Should I make PDFs searchable on my site?
Making text in a PDF searchable from your website's internal search can have negative consequences for your site's user experience.
While it's good to use search results to direct users to a resource that is relevant to them, making a PDF searchable may not be the best way of doing this.
The PDF appears in search results without context
When a user enters search terms into a search bar, they are generally taken to a search results page which contains teasers of content related to the search. These teasers might not contain as much useful information if they are for a PDF instead of a web page.
On a Sector site, search results include some preview text which highlights snippets of content related to the search term.
This gives the user context for the search result, so they understand why it's being shown to them and if it could be what they're looking for. If the search result is a PDF (especially if it's an image - e.g. a scanned document), its content might not be pulled into the search result previews. This means that the user won't be able to see how the PDF in the search results relates to their search.
The user loses even more context when they open the PDF
In the search results, users can be directed to a PDF which has the relevant keywords in it, but that's as far as you can direct them.
...the user is taken away from the website when they open a PDF. This means they lose the context of the website and its navigation, making it harder for them to go back if they need to.
This is even more of an issue if the user goes directly to the PDF from a search engine. Without the context of the site the PDF is hosted on, they can’t easily browse to related content or search the website.
When the user clicks on the search result and opens the PDF, they are taken to the first page and it might not be clear how the PDF relates to their search term. It's difficult for the user to know where in the PDF their search keywords appear, so they still end up having to scroll through much of the document.
PDF pages lack navigation bars and other apparatus that might help users move within the information space and relate to the rest of the site. Because PDF documents can be very big, the inability to easily navigate them takes a toll on users.
That's not to say you should avoid PDFs entirely...
If you want to use PDFs on your site (and allow users to find them using your site search), a good option is to have the content available in both HTML and PDF format - then the web page of the content is what is searched.
Avoid or minimize using PDF files as a sole source of online information... Alternative file formats, such as Word files or web pages, should also be considered in addition to PDF.
For example, Sector Resource allows you to create a resource page and attach a variety of files as 'Available formats and related files'. You can see this in action on a sample resource on the Sector demo
Want to learn more about online PDFs and resources?
Read related Sector documentation:
Read about best practice for online PDFs:
- Avoid PDF for On-Screen Reading - Nielsen Norman Group
- Why GOV.UK content should be published in HTML and not PDF - Government Digital Service UK
- Accessible PDF files on the Web - Western Washington University
- PDF Issues and Recommendations - Penn State University