On Wed, Feb 20, 2013 at 2:33 PM, Nathan Tallman wrote:
> @Péter: The VuFind solution I mentioned is very similar to what you use
> here. It uses Aperture (although soon to use Tika instead) to grab the
> full-text and shoves everything inside a solr index. The import is managed
> through a PHP scr
As far as the google custom search solution, I'd add that sometimes it
yields weird results : for instance, we indexed a site and for a given
search term, google says "about 16 results" (we have 10 hits displayed
on the page) and when we click on page 2, it says "about 12 results"
(showing the
Yes, Google Custom Search is not too bad, if your PDFs are sorted
meaningfully by directory, and if you submit a site map to Google for more
complete indexing. You can use Xenu to make a site map, put the site map
online as a static XML file, and then use Google Webmaster Tools to pass
the locatio
@Jason and @Michele: I'd rather stay away from a Google solution. The
reason being that they don't index everything. Our sitemap is submitted
nightly and out of about 6000 URLs only 1500 are indexed. I can't make sure
Google indexes the PDFs or be sure that they always will. (If I'm
misunderstandin
an
> Sent: Wednesday, February 20, 2013 12:54 PM
> To: CODE4LIB@LISTSERV.ND.EDU
> Subject: [CODE4LIB] Providing Search Across PDFs
>
> My institution is looking for ways to provide search across PDFs through our
> website. Specifically, PDFs linked from finding aids. Ideally sear
What about just a Google site search?
-Original Message-
From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Nathan
Tallman
Sent: Wednesday, February 20, 2013 12:54 PM
To: CODE4LIB@LISTSERV.ND.EDU
Subject: [CODE4LIB] Providing Search Across PDFs
My institution is
This might not fit your need exactly, but a Google Custom Search (
http://www.google.com/cse/) should do the job. You can have the Custom
Search only index a given directory, or only PDFs, whichever is more useful.
Jason
On Wed, Feb 20, 2013 at 12:53 PM, Nathan Tallman wrote:
> My institution
My institution is looking for ways to provide search across PDFs through
our website. Specifically, PDFs linked from finding aids. Ideally searching
within a collection's PDFs or possibly across all PDFs linked from all
finding aids.
We do not have a CMS or a digital repository. A digital reposito