On Mon, Apr 22, 2013 at 11:45:43AM +0100, Mike Whitaker wrote: >On a similar subject, what PDF (or even text, assuming I can find something to >extract the text on a page by page basis) indexing solutions are there out >there in Perl?
pdftotext and then throw the text at a generic indexing package. I keep meaning to do something with Plucene.