Solr would work find for this, your PDF files would have to be interpreted
by Tika, but see Data Import handler, FileListEntityProcessor and
TikaEntityProcessor. I don't quite think Nutch is the tool here.
You'll be wanting to do highlighting and a couple of other things
You'll spend some
Excellent, thanks for the confirmation Erik. I've started working with
Solr (just getting my feet wet at this point).
-Matt
On 07/20/2011 05:38 PM, Erick Erickson wrote:
Solr would work find for this, your PDF files would have to be interpreted
by Tika, but see Data Import handler,
Greetings,
I'm interesting in having a server based personal document library with
a few specific features and I'm trying to determine what the most
appropriate tools are to build it.
I have the following content which I wish to include in the archive:
1. A smallish collection of technical