Re: Index entire filesystem

Erik Hatcher Wed, 05 Nov 2003 03:01:25 -0800

On Wednesday, November 5, 2003, at 03:51 AM, Marcel Stor wrote:

Hi all,

I'm thinkin' about writing a search tool for my filesystem. I know such things exist already but programming it myself is much more fun ;-) So, I would have Lucene crawl through my filesystem and pass each file to an appropriate indexer (PDF -> PDFbox, etc.). Yes, I run a Windows system and would depend on the file ending to distinguish the file type. Is this a good idea in general? Is there a list of available indexer for the the different file types? Any other comments are also welcome.

The general idea (limited to .txt files intentionally) is included in this code:

http://today.java.net/pub/a/today/2003/07/30/LuceneIntro.html

The Ant <index> task in jakarta-lucene-sandbox CVS repository has a document handler interface that is designed to allow for plugability. You named the PDF pieces, and there is POI for dealing with Office documents.

Erik


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: Index entire filesystem

Reply via email to