On 23 June 2013 01:27, Sourabh107 <sourabh.jain....@gmail.com> wrote:
> I want to create a search engine for my computer. My doubt is, can i crawl my
> G: / or any drive in my network to search any string in any file (any type
> of file like XML, .log, properties) using solr? if yes, Please guide me, I
> went through the tutorials given in Solr site but  could not find them
> useful for me, everywhere they are taking database as an example. But i want
> to crawl my file system.

Yes, use a FileDataSource along with an appropriate entity processor
such as the PlainTextEntityProcessor. Please see
http://wiki.apache.org/solr/DataImportHandler . This blog might
be of help, though it is somewhat outdated now:
http://robotlibrarian.billdueber.com/an-exercise-in-solr-and-dataimporthandler-hathitrust-data/

You could also write a script to crawl the filesystem, or
use something like Apache ManifoldCF, and dump the
contents of found files into Solr. If you want structured
data indexing for log files, and other types of files you
will probably need to do more work to extract structure
from the text in the files.

Regards,
Gora

Reply via email to