And of course Heritrix   http://crawler.archive.org/
I think this one's quite cool.  You'll see example usage in my book.

~ David Smiley
Author: http://www.packtpub.com/solr-1-4-enterprise-search-server/

On Nov 26, 2009, at 5:01 AM, Shalin Shekhar Mangar wrote:

> On Thu, Nov 26, 2009 at 1:54 PM, Jörg Agatz <joerg.ag...@googlemail.com>wrote:
> 
>> *Hey guys*,I search a Fulltext crawler for Solr, to index HTML,OpenOffice
>> and Ms Office documets,PDF and muchmore formates.
>> How indexed you the Data?
>> 
>> Maby you can help me to find a Crawler.
>> 
> 
> If you need a web crawler, look at Nutch. Otherwise, you may need to build
> something using Driods or Aperture.
> 
> http://lucene.apache.org/nutch/
> http://incubator.apache.org/droids/
> http://aperture.sourceforge.net/
> 
> -- 
> Regards,
> Shalin Shekhar Mangar.


Reply via email to