Re: Reading Nutch indexes w/ Lucene.NET

Andrzej Bialecki Thu, 11 Jun 2009 02:32:43 -0700

Robert Sanford wrote:

I used Nutch to crawl an intranet site and am now trying to search using Lucene.net. When 
I attempt to open the index using the IndexReader I get an exception trying to access 
"segments".


I started looking a simple index that I created "segments" is a file but the 
index that was created by Nutch it is a directory. That's a pretty significant mismatch 
there.

Is it possible to directly read an index created by Nutch using Lucene.NET? If 
so, how?

Nutch creates several partial indexes, that's why they are stored inseveral directories. You could either merge them into a single index(using Lucene IndexMerger, or bin/nutch merge command), or you can openthem all with MultiReader.

Please note that there were reports of some incompatibilities betweenJava Lucene and Lucene.Net, IIRC it had to do with compressed stored fields.


--
Best regards,
Andrzej Bialecki     <><
 ___. ___ ___ ___ _ _   __________________________________
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com

Re: Reading Nutch indexes w/ Lucene.NET

Reply via email to