Robert Sanford wrote:
I used Nutch to crawl an intranet site and am now trying to search using Lucene.net. When
I attempt to open the index using the IndexReader I get an exception trying to access
"segments".
I started looking a simple index that I created "segments" is a file but the
index that was created by Nutch it is a directory. That's a pretty significant mismatch
there.
Is it possible to directly read an index created by Nutch using Lucene.NET? If
so, how?
Nutch creates several partial indexes, that's why they are stored in
several directories. You could either merge them into a single index
(using Lucene IndexMerger, or bin/nutch merge command), or you can open
them all with MultiReader.
Please note that there were reports of some incompatibilities between
Java Lucene and Lucene.Net, IIRC it had to do with compressed stored fields.
--
Best regards,
Andrzej Bialecki <><
___. ___ ___ ___ _ _ __________________________________
[__ || __|__/|__||\/| Information Retrieval, Semantic Web
___|||__|| \| || | Embedded Unix, System Integration
http://www.sigram.com Contact: info at sigram dot com