Alexander Aristov wrote:
Hi

Thank you for Katta

But are there any built-in Nutch functionality which can do this stuff. What
I am looking forward is to make distributed search as I am planning to build
an index of quite big size and so it will be not possible to keep it on one
server.

What are best practices for doing this?

There is no built-in single tool in Nutch to do this. Common practice is to create indexes per segment (without merging them), and deploy pairs of segment plus its index to the search servers, and then doing the index merging there, on each search server. Whenever you add new segments or remove old ones, you perform a merge of the new set of active indexes on each search server.

This way it's easy to phase out outdated segments and their indexes, and adding new segments, while still using a merged index on each search server for maximum performance.

PS. it's possible to implement a low-level Lucene tool to split indexes, using FilteredIndexReader and IndexWriter.addIndexes(...). But it's not that relevant if you use the strategy that I explained above.


--
Best regards,
Andrzej Bialecki     <><
 ___. ___ ___ ___ _ _   __________________________________
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com

Reply via email to