Good. Last question. Does Nutch guaranty that it will create new index for each separate segment? May it happen that it create one index for two segments or vise-versa.
Alexander 2008/8/4 Andrzej Bialecki <[EMAIL PROTECTED]> > Alexander Aristov wrote: > >> Hi >> >> Thank you for Katta >> >> But are there any built-in Nutch functionality which can do this stuff. >> What >> I am looking forward is to make distributed search as I am planning to >> build >> an index of quite big size and so it will be not possible to keep it on >> one >> server. >> >> What are best practices for doing this? >> > > There is no built-in single tool in Nutch to do this. Common practice is to > create indexes per segment (without merging them), and deploy pairs of > segment plus its index to the search servers, and then doing the index > merging there, on each search server. Whenever you add new segments or > remove old ones, you perform a merge of the new set of active indexes on > each search server. > > This way it's easy to phase out outdated segments and their indexes, and > adding new segments, while still using a merged index on each search server > for maximum performance. > > PS. it's possible to implement a low-level Lucene tool to split indexes, > using FilteredIndexReader and IndexWriter.addIndexes(...). But it's not that > relevant if you use the strategy that I explained above. > > > -- > Best regards, > Andrzej Bialecki <>< > ___. ___ ___ ___ _ _ __________________________________ > [__ || __|__/|__||\/| Information Retrieval, Semantic Web > ___|||__|| \| || | Embedded Unix, System Integration > http://www.sigram.com Contact: info at sigram dot com > > -- Best Regards Alexander Aristov
