> The merge program doesn't care what the name of the folder is. It > cares it > should be in a certain structure. > > So if we assume you have a folder named indexes, the program wants > that each > folder inside indexes (represents a previous run of index) should > have a > Lucene index in it (it looks for a folder name segments).
Thanks Gal for the explanation. It makes sense. What doesn't though is that bin/nutch merge crawl/index crawl/index_1 crawl/index_2 crawl/index (i.e. merging three indexes including the previously merged one) will not generate the part-00000 in crawl/index, it just dumps the merged Lucene index directly into crawl/index. So then the next time I do a crawl merge I have to manually move the crawl/index/* to crawl/index/ part-00000/. But knowing this at least is helpful so I can update my scripts! -Brian ------------------------------------------------------------------------- Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys-and earn cash http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV _______________________________________________ Nutch-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/nutch-general
