I've found this message while looking to update subcollections field upon a reindexing operation. I had no explanation for my issue: I fetched/indexed some sites, using subcollection.xml, then I made changes in the subcollection.xml and reindexed. While inspecting the db with luke, or using the web search the collections looked unchanged. See here the whole story.
http://www.nabble.com/subcollections-tf2821188.html http://www.nabble.com/subcollections-tf2821188.html I manually looked over all files, and this is what I found: when doing a reindex operation, only the "indexes" files change, but "index" don't. And as you say that "index" folder has preeminence over "indexes", this means that... it's a bug of some sort! in order to benefit of the new subcollection.xml and reindex, I need to remove the "index" folder (unchanged upon reindex) and let the searcher work onky with "indexes" folder. Please tell me if I am wrong. or if there is any other method to accomplish this. Also, what's the drawback or advantage to use "index" or "indexes"? Also, could you point me to a source to browse the internals of the nutch in a "tutorial-style"? Thanks! Andrzej Bialecki wrote: > > [EMAIL PROTECTED] wrote: >> I am just curious if someone could explain the difference between the >> 'index' folder and the 'indexes' folder inside the output directory of >> the crawl? > >> The motivation for my question is that I am trying to determine what >> parts of the crawl need to be deployed to my searcher machines (I don't >> use servlet searcher but a custom class using the Nutch API). It looks >> like it works with just 'index' and 'segments', but I want to be sure >> that I should not be deploying 'indexes' instead/in-addition. >> > > That's correct. NutchBean first tries to use "index", if it can't be > found then it tries "indexes". > > -- View this message in context: http://www.nabble.com/0.8-output%5Cindex-versus-output%5Cindexes-tf2320120.html#a7994100 Sent from the Nutch - User mailing list archive at Nabble.com. ------------------------------------------------------------------------- Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys - and earn cash http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV _______________________________________________ Nutch-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/nutch-general
