Aled, I'll try that today, excellent, and thanks for the heads up on the db directory. I'll let you now how it goes.
r/d -----Original Message----- From: Aled Jones [mailto:[EMAIL PROTECTED] Sent: Thursday, March 30, 2006 12:24 AM To: nutch-user@lucene.apache.org Subject: ATB: Multiple crawls how to get them to work together Hi Dan I'll presume you've done the crawls already.. Each resulting crawled folder should have 3 folders, db, index and segments. Create your search.dir folder and create a segments folder in that. Each segments folder in each crawl folder should contain folders with timestamps as the names. Copy the contents of: crawlA/segments crawlB/segments crawlc/segments (i.e. The folders with timestamps as names)Into: search.dir/segments Next, delete the duplicates from the segments by running the command: bin/nutch dedup -local search.dir/segments Then you need to merge the segments to create an index folder, so run the command: bin/nutch merge -local search.dir/index search.dir/segments/* You should now have two folders in your search.dir: search.dir/segments search.dir/index That's all you need for serving pages (db folder is only used when fetching). Now just set the searcher.dir property value in nutch-site.xml to be the location of search.dir That's how I've been doing it, although it may not be the "right" way. :-) Hope this helps. Cheers Aled > -----Neges Wreiddiol-----/-----Original Message----- > Oddi wrth/From: Dan Morrill [mailto:[EMAIL PROTECTED] > Anfonwyd/Sent: 29 March 2006 18:06 > At/To: nutch-user@lucene.apache.org > Copi/Cc: [EMAIL PROTECTED] > Pwnc/Subject: Multiple crawls how to get them to work together > > Hi folks, > > > > I have 3 crawls, crawlA, crawlB, and crawlC. I would like all > of them to be available to the search.jsp page. > > > > I went through the site saw merge, index, make new db, and > followed all the directions that I could find, but still no > resolution on this one. So what I need are some idea's on > where to proceed from here, I intend on having 2 or > 3 boxes make a crawl, then somehow merge the crawls together > and form a "master" under search.dir. I would also want to > update this one on a regular basis. > > > > Unfortunately, the instructions to date have all been tried, > and have all lead to the idea not working. There is also no > indexmerger or indexsemgents directives in nutch 0.7.1. Any > support ideas, direct pointers, or even step-by-step > instructions on how to do this (outside of what is in the > tutorials because that has been tried already, including > support idea's in the user web mail list). > > > > Cheers/r/dan > > > > > > > > ########################################### This message has been scanned by F-Secure Anti-Virus for Microsoft Exchange. For more information, connect to http://www.f-secure.com/ ************************************************************************ This e-mail and any attachments are strictly confidential and intended solely for the addressee. They may contain information which is covered by legal, professional or other privilege. If you are not the intended addressee, you must not copy the e-mail or the attachments, or use them for any purpose or disclose their contents to any other person. To do so may be unlawful. If you have received this transmission in error, please notify us as soon as possible and delete the message and attachments from all places in your computer where they are stored. Although we have scanned this e-mail and any attachments for viruses, it is your responsibility to ensure that they are actually virus free. =