So, the 'db' is never used during the searching aspect. Interesting. 'segments' is more for run-time use.
On 3/30/06, Aled Jones <[EMAIL PROTECTED]> wrote: > Hi Dan > > I'll presume you've done the crawls already.. > > Each resulting crawled folder should have 3 folders, db, index and > segments. > > Create your search.dir folder and create a segments folder in that. > > Each segments folder in each crawl folder should contain folders with > timestamps as the names. Copy the contents of: > > crawlA/segments > crawlB/segments > crawlc/segments > > (i.e. The folders with timestamps as names)Into: > > search.dir/segments > > Next, delete the duplicates from the segments by running the command: > > bin/nutch dedup -local search.dir/segments > > Then you need to merge the segments to create an index folder, so run > the command: > > bin/nutch merge -local search.dir/index search.dir/segments/* > > You should now have two folders in your search.dir: > search.dir/segments > search.dir/index > > That's all you need for serving pages (db folder is only used when > fetching). > > Now just set the searcher.dir property value in nutch-site.xml to be the > location of search.dir > > That's how I've been doing it, although it may not be the "right" way. > :-) Hope this helps. > > Cheers > Aled > > > > -----Neges Wreiddiol-----/-----Original Message----- > > Oddi wrth/From: Dan Morrill [mailto:[EMAIL PROTECTED] > > Anfonwyd/Sent: 29 March 2006 18:06 > > At/To: nutch-user@lucene.apache.org > > Copi/Cc: [EMAIL PROTECTED] > > Pwnc/Subject: Multiple crawls how to get them to work together > > > > Hi folks, > > > > > > > > I have 3 crawls, crawlA, crawlB, and crawlC. I would like all > > of them to be available to the search.jsp page. > > > > > > > > I went through the site saw merge, index, make new db, and > > followed all the directions that I could find, but still no > > resolution on this one. So what I need are some idea's on > > where to proceed from here, I intend on having 2 or > > 3 boxes make a crawl, then somehow merge the crawls together > > and form a "master" under search.dir. I would also want to > > update this one on a regular basis. > > > > > > > > Unfortunately, the instructions to date have all been tried, > > and have all lead to the idea not working. There is also no > > indexmerger or indexsemgents directives in nutch 0.7.1. Any > > support ideas, direct pointers, or even step-by-step > > instructions on how to do this (outside of what is in the > > tutorials because that has been tried already, including > > support idea's in the user web mail list). > > > > > > > > Cheers/r/dan > > > > > > > > > > > > > > > > > ########################################### > > This message has been scanned by F-Secure Anti-Virus for Microsoft Exchange. > For more information, connect to http://www.f-secure.com/ > > ************************************************************************ > This e-mail and any attachments are strictly confidential and intended solely > for the addressee. They may contain information which is covered by legal, > professional or other privilege. If you are not the intended addressee, you > must not copy the e-mail or the attachments, or use them for any purpose or > disclose their contents to any other person. To do so may be unlawful. If you > have received this transmission in error, please notify us as soon as > possible and delete the message and attachments from all places in your > computer where they are stored. > > Although we have scanned this e-mail and any attachments for viruses, it is > your responsibility to ensure that they are actually virus free. > > >