Thanks so much, Graham. This should do it.
A related question: After the merge, is it possible to build the new webdb
as well? The link data for the merged db can be different from the two
original db. In order to have accurate page ranking, the link data should be
updated.

AJ

On 10/25/05, Graham Stead <[EMAIL PROTECTED]> wrote:
>
> I am by no means a Nutch expert yet, but this is how I merged two
> separate segments so I could search through them:
>
> Step 1:
> $ bin/nutch mergesegs -local -o testmerge -i
> ../crawls/foo/segments/20051018224434/
> ../crawls/bar/segments/20051018225505/
> < bunch of stuff happens >
>
> This creates a segment 20051023112848 in the testmerge folder. The
> segment contains a combined index as well as copies of all information
> from the two input segments.
>
> Step 2:
> This wasn't quite enough to search with, however. I copied the index
> folder and organized the directories into the same structure as used
> during a crawl, then was able to run the Tomcat searcher on the new
> segment.
>
> After copying/moving/reorganizing I have:
>
> $ ls -l testmerge/
> total 0
> drwxrwxrwx+ 2 Oct 23 11:42 index
> drwxrwxrwx+ 3 Oct 23 11:42 segments
>
> $ ls -l testmerge/segments/
> total 0
> drwxrwxrwx+ 7 Oct 23 11:28 20051023112848
>
>
> Step 3:
> Then place this in Tomcat's nutch-site.xml file:
>
> <nutch-conf>
> <property>
> <name>searcher.dir</name>
> <value>C:\path_to_testmerge\testmerge</value>
> </property>
> </nutch-conf>
>
> Run Tomcat and search away.
>
> Hope this helps,
> -Graham
>
> > -----Original Message-----
> > From: AJ Chen [mailto:[EMAIL PROTECTED]
> > Sent: Tuesday, October 25, 2005 4:03 PM
> > To: nutch-dev@lucene.apache.org
> > Subject: merge indices from multiple webdb
> >
> > Has anyone merged indices from two separate webdb? I have two
> > separate webdb and need to find a good way to combine them
> > for unified search.
> > AJ
> >
>

Reply via email to