How do you buid a new webdb from the merged segment/index? Could you provide
detailed steps for the process you described? Thanks.

AJ

On 10/25/05, Andrey Ilinykh <[EMAIL PROTECTED]> wrote:
>
> If you merge two segments page ranks are off. You have to build new webdb,
> calculate page rank and then build one more segment again.
>
> Thank you,
> Andrey
>
> -----Original Message-----
> From: AJ Chen [mailto:[EMAIL PROTECTED]
> Sent: Tuesday, October 25, 2005 2:02 PM
> To: nutch-dev@lucene.apache.org
> Subject: Re: merge indices from multiple webdb
>
>
> Thanks so much, Graham. This should do it.
> A related question: After the merge, is it possible to build the new webdb
> as well? The link data for the merged db can be different from the two
> original db. In order to have accurate page ranking, the link data should
> be
> updated.
>
> AJ
>
> On 10/25/05, Graham Stead <[EMAIL PROTECTED]> wrote:
> >
> > I am by no means a Nutch expert yet, but this is how I merged two
> > separate segments so I could search through them:
> >
> > Step 1:
> > $ bin/nutch mergesegs -local -o testmerge -i
> > ../crawls/foo/segments/20051018224434/
> > ../crawls/bar/segments/20051018225505/
> > < bunch of stuff happens >
> >
> > This creates a segment 20051023112848 in the testmerge folder. The
> > segment contains a combined index as well as copies of all information
> > from the two input segments.
> >
> > Step 2:
> > This wasn't quite enough to search with, however. I copied the index
> > folder and organized the directories into the same structure as used
> > during a crawl, then was able to run the Tomcat searcher on the new
> > segment.
> >
> > After copying/moving/reorganizing I have:
> >
> > $ ls -l testmerge/
> > total 0
> > drwxrwxrwx+ 2 Oct 23 11:42 index
> > drwxrwxrwx+ 3 Oct 23 11:42 segments
> >
> > $ ls -l testmerge/segments/
> > total 0
> > drwxrwxrwx+ 7 Oct 23 11:28 20051023112848
> >
> >
> > Step 3:
> > Then place this in Tomcat's nutch-site.xml file:
> >
> > <nutch-conf>
> > <property>
> > <name>searcher.dir</name>
> > <value>C:\path_to_testmerge\testmerge</value>
> > </property>
> > </nutch-conf>
> >
> > Run Tomcat and search away.
> >
> > Hope this helps,
> > -Graham
> >
> > > -----Original Message-----
> > > From: AJ Chen [mailto:[EMAIL PROTECTED]
> > > Sent: Tuesday, October 25, 2005 4:03 PM
> > > To: nutch-dev@lucene.apache.org
> > > Subject: merge indices from multiple webdb
> > >
> > > Has anyone merged indices from two separate webdb? I have two
> > > separate webdb and need to find a good way to combine them
> > > for unified search.
> > > AJ
> > >
> >
>

Reply via email to