Re: [Nutch-dev] Distributed Search Fails - More Info

Byron Miller Tue, 04 May 2004 19:19:00 -0700

I basically did 4 crawls of equal urls, put segments 1
 & 2 on server 1 and segments 3&4 on server 2. Each
with a merged index.


This is basically what i have on each server:

[EMAIL PROTECTED] nutch]$ pwd
/home/nutch
[EMAIL PROTECTED] nutch]$ cd segments/
[EMAIL PROTECTED] segments]$ ls
20040429050904  20040430103038  index
[EMAIL PROTECTED] segments]$ ls 20040429050904/
fetcher  fetcher_content  fetcher.done  fetcher_text 
fetchlist  index  index.done
[EMAIL PROTECTED] segments]$ 


I then just start the server

bin/nutch server 6969 segments/

And it says:

[EMAIL PROTECTED] nutch]$ bin/nutch server 6969
segments/
040504 190945 10 opening merged index in
/home/nutch/segments/index
040504 190945 11 Server listener on port 6969:
starting
040504 190945 12 Server handler on 6969: starting
040504 190945 13 Server handler on 6969: starting
040504 190945 14 Server handler on 6969: starting
040504 190945 15 Server handler on 6969: starting
040504 190945 17 Server handler on 6969: starting
040504 190945 16 Server handler on 6969: starting
040504 190945 18 Server handler on 6969: starting
040504 190945 19 Server handler on 6969: starting
040504 190945 21 Server handler on 6969: starting
040504 190945 20 Server handler on 6969: starting

I may just kill the merged index and try and see if it
finds the segments individual indexes.

-byron


--- Doug Cutting <[EMAIL PROTECTED]> wrote:
> Have you perhaps (re)moved a segment directory after
> it was indexed, or 
> somehow not kept the segments with the index?  From
> the backtrace below, 
> it looks like a hit's segement directory does not
> exist on a server. 
> The segment name is indexed with each document, so
> that, after indexes 
> are merged, each still knows the name of the
> directory that contains 
> it's summary and cache data.  (I don't think you
> included all of the 
> server log data.  There should be a line before the
> "starting" message 
> indicating where it is finding the index.)
> 
> Each search server should be started in a directory
> with a subdirectory 
> named 'segments' containing all segments that the
> server is to search, 
> complete with 'fetcher', 'fetcher_content' and
> 'fetcher_text' 
> directories, and either:
> 
>    1. a subdirectory named 'index' containing the
> merged index; or
>    2. an 'index' directory in each segment.
> 
> If both exist, the merged index is used.
> 
> (In fact, you don't really need to keep things quite
> so coordinated. 
> All that's really required is that some server has a
> segment directory 
> for every indexed document.)
> 
> Doug
> 
> 
> Byron Miller wrote:
> > java.lang.NullPointerException
> >     at java.util.Hashtable.get(Hashtable.java:333)
> >     at
> >
> net.nutch.ipc.Client.getConnection(Client.java:273)
> >     at net.nutch.ipc.Client.call(Client.java:248)
> >     at
> >
>
net.nutch.searcher.DistributedSearch$Client.getSummary(DistributedSearch.java:389)
> >     at
> >
>
net.nutch.searcher.NutchBean.getSummary(NutchBean.java:119)
> 
> 
>
-------------------------------------------------------
> This SF.Net email is sponsored by: Oracle 10g
> Get certified on the hottest thing ever to hit the
> market... Oracle 10g. 
> Take an Oracle 10g class now, and we'll give you the
> exam FREE.
>
http://ads.osdn.com/?ad_id=3149&alloc_id=8166&op=click
> _______________________________________________
> Nutch-developers mailing list
> [EMAIL PROTECTED]
>
https://lists.sourceforge.net/lists/listinfo/nutch-developers



-------------------------------------------------------
This SF.Net email is sponsored by: Oracle 10g
Get certified on the hottest thing ever to hit the market... Oracle 10g. 
Take an Oracle 10g class now, and we'll give you the exam FREE.
http://ads.osdn.com/?ad_id=3149&alloc_id=8166&op=click
_______________________________________________
Nutch-developers mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/nutch-developers

Re: [Nutch-dev] Distributed Search Fails - More Info

Reply via email to