I basically did 4 crawls of equal urls, put segments 1 & 2 on server 1 and segments 3&4 on server 2. Each with a merged index.
This is basically what i have on each server: [EMAIL PROTECTED] nutch]$ pwd /home/nutch [EMAIL PROTECTED] nutch]$ cd segments/ [EMAIL PROTECTED] segments]$ ls 20040429050904 20040430103038 index [EMAIL PROTECTED] segments]$ ls 20040429050904/ fetcher fetcher_content fetcher.done fetcher_text fetchlist index index.done [EMAIL PROTECTED] segments]$ I then just start the server bin/nutch server 6969 segments/ And it says: [EMAIL PROTECTED] nutch]$ bin/nutch server 6969 segments/ 040504 190945 10 opening merged index in /home/nutch/segments/index 040504 190945 11 Server listener on port 6969: starting 040504 190945 12 Server handler on 6969: starting 040504 190945 13 Server handler on 6969: starting 040504 190945 14 Server handler on 6969: starting 040504 190945 15 Server handler on 6969: starting 040504 190945 17 Server handler on 6969: starting 040504 190945 16 Server handler on 6969: starting 040504 190945 18 Server handler on 6969: starting 040504 190945 19 Server handler on 6969: starting 040504 190945 21 Server handler on 6969: starting 040504 190945 20 Server handler on 6969: starting I may just kill the merged index and try and see if it finds the segments individual indexes. -byron --- Doug Cutting <[EMAIL PROTECTED]> wrote: > Have you perhaps (re)moved a segment directory after > it was indexed, or > somehow not kept the segments with the index? From > the backtrace below, > it looks like a hit's segement directory does not > exist on a server. > The segment name is indexed with each document, so > that, after indexes > are merged, each still knows the name of the > directory that contains > it's summary and cache data. (I don't think you > included all of the > server log data. There should be a line before the > "starting" message > indicating where it is finding the index.) > > Each search server should be started in a directory > with a subdirectory > named 'segments' containing all segments that the > server is to search, > complete with 'fetcher', 'fetcher_content' and > 'fetcher_text' > directories, and either: > > 1. a subdirectory named 'index' containing the > merged index; or > 2. an 'index' directory in each segment. > > If both exist, the merged index is used. > > (In fact, you don't really need to keep things quite > so coordinated. > All that's really required is that some server has a > segment directory > for every indexed document.) > > Doug > > > Byron Miller wrote: > > java.lang.NullPointerException > > at java.util.Hashtable.get(Hashtable.java:333) > > at > > > net.nutch.ipc.Client.getConnection(Client.java:273) > > at net.nutch.ipc.Client.call(Client.java:248) > > at > > > net.nutch.searcher.DistributedSearch$Client.getSummary(DistributedSearch.java:389) > > at > > > net.nutch.searcher.NutchBean.getSummary(NutchBean.java:119) > > > ------------------------------------------------------- > This SF.Net email is sponsored by: Oracle 10g > Get certified on the hottest thing ever to hit the > market... Oracle 10g. > Take an Oracle 10g class now, and we'll give you the > exam FREE. > http://ads.osdn.com/?ad_id=3149&alloc_id=8166&op=click > _______________________________________________ > Nutch-developers mailing list > [EMAIL PROTECTED] > https://lists.sourceforge.net/lists/listinfo/nutch-developers ------------------------------------------------------- This SF.Net email is sponsored by: Oracle 10g Get certified on the hottest thing ever to hit the market... Oracle 10g. Take an Oracle 10g class now, and we'll give you the exam FREE. http://ads.osdn.com/?ad_id=3149&alloc_id=8166&op=click _______________________________________________ Nutch-developers mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/nutch-developers
