On Tue, 05 Feb 2008, John Mendenhall wrote: > ----- > Merging 14 segments to /var/nutch/crawl/mergesegs_dir/20080201220906 > SegmentMerger: adding /var/nutch/crawl/segments/20080128132506 > SegmentMerger: adding ... > SegmentMerger: using segment data from: content crawl_generate crawl_fetch > crawl_parse parse_data parse_text > task_0001_m_000075_0: Exception in thread "main" > java.net.SocketTimeoutException: timed out waiting for rpc response > task_0001_m_000075_0: at org.apache.hadoop.ipc.Client.call(Client.java:473) > task_0001_m_000075_0: at > org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:163) > task_0001_m_000075_0: at > org.apache.hadoop.mapred.$Proxy0.reportDiagnosticInfo(Unknown Source) > task_0001_m_000075_0: at > org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:1454) > task_0001_m_000080_0: Exception in thread "main" java.net.SocketException: > Socket closed > task_0001_m_000080_0: at > java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:99) > task_0001_m_000080_0: at > java.net.SocketOutputStream.write(SocketOutputStream.java:136) > task_0001_m_000080_0: at > org.apache.hadoop.ipc.Client$Connection$2.write(Client.java:189) > task_0001_m_000080_0: at > java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:65) > task_0001_m_000080_0: at > java.io.BufferedOutputStream.flush(BufferedOutputStream.java:123) > task_0001_m_000080_0: at > java.io.DataOutputStream.flush(DataOutputStream.java:106) > task_0001_m_000080_0: at > org.apache.hadoop.ipc.Client$Connection.sendParam(Client.java:324) > task_0001_m_000080_0: at org.apache.hadoop.ipc.Client.call(Client.java:461) > task_0001_m_000080_0: at > org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:163) > task_0001_m_000080_0: at > org.apache.hadoop.mapred.$Proxy0.reportDiagnosticInfo(Unknown Source) > task_0001_m_000080_0: at > org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:1454) > task_0001_m_000072_1: log4j:WARN No appenders could be found for logger > (org.apache.hadoop.ipc.Client). > task_0001_m_000072_1: log4j:WARN Please initialize the log4j system properly. > Exception in thread "main" java.io.IOException: Job failed! > at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:604) > at > org.apache.nutch.segment.SegmentMerger.merge(SegmentMerger.java:590) > at org.apache.nutch.segment.SegmentMerger.main(SegmentMerger.java:638) > ----- > > nutch mergesegs returns with status code of 1. > > I have tried looking at why the log4j warning is happening. > All other runs seem fine. Log4j seems to be setup for all > other instances where it is needed. > > Where do I need to look to find out why nutch mergesegs is > crashing? > > Why is log4j not finding the log4j.properties file? > The nutch script in nutch/bin already adds the conf > dir to the class path. > > Thanks in advance for any assistance you can provide. > > JohnM
I modified the configuration to use less memory. I also rebooted all servers. Then, I reran the index and it worked. I currently have 3 servers, 1 serving as master and slave. Each has a different amount of memory available. Each has a different processor type. What is the rule of thumb for setting the heap size, and the child process heap sizes for each server? Thanks! JohnM -- john mendenhall [EMAIL PROTECTED] surf utopia internet services