I raised heap size to 2GB for each child in "mapred.child.java.opts" and the segment merging succeeded.

Justin Yao wrote:
Hi,

I encountered an error when I try to merge segment using the latest nightly build nutch.
I have 3 hadoop nodes and all servers have CentOS 5.2 installed.

Every time when I tried to merge segment using command:

"nutch mergesegs crawl/MERGEDsegments -dir crawl/segments",

it would fail with error message:

Task attempt: /default-rack/10.9.17.206
Cleanup Attempt: /default-rack/10.9.17.206

"Task attempt_200903161037_0001_r_000003_0 failed to report status for 1200 seconds. Killing!"

then another child task will be launched, and later I got error message:

org.apache.hadoop.ipc.RemoteException: org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException: failed to create file /user/justin/crawl/MERGEDsegments/20090316143643/crawl_generate/part-00003 for DFSClient_attempt_200903161037_0001_r_000003_1 on client 10.6.180.2 because current leaseholder is trying to recreate file. at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInternal(FSNamesystem.java:1055) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:998) at org.apache.hadoop.hdfs.server.namenode.NameNode.create(NameNode.java:301)
    at sun.reflect.GeneratedMethodAccessor10.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:481)
    at org.apache.hadoop.ipc.Server$Handler.run(Server.java:894)

    at org.apache.hadoop.ipc.Client.call(Client.java:697)
    at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:216)
    at $Proxy1.create(Unknown Source)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59)
    at $Proxy1.create(Unknown Source)
at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.(DFSClient.java:2585)
    at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:454)
at org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:190)
    at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:487)
at org.apache.hadoop.io.SequenceFile$RecordCompressWriter.(SequenceFile.java:1074) at org.apache.hadoop.io.SequenceFile.createWriter(SequenceFile.java:397) at org.apache.hadoop.io.SequenceFile.createWriter(SequenceFile.java:306) at org.apache.nutch.segment.SegmentMerger$SegmentOutputFormat$1.ensureSequenceFile(SegmentMerger.java:252) at org.apache.nutch.segment.SegmentMerger$SegmentOutputFormat$1.write(SegmentMerger.java:211) at org.apache.nutch.segment.SegmentMerger$SegmentOutputFormat$1.write(SegmentMerger.java:194)
    at org.apache.hadoop.mapred.ReduceTask$3.collect(ReduceTask.java:410)
at org.apache.nutch.segment.SegmentMerger.reduce(SegmentMerger.java:479) at org.apache.nutch.segment.SegmentMerger.reduce(SegmentMerger.java:113)
    at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:436)
    at org.apache.hadoop.mapred.Child.main(Child.java:158)


Here is the log of namenode:

2009-03-16 17:03:20,794 WARN hdfs.StateChange - DIR* NameSystem.startFile: failed to create file /user/justin/crawl/MERGEDsegments/20090316143643/crawl_generate/part-00003 for DFSClient_attempt_200903161037_0001_r_000003_1 on client 10.6.180.2 because current leaseholder is trying to recreate file. 2009-03-16 17:04:20,798 WARN hdfs.StateChange - DIR* NameSystem.startFile: failed to create file /user/justin/crawl/MERGEDsegments/20090316143643/crawl_generate/part-00003 for DFSClient_attempt_200903161037_0001_r_000003_1 on client 10.6.180.2 because current leaseholder is trying to recreate file. 2009-03-16 17:05:20,803 WARN hdfs.StateChange - DIR* NameSystem.startFile: failed to create file /user/justin/crawl/MERGEDsegments/20090316143643/crawl_generate/part-00003 for DFSClient_attempt_200903161037_0001_r_000003_1 on client 10.6.180.2 because current leaseholder is trying to recreate file.

I checked the processes in hadoop node, the failed reduce process was never killed and it kept running.

I've tried to merge segments several times and it always failed with same error.

Does someone encounter this problem before? Is there any solution to avoid this problem? Any suggestion would be appreciated.

Thanks,

--
Justin Yao
Snooth
o: 646.723.4328
c: 718.662.6362
jus...@snooth.com

Snooth -- Over 2 million ratings and counting...

Reply via email to