Chukwa collector was attempting to rename a HDFS file handle which did not 
exist on namenode.
There are two possibilities for this exception to happen.

1. someone has deleted /chukwa/logs/2011102008_s0281.hostsud.net.chukwa, 
2. namenode was restarted but collector was not restarted, hence the file 
handle had mismatch.
3. There is a connection lost while communicate with namenode that it does not 
know about /chukwa/logs/2011102008_s0281.hostsud.net.chukwa file. (Unlikely)

In the past, we bail out on any HDFS errors for Chukwa Collector.  We took out 
the logic to do so for the trunk version of SeqFileWriter.java.  Hence, this 
bug is fixed in trunk.  I would recommend to take a look of the trunk version.  
It is more stable than Chukwa 0.4.

regards,
Eric

On Oct 25, 2011, at 1:46 AM, IvyTang wrote:

> Our chukwa collector crashed 。
> And the log showed
> 
> 
> 2011-10-20 04:08:27,847 WARN Timer-817 SeqFileWriter - Got an exception in 
> rotate
> org.apache.hadoop.ipc.RemoteException: 
> org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException: No lease on 
> /chukwa/logs/2011102008_s0281.hostsud.net.chukwa File does not exist. [Lease. 
>  Holder: DFSClient_395554495, pendingcreates: 1]
>         at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:1490)
>         at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:1481)
>         at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.completeFileInternal(FSNamesystem.java:1536)
>         at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.completeFile(FSNamesystem.java:1524)
>         at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.complete(NameNode.java:665)
>         at sun.reflect.GeneratedMethodAccessor1374.invoke(Unknown Source)
>         at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>         at java.lang.reflect.Method.invoke(Method.java:597)
>         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:557)
>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1416)
>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1412)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:396)
>         at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1115)
>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1410)
> 
>         at org.apache.hadoop.ipc.Client.call(Client.java:1104)
>         at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:226)
>         at $Proxy0.complete(Unknown Source)
>         at sun.reflect.GeneratedMethodAccessor5.invoke(Unknown Source)
>         at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>         at java.lang.reflect.Method.invoke(Method.java:597)
>         at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82)
>         at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59)
>         at $Proxy0.complete(Unknown Source)
>         at 
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.closeInternal(DFSClient.java:3558)
>         at 
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.close(DFSClient.java:3472)
>         at 
> org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:61)
>         at 
> org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:86)
>         at 
> org.apache.hadoop.chukwa.datacollection.writer.SeqFileWriter.rotate(SeqFileWriter.java:199)
>         at 
> org.apache.hadoop.chukwa.datacollection.writer.SeqFileWriter$1.run(SeqFileWriter.java:240)
>         at java.util.TimerThread.mainLoop(Timer.java:512)
>         at java.util.TimerThread.run(Timer.java:462)
> 2011-10-20 04:08:27,848 FATAL Timer-817 SeqFileWriter - IO Exception in 
> rotate. Exiting!
> 2011-10-20 04:08:27,851 WARN Shutdown SeqFileWriter - cannot rename dataSink 
> file:/chukwa/logs/2011102008_s0281.hostsud.net.chukwa
> java.io.IOException: Filesystem closed
>         at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:232)
>         at org.apache.hadoop.hdfs.DFSClient.rename(DFSClient.java:606)
>         at 
> org.apache.hadoop.hdfs.DistributedFileSystem.rename(DistributedFileSystem.java:224)
>         at 
> org.apache.hadoop.chukwa.datacollection.writer.SeqFileWriter.close(SeqFileWriter.java:327)
>         at 
> org.apache.hadoop.chukwa.datacollection.writer.SocketTeeWriter.close(SocketTeeWriter.java:268)
>         at 
> org.apache.hadoop.chukwa.datacollection.writer.PipelineStageWriter.close(PipelineStageWriter.java:46)
>         at 
> org.apache.hadoop.chukwa.datacollection.collector.servlet.ServletCollector.destroy(ServletCollector.java:227)
>         at 
> org.mortbay.jetty.servlet.ServletHolder.destroyInstance(ServletHolder.java:315)
>         at 
> org.mortbay.jetty.servlet.ServletHolder.doStop(ServletHolder.java:286)
>         at 
> org.mortbay.component.AbstractLifeCycle.stop(AbstractLifeCycle.java:64)
>         at 
> org.mortbay.jetty.servlet.ServletHandler.doStop(ServletHandler.java:170)
>         at 
> org.mortbay.component.AbstractLifeCycle.stop(AbstractLifeCycle.java:64)
>         at 
> org.mortbay.jetty.handler.HandlerWrapper.doStop(HandlerWrapper.java:142)
>         at 
> org.mortbay.jetty.servlet.SessionHandler.doStop(SessionHandler.java:124)
>         at 
> org.mortbay.component.AbstractLifeCycle.stop(AbstractLifeCycle.java:64)
>         at 
> org.mortbay.jetty.handler.HandlerWrapper.doStop(HandlerWrapper.java:142)
>         at 
> org.mortbay.jetty.handler.ContextHandler.doStop(ContextHandler.java:569)
>         at 
> org.mortbay.component.AbstractLifeCycle.stop(AbstractLifeCycle.java:64)
>         at 
> org.mortbay.jetty.handler.HandlerWrapper.doStop(HandlerWrapper.java:142)
>         at org.mortbay.jetty.Server.doStop(Server.java:281)
>         at 
> org.mortbay.component.AbstractLifeCycle.stop(AbstractLifeCycle.java:64)
>         at org.mortbay.jetty.Server$ShutdownHookThread.run(Server.java:559)
> 
> What does this mean?
> 
> -- 
> Best regards,
> 
> Ivy Tang
> 
> 
> 

Reply via email to