[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13291492#comment-13291492
 ] 

Jason Lowe commented on MAPREDUCE-4298:
---------------------------------------

This occurred again on one of our clusters.  Turns out I was mistaken earlier, 
the file descriptor ulimit for our nodemanager daemons is set to 32768, not 
8192.  Fortunately this time we were able to examine some nodemanagers that had 
leaked numerous file descriptors but had not fallen over yet.

Almost all of the file descriptors were referencing map outputs for the 
shuffle, often hundreds of file descriptors open to the same file.  
Interestingly almost all of the map files corresponded to just one job.  
Examining the NM log around the time that job ran, I found numerous exceptions 
in it showing things had not gone smoothly during the shuffle for that job.  
For example:

{noformat}
 [New I/O server worker #1-5]java.io.IOException: Connection reset by peer
        at sun.nio.ch.FileDispatcher.write0(Native Method)
        at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:29)
        at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:100)
        at sun.nio.ch.IOUtil.write(IOUtil.java:56)
        at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:334)
        at 
org.jboss.netty.channel.socket.nio.SocketSendBufferPool$PooledSendBuffer.transferTo(SocketSendBufferPool.java:239)
        at 
org.jboss.netty.channel.socket.nio.NioWorker.write0(NioWorker.java:470)
        at 
org.jboss.netty.channel.socket.nio.NioWorker.writeFromUserCode(NioWorker.java:388)
        at 
org.jboss.netty.channel.socket.nio.NioServerSocketPipelineSink.handleAcceptedSocket(NioServerSocketPipelineSink.java:137)
        at 
org.jboss.netty.channel.socket.nio.NioServerSocketPipelineSink.eventSunk(NioServerSocketPipelineSink.java:76)
        at 
org.jboss.netty.handler.codec.oneone.OneToOneEncoder.handleDownstream(OneToOneEncoder.java:68)
        at 
org.jboss.netty.handler.stream.ChunkedWriteHandler.flush(ChunkedWriteHandler.java:253)
        at 
org.jboss.netty.handler.stream.ChunkedWriteHandler.handleDownstream(ChunkedWriteHandler.java:123)
        at org.jboss.netty.channel.Channels.write(Channels.java:611)
        at org.jboss.netty.channel.Channels.write(Channels.java:578)
        at 
org.jboss.netty.channel.AbstractChannel.write(AbstractChannel.java:259)
        at 
org.apache.hadoop.mapred.ShuffleHandler$Shuffle.sendMapOutput(ShuffleHandler.java:477)
        at 
org.apache.hadoop.mapred.ShuffleHandler$Shuffle.messageReceived(ShuffleHandler.java:397)
        at 
org.jboss.netty.handler.stream.ChunkedWriteHandler.handleUpstream(ChunkedWriteHandler.java:144)
        at 
org.jboss.netty.handler.codec.http.HttpChunkAggregator.messageReceived(HttpChunkAggregator.java:116)
        at 
org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:302)
        at 
org.jboss.netty.handler.codec.replay.ReplayingDecoder.unfoldAndfireMessageReceived(ReplayingDecoder.java:523)
        at 
org.jboss.netty.handler.codec.replay.ReplayingDecoder.callDecode(ReplayingDecoder.java:507)
        at 
org.jboss.netty.handler.codec.replay.ReplayingDecoder.messageReceived(ReplayingDecoder.java:444)
        at 
org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:274)
        at 
org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:261)
        at org.jboss.netty.channel.socket.nio.NioWorker.read(NioWorker.java:350)
        at 
org.jboss.netty.channel.socket.nio.NioWorker.processSelectedKeys(NioWorker.java:281)
        at org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:201)
        at 
org.jboss.netty.util.internal.IoWorkerRunnable.run(IoWorkerRunnable.java:46)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
        at java.lang.Thread.run(Thread.java:619)
{noformat}

Looking closer at the job, I could see that it had run with 15000 maps and 2000 
reduces.  Hundreds of the reducers had failed running out of heap space during 
the shuffle phase, which lead to broken pipe and connection reset errors on the 
nodemanagers trying to serve up shuffle data to those reducers when they died.

I was able to reproduce the broken pipe issue and step through the code with a 
debugger.  Normally the file descriptor is closed by adding a ChannelFuture 
after the map data is written, and that future's operationComplete() callback 
closes the file.  However when there is an I/O error sending the shuffle 
header, Netty closes down the channel automatically (plus we explicitly close 
it in a channel exception handler).  By the time we try to write the map file 
data to the channel, the channel is already closed.  And I was able to see that 
if we write to a closed channel, the ChannelFuture's operationComplete() 
callback is never invoked.  No operationComplete() callback means we leak the 
file descriptor for the map file.  If multiple map files are being sent to the 
reducer, we leak multiple file descriptors for the same error.

I searched around and discovered this is a known issue in Netty 3.2.3.Final, 
the version we're currently using.  See 
https://issues.jboss.org/browse/NETTY-374.  It's fixed in version 3.2.4.Final.
                
> NodeManager crashed after running out of file descriptors
> ---------------------------------------------------------
>
>                 Key: MAPREDUCE-4298
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4298
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: mrv2
>    Affects Versions: 0.23.3
>            Reporter: Jason Lowe
>
> A node on one of our clusters fell over because it ran out of open file 
> descriptors.  Log details with stack traceback to follow.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to