Hey there,

We're in the process of upgrading our Hadoop cluster from 2.2.1 to 2.7.2
and currently testing 2.7.2 in our pre-prod/backup cluster. We're seeing a
lot of active NameNode failovers (sometimes as often as every 30 minutes),
especially when we're running DistCp to copy data from our production
cluster for users to test with. We had seen similar failovers occasionally
while running 2.2.1, but not nearly as often (once every month or two).
Haven't been able to verify it's the exact same root cause in the 2.2.1
version since files/logs have rolled over since the last time it happened.

So here's the chain of events we've found so far. Hoping someone can
provide further direction.

The standby NameNode's checkpointing process succeeds locally and issues
the image PUT request in TransferFsImage.uploadImage. The active NameNode
finishes downloading the fsimage.ckpt file, but when it tries to issue the
fos.getChannel().force(true) call in TransferFsImage.receiveFile it seems
to get stuck in native code. The standby NameNode then gets a
SocketTimeoutException -- it happens 60 seconds after the last modification
time we see in the "stat" output for the fsimage.ckpt file that the active
NameNode pulled down.

Right after the time this is happening (~30 sec after the last modification
to the fsimage.ckpt file) we see a similar issue with the edit log roll.
The standby NameNode's EditLogTailer triggers the rolling of the edit log
on the active NameNode. We see the active NameNode enter its rollEditLog
process, and will either see the endCurrentLogSegment call get stuck in
EditLogFileOutputStream.close on the fc.truncate(fc.position()) call or the
startLogSegment call get stuck in EditLogFileOutputStream.flushAndSync on
the fc.force(true) call. They both get stuck in the native code. Looking at
the last modification time in the "stat" output of the edits file, we see
that 20 seconds later the standby NameNode's RPC call times out.

The rollEditLog ends up holding onto the FSNamesystem's write lock on
fsLock, and this causes all other RPC calls to pile up trying to acquire
read locks until ZKFC times out on the health monitor and signals for the
NameNode to be killed. We patched the SshFenceByTcpPort code to issue a
kill -3 to get a thread dump before it kills the active NameNode.

We're running on CentOS 6 using ext4 FS (w/ noatime) using kernel 2.6.32.
The fsimage file is typically ~7.2GB and the edits files are typically
~1MB-2MB. The cluster running 2.7.2 is 256 nodes. We're running on
JDK 1.8.0_92 (compiled against it too w/ a few JDK8 specific patches).

See the relevant stacks below of the FileChannel code getting stuck in the
native code. I can also provide the full thread dumps and any relevant
configs, if needed.

Tried looking in JIRA and online but didn't see anything directly related.
Any insight as to whether this is a bug in Hadoop or if it's a side-effect
of something else? When the cluster is mostly idle, everything seems fine.
Our dev/test clusters haven't had any issues with the upgrades but they're
only 10 nodes or less and have little load.

Thanks!
Joey

Example of both getting stuck in force calls:

"641242166@qtp-1147805316-11" #869 daemon prio=5 os_prio=0
tid=0x00007fd9c3bc4800 nid=0x2f37 runnable [0x00007fb7f6c8d000]
   java.lang.Thread.State: RUNNABLE
        at sun.nio.ch.FileDispatcherImpl.force0(Native Method)
        at sun.nio.ch.FileDispatcherImpl.force(FileDispatcherImpl.java:76)
        at sun.nio.ch.FileChannelImpl.force(FileChannelImpl.java:388)
        at
org.apache.hadoop.hdfs.server.namenode.TransferFsImage.receiveFile(TransferFsImage.java:530)
        at
org.apache.hadoop.hdfs.server.namenode.TransferFsImage.handleUploadImageRequest(TransferFsImage.java:132)
        at
org.apache.hadoop.hdfs.server.namenode.ImageServlet$2.run(ImageServlet.java:488)
        at
org.apache.hadoop.hdfs.server.namenode.ImageServlet$2.run(ImageServlet.java:458)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
        at
org.apache.hadoop.hdfs.server.namenode.ImageServlet.doPut(ImageServlet.java:457)
        at javax.servlet.http.HttpServlet.service(HttpServlet.java:730)
        at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
        at
org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:511)
        at
org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1221)
        at
org.apache.hadoop.security.authentication.server.AuthenticationFilter.doFilter(AuthenticationFilter.java:595)
        at
org.apache.hadoop.security.authentication.server.AuthenticationFilter.doFilter(AuthenticationFilter.java:554)
        at
org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
        at
org.apache.hadoop.http.HttpServer2$QuotingInputFilter.doFilter(HttpServer2.java:1243)
        at
org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
        at
org.apache.hadoop.http.NoCacheFilter.doFilter(NoCacheFilter.java:45)
        at
org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
        at
org.apache.hadoop.http.NoCacheFilter.doFilter(NoCacheFilter.java:45)
        at
org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
        at
org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399)
        at
org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
        at
org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182)
        at
org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766)
        at
org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:450)
        at
org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230)
        at
org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
        at org.mortbay.jetty.Server.handle(Server.java:326)
        at
org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542)
        at
org.mortbay.jetty.HttpConnection$RequestHandler.content(HttpConnection.java:945)
        at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:843)
        at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212)
        at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404)
        at
org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:410)
        at
org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582)


"IPC Server handler 111 on 8020" #166 daemon prio=5 os_prio=0
tid=0x00007fd9c3449000 nid=0x600d runnable [0x00007fb7fb1af000]
   java.lang.Thread.State: RUNNABLE
        at sun.nio.ch.FileDispatcherImpl.force0(Native Method)
        at sun.nio.ch.FileDispatcherImpl.force(FileDispatcherImpl.java:76)
        at sun.nio.ch.FileChannelImpl.force(FileChannelImpl.java:388)
        at
org.apache.hadoop.hdfs.server.namenode.EditLogFileOutputStream.flushAndSync(EditLogFileOutputStream.java:206)
        at
org.apache.hadoop.hdfs.server.namenode.EditLogOutputStream.flush(EditLogOutputStream.java:113)
        at
org.apache.hadoop.hdfs.server.namenode.EditLogOutputStream.flush(EditLogOutputStream.java:107)
        at
org.apache.hadoop.hdfs.server.namenode.EditLogFileOutputStream.create(EditLogFileOutputStream.java:122)
        at
org.apache.hadoop.hdfs.server.namenode.FileJournalManager.startLogSegment(FileJournalManager.java:115)
        - locked <0x00007fc671be42b0> (a
org.apache.hadoop.hdfs.server.namenode.FileJournalManager)
        at
org.apache.hadoop.hdfs.server.namenode.JournalSet$JournalAndStream.startLogSegment(JournalSet.java:107)
        at
org.apache.hadoop.hdfs.server.namenode.JournalSet$3.apply(JournalSet.java:222)
        at
org.apache.hadoop.hdfs.server.namenode.JournalSet.mapJournalsAndReportErrors(JournalSet.java:393)
        at
org.apache.hadoop.hdfs.server.namenode.JournalSet.startLogSegment(JournalSet.java:219)
        at
org.apache.hadoop.hdfs.server.namenode.FSEditLog.startLogSegment(FSEditLog.java:1237)
        - locked <0x00007fba94879c00> (a
org.apache.hadoop.hdfs.server.namenode.FSEditLog)
        at
org.apache.hadoop.hdfs.server.namenode.FSEditLog.rollEditLog(FSEditLog.java:1206)
        - locked <0x00007fba94879c00> (a
org.apache.hadoop.hdfs.server.namenode.FSEditLog)
        at
org.apache.hadoop.hdfs.server.namenode.FSImage.rollEditLog(FSImage.java:1294)
        at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.rollEditLog(FSNamesystem.java:5832)
        at
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.rollEditLog(NameNodeRpcServer.java:1121)
        at
org.apache.hadoop.hdfs.protocolPB.NamenodeProtocolServerSideTranslatorPB.rollEditLog(NamenodeProtocolServerSideTranslatorPB.java:142)
        at
org.apache.hadoop.hdfs.protocol.proto.NamenodeProtocolProtos$NamenodeProtocolService$2.callBlockingMethod(NamenodeProtocolProtos.java:12025)
        at
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:969)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2049)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2045)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2045)


Examples of rollEditLog getting stuck in truncate call and TransferFsImage
getting stuck in force call:

"281983328@qtp-1147805316-22" #1085 daemon prio=5 os_prio=0
tid=0x0000000061637000 nid=0x58a4 runnable [0x00007f5da10cd000]
   java.lang.Thread.State: RUNNABLE
        at sun.nio.ch.FileDispatcherImpl.force0(Native Method)
        at sun.nio.ch.FileDispatcherImpl.force(FileDispatcherImpl.java:76)
        at sun.nio.ch.FileChannelImpl.force(FileChannelImpl.java:388)
        at
org.apache.hadoop.hdfs.server.namenode.TransferFsImage.receiveFile(TransferFsImage.java:530)
        at
org.apache.hadoop.hdfs.server.namenode.TransferFsImage.handleUploadImageRequest(TransferFsImage.java:132)
        at
org.apache.hadoop.hdfs.server.namenode.ImageServlet$2.run(ImageServlet.java:488)
        at
org.apache.hadoop.hdfs.server.namenode.ImageServlet$2.run(ImageServlet.java:458)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
        at
org.apache.hadoop.hdfs.server.namenode.ImageServlet.doPut(ImageServlet.java:457)
        at javax.servlet.http.HttpServlet.service(HttpServlet.java:730)
        at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
        at
org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:511)
        at
org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1221)
        at
org.apache.hadoop.security.authentication.server.AuthenticationFilter.doFilter(AuthenticationFilter.java:595)
        at
org.apache.hadoop.security.authentication.server.AuthenticationFilter.doFilter(AuthenticationFilter.java:554)
        at
org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
        at
org.apache.hadoop.http.HttpServer2$QuotingInputFilter.doFilter(HttpServer2.java:1243)
        at
org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
        at
org.apache.hadoop.http.NoCacheFilter.doFilter(NoCacheFilter.java:45)
        at
org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
        at
org.apache.hadoop.http.NoCacheFilter.doFilter(NoCacheFilter.java:45)
        at
org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
        at
org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399)
        at
org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
        at
org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182)
        at
org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766)
        at
org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:450)
        at
org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230)
        at
org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
        at org.mortbay.jetty.Server.handle(Server.java:326)
        at
org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542)
        at
org.mortbay.jetty.HttpConnection$RequestHandler.content(HttpConnection.java:945)
        at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:843)
        at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:218)
        at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404)
        at
org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:410)
        at
org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582)


"IPC Server handler 41 on 8020" #97 daemon prio=5 os_prio=0
tid=0x00007f5dbbdc6800 nid=0x611 runnable [0x00007f5de0ad0000]
   java.lang.Thread.State: RUNNABLE
        at sun.nio.ch.FileDispatcherImpl.truncate0(Native Method)
        at
sun.nio.ch.FileDispatcherImpl.truncate(FileDispatcherImpl.java:80)
        at sun.nio.ch.FileChannelImpl.truncate(FileChannelImpl.java:357)
        - locked <0x00007f7bf1c4e220> (a java.lang.Object)
        at
org.apache.hadoop.hdfs.server.namenode.EditLogFileOutputStream.close(EditLogFileOutputStream.java:157)
        at
org.apache.hadoop.hdfs.server.namenode.JournalSet$JournalAndStream.closeStream(JournalSet.java:115)
        at
org.apache.hadoop.hdfs.server.namenode.JournalSet$4.apply(JournalSet.java:235)
        at
org.apache.hadoop.hdfs.server.namenode.JournalSet.mapJournalsAndReportErrors(JournalSet.java:393)
        at
org.apache.hadoop.hdfs.server.namenode.JournalSet.finalizeLogSegment(JournalSet.java:231)
        at
org.apache.hadoop.hdfs.server.namenode.FSEditLog.endCurrentLogSegment(FSEditLog.java:1274)
        - locked <0x00007f60487f17f8> (a
org.apache.hadoop.hdfs.server.namenode.FSEditLog)
        at
org.apache.hadoop.hdfs.server.namenode.FSEditLog.rollEditLog(FSEditLog.java:1203)
        - locked <0x00007f60487f17f8> (a
org.apache.hadoop.hdfs.server.namenode.FSEditLog)
        at
org.apache.hadoop.hdfs.server.namenode.FSImage.rollEditLog(FSImage.java:1294)
        at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.rollEditLog(FSNamesystem.java:5832)
        at
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.rollEditLog(NameNodeRpcServer.java:1121)
        at
org.apache.hadoop.hdfs.protocolPB.NamenodeProtocolServerSideTranslatorPB.rollEditLog(NamenodeProtocolServerSideTranslatorPB.java:142)
        at
org.apache.hadoop.hdfs.protocol.proto.NamenodeProtocolProtos$NamenodeProtocolService$2.callBlockingMethod(NamenodeProtocolProtos.java:12025)
        at
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:969)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2049)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2045)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2045)

Reply via email to