Hi all, Our HDFS has HA setting which depends on QuorumJournalNodes. We previously use CMS and 120GB heap size for the NameNode. When we increased the heap size to 180GB and enable G1GC for one of the NameNode, it crashed after it became active for ~10 minutes.
The crash is due to a FATAL error that the NameNode timeout to write to any of the JournalNodes. However, no GC pauses were detected at that time. Resources (CPU, disk/network IO) usage were low. For the group mapping, we're using ShellBasedUnixGroupsMapping so the bottleneck may not be in connections with LDAP server. Here're the NameNode logs: 1. 2019-01-09 18:54:07,788 INFO org.apache.hadoop.hdfs.server.blockmanagement.CacheReplicationMonitor: Rescanning after 30000 milliseconds 2. 2019-01-09 18:54:10,042 INFO org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager: Waited 6001 ms (timeout=20000 ms) for a response for sendEdits. No responses yet. 3. 2019-01-09 18:54:11,043 INFO org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager: Waited 7003 ms (timeout=20000 ms) for a response for sendEdits. No responses yet. 4. 2019-01-09 18:54:12,045 INFO org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager: Waited 8004 ms (timeout=20000 ms) for a response for sendEdits. No responses yet. 5. 2019-01-09 18:54:13,046 INFO org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager: Waited 9005 ms (timeout=20000 ms) for a response for sendEdits. No responses yet. 6. 2019-01-09 18:54:14,047 INFO org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager: Waited 10007 ms (timeout=20000 ms) for a response for sendEdits. No responses yet. 7. 2019-01-09 18:54:15,049 INFO org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager: Waited 11008 ms (timeout=20000 ms) for a response for sendEdits. No responses yet. 8. 2019-01-09 18:54:16,050 INFO org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager: Waited 12009 ms (timeout=20000 ms) for a response for sendEdits. No responses yet. 9. 2019-01-09 18:54:17,050 INFO org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager: Waited 13010 ms (timeout=20000 ms) for a response for sendEdits. No responses yet. 10. 2019-01-09 18:54:18,052 WARN org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager: Waited 14011 ms (timeout=20000 ms) for a response for sendEdits. No responses yet. 11. 2019-01-09 18:54:19,053 WARN org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager: Waited 15012 ms (timeout=20000 ms) for a response for sendEdits. No responses yet. 12. 2019-01-09 18:54:20,053 WARN org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager: Waited 16013 ms (timeout=20000 ms) for a response for sendEdits. No responses yet. 13. 2019-01-09 18:54:21,055 WARN org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager: Waited 17014 ms (timeout=20000 ms) for a response for sendEdits. No responses yet. 14. 2019-01-09 18:54:22,056 WARN org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager: Waited 18015 ms (timeout=20000 ms) for a response for sendEdits. No responses yet. 15. 2019-01-09 18:54:23,056 WARN org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager: Waited 19016 ms (timeout=20000 ms) for a response for sendEdits. No responses yet. 16. 2019-01-09 18:54:24,041 FATAL org.apache.hadoop.hdfs.server.namenode.FSEditLog: Error: flush failed for required journal (JournalAndStream(mgr=QJM to [10.88.131.30:8485, 10.88.133.30:8485, 10.88.135.29:8485], stream=QuorumOutputStream starting at txid 46729238851)) 17. java.io.IOException: Timed out waiting 20000ms for a quorum of nodes to respond. 18. at org.apache.hadoop.hdfs.qjournal.client.AsyncLoggerSet.waitForWriteQuorum(AsyncLoggerSet.java:137) 19. at org.apache.hadoop.hdfs.qjournal.client.QuorumOutputStream.flushAndSync(QuorumOutputStream.java:107) 20. at org.apache.hadoop.hdfs.server.namenode.EditLogOutputStream.flush(EditLogOutputStream.java:113) 21. at org.apache.hadoop.hdfs.server.namenode.EditLogOutputStream.flush(EditLogOutputStream.java:107) 22. at org.apache.hadoop.hdfs.server.namenode.JournalSet$JournalSetOutputStream$8.apply(JournalSet.java:533) 23. at org.apache.hadoop.hdfs.server.namenode.JournalSet.mapJournalsAndReportErrors(JournalSet.java:393) 24. at org.apache.hadoop.hdfs.server.namenode.JournalSet.access$100(JournalSet.java:57) 25. at org.apache.hadoop.hdfs.server.namenode.JournalSet$JournalSetOutputStream.flush(JournalSet.java:529) 26. at org.apache.hadoop.hdfs.server.namenode.FSEditLog.logSync(FSEditLog.java:641) 27. at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:3363) 28. at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:677) 29. at org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.addBlock(AuthorizationProviderProxyClientProtocol.java:213) 30. at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:485) 31. at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) 32. at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617) 33. at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1073) 34. at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2086) 35. at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2082) 36. at java.security.AccessController.doPrivileged(Native Method) 37. at javax.security.auth.Subject.doAs(Subject.java:422) 38. at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1693) 39. at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2080) 40. 2019-01-09 18:54:24,042 WARN org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager: Aborting QuorumOutputStream starting at txid 46729238851 41. 2019-01-09 18:54:24,045 INFO org.apache.hadoop.util.ExitUtil: Exiting with status 1 42. 2019-01-09 18:54:42,494 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: SHUTDOWN_MSG: 43. /************************************************************ 44. SHUTDOWN_MSG: Shutting down NameNode at metrics-hadoop-warehousestore-nn01.server.hulu.com/10.88.135.13 45. ************************************************************/ Only one complain of JvmPauseMonitor was shown far early before the timeout: 1. 2019-01-09 18:50:44,772 INFO org.apache.hadoop.util.JvmPauseMonitor: Detected pause in JVM or host machine (eg GC): pause of approximately 1997ms 2. No GCs detected In the QuaromJournalNodes, no warning logs are found in that period. It seems the NameNode fails to connect to any of the JournalNodes. Here're the GC options we set for G1GC: 1. -XX:+UseG1GC -XX:MaxGCPauseMillis=400 -XX:ParallelGCThreads=32 -XX:ConcGCThreads=8 -XX:+PrintGCDetails -XX:+PrintGCDateStamps -Xloggc:/var/log/hadoop-hdfs/gc.log -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=10 -XX:GCLogFileSize=5M -verbose.gc Could anyone give some thoughts? Thanks, Quanlong Huang