[ https://issues.apache.org/jira/browse/HDFS-8973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14716414#comment-14716414 ]
He Xiaoqiao commented on HDFS-8973: ----------------------------------- Thanks, Kanaka. before stop logging everything looks well actually, neither WARN nor ERROR occured, and after that it continues print GC info to out file about 5 mins also looks well but "log4j:ERROR Failed to flush writer", no any other useful info. I doubt there are multi threads using the same log4j handler, and when rolling the logfile, one thread close the Stream, and other thread continue write to this Stream, thus some Exception throws and interrupt the Thread. When all Threads of Namenode interrupt, main Thread exit. > NameNode exit without any exception log > --------------------------------------- > > Key: HDFS-8973 > URL: https://issues.apache.org/jira/browse/HDFS-8973 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode > Affects Versions: 2.4.1 > Reporter: He Xiaoqiao > Priority: Critical > > namenode process exit without any useful WARN/ERROR log, and after .log file > output interrupt .out file continue show about 5 min GC log. when .log file > intertupt .out file print the follow ERROR, it may hint some info. it seems > cause by log4j ERROR. > {code:title=namenode.out|borderStyle=solid} > log4j:ERROR Failed to flush writer, > java.io.IOException: 错误的文件描述符 > at java.io.FileOutputStream.writeBytes(Native Method) > at java.io.FileOutputStream.write(FileOutputStream.java:318) > at sun.nio.cs.StreamEncoder.writeBytes(StreamEncoder.java:221) > at sun.nio.cs.StreamEncoder.implFlushBuffer(StreamEncoder.java:291) > at sun.nio.cs.StreamEncoder.implFlush(StreamEncoder.java:295) > at sun.nio.cs.StreamEncoder.flush(StreamEncoder.java:141) > at java.io.OutputStreamWriter.flush(OutputStreamWriter.java:229) > at org.apache.log4j.helpers.QuietWriter.flush(QuietWriter.java:59) > at org.apache.log4j.WriterAppender.subAppend(WriterAppender.java:324) > at > org.apache.log4j.RollingFileAppender.subAppend(RollingFileAppender.java:276) > at org.apache.log4j.WriterAppender.append(WriterAppender.java:162) > at > org.apache.log4j.AppenderSkeleton.doAppend(AppenderSkeleton.java:251) > at > org.apache.log4j.helpers.AppenderAttachableImpl.appendLoopOnAppenders(AppenderAttachableImpl.java:66) > at org.apache.log4j.Category.callAppenders(Category.java:206) > at org.apache.log4j.Category.forcedLog(Category.java:391) > at org.apache.log4j.Category.log(Category.java:856) > at > org.apache.commons.logging.impl.Log4JLogger.info(Log4JLogger.java:176) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.logAddStoredBlock(BlockManager.java:2391) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.addStoredBlock(BlockManager.java:2312) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.processAndHandleReportedBlock(BlockManager.java:2919) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.addBlock(BlockManager.java:2894) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.processIncrementalBlockReport(BlockManager.java:2976) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.processIncrementalBlockReport(FSNamesystem.java:5432) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.blockReceivedAndDeleted(NameNodeRpcServer.java:1061) > at > org.apache.hadoop.hdfs.protocolPB.DatanodeProtocolServerSideTranslatorPB.blockReceivedAndDeleted(DatanodeProtocolServerSideTranslatorPB.java:209) > at > org.apache.hadoop.hdfs.protocol.proto.DatanodeProtocolProtos$DatanodeProtocolService$2.callBlockingMethod(DatanodeProtocolProtos.java:28065) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2013) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2009) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1556) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2007) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)