[ 
https://issues.apache.org/jira/browse/HDFS-8973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14716414#comment-14716414
 ] 

He Xiaoqiao commented on HDFS-8973:
-----------------------------------

Thanks, Kanaka. before stop logging everything looks well actually, neither 
WARN nor ERROR occured, and after that it continues print GC info to out file 
about 5 mins also looks well but "log4j:ERROR Failed to flush writer", no any 
other useful info. 
I doubt there are multi threads using the same log4j handler, and when rolling 
the logfile, one thread close the Stream, and other thread continue write to 
this Stream, thus some Exception throws and interrupt the Thread. When all 
Threads of Namenode interrupt, main Thread exit.

> NameNode exit without any exception log
> ---------------------------------------
>
>                 Key: HDFS-8973
>                 URL: https://issues.apache.org/jira/browse/HDFS-8973
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: namenode
>    Affects Versions: 2.4.1
>            Reporter: He Xiaoqiao
>            Priority: Critical
>
> namenode process exit without any useful WARN/ERROR log, and after .log file 
> output interrupt .out file continue show about 5 min GC log. when .log file 
> intertupt .out file print the follow ERROR, it may hint some info. it seems 
> cause by log4j ERROR.
> {code:title=namenode.out|borderStyle=solid}
> log4j:ERROR Failed to flush writer,
> java.io.IOException: 错误的文件描述符
>         at java.io.FileOutputStream.writeBytes(Native Method)
>         at java.io.FileOutputStream.write(FileOutputStream.java:318)
>         at sun.nio.cs.StreamEncoder.writeBytes(StreamEncoder.java:221)
>         at sun.nio.cs.StreamEncoder.implFlushBuffer(StreamEncoder.java:291)
>         at sun.nio.cs.StreamEncoder.implFlush(StreamEncoder.java:295)
>         at sun.nio.cs.StreamEncoder.flush(StreamEncoder.java:141)
>         at java.io.OutputStreamWriter.flush(OutputStreamWriter.java:229)
>         at org.apache.log4j.helpers.QuietWriter.flush(QuietWriter.java:59)
>         at org.apache.log4j.WriterAppender.subAppend(WriterAppender.java:324)
>         at 
> org.apache.log4j.RollingFileAppender.subAppend(RollingFileAppender.java:276)
>         at org.apache.log4j.WriterAppender.append(WriterAppender.java:162)
>         at 
> org.apache.log4j.AppenderSkeleton.doAppend(AppenderSkeleton.java:251)
>         at 
> org.apache.log4j.helpers.AppenderAttachableImpl.appendLoopOnAppenders(AppenderAttachableImpl.java:66)
>         at org.apache.log4j.Category.callAppenders(Category.java:206)
>         at org.apache.log4j.Category.forcedLog(Category.java:391)
>         at org.apache.log4j.Category.log(Category.java:856)
>         at 
> org.apache.commons.logging.impl.Log4JLogger.info(Log4JLogger.java:176)
>         at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.logAddStoredBlock(BlockManager.java:2391)
>         at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.addStoredBlock(BlockManager.java:2312)
>         at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.processAndHandleReportedBlock(BlockManager.java:2919)
>         at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.addBlock(BlockManager.java:2894)
>         at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.processIncrementalBlockReport(BlockManager.java:2976)
>         at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.processIncrementalBlockReport(FSNamesystem.java:5432)
>         at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.blockReceivedAndDeleted(NameNodeRpcServer.java:1061)
>         at 
> org.apache.hadoop.hdfs.protocolPB.DatanodeProtocolServerSideTranslatorPB.blockReceivedAndDeleted(DatanodeProtocolServerSideTranslatorPB.java:209)
>         at 
> org.apache.hadoop.hdfs.protocol.proto.DatanodeProtocolProtos$DatanodeProtocolService$2.callBlockingMethod(DatanodeProtocolProtos.java:28065)
>         at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
>         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2013)
>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2009)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:415)
>         at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1556)
>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2007)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to