[ 
https://issues.apache.org/jira/browse/HDFS-5140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13752639#comment-13752639
 ] 

Arpit Gupta commented on HDFS-5140:
-----------------------------------

Here is the stack trace from the standby namenode

{code}
2013-08-28 08:58:45,519 INFO  hdfs.StateChange 
(FSNamesystem.java:reportStatus(4677)) - STATE* Safe mode extension entered.
The reported blocks 833 has reached the threshold 1.0000 of total blocks 833. 
The number of live datanodes 3 has reached the minimum number 0. Safe mode will 
be turned off automatically in 29 seconds.
2013-08-28 08:58:45,524 ERROR namenode.FSEditLogLoader 
(FSEditLogLoader.java:loadEditRecords(203)) - Encountered exception on 
operation CloseOp [length=0, inodeId=0, 
path=/user/hrt_qa/ha-loadgenerator/100-threads/dir3/dir2/dir5/dir4/dir2/dir1/hostname63,
 replication=3, mtime=1377680236411, atime=1377680236320, blockSize=134217728, 
blocks=[blk_1073940431_205511], permissions=hrt_qa:hrt_qa:rw-r--r--, 
clientName=, clientMachine=, opCode=OP_CLOSE, txid=1141116]
java.lang.OutOfMemoryError: unable to create new native thread
        at java.lang.Thread.start0(Native Method)
        at java.lang.Thread.start(Thread.java:640)
        at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem$SafeModeInfo.checkMode(FSNamesystem.java:4521)
        at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem$SafeModeInfo.incrementSafeBlockCount(FSNamesystem.java:4568)
        at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem$SafeModeInfo.access$1900(FSNamesystem.java:4275)
        at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.incrementSafeBlockCount(FSNamesystem.java:4854)
        at 
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.completeBlock(BlockManager.java:596)
        at 
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.completeBlock(BlockManager.java:608)
        at 
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.forceCompleteBlock(BlockManager.java:621)
        at 
org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.updateBlocks(FSEditLogLoader.java:696)
        at 
org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.applyEditLogOp(FSEditLogLoader.java:372)
        at 
org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:198)
        at 
org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:111)
        at 
org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:733)
        at 
org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer.doTailEdits(EditLogTailer.java:227)
        at 
org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.doWork(EditLogTailer.java:321)
        at 
org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.access$200(EditLogTailer.java:279)
        at 
org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread$1.run(EditLogTailer.java:296)
        at 
org.apache.hadoop.security.SecurityUtil.doAsLoginUserOrFatal(SecurityUtil.java:456)
        at 
org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.run(EditLogTailer.java:292)
2013-08-28 08:58:45,597 FATAL ha.EditLogTailer (EditLogTailer.java:doWork(328)) 
- Unknown error encountered while tailing edits. Shutting down standby NN.
java.io.IOException: Failed to apply edit log operation CloseOp [length=0, 
inodeId=0, 
path=/user/hrt_qa/ha-loadgenerator/100-threads/dir3/dir2/dir5/dir4/dir2/dir1/hostname63,
 replication=3, mtime=1377680236411, atime=1377680236320, blockSize=134217728, 
blocks=[blk_1073940431_205511], permissions=hrt_qa:hrt_qa:rw-r--r--, 
clientName=, clientMachine=, opCode=OP_CLOSE, txid=1141116]: error unable to 
create new native thread
        at 
org.apache.hadoop.hdfs.server.namenode.MetaRecoveryContext.editLogLoaderPrompt(MetaRecoveryContext.java:94)
        at 
org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:204)
        at 
org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:111)
        at 
org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:733)
        at 
org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer.doTailEdits(EditLogTailer.java:227)
        at 
org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.doWork(EditLogTailer.java:321)
        at 
org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.access$200(EditLogTailer.java:279)
        at 
org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread$1.run(EditLogTailer.java:296)
        at 
org.apache.hadoop.security.SecurityUtil.doAsLoginUserOrFatal(SecurityUtil.java:456)
        at 
org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.run(EditLogTailer.java:292)
2013-08-28 08:58:45,636 INFO  util.ExitUtil (ExitUtil.java:terminate(124)) - 
Exiting with status 1
{code}
                
> Too many safemode monitor threads being created in the standby namenode
> -----------------------------------------------------------------------
>
>                 Key: HDFS-5140
>                 URL: https://issues.apache.org/jira/browse/HDFS-5140
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: ha
>    Affects Versions: 2.1.0-beta
>            Reporter: Arpit Gupta
>            Assignee: Jing Zhao
>            Priority: Blocker
>
> While running namenode load generator with 100 threads for 10 mins namenode 
> was being failed over ever 2 mins.
> The standby namenode shut itself down as it ran out of memory and was not 
> able to create another thread.
> When we searched for 'Safe mode extension entered' in the standby log it was 
> present 55000+ times

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to