[jira] [Commented] (HDFS-5225) datanode keeps logging the same 'is no longer in the dataset' message over and over again

Hadoop QA (JIRA) Wed, 25 Sep 2013 15:30:12 -0700

    [ 
https://issues.apache.org/jira/browse/HDFS-5225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13778181#comment-13778181
 ]


Hadoop QA commented on HDFS-5225:
---------------------------------

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12605069/HDFS-5225.2.patch
  against trunk revision .

    {color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

    {color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified test files.

    {color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

    {color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

    {color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

    {color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

    {color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

    {color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-hdfs-project/hadoop-hdfs:

                  
org.apache.hadoop.hdfs.server.datanode.TestMultipleNNDataBlockScanner
                  org.apache.hadoop.hdfs.web.TestWebHDFS

    {color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/5036//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/5036//console

This message is automatically generated.
                
> datanode keeps logging the same 'is no longer in the dataset' message over 
> and over again
> -----------------------------------------------------------------------------------------
>
>                 Key: HDFS-5225
>                 URL: https://issues.apache.org/jira/browse/HDFS-5225
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: datanode
>    Affects Versions: 2.1.1-beta
>            Reporter: Roman Shaposhnik
>            Assignee: Tsuyoshi OZAWA
>            Priority: Blocker
>         Attachments: HDFS-5225.1.patch, HDFS-5225.2.patch, 
> HDFS-5225-reproduce.1.txt
>
>
> I was running the usual Bigtop testing on 2.1.1-beta RC1 with the following 
> configuration: 4 nodes fully distributed cluster with security on.
> All of a sudden my DN ate up all of the space in /var/log logging the 
> following message repeatedly:
> {noformat}
> 2013-09-18 20:51:12,046 INFO 
> org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceScanner: 
> BP-1884637155-10.224.158.152-1379524544853:blk_1073742189_1369 is no longer 
> in the dataset
> {noformat}
> It wouldn't answer to a jstack and jstack -F ended up being useless.
> Here's what I was able to find in the NameNode logs regarding this block ID:
> {noformat}
> fgrep -rI 'blk_1073742189' hadoop-hdfs-namenode-ip-10-224-158-152.log
> 2013-09-18 18:03:16,972 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* 
> allocateBlock: 
> /user/jenkins/testAppendInputWedSep18180222UTC2013/test4.fileWedSep18180222UTC2013._COPYING_.
>  BP-1884637155-10.224.158.152-1379524544853 
> blk_1073742189_1369{blockUCState=UNDER_CONSTRUCTION, primaryNodeIndex=-1, 
> replicas=[ReplicaUnderConstruction[10.83.107.80:1004|RBW], 
> ReplicaUnderConstruction[10.34.74.206:1004|RBW], 
> ReplicaUnderConstruction[10.224.158.152:1004|RBW]]}
> 2013-09-18 18:03:17,222 INFO BlockStateChange: BLOCK* addStoredBlock: 
> blockMap updated: 10.224.158.152:1004 is added to 
> blk_1073742189_1369{blockUCState=UNDER_CONSTRUCTION, primaryNodeIndex=-1, 
> replicas=[ReplicaUnderConstruction[10.83.107.80:1004|RBW], 
> ReplicaUnderConstruction[10.34.74.206:1004|RBW], 
> ReplicaUnderConstruction[10.224.158.152:1004|RBW]]} size 0
> 2013-09-18 18:03:17,222 INFO BlockStateChange: BLOCK* addStoredBlock: 
> blockMap updated: 10.34.74.206:1004 is added to 
> blk_1073742189_1369{blockUCState=UNDER_CONSTRUCTION, primaryNodeIndex=-1, 
> replicas=[ReplicaUnderConstruction[10.83.107.80:1004|RBW], 
> ReplicaUnderConstruction[10.34.74.206:1004|RBW], 
> ReplicaUnderConstruction[10.224.158.152:1004|RBW]]} size 0
> 2013-09-18 18:03:17,224 INFO BlockStateChange: BLOCK* addStoredBlock: 
> blockMap updated: 10.83.107.80:1004 is added to 
> blk_1073742189_1369{blockUCState=UNDER_CONSTRUCTION, primaryNodeIndex=-1, 
> replicas=[ReplicaUnderConstruction[10.83.107.80:1004|RBW], 
> ReplicaUnderConstruction[10.34.74.206:1004|RBW], 
> ReplicaUnderConstruction[10.224.158.152:1004|RBW]]} size 0
> 2013-09-18 18:03:17,899 INFO 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: 
> updatePipeline(block=BP-1884637155-10.224.158.152-1379524544853:blk_1073742189_1369,
>  newGenerationStamp=1370, newLength=1048576, newNodes=[10.83.107.80:1004, 
> 10.34.74.206:1004, 10.224.158.152:1004], 
> clientName=DFSClient_NONMAPREDUCE_-450304083_1)
> 2013-09-18 18:03:17,904 INFO 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: 
> updatePipeline(BP-1884637155-10.224.158.152-1379524544853:blk_1073742189_1369)
>  successfully to 
> BP-1884637155-10.224.158.152-1379524544853:blk_1073742189_1370
> 2013-09-18 18:03:18,540 INFO 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: 
> updatePipeline(block=BP-1884637155-10.224.158.152-1379524544853:blk_1073742189_1370,
>  newGenerationStamp=1371, newLength=2097152, newNodes=[10.83.107.80:1004, 
> 10.34.74.206:1004, 10.224.158.152:1004], 
> clientName=DFSClient_NONMAPREDUCE_-450304083_1)
> 2013-09-18 18:03:18,548 INFO 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: 
> updatePipeline(BP-1884637155-10.224.158.152-1379524544853:blk_1073742189_1370)
>  successfully to 
> BP-1884637155-10.224.158.152-1379524544853:blk_1073742189_1371
> 2013-09-18 18:03:26,150 INFO BlockStateChange: BLOCK* addToInvalidates: 
> blk_1073742189_1371 10.83.107.80:1004 10.34.74.206:1004 10.224.158.152:1004 
> 2013-09-18 18:03:26,847 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* 
> InvalidateBlocks: ask 10.34.74.206:1004 to delete [blk_1073742178_1359, 
> blk_1073742183_1362, blk_1073742184_1363, blk_1073742186_1366, 
> blk_1073742188_1368, blk_1073742189_1371]
> 2013-09-18 18:03:29,848 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* 
> InvalidateBlocks: ask 10.224.158.152:1004 to delete [blk_1073742177_1353, 
> blk_1073742178_1359, blk_1073742179_1355, blk_1073742180_1356, 
> blk_1073742181_1358, blk_1073742182_1361, blk_1073742185_1364, 
> blk_1073742187_1367, blk_1073742188_1368, blk_1073742189_1371]
> 2013-09-18 18:03:29,848 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* 
> InvalidateBlocks: ask 10.83.107.80:1004 to delete [blk_1073742177_1353, 
> blk_1073742178_1359, blk_1073742179_1355, blk_1073742180_1356, 
> blk_1073742181_1358, blk_1073742182_1361, blk_1073742183_1362, 
> blk_1073742184_1363, blk_1073742185_1364, blk_1073742186_1366, 
> blk_1073742187_1367, blk_1073742188_1368, blk_1073742189_1371]
> {noformat}
> This seems to suggest that the block was successfully deleted, but then DN 
> got into a death spiral inside of a scanner.
> I can keep the cluster running for a few days if anybody is willing to take a 
> look. Ask me for creds via a personal email.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-5225) datanode keeps logging the same 'is no longer in the dataset' message over and over again

Reply via email to