[jira] [Commented] (HDFS-13638) DataNode Can't replicate block because NameNode thinks the length is 9223372036854775807

2018-06-07 Thread Wei-Chiu Chuang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16505239#comment-16505239
 ] 

Wei-Chiu Chuang commented on HDFS-13638:


Thanks [~hexiaoqiao] for the hint sorry it took a while to circle back.
The relevant code is here. Too bad I lost NameNode log.
{code:java|title=BlockManager#removeBlock()}
public void removeBlock(BlockInfo block) {
assert namesystem.hasWriteLock();
// No need to ACK blocks that are being removed entirely
// from the namespace, since the removal of the associated
// file already removes them from the block map below.
block.setNumBytes(BlockCommand.NO_ACK);
{code}

Quote HDFS-10453:
bq. (2) FSNamesystem#delete invoked to delete blocks then clear the reference 
in blocksmap, needReplications, etc. the block's NumBytes will set 
NO_ACK(Long.MAX_VALUE) which is used to indicate that the block deletion does 
not need explicit ACK from the node. 

If this is the case, then there's no need to worry about DataNode marking the 
replica as corrupt due to length being 9223372036854775807.

> DataNode Can't replicate block because NameNode thinks the length is 
> 9223372036854775807
> 
>
> Key: HDFS-13638
> URL: https://issues.apache.org/jira/browse/HDFS-13638
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Reporter: Wei-Chiu Chuang
>Priority: Major
>
> I occasionally find the following warning in CDH clusters, but haven't 
> figured out why. Thought I should better raise the issue anyway.
> {quote}
> 2018-05-29 09:15:58,092 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: 
> Can't replicate block 
> BP-725378529-10.0.0.8-1410027444173:blk_13276745777_1112363330268 because 
> on-disk length 175085 is shorter than NameNode recorded length 
> 9223372036854775807
> {quote}
> Infact, 9223372036854775807 = Long.MAX_VALUE.
> Chasing in the HDFS codebase but didn't find where this length could come from



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13638) DataNode Can't replicate block because NameNode thinks the length is 9223372036854775807

2018-05-29 Thread He Xiaoqiao (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16494689#comment-16494689
 ] 

He Xiaoqiao commented on HDFS-13638:


[~jojochuang], I met this problem in apache-2.7.1, IIUC, it may be relevant to 
HDFS-10453, FYI.

> DataNode Can't replicate block because NameNode thinks the length is 
> 9223372036854775807
> 
>
> Key: HDFS-13638
> URL: https://issues.apache.org/jira/browse/HDFS-13638
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Reporter: Wei-Chiu Chuang
>Priority: Major
>
> I occasionally find the following warning in CDH clusters, but haven't 
> figured out why. Thought I should better raise the issue anyway.
> {quote}
> 2018-05-29 09:15:58,092 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: 
> Can't replicate block 
> BP-725378529-10.0.0.8-1410027444173:blk_13276745777_1112363330268 because 
> on-disk length 175085 is shorter than NameNode recorded length 
> 9223372036854775807
> {quote}
> Infact, 9223372036854775807 = Long.MAX_VALUE.
> Chasing in the HDFS codebase but didn't find where this length could come from



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org