[ 
https://issues.apache.org/jira/browse/HADOOP-4103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12790922#action_12790922
 ] 

Raghu Angadi commented on HADOOP-4103:
--------------------------------------

> If I run the command twice successively within 10 seconds , each run shows 
> different values, sometimes 20, sometimes 48, etc.etc.

Does this always happen? 

But if you are seeing this now and then, it is expected. Note that the missing 
blocks are detected only by the replication monitor when it iterates once every 
few (5?) minutes. For accurate count you could use new RPC you added in another 
jira.

It is certainly not because of locking. The unlocked part only does max(int1, 
int2). There are no other consistency requirements on returned value. volatile 
int won't help. 

> Alert for missing blocks
> ------------------------
>
>                 Key: HADOOP-4103
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4103
>             Project: Hadoop Common
>          Issue Type: New Feature
>    Affects Versions: 0.17.2
>            Reporter: Christian Kunz
>            Assignee: Raghu Angadi
>             Fix For: 0.20.0
>
>         Attachments: HADOOP-4103-branch-20.patch, HADOOP-4103.patch, 
> HADOOP-4103.patch, HADOOP-4103.patch, HADOOP-4103.patch
>
>
> A whole bunch of datanodes became dead because of some network problems 
> resulting in  heartbeat timeouts although datanodes were fine.
> Many processes started to fail because of the corrupted filesystem.
> In order to catch and diagnose such problems faster the namenode should 
> detect the corruption automatically and provide a way to alert operations. At 
> the minimum it should show the fact of corruption on the GUI.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to