[
https://issues.apache.org/jira/browse/HADOOP-1486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12506680
]
Hairong Kuang commented on HADOOP-1486:
---------------------------------------
> The ReplicationMonitor thread catches all types of exceptions, logs them,
> sleep for 5 seconds and then continue from the beginning.
This solution makes sure that ReplicationMonitor does not go away in case of
RuntimeErrors. But is it possible that this solution leaves namenode in an
inconsistent state? What if ReplicationMonitor is in the middle of updating
some data structures when RuntimeError occurs. If this is possible, option 1
might be a safer solution.
> ReplicationMonitor thread goes away
> ------------------------------------
>
> Key: HADOOP-1486
> URL: https://issues.apache.org/jira/browse/HADOOP-1486
> Project: Hadoop
> Issue Type: Bug
> Components: dfs
> Affects Versions: 0.12.3
> Reporter: Koji Noguchi
> Priority: Blocker
> Fix For: 0.14.0
>
> Attachments: catchThrowable.patch
>
>
> Saw many over/under replicated blocks in fsck output.
> .out file showed
> Exception in thread "[EMAIL PROTECTED]" java.lang.IllegalArgumentException:
> Unexpected non-existing data node: /99.9.99.0/99.9.99.42:99999
> at
> org.apache.hadoop.net.NetworkTopology.checkArgument(NetworkTopology.java:379)
> at
> org.apache.hadoop.net.NetworkTopology.isOnSameRack(NetworkTopology.java:424)
> at
> org.apache.hadoop.dfs.FSNamesystem$ReplicationTargetChooser.chooseTarget(FSNamesystem.java:2853)
> at
> org.apache.hadoop.dfs.FSNamesystem$ReplicationTargetChooser.chooseTarget(FSNamesystem.java:2816)
> at
> org.apache.hadoop.dfs.FSNamesystem.pendingTransfers(FSNamesystem.java:2658)
> at
> org.apache.hadoop.dfs.FSNamesystem.computeDatanodeWork(FSNamesystem.java:1774)
> at
> org.apache.hadoop.dfs.FSNamesystem$ReplicationMonitor.run(FSNamesystem.java:1723)
> at java.lang.Thread.run(Thread.java:619)
> (same as HADOOP-1232)
> And, jstack showed no ReplicationMonitor thread.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.