[
https://issues.apache.org/jira/browse/HADOOP-1486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12508912
]
Doug Cutting commented on HADOOP-1486:
--------------------------------------
> If we create a new instance of the NameNode within the same JVM, then the GC
> process might take a while before the memory situation stabilizes.
That's possible, I suppose, it's also possible that the GC might handle this
well. GC time is often proportional to the amount of non-garbage, which would
be small on restart.
> Is it ok if I exit the namenode-jvm completely and leave it to the
> administrator to restart the namenode if necessary?
Sure, that'd be okay. But, if the namenode auto-restarts slowly, the admin can
always kill & restart it manually, so I don't see the harm in it attempting to
auto-restart. Restarting slowly isn't worse than being down, is it? So my
instinct would be to try auto-restarting. It shouldn't cause data loss, and
might indeed help in many cases, so, why not?
> ReplicationMonitor thread goes away
> ------------------------------------
>
> Key: HADOOP-1486
> URL: https://issues.apache.org/jira/browse/HADOOP-1486
> Project: Hadoop
> Issue Type: Bug
> Components: dfs
> Affects Versions: 0.12.3
> Reporter: Koji Noguchi
> Assignee: dhruba borthakur
> Priority: Blocker
> Fix For: 0.14.0
>
> Attachments: catchThrowable2.patch
>
>
> Saw many over/under replicated blocks in fsck output.
> .out file showed
> Exception in thread "[EMAIL PROTECTED]" java.lang.IllegalArgumentException:
> Unexpected non-existing data node: /99.9.99.0/99.9.99.42:99999
> at
> org.apache.hadoop.net.NetworkTopology.checkArgument(NetworkTopology.java:379)
> at
> org.apache.hadoop.net.NetworkTopology.isOnSameRack(NetworkTopology.java:424)
> at
> org.apache.hadoop.dfs.FSNamesystem$ReplicationTargetChooser.chooseTarget(FSNamesystem.java:2853)
> at
> org.apache.hadoop.dfs.FSNamesystem$ReplicationTargetChooser.chooseTarget(FSNamesystem.java:2816)
> at
> org.apache.hadoop.dfs.FSNamesystem.pendingTransfers(FSNamesystem.java:2658)
> at
> org.apache.hadoop.dfs.FSNamesystem.computeDatanodeWork(FSNamesystem.java:1774)
> at
> org.apache.hadoop.dfs.FSNamesystem$ReplicationMonitor.run(FSNamesystem.java:1723)
> at java.lang.Thread.run(Thread.java:619)
> (same as HADOOP-1232)
> And, jstack showed no ReplicationMonitor thread.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.