[ 
https://issues.apache.org/jira/browse/HDFS-3934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13665668#comment-13665668
 ] 

Aaron T. Myers commented on HDFS-3934:
--------------------------------------

The latest patch looks pretty good to me, Colin. A few comments:

# Recommend you add @VisibleForTesting annotation to DataNode#getXferPort.
# Recommend you change this exception message to include the full entry string, 
not just the port part that couldn't be parsed, and mention explicitly that it 
was the port component that couldn't be parsed:
{code}
+          throw new IOException("invalid number format when parsing " +
+              portStr, e);
{code}
The fact that it was an invalid number and what the number was will already be 
contained in the message for the NumberFormatException:
# Recommend you add something to this warning message to make it clear that 
this is expected if using the DN registration name feature, and to make it 
clear that this was encountered when reading an include or exclude file:
{code}
+        LOG.warn("unknown host " + prefix, e);
{code}
# In HostFileManager.EntrySet#find(Entry), since right after this we 
uncondtionally return null, you can condense this code into a single {{if 
(...)}} condition which first checks that "{{ceil != null}}":
{code}
+        if (ceil == null) {
+          return null;
+        }
+        if (ceil.getValue().getIdentifier().equals(
+              toFind.getIdentifier())) {
+          return ceil.getValue();
+        }
{code}
# In HostFileManager.MutableEntrySet#add(DatanodeID), are we guaranteed that 
datanodeID.getXferPort() >= 0? Perhaps we should assert that?
# Perhaps make HostFileManager.EntrySet.index protected?
# I see the purpose of delaying the throwing of the errors in 
HostFileManager#refresh, but you might want to add a comment explaining it, 
since it's not super obvious. I'd also recommend adding something to the log 
messages in that method along the lines of "failed to read exclude file, 
continuing to use previous list of excluded nodes" to make it clear what 
happens in this case.
# Perhaps I'm missing something, but why have separate classes for EntrySet and 
MutableEntrySet? The only time we use just the normal EntrySet is for the 
initial empty sets, so seems like we should just have a single class.
# Seems like you should do an s/DataNode/NameNode/g in this comment:
{code}
+    // These entries will be de-duped by the DataNode, since they refer
+    // to the same IP address + port combo.
{code}
# I don't think this code is doing anything in 
TestDecommission#testDuplicateHostEntries:
{code}
+    info = client.datanodeReport(DatanodeReportType.DEAD);
{code}
# Seems like you could replace all of this code with just three asserts: two 
calls to Map#contains(...), and one check that Map#size() == 2:
{code}
+    Iterator<Map.Entry<String, DatanodeInfo>> iter =
+            deadByXferAddr.entrySet().iterator();
+    boolean foundPort1 = false, foundPort2 = false;
+    while (iter.hasNext()) {
+      Map.Entry<String, DatanodeInfo> entry = iter.next();
+      DatanodeInfo dn = entry.getValue();
+      if (dn.getXferPort() == port1) {
+        foundPort1 = true;
+        iter.remove();
+      } else if (dn.getXferPort() == port2) {
+        foundPort2 = true;
+        iter.remove();
+      }
+    }
+    Assert.assertTrue("did not find a dead entry with port " + port1,
+        foundPort1);
+    Assert.assertTrue("did not find a dead entry with port " + port2,
+        foundPort2);
+    Assert.assertTrue(deadByXferAddr.isEmpty());
{code}
# I like that you make a copy of the Configuration object in 
testIncludeByRegistrationName, and recommend you do the same in 
testDuplicateHostsEntries, just to minimize the likelihood of inter-test 
interference.

I'll be +1 once these are addressed.

Daryn (or anyone who's intending to review this) - please do so shortly. I'll 
be committing this soon after Colin posts a patch addressing these comments 
unless I hear from someone else.
                
> duplicative dfs_hosts entries handled wrong
> -------------------------------------------
>
>                 Key: HDFS-3934
>                 URL: https://issues.apache.org/jira/browse/HDFS-3934
>             Project: Hadoop HDFS
>          Issue Type: Bug
>    Affects Versions: 2.0.1-alpha
>            Reporter: Andy Isaacson
>            Assignee: Colin Patrick McCabe
>            Priority: Minor
>         Attachments: HDFS-3934.001.patch, HDFS-3934.002.patch, 
> HDFS-3934.003.patch, HDFS-3934.004.patch, HDFS-3934.005.patch, 
> HDFS-3934.006.patch, HDFS-3934.007.patch, HDFS-3934.008.patch, 
> HDFS-3934.010.patch, HDFS-3934.011.patch, HDFS-3934.012.patch, 
> HDFS-3934.013.patch, HDFS-3934.014.patch
>
>
> A dead DN listed in dfs_hosts_allow.txt by IP and in dfs_hosts_exclude.txt by 
> hostname ends up being displayed twice in {{dfsnodelist.jsp?whatNodes=DEAD}} 
> after the NN restarts because {{getDatanodeListForReport}} does not handle 
> such a "pseudo-duplicate" correctly:
> # the "Remove any nodes we know about from the map" loop no longer has the 
> knowledge to remove the spurious entries
> # the "The remaining nodes are ones that are referenced by the hosts files" 
> loop does not do hostname lookups, so does not know that the IP and hostname 
> refer to the same host.
> Relatedly, such an IP-based dfs_hosts entry results in a cosmetic problem in 
> the JSP output:  The *Node* column shows ":50010" as the nodename, with HTML 
> markup {{<a 
> href="http://:50075/browseDirectory.jsp?namenodeInfoPort=50070&amp;dir=%2F&amp;nnaddr=172.29.97.196:8020";
>  title="172.29.97.216:50010">:50010</a>}}.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to