[jira] [Commented] (HDFS-5208) Only clear network location cache on specific nodes if invalid NetworkTopology happens

2013-09-25 Thread Junping Du (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13778257#comment-13778257
 ] 

Junping Du commented on HDFS-5208:
--

Thanks Colin for review the patch again. The Jenkins test is finished and good 
now.

 Only clear network location cache on specific nodes if invalid 
 NetworkTopology happens
 --

 Key: HDFS-5208
 URL: https://issues.apache.org/jira/browse/HDFS-5208
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Junping Du
Assignee: Junping Du
 Attachments: HDFS-5208-v1.patch, HDFS-5208-v2.patch


 After HDFS-4521, once a DN is registered with invalid networktopology, all 
 cached rack info in DNSToSwitchMapping will be cleared. We should only clear 
 cache on specific nodes.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-5208) Only clear network location cache on specific nodes if invalid NetworkTopology happens

2013-09-24 Thread Junping Du (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13776086#comment-13776086
 ] 

Junping Du commented on HDFS-5208:
--

Hi Colin, I updated my new comments in HDFS-5237 and thanks for your comments 
there. IMO, HDFS-5237 shouldn't be a blocking jira for this as no registration 
name (only IP) going to cache backed by CachedDNSToSwitchMapping because of 
following code in DatanodeManager.resolveNetworkLocation (DatanodeID).
{code}
if (dnsToSwitchMapping instanceof CachedDNSToSwitchMapping) {
  names.add(node.getIpAddr());
} else {
  names.add(node.getHostName());
}
{code}
What do you think?

 Only clear network location cache on specific nodes if invalid 
 NetworkTopology happens
 --

 Key: HDFS-5208
 URL: https://issues.apache.org/jira/browse/HDFS-5208
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Junping Du
Assignee: Junping Du
 Attachments: HDFS-5208-v1.patch


 After HDFS-4521, once a DN is registered with invalid networktopology, all 
 cached rack info in DNSToSwitchMapping will be cleared. We should only clear 
 cache on specific nodes.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-5208) Only clear network location cache on specific nodes if invalid NetworkTopology happens

2013-09-24 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13776547#comment-13776547
 ] 

Colin Patrick McCabe commented on HDFS-5208:


{code}
-  dnsToSwitchMapping.reloadCachedMappings();
+  ListString invalidNodeNames = new ArrayListString(1);
+  // clear cache for nodes in IP or Hostname
+  invalidNodeNames.add(nodeReg.getIpAddr());
+  invalidNodeNames.add(nodeReg.getHostName());
+  dnsToSwitchMapping.reloadCachedMappings(invalidNodeNames);
{code}

Can we also add something like this?
{code}
+  invalidNodeNames.add(nodeReg.getPeerHostName());
{code}

It seems like the datanode could be known by any one of those three: IP 
address, registration name, or hostname.

After that change, I don't see any reason why this shouldn't work.  What kind 
of testing have you done?

 Only clear network location cache on specific nodes if invalid 
 NetworkTopology happens
 --

 Key: HDFS-5208
 URL: https://issues.apache.org/jira/browse/HDFS-5208
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Junping Du
Assignee: Junping Du
 Attachments: HDFS-5208-v1.patch


 After HDFS-4521, once a DN is registered with invalid networktopology, all 
 cached rack info in DNSToSwitchMapping will be cleared. We should only clear 
 cache on specific nodes.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-5208) Only clear network location cache on specific nodes if invalid NetworkTopology happens

2013-09-24 Thread Junping Du (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13777004#comment-13777004
 ] 

Junping Du commented on HDFS-5208:
--

Thanks Colin for review. v2 patch incorporate your comments. For test, it works 
well to clean up cache for node (in registration name) registering with fault 
topology in TestNetworkTopology.

 Only clear network location cache on specific nodes if invalid 
 NetworkTopology happens
 --

 Key: HDFS-5208
 URL: https://issues.apache.org/jira/browse/HDFS-5208
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Junping Du
Assignee: Junping Du
 Attachments: HDFS-5208-v1.patch, HDFS-5208-v2.patch


 After HDFS-4521, once a DN is registered with invalid networktopology, all 
 cached rack info in DNSToSwitchMapping will be cleared. We should only clear 
 cache on specific nodes.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-5208) Only clear network location cache on specific nodes if invalid NetworkTopology happens

2013-09-24 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13777113#comment-13777113
 ] 

Hadoop QA commented on HDFS-5208:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12604930/HDFS-5208-v2.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 5 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/5032//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/5032//console

This message is automatically generated.

 Only clear network location cache on specific nodes if invalid 
 NetworkTopology happens
 --

 Key: HDFS-5208
 URL: https://issues.apache.org/jira/browse/HDFS-5208
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Junping Du
Assignee: Junping Du
 Attachments: HDFS-5208-v1.patch, HDFS-5208-v2.patch


 After HDFS-4521, once a DN is registered with invalid networktopology, all 
 cached rack info in DNSToSwitchMapping will be cleared. We should only clear 
 cache on specific nodes.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-5208) Only clear network location cache on specific nodes if invalid NetworkTopology happens

2013-09-20 Thread Junping Du (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13773076#comment-13773076
 ] 

Junping Du commented on HDFS-5208:
--

Filed HDFS-5237 to track node registration name issue and upload with a demo 
patch. However, as demo patch shows, lots of tests are affected by ignoring 
specified host name in MiniDFSCluster. Hi [~cmccabe], from your above comments 
it seems that we had some discussions before on fake up having different 
hostnames in simpler ways. It is great if you can share more info on this.

 Only clear network location cache on specific nodes if invalid 
 NetworkTopology happens
 --

 Key: HDFS-5208
 URL: https://issues.apache.org/jira/browse/HDFS-5208
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Junping Du
Assignee: Junping Du
 Attachments: HDFS-5208-v1.patch


 After HDFS-4521, once a DN is registered with invalid networktopology, all 
 cached rack info in DNSToSwitchMapping will be cleared. We should only clear 
 cache on specific nodes.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-5208) Only clear network location cache on specific nodes if invalid NetworkTopology happens

2013-09-19 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13772100#comment-13772100
 ] 

Colin Patrick McCabe commented on HDFS-5208:


Sounds good to me.

 Only clear network location cache on specific nodes if invalid 
 NetworkTopology happens
 --

 Key: HDFS-5208
 URL: https://issues.apache.org/jira/browse/HDFS-5208
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Junping Du
Assignee: Junping Du
 Attachments: HDFS-5208-v1.patch


 After HDFS-4521, once a DN is registered with invalid networktopology, all 
 cached rack info in DNSToSwitchMapping will be cleared. We should only clear 
 cache on specific nodes.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-5208) Only clear network location cache on specific nodes if invalid NetworkTopology happens

2013-09-18 Thread Junping Du (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13770910#comment-13770910
 ] 

Junping Du commented on HDFS-5208:
--

Hi Colin, I think you are right that DatanodeID is created in DN heartbeat to 
NN for registration and its hostName comes from conf of dfs.datanode.hostname 
which can be any style but DNS name if this config is not setting.
However, following code in resolveNetworkLocation() called by 
DatanodeManager.registerDatanode() make only IPs are cached through DN 
registration. Isn't it? 
{code}
if (dnsToSwitchMapping instanceof CachedDNSToSwitchMapping) {
  names.add(node.getIpAddr());
} else {
  names.add(node.getHostName());
}
{code} 
Actually, now I am worrying about non-cached case, as even topology script can 
resolve user-specified hostName to correct network location (rack) properly 
and use it to register into networktopology tree. Later, it still need to 
resolve topology based on nodes' IP (like in 
DatanodeManager.sortLocatedBlocks()) which means script must contains both 
user-specified hostName and IP for each node. IMO, This is really unnecessary 
and confusing. Thoughts?

 Only clear network location cache on specific nodes if invalid 
 NetworkTopology happens
 --

 Key: HDFS-5208
 URL: https://issues.apache.org/jira/browse/HDFS-5208
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Junping Du
Assignee: Junping Du
 Attachments: HDFS-5208-v1.patch


 After HDFS-4521, once a DN is registered with invalid networktopology, all 
 cached rack info in DNSToSwitchMapping will be cleared. We should only clear 
 cache on specific nodes.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-5208) Only clear network location cache on specific nodes if invalid NetworkTopology happens

2013-09-18 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13771504#comment-13771504
 ] 

Colin Patrick McCabe commented on HDFS-5208:


Unfortunately, {{DatanodeID#getHostName}} doesn't actually return the hostname. 
 It returns {{DatanodeID#hostName}}, which is either the registration name (if 
it was specified) or the hostname (if it was not.)

I think this feature is mainly used in unit tests to fake up having different 
hostnames-- something we could probably do this in a much simpler way.  We've 
discussed creating a JIRA to remove it before-- maybe it's time.

 Only clear network location cache on specific nodes if invalid 
 NetworkTopology happens
 --

 Key: HDFS-5208
 URL: https://issues.apache.org/jira/browse/HDFS-5208
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Junping Du
Assignee: Junping Du
 Attachments: HDFS-5208-v1.patch


 After HDFS-4521, once a DN is registered with invalid networktopology, all 
 cached rack info in DNSToSwitchMapping will be cleared. We should only clear 
 cache on specific nodes.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-5208) Only clear network location cache on specific nodes if invalid NetworkTopology happens

2013-09-18 Thread Junping Du (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13771529#comment-13771529
 ] 

Junping Du commented on HDFS-5208:
--

Thanks for comments, Colin!
bq. Unfortunately, DatanodeID#getHostName doesn't actually return the hostname. 
It returns DatanodeID#hostName, which is either the registration name (if it 
was specified) or the hostname (if it was not.)
So, we are saying the same thing - DatanodeID is created in DN heartbeat to NN 
for registration and its hostName comes from conf of dfs.datanode.hostname 
which can be any style but DNS name if this config is not setting. - Isn't it? 
:)
bq. I think this feature is mainly used in unit tests to fake up having 
different hostnames-- something we could probably do this in a much simpler 
way. We've discussed creating a JIRA to remove it before-- maybe it's time.
I agree. Will file separated Jira and work on it later. So we may just do 
simply two things below:
1. remove config of dfs.datanode.hostname and all usage place.
2. make sure DatanodeID#hostname is its hostname (DNS name).
Anything else to address?


 Only clear network location cache on specific nodes if invalid 
 NetworkTopology happens
 --

 Key: HDFS-5208
 URL: https://issues.apache.org/jira/browse/HDFS-5208
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Junping Du
Assignee: Junping Du
 Attachments: HDFS-5208-v1.patch


 After HDFS-4521, once a DN is registered with invalid networktopology, all 
 cached rack info in DNSToSwitchMapping will be cleared. We should only clear 
 cache on specific nodes.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-5208) Only clear network location cache on specific nodes if invalid NetworkTopology happens

2013-09-17 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13769605#comment-13769605
 ] 

Colin Patrick McCabe commented on HDFS-5208:


If you look in DatanodeID, a base class of DatanodeRegistration, you'll find 
this:
{code}
  private String ipAddr; // IP address
  private String hostName;   // hostname claimed by datanode
  private String peerHostName; // hostname from the actual connection
{code}

{{ipAddr}} is the IP address as a string.  {{hostName}} is the registration 
name.  {{peerHostName}} is the hostname.

The registration name can't be resolved by DNS.  In fact, it is completely 
fake.  But will be added to {{DNSToSwitchMapping}} if a datanode asks for it to 
be added.

So really you need to clear all of them, plus all of them including a colon and 
port.

 Only clear network location cache on specific nodes if invalid 
 NetworkTopology happens
 --

 Key: HDFS-5208
 URL: https://issues.apache.org/jira/browse/HDFS-5208
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Junping Du
Assignee: Junping Du
 Attachments: HDFS-5208-v1.patch


 After HDFS-4521, once a DN is registered with invalid networktopology, all 
 cached rack info in DNSToSwitchMapping will be cleared. We should only clear 
 cache on specific nodes.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-5208) Only clear network location cache on specific nodes if invalid NetworkTopology happens

2013-09-16 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13768510#comment-13768510
 ] 

Colin Patrick McCabe commented on HDFS-5208:


It's a little more complicated than just clearing IP address and hostname.

Nodes can be stored by registration name, hostname:port, hostname, IP 
address, IP address:port.

It would be nice to file a JIRA to get rid of registration name.

 Only clear network location cache on specific nodes if invalid 
 NetworkTopology happens
 --

 Key: HDFS-5208
 URL: https://issues.apache.org/jira/browse/HDFS-5208
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Junping Du
Assignee: Junping Du
 Attachments: HDFS-5208-v1.patch


 After HDFS-4521, once a DN is registered with invalid networktopology, all 
 cached rack info in DNSToSwitchMapping will be cleared. We should only clear 
 cache on specific nodes.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-5208) Only clear network location cache on specific nodes if invalid NetworkTopology happens

2013-09-16 Thread Junping Du (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13769013#comment-13769013
 ] 

Junping Du commented on HDFS-5208:
--

Thanks for comments, Colin! I checked again on all callers of 
DNSToSwitchMapping.resolve() but didn't find anywhere to resolve a registration 
name (the only way to fill the cache). Also, DN registration will call 
resolveNetworkLocation() in DatanodeManager which only resolve IpAddr or 
Hostname, Am I missing something here?

 Only clear network location cache on specific nodes if invalid 
 NetworkTopology happens
 --

 Key: HDFS-5208
 URL: https://issues.apache.org/jira/browse/HDFS-5208
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Junping Du
Assignee: Junping Du
 Attachments: HDFS-5208-v1.patch


 After HDFS-4521, once a DN is registered with invalid networktopology, all 
 cached rack info in DNSToSwitchMapping will be cleared. We should only clear 
 cache on specific nodes.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-5208) Only clear network location cache on specific nodes if invalid NetworkTopology happens

2013-09-14 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13767653#comment-13767653
 ] 

Hadoop QA commented on HDFS-5208:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12603196/HDFS-5208-v1.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 5 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/4974//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/4974//console

This message is automatically generated.

 Only clear network location cache on specific nodes if invalid 
 NetworkTopology happens
 --

 Key: HDFS-5208
 URL: https://issues.apache.org/jira/browse/HDFS-5208
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Junping Du
Assignee: Junping Du
 Attachments: HDFS-5208-v1.patch


 After HDFS-4521, once a DN is registered with invalid networktopology, all 
 cached rack info in DNSToSwitchMapping will be cleared. We should only clear 
 cache on specific nodes.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira