[jira] [Commented] (HDFS-3150) Add option for clients to contact DNs via hostname in branch-1

2012-07-16 Thread Eli Collins (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13415606#comment-13415606
 ] 

Eli Collins commented on HDFS-3150:
---

Making this a top-level issue since unbreaking multihoming is really orthogonal 
to HDFS-3140. 

> Add option for clients to contact DNs via hostname in branch-1
> --
>
> Key: HDFS-3150
> URL: https://issues.apache.org/jira/browse/HDFS-3150
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: data-node, hdfs client
>Reporter: Eli Collins
>Assignee: Eli Collins
> Fix For: 1.1.0
>
> Attachments: hdfs-3150-b1.txt, hdfs-3150-b1.txt
>
>
> Per the document attached to HADOOP-8198, this is just for branch-1, and 
> unbreaks DN multihoming. The datanode can be configured to listen on a bond, 
> or all interfaces by specifying the wildcard in the dfs.datanode.*.address 
> configuration options, however per HADOOP-6867 only the source address of the 
> registration is exposed to clients. HADOOP-985 made clients access datanodes 
> by IP primarily to avoid the latency of a DNS lookup, this had the side 
> effect of breaking DN multihoming. In order to fix it let's add back the 
> option for Datanodes to be accessed by hostname. This can be done by:
> # Modifying the primary field of the Datanode descriptor to be the hostname, 
> or 
> # Modifying Client/Datanode <-> Datanode access use the hostname field 
> instead of the IP
> I'd like to go with approach #2 as it does not require making an incompatible 
> change to the client protocol, and is much less invasive. It minimizes the 
> scope of modification to just places where clients and Datanodes connect, vs 
> changing all uses of Datanode identifiers.
> New client and Datanode configuration options are introduced:
> - {{dfs.client.use.datanode.hostname}} indicates all client to datanode 
> connections should use the datanode hostname (as clients outside cluster may 
> not be able to route the IP)
> - {{dfs.datanode.use.datanode.hostname}} indicates whether Datanodes should 
> use hostnames when connecting to other Datanodes for data transfer
> If the configuration options are not used, there is no change in the current 
> behavior.
> I'm doing something similar to #1 btw in trunk in HDFS-3144 - refactoring the 
> use of DatanodeID to use the right field (IP, IP:xferPort, hostname, etc) 
> based on the context the ID is being used in, vs always using the IP:xferPort 
> as the Datanode's name, and using the name everywhere.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3150) Add option for clients to contact DNs via hostname in branch-1

2012-04-06 Thread Tsz Wo (Nicholas), SZE (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13248578#comment-13248578
 ] 

Tsz Wo (Nicholas), SZE commented on HDFS-3150:
--

Eli, I can understand that it is easy to make mistakes when getting busy.  
Simply relax and, maybe slow down a little bit.  I might have given you a hard 
time although it was not my intention.  

> Add option for clients to contact DNs via hostname in branch-1
> --
>
> Key: HDFS-3150
> URL: https://issues.apache.org/jira/browse/HDFS-3150
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: data-node, hdfs client
>Reporter: Eli Collins
>Assignee: Eli Collins
> Fix For: 1.1.0
>
> Attachments: hdfs-3150-b1.txt, hdfs-3150-b1.txt
>
>
> Per the document attached to HADOOP-8198, this is just for branch-1, and 
> unbreaks DN multihoming. The datanode can be configured to listen on a bond, 
> or all interfaces by specifying the wildcard in the dfs.datanode.*.address 
> configuration options, however per HADOOP-6867 only the source address of the 
> registration is exposed to clients. HADOOP-985 made clients access datanodes 
> by IP primarily to avoid the latency of a DNS lookup, this had the side 
> effect of breaking DN multihoming. In order to fix it let's add back the 
> option for Datanodes to be accessed by hostname. This can be done by:
> # Modifying the primary field of the Datanode descriptor to be the hostname, 
> or 
> # Modifying Client/Datanode <-> Datanode access use the hostname field 
> instead of the IP
> I'd like to go with approach #2 as it does not require making an incompatible 
> change to the client protocol, and is much less invasive. It minimizes the 
> scope of modification to just places where clients and Datanodes connect, vs 
> changing all uses of Datanode identifiers.
> New client and Datanode configuration options are introduced:
> - {{dfs.client.use.datanode.hostname}} indicates all client to datanode 
> connections should use the datanode hostname (as clients outside cluster may 
> not be able to route the IP)
> - {{dfs.datanode.use.datanode.hostname}} indicates whether Datanodes should 
> use hostnames when connecting to other Datanodes for data transfer
> If the configuration options are not used, there is no change in the current 
> behavior.
> I'm doing something similar to #1 btw in trunk in HDFS-3144 - refactoring the 
> use of DatanodeID to use the right field (IP, IP:xferPort, hostname, etc) 
> based on the context the ID is being used in, vs always using the IP:xferPort 
> as the Datanode's name, and using the name everywhere.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3150) Add option for clients to contact DNs via hostname in branch-1

2012-04-05 Thread Eli Collins (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13248014#comment-13248014
 ] 

Eli Collins commented on HDFS-3150:
---

Nicholas, this was a simple misunderstanding, Todd was +1 modulo the variable 
name and log message, I thought he had actually +1'd on the jira but was 
mistaken (I've had a lot of patches in flight recently). We obviously intend to 
honor the bylaws.

> Add option for clients to contact DNs via hostname in branch-1
> --
>
> Key: HDFS-3150
> URL: https://issues.apache.org/jira/browse/HDFS-3150
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: data-node, hdfs client
>Reporter: Eli Collins
>Assignee: Eli Collins
> Fix For: 1.1.0
>
> Attachments: hdfs-3150-b1.txt, hdfs-3150-b1.txt
>
>
> Per the document attached to HADOOP-8198, this is just for branch-1, and 
> unbreaks DN multihoming. The datanode can be configured to listen on a bond, 
> or all interfaces by specifying the wildcard in the dfs.datanode.*.address 
> configuration options, however per HADOOP-6867 only the source address of the 
> registration is exposed to clients. HADOOP-985 made clients access datanodes 
> by IP primarily to avoid the latency of a DNS lookup, this had the side 
> effect of breaking DN multihoming. In order to fix it let's add back the 
> option for Datanodes to be accessed by hostname. This can be done by:
> # Modifying the primary field of the Datanode descriptor to be the hostname, 
> or 
> # Modifying Client/Datanode <-> Datanode access use the hostname field 
> instead of the IP
> I'd like to go with approach #2 as it does not require making an incompatible 
> change to the client protocol, and is much less invasive. It minimizes the 
> scope of modification to just places where clients and Datanodes connect, vs 
> changing all uses of Datanode identifiers.
> New client and Datanode configuration options are introduced:
> - {{dfs.client.use.datanode.hostname}} indicates all client to datanode 
> connections should use the datanode hostname (as clients outside cluster may 
> not be able to route the IP)
> - {{dfs.datanode.use.datanode.hostname}} indicates whether Datanodes should 
> use hostnames when connecting to other Datanodes for data transfer
> If the configuration options are not used, there is no change in the current 
> behavior.
> I'm doing something similar to #1 btw in trunk in HDFS-3144 - refactoring the 
> use of DatanodeID to use the right field (IP, IP:xferPort, hostname, etc) 
> based on the context the ID is being used in, vs always using the IP:xferPort 
> as the Datanode's name, and using the name everywhere.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3150) Add option for clients to contact DNs via hostname in branch-1

2012-04-05 Thread Tsz Wo (Nicholas), SZE (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13247955#comment-13247955
 ] 

Tsz Wo (Nicholas), SZE commented on HDFS-3150:
--

Eli, this is not a question on the quality of the patch but whether we should 
honor the bylaws.

> Add option for clients to contact DNs via hostname in branch-1
> --
>
> Key: HDFS-3150
> URL: https://issues.apache.org/jira/browse/HDFS-3150
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: data-node, hdfs client
>Reporter: Eli Collins
>Assignee: Eli Collins
> Fix For: 1.1.0
>
> Attachments: hdfs-3150-b1.txt, hdfs-3150-b1.txt
>
>
> Per the document attached to HADOOP-8198, this is just for branch-1, and 
> unbreaks DN multihoming. The datanode can be configured to listen on a bond, 
> or all interfaces by specifying the wildcard in the dfs.datanode.*.address 
> configuration options, however per HADOOP-6867 only the source address of the 
> registration is exposed to clients. HADOOP-985 made clients access datanodes 
> by IP primarily to avoid the latency of a DNS lookup, this had the side 
> effect of breaking DN multihoming. In order to fix it let's add back the 
> option for Datanodes to be accessed by hostname. This can be done by:
> # Modifying the primary field of the Datanode descriptor to be the hostname, 
> or 
> # Modifying Client/Datanode <-> Datanode access use the hostname field 
> instead of the IP
> I'd like to go with approach #2 as it does not require making an incompatible 
> change to the client protocol, and is much less invasive. It minimizes the 
> scope of modification to just places where clients and Datanodes connect, vs 
> changing all uses of Datanode identifiers.
> New client and Datanode configuration options are introduced:
> - {{dfs.client.use.datanode.hostname}} indicates all client to datanode 
> connections should use the datanode hostname (as clients outside cluster may 
> not be able to route the IP)
> - {{dfs.datanode.use.datanode.hostname}} indicates whether Datanodes should 
> use hostnames when connecting to other Datanodes for data transfer
> If the configuration options are not used, there is no change in the current 
> behavior.
> I'm doing something similar to #1 btw in trunk in HDFS-3144 - refactoring the 
> use of DatanodeID to use the right field (IP, IP:xferPort, hostname, etc) 
> based on the context the ID is being used in, vs always using the IP:xferPort 
> as the Datanode's name, and using the name everywhere.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3150) Add option for clients to contact DNs via hostname in branch-1

2012-04-05 Thread Eli Collins (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13247362#comment-13247362
 ] 

Eli Collins commented on HDFS-3150:
---

@Suresh, @Nicholas - if you have a specific suggestion for something that needs 
to be addressed in this patch let me know.

> Add option for clients to contact DNs via hostname in branch-1
> --
>
> Key: HDFS-3150
> URL: https://issues.apache.org/jira/browse/HDFS-3150
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: data-node, hdfs client
>Reporter: Eli Collins
>Assignee: Eli Collins
> Fix For: 1.1.0
>
> Attachments: hdfs-3150-b1.txt, hdfs-3150-b1.txt
>
>
> Per the document attached to HADOOP-8198, this is just for branch-1, and 
> unbreaks DN multihoming. The datanode can be configured to listen on a bond, 
> or all interfaces by specifying the wildcard in the dfs.datanode.*.address 
> configuration options, however per HADOOP-6867 only the source address of the 
> registration is exposed to clients. HADOOP-985 made clients access datanodes 
> by IP primarily to avoid the latency of a DNS lookup, this had the side 
> effect of breaking DN multihoming. In order to fix it let's add back the 
> option for Datanodes to be accessed by hostname. This can be done by:
> # Modifying the primary field of the Datanode descriptor to be the hostname, 
> or 
> # Modifying Client/Datanode <-> Datanode access use the hostname field 
> instead of the IP
> I'd like to go with approach #2 as it does not require making an incompatible 
> change to the client protocol, and is much less invasive. It minimizes the 
> scope of modification to just places where clients and Datanodes connect, vs 
> changing all uses of Datanode identifiers.
> New client and Datanode configuration options are introduced:
> - {{dfs.client.use.datanode.hostname}} indicates all client to datanode 
> connections should use the datanode hostname (as clients outside cluster may 
> not be able to route the IP)
> - {{dfs.datanode.use.datanode.hostname}} indicates whether Datanodes should 
> use hostnames when connecting to other Datanodes for data transfer
> If the configuration options are not used, there is no change in the current 
> behavior.
> I'm doing something similar to #1 btw in trunk in HDFS-3144 - refactoring the 
> use of DatanodeID to use the right field (IP, IP:xferPort, hostname, etc) 
> based on the context the ID is being used in, vs always using the IP:xferPort 
> as the Datanode's name, and using the name everywhere.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3150) Add option for clients to contact DNs via hostname in branch-1

2012-04-05 Thread Tsz Wo (Nicholas), SZE (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13247326#comment-13247326
 ] 

Tsz Wo (Nicholas), SZE commented on HDFS-3150:
--

I did not know that the +1 could come after commit.

> Add option for clients to contact DNs via hostname in branch-1
> --
>
> Key: HDFS-3150
> URL: https://issues.apache.org/jira/browse/HDFS-3150
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: data-node, hdfs client
>Reporter: Eli Collins
>Assignee: Eli Collins
> Fix For: 1.1.0
>
> Attachments: hdfs-3150-b1.txt, hdfs-3150-b1.txt
>
>
> Per the document attached to HADOOP-8198, this is just for branch-1, and 
> unbreaks DN multihoming. The datanode can be configured to listen on a bond, 
> or all interfaces by specifying the wildcard in the dfs.datanode.*.address 
> configuration options, however per HADOOP-6867 only the source address of the 
> registration is exposed to clients. HADOOP-985 made clients access datanodes 
> by IP primarily to avoid the latency of a DNS lookup, this had the side 
> effect of breaking DN multihoming. In order to fix it let's add back the 
> option for Datanodes to be accessed by hostname. This can be done by:
> # Modifying the primary field of the Datanode descriptor to be the hostname, 
> or 
> # Modifying Client/Datanode <-> Datanode access use the hostname field 
> instead of the IP
> I'd like to go with approach #2 as it does not require making an incompatible 
> change to the client protocol, and is much less invasive. It minimizes the 
> scope of modification to just places where clients and Datanodes connect, vs 
> changing all uses of Datanode identifiers.
> New client and Datanode configuration options are introduced:
> - {{dfs.client.use.datanode.hostname}} indicates all client to datanode 
> connections should use the datanode hostname (as clients outside cluster may 
> not be able to route the IP)
> - {{dfs.datanode.use.datanode.hostname}} indicates whether Datanodes should 
> use hostnames when connecting to other Datanodes for data transfer
> If the configuration options are not used, there is no change in the current 
> behavior.
> I'm doing something similar to #1 btw in trunk in HDFS-3144 - refactoring the 
> use of DatanodeID to use the right field (IP, IP:xferPort, hostname, etc) 
> based on the context the ID is being used in, vs always using the IP:xferPort 
> as the Datanode's name, and using the name everywhere.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3150) Add option for clients to contact DNs via hostname in branch-1

2012-04-05 Thread Todd Lipcon (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13247071#comment-13247071
 ] 

Todd Lipcon commented on HDFS-3150:
---

Sorry, I should have said "+1 assuming these changes are addressed" in my above 
comment. Since Eli addressed my comments, here's my official +1 for the patch.

> Add option for clients to contact DNs via hostname in branch-1
> --
>
> Key: HDFS-3150
> URL: https://issues.apache.org/jira/browse/HDFS-3150
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: data-node, hdfs client
>Reporter: Eli Collins
>Assignee: Eli Collins
> Fix For: 1.1.0
>
> Attachments: hdfs-3150-b1.txt, hdfs-3150-b1.txt
>
>
> Per the document attached to HADOOP-8198, this is just for branch-1, and 
> unbreaks DN multihoming. The datanode can be configured to listen on a bond, 
> or all interfaces by specifying the wildcard in the dfs.datanode.*.address 
> configuration options, however per HADOOP-6867 only the source address of the 
> registration is exposed to clients. HADOOP-985 made clients access datanodes 
> by IP primarily to avoid the latency of a DNS lookup, this had the side 
> effect of breaking DN multihoming. In order to fix it let's add back the 
> option for Datanodes to be accessed by hostname. This can be done by:
> # Modifying the primary field of the Datanode descriptor to be the hostname, 
> or 
> # Modifying Client/Datanode <-> Datanode access use the hostname field 
> instead of the IP
> I'd like to go with approach #2 as it does not require making an incompatible 
> change to the client protocol, and is much less invasive. It minimizes the 
> scope of modification to just places where clients and Datanodes connect, vs 
> changing all uses of Datanode identifiers.
> New client and Datanode configuration options are introduced:
> - {{dfs.client.use.datanode.hostname}} indicates all client to datanode 
> connections should use the datanode hostname (as clients outside cluster may 
> not be able to route the IP)
> - {{dfs.datanode.use.datanode.hostname}} indicates whether Datanodes should 
> use hostnames when connecting to other Datanodes for data transfer
> If the configuration options are not used, there is no change in the current 
> behavior.
> I'm doing something similar to #1 btw in trunk in HDFS-3144 - refactoring the 
> use of DatanodeID to use the right field (IP, IP:xferPort, hostname, etc) 
> based on the context the ID is being used in, vs always using the IP:xferPort 
> as the Datanode's name, and using the name everywhere.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3150) Add option for clients to contact DNs via hostname in branch-1

2012-04-04 Thread Suresh Srinivas (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13247020#comment-13247020
 ] 

Suresh Srinivas commented on HDFS-3150:
---

Given there are some discussions happening around +1s from committer, it is 
probably a good idea to wait for +1. Should we also keep release manager posted 
about this change? I generally post an email to hdfs/common dev about this kind 
of changes.

> Add option for clients to contact DNs via hostname in branch-1
> --
>
> Key: HDFS-3150
> URL: https://issues.apache.org/jira/browse/HDFS-3150
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: data-node, hdfs client
>Reporter: Eli Collins
>Assignee: Eli Collins
> Fix For: 1.1.0
>
> Attachments: hdfs-3150-b1.txt, hdfs-3150-b1.txt
>
>
> Per the document attached to HADOOP-8198, this is just for branch-1, and 
> unbreaks DN multihoming. The datanode can be configured to listen on a bond, 
> or all interfaces by specifying the wildcard in the dfs.datanode.*.address 
> configuration options, however per HADOOP-6867 only the source address of the 
> registration is exposed to clients. HADOOP-985 made clients access datanodes 
> by IP primarily to avoid the latency of a DNS lookup, this had the side 
> effect of breaking DN multihoming. In order to fix it let's add back the 
> option for Datanodes to be accessed by hostname. This can be done by:
> # Modifying the primary field of the Datanode descriptor to be the hostname, 
> or 
> # Modifying Client/Datanode <-> Datanode access use the hostname field 
> instead of the IP
> I'd like to go with approach #2 as it does not require making an incompatible 
> change to the client protocol, and is much less invasive. It minimizes the 
> scope of modification to just places where clients and Datanodes connect, vs 
> changing all uses of Datanode identifiers.
> New client and Datanode configuration options are introduced:
> - {{dfs.client.use.datanode.hostname}} indicates all client to datanode 
> connections should use the datanode hostname (as clients outside cluster may 
> not be able to route the IP)
> - {{dfs.datanode.use.datanode.hostname}} indicates whether Datanodes should 
> use hostnames when connecting to other Datanodes for data transfer
> If the configuration options are not used, there is no change in the current 
> behavior.
> I'm doing something similar to #1 btw in trunk in HDFS-3144 - refactoring the 
> use of DatanodeID to use the right field (IP, IP:xferPort, hostname, etc) 
> based on the context the ID is being used in, vs always using the IP:xferPort 
> as the Datanode's name, and using the name everywhere.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3150) Add option for clients to contact DNs via hostname in branch-1

2012-03-31 Thread Todd Lipcon (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13243614#comment-13243614
 ] 

Todd Lipcon commented on HDFS-3150:
---

Mostly looks good, just some nits:

{code}
+LOG.info("Opened streaming server at " + tmpPort);
{code}
This isn't the terminology used elsewhere. "Data transfer server" or "data 
transceiver server" is better


{code}
 // Connect to backup machine
+final String dnName = targets[0].getName(connectToDnViaHostname);
{code}
I think better to call this {{mirrorName}} or {{mirrorAddrString}}


{code}
+  final String dnName = proxySource.getName(connectToDnViaHostname);
+  InetSocketAddress proxyAddr = NetUtils.createSocketAddr(dnName);
{code}
Similar here -- {{proxyDnName}} or {{proxyAddrString}}


> Add option for clients to contact DNs via hostname in branch-1
> --
>
> Key: HDFS-3150
> URL: https://issues.apache.org/jira/browse/HDFS-3150
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: data-node, hdfs client
>Reporter: Eli Collins
>Assignee: Eli Collins
> Attachments: hdfs-3150-b1.txt
>
>
> Per the document attached to HADOOP-8198, this is just for branch-1, and 
> unbreaks DN multihoming. The datanode can be configured to listen on a bond, 
> or all interfaces by specifying the wildcard in the dfs.datanode.*.address 
> configuration options, however per HADOOP-6867 only the source address of the 
> registration is exposed to clients. HADOOP-985 made clients access datanodes 
> by IP primarily to avoid the latency of a DNS lookup, this had the side 
> effect of breaking DN multihoming. In order to fix it let's add back the 
> option for Datanodes to be accessed by hostname. This can be done by:
> # Modifying the primary field of the Datanode descriptor to be the hostname, 
> or 
> # Modifying Client/Datanode <-> Datanode access use the hostname field 
> instead of the IP
> I'd like to go with approach #2 as it does not require making an incompatible 
> change to the client protocol, and is much less invasive. It minimizes the 
> scope of modification to just places where clients and Datanodes connect, vs 
> changing all uses of Datanode identifiers.
> New client and Datanode configuration options are introduced:
> - {{dfs.client.use.datanode.hostname}} indicates all client to datanode 
> connections should use the datanode hostname (as clients outside cluster may 
> not be able to route the IP)
> - {{dfs.datanode.use.datanode.hostname}} indicates whether Datanodes should 
> use hostnames when connecting to other Datanodes for data transfer
> If the configuration options are not used, there is no change in the current 
> behavior.
> I'm doing something similar to #1 btw in trunk in HDFS-3144 - refactoring the 
> use of DatanodeID to use the right field (IP, IP:xferPort, hostname, etc) 
> based on the context the ID is being used in, vs always using the IP:xferPort 
> as the Datanode's name, and using the name everywhere.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira