[
https://issues.apache.org/jira/browse/HDFS-14017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16685962#comment-16685962
]
Konstantin Shvachko edited comment on HDFS-14017 at 11/14/18 1:06 AM:
----------------------------------------------------------------------
For the record, my assumptions in the comment above were incorrect based on my
recent evaluation of the state of the art. Here is how IPFailoverPP is
configured:
{code:java}
// Client uses only these two lines from core-site.xml
fs.defaultFS = virtual-address-nn.in.com:8020
dfs.client.failover.proxy.provider.virtual-address-nn.in.com =
o.a.h...IPFailoverProxyProvider
// Standard HA configuration for the NameNode in hdfs-site-xml
dfs.nameservices = my-cluster
dfs.ha.namenodes.my-cluster = nn1, nn2
dfs.namenode.rpc-address.my-cluster.nn1 = physical-address-ha1.in.com:8020
dfs.namenode.rpc-address.my-cluster.nn2 = physical-address-ha2.in.com:8020
{code}
>From HDFS-6334 I understand IPFPP was intentionally made to look like it talks
>to a single NameNode. Which looks hacky now. We have multiple NameNodes and
>the Proxy provider is in control which NN it should direct the call, so using
>NN's logical name (aka nameserviceID) seems the right way for newly developed
>proxy providers. We should still support current way for IPFPP for backward
>compatibility, so be it.
For ORPPwithIPF we still need to know virtual address for NameNode failover. I
suggest we add a new parameter for that, adding it to the config above:
{code:java}
dfs.client.failover.ipfailover.virtual-address.my-cluster =
virtual-address-nn.in.com:8020
{code}
So the ORPP part will use {{dfs.nameservices}} to obtain physical addresses of
NNs, and the IPF part will instantiate IPFPP based on
{{dfs.client.failover.ipfailover.virtual-address}} parameter.
And we can still support traditional IPFPP (without Observer) using current
{{core-site.xml}} configuration.
was (Author: shv):
For the record, my assumptions in the comment above were incorrect based on my
recent evaluation of the state of the art. Here is how IPFailoverPP is
configured:
{code:java}
// Client uses only these two lines from core-site.xml
fs.defaultFS = virtual-address-nn.in.com:8020
dfs.client.failover.proxy.provider.virtual-address-nn.in.com =
o.a.h...IPFailoverProxyProvider
// Standard HA configuration for the NameNode in hdfs-site-xml
dfs.nameservices = my-cluster
dfs.ha.namenodes.my-cluster = nn1, nn2
dfs.namenode.rpc-address.my-cluster.nn1 = physical-address-ha1.in.com:8020
dfs.namenode.rpc-address.my-cluster.nn2 = physical-address-ha2.in.com:8020
{code}
>From HDFS-6334 I understand IPFPP was intentionally made to look like it talks
>to a single node. Which looks hacky now. We have multiple NameNodes and the
>Proxy provider is in control which NN it should direct the call, so using NN's
>logical name (aka nameserviceID) seems the right way for newly developed proxy
>providers. We should still support current way for IPFPP for backward
>compatibility, so be it.
For ORPPwithIPF we still need to know virtual address for NameNode failover. I
suggest we add a new parameter for that, adding it to the config above:
{code:java}
dfs.client.failover.ipfailover.virtual-address.my-cluster =
virtual-address-nn.in.com:8020
{code}
So the ORPP part will use {{dfs.nameservices}} to obtain physical addresses of
NNs, and the IPF part will instantiate IPFPP based on
{{dfs.client.failover.ipfailover.virtual-address}} parameter.
And we can still support traditional IPFPP (without Observer) using current
{{core-site.xml}} configuration.
> ObserverReadProxyProviderWithIPFailover should work with HA configuration
> -------------------------------------------------------------------------
>
> Key: HDFS-14017
> URL: https://issues.apache.org/jira/browse/HDFS-14017
> Project: Hadoop HDFS
> Issue Type: Sub-task
> Reporter: Chen Liang
> Assignee: Chen Liang
> Priority: Major
> Attachments: HDFS-14017-HDFS-12943.001.patch,
> HDFS-14017-HDFS-12943.002.patch, HDFS-14017-HDFS-12943.003.patch,
> HDFS-14017-HDFS-12943.004.patch, HDFS-14017-HDFS-12943.005.patch,
> HDFS-14017-HDFS-12943.006.patch, HDFS-14017-HDFS-12943.008.patch,
> HDFS-14017-HDFS-12943.009.patch, HDFS-14017-HDFS-12943.010.patch
>
>
> Currently {{ObserverReadProxyProviderWithIPFailover}} extends
> {{ObserverReadProxyProvider}}, and the only difference is changing the proxy
> factory to use {{IPFailoverProxyProvider}}. However this is not enough
> because when calling constructor of {{ObserverReadProxyProvider}} in
> super(...), the follow line:
> {code:java}
> nameNodeProxies = getProxyAddresses(uri,
> HdfsClientConfigKeys.DFS_NAMENODE_RPC_ADDRESS_KEY);
> {code}
> will try to resolve the all configured NN addresses to do configured
> failover. But in the case of IPFailover, this does not really apply.
>
> A second issue closely related is about delegation token. For example, in
> current IPFailover setup, say we have a virtual host nn.xyz.com, which points
> to either of two physical nodes nn1.xyz.com or nn2.xyz.com. In current HDFS,
> there is always only one DT being exchanged, which has hostname nn.xyz.com.
> Server only issues this DT, and client only knows the host nn.xyz.com, so all
> is good. But in Observer read, even with IPFailover, the client will no
> longer contacting nn.xyz.com, but will actively reaching to nn1.xyz.com and
> nn2.xyz.com. During this process, current code will look for DT associated
> with hostname nn1.xyz.com or nn2.xyz.com, which is different from the DT
> given by NN. causing Token authentication to fail. This happens in
> {{AbstractDelegationTokenSelector#selectToken}}. New IPFailover proxy
> provider will need to resolve this as well.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]