[GitHub] [hadoop] goiri commented on a change in pull request #1480: HDFS-14857 FS operations fail in HA mode: DataNode fails to connect to NameNode

2019-10-07 Thread GitBox
goiri commented on a change in pull request #1480: HDFS-14857 FS operations 
fail in HA mode: DataNode fails to connect to NameNode
URL: https://github.com/apache/hadoop/pull/1480#discussion_r331732619
 
 

 ##
 File path: 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/server/namenode/ha/ConfiguredFailoverProxyProvider.java
 ##
 @@ -93,4 +92,34 @@ public synchronized void close() throws IOException {
   public boolean useLogicalURI() {
 return true;
   }
+
+  /**
+   * Resets the NameNode proxy address in case it's stale
+   */
+  protected void resetProxyAddress(List> proxies, int index) {
+try {
+  stopProxy(proxies.get(index).proxy);
+  InetSocketAddress oldAddress = proxies.get(index).getAddress();
+  InetSocketAddress address = NetUtils.createSocketAddr(
+  oldAddress.getHostName() + ":" + oldAddress.getPort());
+  LOG.debug("oldAddress {}, newAddress {}", oldAddress, address);
+  proxies.set(index, new NNProxyInfo(address));
+} catch (Exception e) {
+  throw new RuntimeException("Could not refresh NN address", e);
 
 Review comment:
   In particular this runtime exception. Can we test it?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] goiri commented on a change in pull request #1480: HDFS-14857 FS operations fail in HA mode: DataNode fails to connect to NameNode

2019-10-07 Thread GitBox
goiri commented on a change in pull request #1480: HDFS-14857 FS operations 
fail in HA mode: DataNode fails to connect to NameNode
URL: https://github.com/apache/hadoop/pull/1480#discussion_r331732613
 
 

 ##
 File path: 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/server/namenode/ha/ConfiguredFailoverProxyProvider.java
 ##
 @@ -93,4 +92,34 @@ public synchronized void close() throws IOException {
   public boolean useLogicalURI() {
 return true;
   }
+
+  /**
+   * Resets the NameNode proxy address in case it's stale
+   */
+  protected void resetProxyAddress(List> proxies, int index) {
+try {
+  stopProxy(proxies.get(index).proxy);
+  InetSocketAddress oldAddress = proxies.get(index).getAddress();
+  InetSocketAddress address = NetUtils.createSocketAddr(
+  oldAddress.getHostName() + ":" + oldAddress.getPort());
+  LOG.debug("oldAddress {}, newAddress {}", oldAddress, address);
+  proxies.set(index, new NNProxyInfo(address));
+} catch (Exception e) {
+  throw new RuntimeException("Could not refresh NN address", e);
+}
+  }
+
+  protected void stopProxy(T proxy) {
+if (proxy != null) {
+  if (proxy instanceof Closeable) {
+try {
+  ((Closeable)proxy).close();
+} catch(IOException e) {
+  throw new RuntimeException("Could not close proxy", e);
 
 Review comment:
   Do we want to fail the whole thing? We should have a unit test for this.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] goiri commented on a change in pull request #1480: HDFS-14857 FS operations fail in HA mode: DataNode fails to connect to NameNode

2019-10-04 Thread GitBox
goiri commented on a change in pull request #1480: HDFS-14857 FS operations 
fail in HA mode: DataNode fails to connect to NameNode
URL: https://github.com/apache/hadoop/pull/1480#discussion_r331732588
 
 

 ##
 File path: 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/server/namenode/ha/ConfiguredFailoverProxyProvider.java
 ##
 @@ -61,7 +63,10 @@ public ConfiguredFailoverProxyProvider(Configuration conf, 
URI uri,
   }
 
   @Override
-  public  void performFailover(T currentProxy) {
+  public void performFailover(T currentProxy) {
+//reset the IP address in case  the stale IP was the cause for failover
+LOG.info("Resetting cached proxy: " + currentProxyIndex);
 
 Review comment:
   Use logger style with {}


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] goiri commented on a change in pull request #1480: HDFS-14857 FS operations fail in HA mode: DataNode fails to connect to NameNode

2019-10-04 Thread GitBox
goiri commented on a change in pull request #1480: HDFS-14857 FS operations 
fail in HA mode: DataNode fails to connect to NameNode
URL: https://github.com/apache/hadoop/pull/1480#discussion_r331732560
 
 

 ##
 File path: 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSClientFailover.java
 ##
 @@ -398,4 +402,50 @@ public void testIPFailoverProxyProviderLogicalUri() 
throws Exception {
 HAUtil.useLogicalUri(config, nnUri));
   }
 
+  /**
+   * Test HDFS-14857 FS operations fail in HA mode: DataNode fails to connect 
to NameNode
+   */
+  @Test
+  public void testIpAddressResetOnPerformFailover() throws Exception {
+NetUtilsTestResolver resolver = NetUtilsTestResolver.install();
+resolver.addResolvedHost("nn1.b.", "1.1.1.1");
+resolver.addResolvedHost("nn2.b.", "2.2.2.2");
+conf.set(HdfsClientConfigKeys.DFS_NAMESERVICES, "nmnode-0");
+conf.set("dfs.ha.namenodes.nmnode-0", "nn1,nn2");
+conf.set("dfs.namenode.rpc-address.nmnode-0.nn1", "nn1.b:9000");
+conf.set("dfs.namenode.rpc-address.nmnode-0.nn2", "nn2.b:9000");
+conf.set("dfs.client.failover.proxy.provider.nmnode-0",
+
"org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider");
+
+URI uri  = new URI("hdfs://nmnode-0");
+ConfiguredFailoverProxyProvider proxyProvider = 
(ConfiguredFailoverProxyProvider)NameNodeProxiesClient
+.createFailoverProxyProvider(conf, uri, ClientProtocol.class,
+false, null);
+assertNotNull(proxyProvider);
+
+AbstractNNFailoverProxyProvider.NNProxyInfo proxyInfo =
+
(AbstractNNFailoverProxyProvider.NNProxyInfo)proxyProvider.getProxy();
+assertNotNull(proxyInfo);
+assertEquals(proxyInfo.getAddress().getAddress(), 
resolver.getByExactName(proxyInfo.getAddress().getHostName()));
 
 Review comment:
   I triggered Yetus in the JIRA, there are a bunch of style check failure here 
(e.g., >80 characters).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] goiri commented on a change in pull request #1480: HDFS-14857 FS operations fail in HA mode: DataNode fails to connect to NameNode

2019-09-20 Thread GitBox
goiri commented on a change in pull request #1480: HDFS-14857 FS operations 
fail in HA mode: DataNode fails to connect to NameNode
URL: https://github.com/apache/hadoop/pull/1480#discussion_r326462143
 
 

 ##
 File path: 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/server/namenode/ha/ConfiguredFailoverProxyProvider.java
 ##
 @@ -62,6 +64,11 @@ public ConfiguredFailoverProxyProvider(Configuration conf, 
URI uri,
 
   @Override
   public  void performFailover(T currentProxy) {
+if(conf.getBoolean(RESET_PROXY_ON_FAILOVER, false)) {
 
 Review comment:
   Can we have a unit test?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] goiri commented on a change in pull request #1480: HDFS-14857 FS operations fail in HA mode: DataNode fails to connect to NameNode

2019-09-20 Thread GitBox
goiri commented on a change in pull request #1480: HDFS-14857 FS operations 
fail in HA mode: DataNode fails to connect to NameNode
URL: https://github.com/apache/hadoop/pull/1480#discussion_r326462034
 
 

 ##
 File path: 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/server/namenode/ha/ConfiguredFailoverProxyProvider.java
 ##
 @@ -36,6 +36,8 @@
 public class ConfiguredFailoverProxyProvider extends
 AbstractNNFailoverProxyProvider {
 
+  public static final String RESET_PROXY_ON_FAILOVER = 
"dfs.client.failover.proxy.provider.reset-proxy-on-failure";
 
 Review comment:
   This should go to the config file.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org