[jira] [Commented] (HDFS-7858) Improve HA Namenode Failover detection on the client
[ https://issues.apache.org/jira/browse/HDFS-7858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15360989#comment-15360989 ] He Xiaoqiao commented on HDFS-7858: --- hi [~asuresh] when i patch this to 2.7.1, it throws some exception when submit job as following: {quote} 2016-07-01 17:45:37,497 WARN [pool-9-thread-2] org.apache.hadoop.ipc.Client: Exception encountered while connecting to the server : java.nio.channels.ClosedByInterruptException 2016-07-01 17:45:37,542 WARN [pool-10-thread-2] org.apache.hadoop.ipc.Client: Exception encountered while connecting to the server : java.nio.channels.ClosedByInterruptException 2016-07-01 17:45:37,571 INFO [main] org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: Emitting job history data to the timeline server is not enabled 2016-07-01 17:45:37,572 WARN [pool-11-thread-2] org.apache.hadoop.ipc.Client: Exception encountered while connecting to the server : java.nio.channels.ClosedByInterruptException 2016-07-01 17:45:37,573 INFO [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Recovery is enabled. Will try to recover from previous life on best effort basis. 2016-07-01 17:45:37,633 INFO [main] org.apache.hadoop.mapreduce.v2.jobhistory.JobHistoryUtils: Default file system [viewfs://ha/] 2016-07-01 17:45:37,698 INFO [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Previous history file is at viewfs://ha/hadoop-yarn/staging/yarn/.staging/job_1467365572539_3212/job_1467365572539_3212_1.jhist 2016-07-01 17:45:37,713 WARN [main] org.apache.hadoop.hdfs.server.namenode.ha.RequestHedgingProxyProvider: Invocation returned exception on [nn1host/ip:port] 2016-07-01 17:45:37,716 WARN [pool-12-thread-2] org.apache.hadoop.ipc.Client: Exception encountered while connecting to the server : org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException): Operation category READ is not supported in state standby 2016-07-01 17:45:37,717 WARN [main] org.apache.hadoop.hdfs.server.namenode.ha.RequestHedgingProxyProvider: Invocation returned exception on [nn2host/ip:port] 2016-07-01 17:45:37,725 WARN [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Unable to parse prior job history, aborting recovery MultiException[{java.util.concurrent.ExecutionException: java.lang.reflect.InvocationTargetException, java.util.concurrent.ExecutionException: java.lang.reflect.InvocationTargetException, }] at org.apache.hadoop.hdfs.server.namenode.ha.RequestHedgingProxyProvider$RequestHedgingInvocationHandler.invoke(RequestHedgingProxyProvider.java:133) at com.sun.proxy.$Proxy10.getBlockLocations(Unknown Source) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:252) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:104) at com.sun.proxy.$Proxy10.getBlockLocations(Unknown Source) at org.apache.hadoop.hdfs.DFSClient.callGetBlockLocations(DFSClient.java:1226) at org.apache.hadoop.hdfs.DFSClient.getLocatedBlocks(DFSClient.java:1213) at org.apache.hadoop.hdfs.DFSClient.getLocatedBlocks(DFSClient.java:1201) at org.apache.hadoop.hdfs.DFSInputStream.fetchLocatedBlocksAndGetLastBlockLength(DFSInputStream.java:303) at org.apache.hadoop.hdfs.DFSInputStream.openInfo(DFSInputStream.java:269) at org.apache.hadoop.hdfs.DFSInputStream.(DFSInputStream.java:261) at org.apache.hadoop.hdfs.DFSClient.open(DFSClient.java:1526) at org.apache.hadoop.hdfs.DistributedFileSystem$3.doCall(DistributedFileSystem.java:303) at org.apache.hadoop.hdfs.DistributedFileSystem$3.doCall(DistributedFileSystem.java:299) at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) at org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:299) at org.apache.hadoop.fs.FilterFileSystem.open(FilterFileSystem.java:161) at org.apache.hadoop.fs.viewfs.ChRootedFileSystem.open(ChRootedFileSystem.java:257) at org.apache.hadoop.fs.viewfs.ViewFileSystem.open(ViewFileSystem.java:423) at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:788) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.getPreviousJobHistoryStream(MRAppMaster.java:1199) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.parsePreviousJobHistory(MRAppMaster.java:1203) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.processRecovery(MRAppMaster.java:1175) at org.apache.hadoop.mapreduce.v2.app.MRApp
[jira] [Commented] (HDFS-7858) Improve HA Namenode Failover detection on the client
[ https://issues.apache.org/jira/browse/HDFS-7858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14906384#comment-14906384 ] Tsz Wo Nicholas Sze commented on HDFS-7858: --- Never mind. Thanks for the response. > Improve HA Namenode Failover detection on the client > > > Key: HDFS-7858 > URL: https://issues.apache.org/jira/browse/HDFS-7858 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client >Reporter: Arun Suresh >Assignee: Arun Suresh > Labels: BB2015-05-TBR > Fix For: 2.8.0 > > Attachments: HDFS-7858.1.patch, HDFS-7858.10.patch, > HDFS-7858.10.patch, HDFS-7858.11.patch, HDFS-7858.12.patch, > HDFS-7858.13.patch, HDFS-7858.2.patch, HDFS-7858.2.patch, HDFS-7858.3.patch, > HDFS-7858.4.patch, HDFS-7858.5.patch, HDFS-7858.6.patch, HDFS-7858.7.patch, > HDFS-7858.8.patch, HDFS-7858.9.patch > > > In an HA deployment, Clients are configured with the hostnames of both the > Active and Standby Namenodes.Clients will first try one of the NNs > (non-deterministically) and if its a standby NN, then it will respond to the > client to retry the request on the other Namenode. > If the client happens to talks to the Standby first, and the standby is > undergoing some GC / is busy, then those clients might not get a response > soon enough to try the other NN. > Proposed Approach to solve this : > 1) Use hedged RPCs to simultaneously call multiple configured NNs to decide > which is the active Namenode. > 2) Subsequent calls, will invoke the previously successful NN. > 3) On failover of the currently active NN, the remaining NNs will be invoked > to decide which is the new active -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7858) Improve HA Namenode Failover detection on the client
[ https://issues.apache.org/jira/browse/HDFS-7858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14904584#comment-14904584 ] Arun Suresh commented on HDFS-7858: --- [~szetszwo], Unfortunately, I do not remember the specifics, but think it went in to minutes.. > Improve HA Namenode Failover detection on the client > > > Key: HDFS-7858 > URL: https://issues.apache.org/jira/browse/HDFS-7858 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client >Reporter: Arun Suresh >Assignee: Arun Suresh > Labels: BB2015-05-TBR > Fix For: 2.8.0 > > Attachments: HDFS-7858.1.patch, HDFS-7858.10.patch, > HDFS-7858.10.patch, HDFS-7858.11.patch, HDFS-7858.12.patch, > HDFS-7858.13.patch, HDFS-7858.2.patch, HDFS-7858.2.patch, HDFS-7858.3.patch, > HDFS-7858.4.patch, HDFS-7858.5.patch, HDFS-7858.6.patch, HDFS-7858.7.patch, > HDFS-7858.8.patch, HDFS-7858.9.patch > > > In an HA deployment, Clients are configured with the hostnames of both the > Active and Standby Namenodes.Clients will first try one of the NNs > (non-deterministically) and if its a standby NN, then it will respond to the > client to retry the request on the other Namenode. > If the client happens to talks to the Standby first, and the standby is > undergoing some GC / is busy, then those clients might not get a response > soon enough to try the other NN. > Proposed Approach to solve this : > 1) Use hedged RPCs to simultaneously call multiple configured NNs to decide > which is the active Namenode. > 2) Subsequent calls, will invoke the previously successful NN. > 3) On failover of the currently active NN, the remaining NNs will be invoked > to decide which is the new active -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7858) Improve HA Namenode Failover detection on the client
[ https://issues.apache.org/jira/browse/HDFS-7858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14901642#comment-14901642 ] Tsz Wo Nicholas Sze commented on HDFS-7858: --- > ... then those clients might not get a response soon enough to try the other > NN. [~asuresh], do you recall how long have you seen for the client waiting? I might hit a similar problem recently. > Improve HA Namenode Failover detection on the client > > > Key: HDFS-7858 > URL: https://issues.apache.org/jira/browse/HDFS-7858 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client >Reporter: Arun Suresh >Assignee: Arun Suresh > Labels: BB2015-05-TBR > Fix For: 2.8.0 > > Attachments: HDFS-7858.1.patch, HDFS-7858.10.patch, > HDFS-7858.10.patch, HDFS-7858.11.patch, HDFS-7858.12.patch, > HDFS-7858.13.patch, HDFS-7858.2.patch, HDFS-7858.2.patch, HDFS-7858.3.patch, > HDFS-7858.4.patch, HDFS-7858.5.patch, HDFS-7858.6.patch, HDFS-7858.7.patch, > HDFS-7858.8.patch, HDFS-7858.9.patch > > > In an HA deployment, Clients are configured with the hostnames of both the > Active and Standby Namenodes.Clients will first try one of the NNs > (non-deterministically) and if its a standby NN, then it will respond to the > client to retry the request on the other Namenode. > If the client happens to talks to the Standby first, and the standby is > undergoing some GC / is busy, then those clients might not get a response > soon enough to try the other NN. > Proposed Approach to solve this : > 1) Use hedged RPCs to simultaneously call multiple configured NNs to decide > which is the active Namenode. > 2) Subsequent calls, will invoke the previously successful NN. > 3) On failover of the currently active NN, the remaining NNs will be invoked > to decide which is the new active -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7858) Improve HA Namenode Failover detection on the client
[ https://issues.apache.org/jira/browse/HDFS-7858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14644543#comment-14644543 ] Hudson commented on HDFS-7858: -- FAILURE: Integrated in Hadoop-Mapreduce-trunk #2216 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2216/]) HDFS-7858. Improve HA Namenode Failover detection on the client. (asuresh) (Arun Suresh: rev 030fcfa99c345ad57625486eeabedebf2fd4411f) * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/ha/ConfiguredFailoverProxyProvider.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/ha/TestRequestHedgingProxyProvider.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/ha/RequestHedgingProxyProvider.java * hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/retry/RetryInvocationHandler.java * hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/HDFSHighAvailabilityWithNFS.md * hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/retry/MultiException.java * hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/HDFSHighAvailabilityWithQJM.md * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt > Improve HA Namenode Failover detection on the client > > > Key: HDFS-7858 > URL: https://issues.apache.org/jira/browse/HDFS-7858 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client >Reporter: Arun Suresh >Assignee: Arun Suresh > Labels: BB2015-05-TBR > Fix For: 2.8.0 > > Attachments: HDFS-7858.1.patch, HDFS-7858.10.patch, > HDFS-7858.10.patch, HDFS-7858.11.patch, HDFS-7858.12.patch, > HDFS-7858.13.patch, HDFS-7858.2.patch, HDFS-7858.2.patch, HDFS-7858.3.patch, > HDFS-7858.4.patch, HDFS-7858.5.patch, HDFS-7858.6.patch, HDFS-7858.7.patch, > HDFS-7858.8.patch, HDFS-7858.9.patch > > > In an HA deployment, Clients are configured with the hostnames of both the > Active and Standby Namenodes.Clients will first try one of the NNs > (non-deterministically) and if its a standby NN, then it will respond to the > client to retry the request on the other Namenode. > If the client happens to talks to the Standby first, and the standby is > undergoing some GC / is busy, then those clients might not get a response > soon enough to try the other NN. > Proposed Approach to solve this : > 1) Since Zookeeper is already used as the failover controller, the clients > could talk to ZK and find out which is the active namenode before contacting > it. > 2) Long-lived DFSClients would have a ZK watch configured which fires when > there is a failover so they do not have to query ZK everytime to find out the > active NN > 2) Clients can also cache the last active NN in the user's home directory > (~/.lastNN) so that short-lived clients can try that Namenode first before > querying ZK -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7858) Improve HA Namenode Failover detection on the client
[ https://issues.apache.org/jira/browse/HDFS-7858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14644532#comment-14644532 ] Hudson commented on HDFS-7858: -- SUCCESS: Integrated in Hadoop-Mapreduce-trunk-Java8 #267 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/267/]) HDFS-7858. Improve HA Namenode Failover detection on the client. (asuresh) (Arun Suresh: rev 030fcfa99c345ad57625486eeabedebf2fd4411f) * hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/retry/RetryInvocationHandler.java * hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/retry/MultiException.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/ha/TestRequestHedgingProxyProvider.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/HDFSHighAvailabilityWithQJM.md * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/ha/ConfiguredFailoverProxyProvider.java * hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/HDFSHighAvailabilityWithNFS.md * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/ha/RequestHedgingProxyProvider.java > Improve HA Namenode Failover detection on the client > > > Key: HDFS-7858 > URL: https://issues.apache.org/jira/browse/HDFS-7858 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client >Reporter: Arun Suresh >Assignee: Arun Suresh > Labels: BB2015-05-TBR > Fix For: 2.8.0 > > Attachments: HDFS-7858.1.patch, HDFS-7858.10.patch, > HDFS-7858.10.patch, HDFS-7858.11.patch, HDFS-7858.12.patch, > HDFS-7858.13.patch, HDFS-7858.2.patch, HDFS-7858.2.patch, HDFS-7858.3.patch, > HDFS-7858.4.patch, HDFS-7858.5.patch, HDFS-7858.6.patch, HDFS-7858.7.patch, > HDFS-7858.8.patch, HDFS-7858.9.patch > > > In an HA deployment, Clients are configured with the hostnames of both the > Active and Standby Namenodes.Clients will first try one of the NNs > (non-deterministically) and if its a standby NN, then it will respond to the > client to retry the request on the other Namenode. > If the client happens to talks to the Standby first, and the standby is > undergoing some GC / is busy, then those clients might not get a response > soon enough to try the other NN. > Proposed Approach to solve this : > 1) Since Zookeeper is already used as the failover controller, the clients > could talk to ZK and find out which is the active namenode before contacting > it. > 2) Long-lived DFSClients would have a ZK watch configured which fires when > there is a failover so they do not have to query ZK everytime to find out the > active NN > 2) Clients can also cache the last active NN in the user's home directory > (~/.lastNN) so that short-lived clients can try that Namenode first before > querying ZK -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7858) Improve HA Namenode Failover detection on the client
[ https://issues.apache.org/jira/browse/HDFS-7858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14644432#comment-14644432 ] Hudson commented on HDFS-7858: -- FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #259 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/259/]) HDFS-7858. Improve HA Namenode Failover detection on the client. (asuresh) (Arun Suresh: rev 030fcfa99c345ad57625486eeabedebf2fd4411f) * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/HDFSHighAvailabilityWithQJM.md * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/ha/TestRequestHedgingProxyProvider.java * hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/retry/RetryInvocationHandler.java * hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/retry/MultiException.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/ha/RequestHedgingProxyProvider.java * hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/HDFSHighAvailabilityWithNFS.md * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/ha/ConfiguredFailoverProxyProvider.java > Improve HA Namenode Failover detection on the client > > > Key: HDFS-7858 > URL: https://issues.apache.org/jira/browse/HDFS-7858 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client >Reporter: Arun Suresh >Assignee: Arun Suresh > Labels: BB2015-05-TBR > Fix For: 2.8.0 > > Attachments: HDFS-7858.1.patch, HDFS-7858.10.patch, > HDFS-7858.10.patch, HDFS-7858.11.patch, HDFS-7858.12.patch, > HDFS-7858.13.patch, HDFS-7858.2.patch, HDFS-7858.2.patch, HDFS-7858.3.patch, > HDFS-7858.4.patch, HDFS-7858.5.patch, HDFS-7858.6.patch, HDFS-7858.7.patch, > HDFS-7858.8.patch, HDFS-7858.9.patch > > > In an HA deployment, Clients are configured with the hostnames of both the > Active and Standby Namenodes.Clients will first try one of the NNs > (non-deterministically) and if its a standby NN, then it will respond to the > client to retry the request on the other Namenode. > If the client happens to talks to the Standby first, and the standby is > undergoing some GC / is busy, then those clients might not get a response > soon enough to try the other NN. > Proposed Approach to solve this : > 1) Since Zookeeper is already used as the failover controller, the clients > could talk to ZK and find out which is the active namenode before contacting > it. > 2) Long-lived DFSClients would have a ZK watch configured which fires when > there is a failover so they do not have to query ZK everytime to find out the > active NN > 2) Clients can also cache the last active NN in the user's home directory > (~/.lastNN) so that short-lived clients can try that Namenode first before > querying ZK -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7858) Improve HA Namenode Failover detection on the client
[ https://issues.apache.org/jira/browse/HDFS-7858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14644318#comment-14644318 ] Hudson commented on HDFS-7858: -- FAILURE: Integrated in Hadoop-Hdfs-trunk #2197 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/2197/]) HDFS-7858. Improve HA Namenode Failover detection on the client. (asuresh) (Arun Suresh: rev 030fcfa99c345ad57625486eeabedebf2fd4411f) * hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/retry/MultiException.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/ha/RequestHedgingProxyProvider.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/HDFSHighAvailabilityWithQJM.md * hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/HDFSHighAvailabilityWithNFS.md * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/ha/ConfiguredFailoverProxyProvider.java * hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/retry/RetryInvocationHandler.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/ha/TestRequestHedgingProxyProvider.java > Improve HA Namenode Failover detection on the client > > > Key: HDFS-7858 > URL: https://issues.apache.org/jira/browse/HDFS-7858 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client >Reporter: Arun Suresh >Assignee: Arun Suresh > Labels: BB2015-05-TBR > Fix For: 2.8.0 > > Attachments: HDFS-7858.1.patch, HDFS-7858.10.patch, > HDFS-7858.10.patch, HDFS-7858.11.patch, HDFS-7858.12.patch, > HDFS-7858.13.patch, HDFS-7858.2.patch, HDFS-7858.2.patch, HDFS-7858.3.patch, > HDFS-7858.4.patch, HDFS-7858.5.patch, HDFS-7858.6.patch, HDFS-7858.7.patch, > HDFS-7858.8.patch, HDFS-7858.9.patch > > > In an HA deployment, Clients are configured with the hostnames of both the > Active and Standby Namenodes.Clients will first try one of the NNs > (non-deterministically) and if its a standby NN, then it will respond to the > client to retry the request on the other Namenode. > If the client happens to talks to the Standby first, and the standby is > undergoing some GC / is busy, then those clients might not get a response > soon enough to try the other NN. > Proposed Approach to solve this : > 1) Since Zookeeper is already used as the failover controller, the clients > could talk to ZK and find out which is the active namenode before contacting > it. > 2) Long-lived DFSClients would have a ZK watch configured which fires when > there is a failover so they do not have to query ZK everytime to find out the > active NN > 2) Clients can also cache the last active NN in the user's home directory > (~/.lastNN) so that short-lived clients can try that Namenode first before > querying ZK -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7858) Improve HA Namenode Failover detection on the client
[ https://issues.apache.org/jira/browse/HDFS-7858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14644217#comment-14644217 ] Hudson commented on HDFS-7858: -- FAILURE: Integrated in Hadoop-Yarn-trunk #1000 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/1000/]) HDFS-7858. Improve HA Namenode Failover detection on the client. (asuresh) (Arun Suresh: rev 030fcfa99c345ad57625486eeabedebf2fd4411f) * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/ha/TestRequestHedgingProxyProvider.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/HDFSHighAvailabilityWithQJM.md * hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/HDFSHighAvailabilityWithNFS.md * hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/retry/MultiException.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/ha/ConfiguredFailoverProxyProvider.java * hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/retry/RetryInvocationHandler.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/ha/RequestHedgingProxyProvider.java > Improve HA Namenode Failover detection on the client > > > Key: HDFS-7858 > URL: https://issues.apache.org/jira/browse/HDFS-7858 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client >Reporter: Arun Suresh >Assignee: Arun Suresh > Labels: BB2015-05-TBR > Fix For: 2.8.0 > > Attachments: HDFS-7858.1.patch, HDFS-7858.10.patch, > HDFS-7858.10.patch, HDFS-7858.11.patch, HDFS-7858.12.patch, > HDFS-7858.13.patch, HDFS-7858.2.patch, HDFS-7858.2.patch, HDFS-7858.3.patch, > HDFS-7858.4.patch, HDFS-7858.5.patch, HDFS-7858.6.patch, HDFS-7858.7.patch, > HDFS-7858.8.patch, HDFS-7858.9.patch > > > In an HA deployment, Clients are configured with the hostnames of both the > Active and Standby Namenodes.Clients will first try one of the NNs > (non-deterministically) and if its a standby NN, then it will respond to the > client to retry the request on the other Namenode. > If the client happens to talks to the Standby first, and the standby is > undergoing some GC / is busy, then those clients might not get a response > soon enough to try the other NN. > Proposed Approach to solve this : > 1) Since Zookeeper is already used as the failover controller, the clients > could talk to ZK and find out which is the active namenode before contacting > it. > 2) Long-lived DFSClients would have a ZK watch configured which fires when > there is a failover so they do not have to query ZK everytime to find out the > active NN > 2) Clients can also cache the last active NN in the user's home directory > (~/.lastNN) so that short-lived clients can try that Namenode first before > querying ZK -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7858) Improve HA Namenode Failover detection on the client
[ https://issues.apache.org/jira/browse/HDFS-7858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14644206#comment-14644206 ] Hudson commented on HDFS-7858: -- FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #270 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/270/]) HDFS-7858. Improve HA Namenode Failover detection on the client. (asuresh) (Arun Suresh: rev 030fcfa99c345ad57625486eeabedebf2fd4411f) * hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/retry/MultiException.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/ha/RequestHedgingProxyProvider.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/ha/TestRequestHedgingProxyProvider.java * hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/HDFSHighAvailabilityWithQJM.md * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/ha/ConfiguredFailoverProxyProvider.java * hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/HDFSHighAvailabilityWithNFS.md * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/retry/RetryInvocationHandler.java > Improve HA Namenode Failover detection on the client > > > Key: HDFS-7858 > URL: https://issues.apache.org/jira/browse/HDFS-7858 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client >Reporter: Arun Suresh >Assignee: Arun Suresh > Labels: BB2015-05-TBR > Fix For: 2.8.0 > > Attachments: HDFS-7858.1.patch, HDFS-7858.10.patch, > HDFS-7858.10.patch, HDFS-7858.11.patch, HDFS-7858.12.patch, > HDFS-7858.13.patch, HDFS-7858.2.patch, HDFS-7858.2.patch, HDFS-7858.3.patch, > HDFS-7858.4.patch, HDFS-7858.5.patch, HDFS-7858.6.patch, HDFS-7858.7.patch, > HDFS-7858.8.patch, HDFS-7858.9.patch > > > In an HA deployment, Clients are configured with the hostnames of both the > Active and Standby Namenodes.Clients will first try one of the NNs > (non-deterministically) and if its a standby NN, then it will respond to the > client to retry the request on the other Namenode. > If the client happens to talks to the Standby first, and the standby is > undergoing some GC / is busy, then those clients might not get a response > soon enough to try the other NN. > Proposed Approach to solve this : > 1) Since Zookeeper is already used as the failover controller, the clients > could talk to ZK and find out which is the active namenode before contacting > it. > 2) Long-lived DFSClients would have a ZK watch configured which fires when > there is a failover so they do not have to query ZK everytime to find out the > active NN > 2) Clients can also cache the last active NN in the user's home directory > (~/.lastNN) so that short-lived clients can try that Namenode first before > querying ZK -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7858) Improve HA Namenode Failover detection on the client
[ https://issues.apache.org/jira/browse/HDFS-7858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14643911#comment-14643911 ] Hudson commented on HDFS-7858: -- FAILURE: Integrated in Hadoop-trunk-Commit #8231 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/8231/]) HDFS-7858. Improve HA Namenode Failover detection on the client. (asuresh) (Arun Suresh: rev 030fcfa99c345ad57625486eeabedebf2fd4411f) * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/ha/TestRequestHedgingProxyProvider.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/ha/RequestHedgingProxyProvider.java * hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/HDFSHighAvailabilityWithNFS.md * hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/retry/RetryInvocationHandler.java * hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/HDFSHighAvailabilityWithQJM.md * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/ha/ConfiguredFailoverProxyProvider.java * hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/retry/MultiException.java > Improve HA Namenode Failover detection on the client > > > Key: HDFS-7858 > URL: https://issues.apache.org/jira/browse/HDFS-7858 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Arun Suresh >Assignee: Arun Suresh > Labels: BB2015-05-TBR > Attachments: HDFS-7858.1.patch, HDFS-7858.10.patch, > HDFS-7858.10.patch, HDFS-7858.11.patch, HDFS-7858.12.patch, > HDFS-7858.13.patch, HDFS-7858.2.patch, HDFS-7858.2.patch, HDFS-7858.3.patch, > HDFS-7858.4.patch, HDFS-7858.5.patch, HDFS-7858.6.patch, HDFS-7858.7.patch, > HDFS-7858.8.patch, HDFS-7858.9.patch > > > In an HA deployment, Clients are configured with the hostnames of both the > Active and Standby Namenodes.Clients will first try one of the NNs > (non-deterministically) and if its a standby NN, then it will respond to the > client to retry the request on the other Namenode. > If the client happens to talks to the Standby first, and the standby is > undergoing some GC / is busy, then those clients might not get a response > soon enough to try the other NN. > Proposed Approach to solve this : > 1) Since Zookeeper is already used as the failover controller, the clients > could talk to ZK and find out which is the active namenode before contacting > it. > 2) Long-lived DFSClients would have a ZK watch configured which fires when > there is a failover so they do not have to query ZK everytime to find out the > active NN > 2) Clients can also cache the last active NN in the user's home directory > (~/.lastNN) so that short-lived clients can try that Namenode first before > querying ZK -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7858) Improve HA Namenode Failover detection on the client
[ https://issues.apache.org/jira/browse/HDFS-7858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14643902#comment-14643902 ] Arun Suresh commented on HDFS-7858: --- Test Case error was due to a timing issue. Modified test case to ensure that it doesn't happen Committed to trunk and branch-2 > Improve HA Namenode Failover detection on the client > > > Key: HDFS-7858 > URL: https://issues.apache.org/jira/browse/HDFS-7858 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Arun Suresh >Assignee: Arun Suresh > Labels: BB2015-05-TBR > Attachments: HDFS-7858.1.patch, HDFS-7858.10.patch, > HDFS-7858.10.patch, HDFS-7858.11.patch, HDFS-7858.12.patch, > HDFS-7858.13.patch, HDFS-7858.2.patch, HDFS-7858.2.patch, HDFS-7858.3.patch, > HDFS-7858.4.patch, HDFS-7858.5.patch, HDFS-7858.6.patch, HDFS-7858.7.patch, > HDFS-7858.8.patch, HDFS-7858.9.patch > > > In an HA deployment, Clients are configured with the hostnames of both the > Active and Standby Namenodes.Clients will first try one of the NNs > (non-deterministically) and if its a standby NN, then it will respond to the > client to retry the request on the other Namenode. > If the client happens to talks to the Standby first, and the standby is > undergoing some GC / is busy, then those clients might not get a response > soon enough to try the other NN. > Proposed Approach to solve this : > 1) Since Zookeeper is already used as the failover controller, the clients > could talk to ZK and find out which is the active namenode before contacting > it. > 2) Long-lived DFSClients would have a ZK watch configured which fires when > there is a failover so they do not have to query ZK everytime to find out the > active NN > 2) Clients can also cache the last active NN in the user's home directory > (~/.lastNN) so that short-lived clients can try that Namenode first before > querying ZK -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7858) Improve HA Namenode Failover detection on the client
[ https://issues.apache.org/jira/browse/HDFS-7858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14643869#comment-14643869 ] Hadoop QA commented on HDFS-7858: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 22m 33s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 1 new or modified test files. | | {color:red}-1{color} | javac | 7m 38s | The applied patch generated 2 additional warning messages. | | {color:green}+1{color} | javadoc | 9m 51s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 23s | The applied patch does not increase the total number of release audit warnings. | | {color:green}+1{color} | site | 2m 57s | Site still builds. | | {color:green}+1{color} | checkstyle | 2m 2s | There were no new checkstyle issues. | | {color:green}+1{color} | whitespace | 0m 0s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 27s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 32s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 4m 24s | The patch does not introduce any new Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | common tests | 22m 20s | Tests passed in hadoop-common. | | {color:red}-1{color} | hdfs tests | 158m 53s | Tests failed in hadoop-hdfs. | | | | 233m 4s | | \\ \\ || Reason || Tests || | Failed unit tests | hadoop.hdfs.server.namenode.ha.TestRequestHedgingProxyProvider | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12747474/HDFS-7858.13.patch | | Optional Tests | javadoc javac unit findbugs checkstyle site | | git revision | trunk / 3572ebd | | javac | https://builds.apache.org/job/PreCommit-HDFS-Build/11852/artifact/patchprocess/diffJavacWarnings.txt | | hadoop-common test log | https://builds.apache.org/job/PreCommit-HDFS-Build/11852/artifact/patchprocess/testrun_hadoop-common.txt | | hadoop-hdfs test log | https://builds.apache.org/job/PreCommit-HDFS-Build/11852/artifact/patchprocess/testrun_hadoop-hdfs.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/11852/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf904.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/11852/console | This message was automatically generated. > Improve HA Namenode Failover detection on the client > > > Key: HDFS-7858 > URL: https://issues.apache.org/jira/browse/HDFS-7858 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Arun Suresh >Assignee: Arun Suresh > Labels: BB2015-05-TBR > Attachments: HDFS-7858.1.patch, HDFS-7858.10.patch, > HDFS-7858.10.patch, HDFS-7858.11.patch, HDFS-7858.12.patch, > HDFS-7858.13.patch, HDFS-7858.2.patch, HDFS-7858.2.patch, HDFS-7858.3.patch, > HDFS-7858.4.patch, HDFS-7858.5.patch, HDFS-7858.6.patch, HDFS-7858.7.patch, > HDFS-7858.8.patch, HDFS-7858.9.patch > > > In an HA deployment, Clients are configured with the hostnames of both the > Active and Standby Namenodes.Clients will first try one of the NNs > (non-deterministically) and if its a standby NN, then it will respond to the > client to retry the request on the other Namenode. > If the client happens to talks to the Standby first, and the standby is > undergoing some GC / is busy, then those clients might not get a response > soon enough to try the other NN. > Proposed Approach to solve this : > 1) Since Zookeeper is already used as the failover controller, the clients > could talk to ZK and find out which is the active namenode before contacting > it. > 2) Long-lived DFSClients would have a ZK watch configured which fires when > there is a failover so they do not have to query ZK everytime to find out the > active NN > 2) Clients can also cache the last active NN in the user's home directory > (~/.lastNN) so that short-lived clients can try that Namenode first before > querying ZK -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7858) Improve HA Namenode Failover detection on the client
[ https://issues.apache.org/jira/browse/HDFS-7858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14643603#comment-14643603 ] Hadoop QA commented on HDFS-7858: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 21m 42s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 1 new or modified test files. | | {color:red}-1{color} | javac | 7m 31s | The applied patch generated 1 additional warning messages. | | {color:green}+1{color} | javadoc | 9m 36s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 24s | The applied patch does not increase the total number of release audit warnings. | | {color:green}+1{color} | site | 2m 59s | Site still builds. | | {color:green}+1{color} | checkstyle | 2m 1s | There were no new checkstyle issues. | | {color:green}+1{color} | whitespace | 0m 1s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 27s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 33s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 4m 19s | The patch does not introduce any new Findbugs (version 3.0.0) warnings. | | {color:red}-1{color} | common tests | 22m 20s | Tests failed in hadoop-common. | | {color:red}-1{color} | hdfs tests | 162m 20s | Tests failed in hadoop-hdfs. | | | | 235m 17s | | \\ \\ || Reason || Tests || | Failed unit tests | hadoop.ipc.TestRPC | | | hadoop.fs.TestLocalFsFCStatistics | | | hadoop.hdfs.server.namenode.ha.TestStandbyIsHot | | | hadoop.hdfs.server.namenode.ha.TestRequestHedgingProxyProvider | | | hadoop.hdfs.server.namenode.TestFsck | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12747399/HDFS-7858.12.patch | | Optional Tests | javadoc javac unit findbugs checkstyle site | | git revision | trunk / f36835f | | javac | https://builds.apache.org/job/PreCommit-HDFS-Build/11848/artifact/patchprocess/diffJavacWarnings.txt | | hadoop-common test log | https://builds.apache.org/job/PreCommit-HDFS-Build/11848/artifact/patchprocess/testrun_hadoop-common.txt | | hadoop-hdfs test log | https://builds.apache.org/job/PreCommit-HDFS-Build/11848/artifact/patchprocess/testrun_hadoop-hdfs.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/11848/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf906.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/11848/console | This message was automatically generated. > Improve HA Namenode Failover detection on the client > > > Key: HDFS-7858 > URL: https://issues.apache.org/jira/browse/HDFS-7858 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Arun Suresh >Assignee: Arun Suresh > Labels: BB2015-05-TBR > Attachments: HDFS-7858.1.patch, HDFS-7858.10.patch, > HDFS-7858.10.patch, HDFS-7858.11.patch, HDFS-7858.12.patch, > HDFS-7858.2.patch, HDFS-7858.2.patch, HDFS-7858.3.patch, HDFS-7858.4.patch, > HDFS-7858.5.patch, HDFS-7858.6.patch, HDFS-7858.7.patch, HDFS-7858.8.patch, > HDFS-7858.9.patch > > > In an HA deployment, Clients are configured with the hostnames of both the > Active and Standby Namenodes.Clients will first try one of the NNs > (non-deterministically) and if its a standby NN, then it will respond to the > client to retry the request on the other Namenode. > If the client happens to talks to the Standby first, and the standby is > undergoing some GC / is busy, then those clients might not get a response > soon enough to try the other NN. > Proposed Approach to solve this : > 1) Since Zookeeper is already used as the failover controller, the clients > could talk to ZK and find out which is the active namenode before contacting > it. > 2) Long-lived DFSClients would have a ZK watch configured which fires when > there is a failover so they do not have to query ZK everytime to find out the > active NN > 2) Clients can also cache the last active NN in the user's home directory > (~/.lastNN) so that short-lived clients can try that Namenode first before > querying ZK -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7858) Improve HA Namenode Failover detection on the client
[ https://issues.apache.org/jira/browse/HDFS-7858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14643339#comment-14643339 ] Jing Zhao commented on HDFS-7858: - Thanks for working on this, [~asuresh]! The latest patch looks good to me. +1. Also agree with [~arpitagarwal] that we can keep testing and improving this since this is currently not default. > Improve HA Namenode Failover detection on the client > > > Key: HDFS-7858 > URL: https://issues.apache.org/jira/browse/HDFS-7858 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Arun Suresh >Assignee: Arun Suresh > Labels: BB2015-05-TBR > Attachments: HDFS-7858.1.patch, HDFS-7858.10.patch, > HDFS-7858.10.patch, HDFS-7858.11.patch, HDFS-7858.12.patch, > HDFS-7858.2.patch, HDFS-7858.2.patch, HDFS-7858.3.patch, HDFS-7858.4.patch, > HDFS-7858.5.patch, HDFS-7858.6.patch, HDFS-7858.7.patch, HDFS-7858.8.patch, > HDFS-7858.9.patch > > > In an HA deployment, Clients are configured with the hostnames of both the > Active and Standby Namenodes.Clients will first try one of the NNs > (non-deterministically) and if its a standby NN, then it will respond to the > client to retry the request on the other Namenode. > If the client happens to talks to the Standby first, and the standby is > undergoing some GC / is busy, then those clients might not get a response > soon enough to try the other NN. > Proposed Approach to solve this : > 1) Since Zookeeper is already used as the failover controller, the clients > could talk to ZK and find out which is the active namenode before contacting > it. > 2) Long-lived DFSClients would have a ZK watch configured which fires when > there is a failover so they do not have to query ZK everytime to find out the > active NN > 2) Clients can also cache the last active NN in the user's home directory > (~/.lastNN) so that short-lived clients can try that Namenode first before > querying ZK -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7858) Improve HA Namenode Failover detection on the client
[ https://issues.apache.org/jira/browse/HDFS-7858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14643102#comment-14643102 ] Arpit Agarwal commented on HDFS-7858: - I am +1 on the v11 patch. {{RequestHedgingProxyProvider}} is disabled by default so remaining issues can be addressed separately to avoid spinning on this forever. :-) # One optimization with your new approach - In the common HA case with two NameNodes, after {{performFailover}} is called, {{toIgnore}} will be non-null. We don't need to create a thread pool/completion service, we can simply send the request to the single proxy in the callers thread. # The TODO is not technically a TODO. We can just document this property in the class Javadoc that it can block indefinitely and depends on the caller implementing a timeout. # Couple of documentation nitpicks: ## _The two implementations which currently ships_ -> _The two implementations which currently ship_ ## _so use these_ --> _so use one of these unless you are using a custom proxy provider_ Will hold off committing in case [~jingzhao] has any further comments. Thanks for working on this [~asuresh]. > Improve HA Namenode Failover detection on the client > > > Key: HDFS-7858 > URL: https://issues.apache.org/jira/browse/HDFS-7858 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Arun Suresh >Assignee: Arun Suresh > Labels: BB2015-05-TBR > Attachments: HDFS-7858.1.patch, HDFS-7858.10.patch, > HDFS-7858.10.patch, HDFS-7858.11.patch, HDFS-7858.2.patch, HDFS-7858.2.patch, > HDFS-7858.3.patch, HDFS-7858.4.patch, HDFS-7858.5.patch, HDFS-7858.6.patch, > HDFS-7858.7.patch, HDFS-7858.8.patch, HDFS-7858.9.patch > > > In an HA deployment, Clients are configured with the hostnames of both the > Active and Standby Namenodes.Clients will first try one of the NNs > (non-deterministically) and if its a standby NN, then it will respond to the > client to retry the request on the other Namenode. > If the client happens to talks to the Standby first, and the standby is > undergoing some GC / is busy, then those clients might not get a response > soon enough to try the other NN. > Proposed Approach to solve this : > 1) Since Zookeeper is already used as the failover controller, the clients > could talk to ZK and find out which is the active namenode before contacting > it. > 2) Long-lived DFSClients would have a ZK watch configured which fires when > there is a failover so they do not have to query ZK everytime to find out the > active NN > 2) Clients can also cache the last active NN in the user's home directory > (~/.lastNN) so that short-lived clients can try that Namenode first before > querying ZK -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7858) Improve HA Namenode Failover detection on the client
[ https://issues.apache.org/jira/browse/HDFS-7858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14641780#comment-14641780 ] Hadoop QA commented on HDFS-7858: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:red}-1{color} | pre-patch | 19m 52s | Findbugs (version ) appears to be broken on trunk. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 1 new or modified test files. | | {color:green}+1{color} | javac | 7m 34s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 9m 32s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 23s | The applied patch does not increase the total number of release audit warnings. | | {color:green}+1{color} | site | 2m 59s | Site still builds. | | {color:green}+1{color} | checkstyle | 1m 39s | There were no new checkstyle issues. | | {color:green}+1{color} | whitespace | 0m 1s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 28s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 32s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 4m 21s | The patch does not introduce any new Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | common tests | 22m 21s | Tests passed in hadoop-common. | | {color:red}-1{color} | hdfs tests | 160m 51s | Tests failed in hadoop-hdfs. | | | | 231m 36s | | \\ \\ || Reason || Tests || | Failed unit tests | hadoop.hdfs.TestAppendSnapshotTruncate | | | hadoop.hdfs.TestDistributedFileSystem | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12747199/HDFS-7858.11.patch | | Optional Tests | javadoc javac unit findbugs checkstyle site | | git revision | trunk / 156f24e | | hadoop-common test log | https://builds.apache.org/job/PreCommit-HDFS-Build/11840/artifact/patchprocess/testrun_hadoop-common.txt | | hadoop-hdfs test log | https://builds.apache.org/job/PreCommit-HDFS-Build/11840/artifact/patchprocess/testrun_hadoop-hdfs.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/11840/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf909.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/11840/console | This message was automatically generated. > Improve HA Namenode Failover detection on the client > > > Key: HDFS-7858 > URL: https://issues.apache.org/jira/browse/HDFS-7858 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Arun Suresh >Assignee: Arun Suresh > Labels: BB2015-05-TBR > Attachments: HDFS-7858.1.patch, HDFS-7858.10.patch, > HDFS-7858.10.patch, HDFS-7858.11.patch, HDFS-7858.2.patch, HDFS-7858.2.patch, > HDFS-7858.3.patch, HDFS-7858.4.patch, HDFS-7858.5.patch, HDFS-7858.6.patch, > HDFS-7858.7.patch, HDFS-7858.8.patch, HDFS-7858.9.patch > > > In an HA deployment, Clients are configured with the hostnames of both the > Active and Standby Namenodes.Clients will first try one of the NNs > (non-deterministically) and if its a standby NN, then it will respond to the > client to retry the request on the other Namenode. > If the client happens to talks to the Standby first, and the standby is > undergoing some GC / is busy, then those clients might not get a response > soon enough to try the other NN. > Proposed Approach to solve this : > 1) Since Zookeeper is already used as the failover controller, the clients > could talk to ZK and find out which is the active namenode before contacting > it. > 2) Long-lived DFSClients would have a ZK watch configured which fires when > there is a failover so they do not have to query ZK everytime to find out the > active NN > 2) Clients can also cache the last active NN in the user's home directory > (~/.lastNN) so that short-lived clients can try that Namenode first before > querying ZK -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7858) Improve HA Namenode Failover detection on the client
[ https://issues.apache.org/jira/browse/HDFS-7858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14641570#comment-14641570 ] Hadoop QA commented on HDFS-7858: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 21m 50s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 1 new or modified test files. | | {color:green}+1{color} | javac | 7m 34s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 9m 39s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 23s | The applied patch does not increase the total number of release audit warnings. | | {color:green}+1{color} | site | 2m 59s | Site still builds. | | {color:green}+1{color} | checkstyle | 1m 59s | There were no new checkstyle issues. | | {color:green}+1{color} | whitespace | 0m 1s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 29s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 33s | The patch built with eclipse:eclipse. | | {color:red}-1{color} | findbugs | 4m 24s | The patch appears to introduce 2 new Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | common tests | 22m 27s | Tests passed in hadoop-common. | | {color:red}-1{color} | hdfs tests | 163m 44s | Tests failed in hadoop-hdfs. | | | | 237m 6s | | \\ \\ || Reason || Tests || | FindBugs | module:hadoop-hdfs | | Failed unit tests | hadoop.hdfs.TestDistributedFileSystem | | | hadoop.hdfs.server.namenode.ha.TestStandbyIsHot | | Timed out tests | org.apache.hadoop.hdfs.server.mover.TestStorageMover | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12747173/HDFS-7858.10.patch | | Optional Tests | javadoc javac unit findbugs checkstyle site | | git revision | trunk / adcf5dd | | Findbugs warnings | https://builds.apache.org/job/PreCommit-HDFS-Build/11838/artifact/patchprocess/newPatchFindbugsWarningshadoop-hdfs.html | | hadoop-common test log | https://builds.apache.org/job/PreCommit-HDFS-Build/11838/artifact/patchprocess/testrun_hadoop-common.txt | | hadoop-hdfs test log | https://builds.apache.org/job/PreCommit-HDFS-Build/11838/artifact/patchprocess/testrun_hadoop-hdfs.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/11838/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf903.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/11838/console | This message was automatically generated. > Improve HA Namenode Failover detection on the client > > > Key: HDFS-7858 > URL: https://issues.apache.org/jira/browse/HDFS-7858 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Arun Suresh >Assignee: Arun Suresh > Labels: BB2015-05-TBR > Attachments: HDFS-7858.1.patch, HDFS-7858.10.patch, > HDFS-7858.10.patch, HDFS-7858.2.patch, HDFS-7858.2.patch, HDFS-7858.3.patch, > HDFS-7858.4.patch, HDFS-7858.5.patch, HDFS-7858.6.patch, HDFS-7858.7.patch, > HDFS-7858.8.patch, HDFS-7858.9.patch > > > In an HA deployment, Clients are configured with the hostnames of both the > Active and Standby Namenodes.Clients will first try one of the NNs > (non-deterministically) and if its a standby NN, then it will respond to the > client to retry the request on the other Namenode. > If the client happens to talks to the Standby first, and the standby is > undergoing some GC / is busy, then those clients might not get a response > soon enough to try the other NN. > Proposed Approach to solve this : > 1) Since Zookeeper is already used as the failover controller, the clients > could talk to ZK and find out which is the active namenode before contacting > it. > 2) Long-lived DFSClients would have a ZK watch configured which fires when > there is a failover so they do not have to query ZK everytime to find out the > active NN > 2) Clients can also cache the last active NN in the user's home directory > (~/.lastNN) so that short-lived clients can try that Namenode first before > querying ZK -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7858) Improve HA Namenode Failover detection on the client
[ https://issues.apache.org/jira/browse/HDFS-7858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14641238#comment-14641238 ] Hadoop QA commented on HDFS-7858: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 21m 46s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 1 new or modified test files. | | {color:green}+1{color} | javac | 7m 35s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 9m 43s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 22s | The applied patch does not increase the total number of release audit warnings. | | {color:green}+1{color} | site | 3m 0s | Site still builds. | | {color:green}+1{color} | checkstyle | 2m 2s | There were no new checkstyle issues. | | {color:green}+1{color} | whitespace | 0m 1s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 29s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 31s | The patch built with eclipse:eclipse. | | {color:red}-1{color} | findbugs | 4m 26s | The patch appears to introduce 1 new Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | common tests | 22m 16s | Tests passed in hadoop-common. | | {color:red}-1{color} | hdfs tests | 158m 42s | Tests failed in hadoop-hdfs. | | | | 231m 57s | | \\ \\ || Reason || Tests || | FindBugs | module:hadoop-hdfs | | Failed unit tests | hadoop.hdfs.TestDistributedFileSystem | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12747095/HDFS-7858.10.patch | | Optional Tests | javadoc javac unit findbugs checkstyle site | | git revision | trunk / d19d187 | | Findbugs warnings | https://builds.apache.org/job/PreCommit-HDFS-Build/11833/artifact/patchprocess/newPatchFindbugsWarningshadoop-hdfs.html | | hadoop-common test log | https://builds.apache.org/job/PreCommit-HDFS-Build/11833/artifact/patchprocess/testrun_hadoop-common.txt | | hadoop-hdfs test log | https://builds.apache.org/job/PreCommit-HDFS-Build/11833/artifact/patchprocess/testrun_hadoop-hdfs.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/11833/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf901.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/11833/console | This message was automatically generated. > Improve HA Namenode Failover detection on the client > > > Key: HDFS-7858 > URL: https://issues.apache.org/jira/browse/HDFS-7858 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Arun Suresh >Assignee: Arun Suresh > Labels: BB2015-05-TBR > Attachments: HDFS-7858.1.patch, HDFS-7858.10.patch, > HDFS-7858.2.patch, HDFS-7858.2.patch, HDFS-7858.3.patch, HDFS-7858.4.patch, > HDFS-7858.5.patch, HDFS-7858.6.patch, HDFS-7858.7.patch, HDFS-7858.8.patch, > HDFS-7858.9.patch > > > In an HA deployment, Clients are configured with the hostnames of both the > Active and Standby Namenodes.Clients will first try one of the NNs > (non-deterministically) and if its a standby NN, then it will respond to the > client to retry the request on the other Namenode. > If the client happens to talks to the Standby first, and the standby is > undergoing some GC / is busy, then those clients might not get a response > soon enough to try the other NN. > Proposed Approach to solve this : > 1) Since Zookeeper is already used as the failover controller, the clients > could talk to ZK and find out which is the active namenode before contacting > it. > 2) Long-lived DFSClients would have a ZK watch configured which fires when > there is a failover so they do not have to query ZK everytime to find out the > active NN > 2) Clients can also cache the last active NN in the user's home directory > (~/.lastNN) so that short-lived clients can try that Namenode first before > querying ZK -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7858) Improve HA Namenode Failover detection on the client
[ https://issues.apache.org/jira/browse/HDFS-7858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14641067#comment-14641067 ] Jing Zhao commented on HDFS-7858: - bq. I need this to get name of the proxy which was successful (so i can key into the targetProxies map). CallResult catches the exception and sets it as the result. Yeah, I did notice the exception has been captured by CallResult. But maybe we can use a future-->proxy map here? In this way we do not need to have a wrapper class like CallResult so maybe the code can be further simplified. > Improve HA Namenode Failover detection on the client > > > Key: HDFS-7858 > URL: https://issues.apache.org/jira/browse/HDFS-7858 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Arun Suresh >Assignee: Arun Suresh > Labels: BB2015-05-TBR > Attachments: HDFS-7858.1.patch, HDFS-7858.10.patch, > HDFS-7858.2.patch, HDFS-7858.2.patch, HDFS-7858.3.patch, HDFS-7858.4.patch, > HDFS-7858.5.patch, HDFS-7858.6.patch, HDFS-7858.7.patch, HDFS-7858.8.patch, > HDFS-7858.9.patch > > > In an HA deployment, Clients are configured with the hostnames of both the > Active and Standby Namenodes.Clients will first try one of the NNs > (non-deterministically) and if its a standby NN, then it will respond to the > client to retry the request on the other Namenode. > If the client happens to talks to the Standby first, and the standby is > undergoing some GC / is busy, then those clients might not get a response > soon enough to try the other NN. > Proposed Approach to solve this : > 1) Since Zookeeper is already used as the failover controller, the clients > could talk to ZK and find out which is the active namenode before contacting > it. > 2) Long-lived DFSClients would have a ZK watch configured which fires when > there is a failover so they do not have to query ZK everytime to find out the > active NN > 2) Clients can also cache the last active NN in the user's home directory > (~/.lastNN) so that short-lived clients can try that Namenode first before > querying ZK -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7858) Improve HA Namenode Failover detection on the client
[ https://issues.apache.org/jira/browse/HDFS-7858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14640857#comment-14640857 ] Jing Zhao commented on HDFS-7858: - Thanks again for updating the patch, [~asuresh]! Some minor comments on the latest patch: # Do we need the latch in {{RequestHedgingInvocationHandler#invoke}}? # I'm not sure if we need requestTimeout. Client/DN now already sets their specific socket timeout for their connection to NameNode thus it seems redundant to have an extra 2 min timeout when polling the CompletionService. # We can use the ExecutionException thrown by {{callResultFuture.get()}} to get the exception thrown by the invocation. # Maybe we should use debug/trace here? {code} +LOG.info("Invocation successful on [" ++ callResultFuture.get().name + "]"); {code} > Improve HA Namenode Failover detection on the client > > > Key: HDFS-7858 > URL: https://issues.apache.org/jira/browse/HDFS-7858 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Arun Suresh >Assignee: Arun Suresh > Labels: BB2015-05-TBR > Attachments: HDFS-7858.1.patch, HDFS-7858.2.patch, HDFS-7858.2.patch, > HDFS-7858.3.patch, HDFS-7858.4.patch, HDFS-7858.5.patch, HDFS-7858.6.patch, > HDFS-7858.7.patch, HDFS-7858.8.patch, HDFS-7858.9.patch > > > In an HA deployment, Clients are configured with the hostnames of both the > Active and Standby Namenodes.Clients will first try one of the NNs > (non-deterministically) and if its a standby NN, then it will respond to the > client to retry the request on the other Namenode. > If the client happens to talks to the Standby first, and the standby is > undergoing some GC / is busy, then those clients might not get a response > soon enough to try the other NN. > Proposed Approach to solve this : > 1) Since Zookeeper is already used as the failover controller, the clients > could talk to ZK and find out which is the active namenode before contacting > it. > 2) Long-lived DFSClients would have a ZK watch configured which fires when > there is a failover so they do not have to query ZK everytime to find out the > active NN > 2) Clients can also cache the last active NN in the user's home directory > (~/.lastNN) so that short-lived clients can try that Namenode first before > querying ZK -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7858) Improve HA Namenode Failover detection on the client
[ https://issues.apache.org/jira/browse/HDFS-7858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14640395#comment-14640395 ] Hadoop QA commented on HDFS-7858: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 26m 54s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 1s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 1 new or modified test files. | | {color:green}+1{color} | javac | 9m 44s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 11m 53s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 27s | The applied patch does not increase the total number of release audit warnings. | | {color:green}+1{color} | site | 3m 37s | Site still builds. | | {color:green}+1{color} | checkstyle | 2m 39s | There were no new checkstyle issues. | | {color:green}+1{color} | whitespace | 0m 1s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 45s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 43s | The patch built with eclipse:eclipse. | | {color:red}-1{color} | findbugs | 5m 33s | The patch appears to introduce 1 new Findbugs (version 3.0.0) warnings. | | {color:red}-1{color} | common tests | 25m 29s | Tests failed in hadoop-common. | | {color:red}-1{color} | hdfs tests | 161m 41s | Tests failed in hadoop-hdfs. | | | | 250m 31s | | \\ \\ || Reason || Tests || | FindBugs | module:hadoop-hdfs | | Failed unit tests | hadoop.ha.TestZKFailoverController | | | hadoop.hdfs.TestAppendSnapshotTruncate | | | hadoop.hdfs.server.namenode.ha.TestRequestHedgingProxyProvider | | | hadoop.hdfs.TestDistributedFileSystem | | Timed out tests | org.apache.hadoop.hdfs.server.namenode.ha.TestDelegationTokensWithHA | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12746976/HDFS-7858.9.patch | | Optional Tests | javadoc javac unit findbugs checkstyle site | | git revision | trunk / e202efa | | Findbugs warnings | https://builds.apache.org/job/PreCommit-HDFS-Build/11826/artifact/patchprocess/newPatchFindbugsWarningshadoop-hdfs.html | | hadoop-common test log | https://builds.apache.org/job/PreCommit-HDFS-Build/11826/artifact/patchprocess/testrun_hadoop-common.txt | | hadoop-hdfs test log | https://builds.apache.org/job/PreCommit-HDFS-Build/11826/artifact/patchprocess/testrun_hadoop-hdfs.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/11826/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf904.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/11826/console | This message was automatically generated. > Improve HA Namenode Failover detection on the client > > > Key: HDFS-7858 > URL: https://issues.apache.org/jira/browse/HDFS-7858 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Arun Suresh >Assignee: Arun Suresh > Labels: BB2015-05-TBR > Attachments: HDFS-7858.1.patch, HDFS-7858.2.patch, HDFS-7858.2.patch, > HDFS-7858.3.patch, HDFS-7858.4.patch, HDFS-7858.5.patch, HDFS-7858.6.patch, > HDFS-7858.7.patch, HDFS-7858.8.patch, HDFS-7858.9.patch > > > In an HA deployment, Clients are configured with the hostnames of both the > Active and Standby Namenodes.Clients will first try one of the NNs > (non-deterministically) and if its a standby NN, then it will respond to the > client to retry the request on the other Namenode. > If the client happens to talks to the Standby first, and the standby is > undergoing some GC / is busy, then those clients might not get a response > soon enough to try the other NN. > Proposed Approach to solve this : > 1) Since Zookeeper is already used as the failover controller, the clients > could talk to ZK and find out which is the active namenode before contacting > it. > 2) Long-lived DFSClients would have a ZK watch configured which fires when > there is a failover so they do not have to query ZK everytime to find out the > active NN > 2) Clients can also cache the last active NN in the user's home directory > (~/.lastNN) so that short-lived clients can try that Namenode first before > querying ZK -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7858) Improve HA Namenode Failover detection on the client
[ https://issues.apache.org/jira/browse/HDFS-7858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14640152#comment-14640152 ] Arun Suresh commented on HDFS-7858: --- bq. in case of failover of HA, only one request will be invoked (SNN) in hedged invocations. Am I right? yup.. although in the case of more than 2 NNs, the subsequent request will be hedged to ALL remaining NNs except the current failed-over NN. bq. This way I feel both ConfiguredFailoverProxyProvider and RequestHedgingProxyProvider work same way, except at the very first time. .. Yup.. as well as the above mentioned condition. bq. ..if no. of proxies to try to are more than 2 then RequestHedgingProxyProvider will be best. yup.. now that HDFS-6440 is resolved, I am hoping ReqHedging would be default. It is also useful in cases where there are large number of adhoc clients (MR jobs) where many of the calls will be one time calls. RequestHedgingProxyProvider will ensure that these tasks don't have to wait for a timed-out request / Exception from a Failed NN to failover to failover to the SNN. > Improve HA Namenode Failover detection on the client > > > Key: HDFS-7858 > URL: https://issues.apache.org/jira/browse/HDFS-7858 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Arun Suresh >Assignee: Arun Suresh > Labels: BB2015-05-TBR > Attachments: HDFS-7858.1.patch, HDFS-7858.2.patch, HDFS-7858.2.patch, > HDFS-7858.3.patch, HDFS-7858.4.patch, HDFS-7858.5.patch, HDFS-7858.6.patch, > HDFS-7858.7.patch, HDFS-7858.8.patch, HDFS-7858.9.patch > > > In an HA deployment, Clients are configured with the hostnames of both the > Active and Standby Namenodes.Clients will first try one of the NNs > (non-deterministically) and if its a standby NN, then it will respond to the > client to retry the request on the other Namenode. > If the client happens to talks to the Standby first, and the standby is > undergoing some GC / is busy, then those clients might not get a response > soon enough to try the other NN. > Proposed Approach to solve this : > 1) Since Zookeeper is already used as the failover controller, the clients > could talk to ZK and find out which is the active namenode before contacting > it. > 2) Long-lived DFSClients would have a ZK watch configured which fires when > there is a failover so they do not have to query ZK everytime to find out the > active NN > 2) Clients can also cache the last active NN in the user's home directory > (~/.lastNN) so that short-lived clients can try that Namenode first before > querying ZK -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7858) Improve HA Namenode Failover detection on the client
[ https://issues.apache.org/jira/browse/HDFS-7858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14640133#comment-14640133 ] Vinayakumar B commented on HDFS-7858: - I have a small question here. I believe all client operations will successfully talk to only Active NameNode. In current {{ConfiguredFailoverProxyProvider}}, only at the beginning, when the client initializes, there will be a need of trying to both Nodes, if standby comes first. During failover, if ANN goes down and SNN is still not failedover, then client has to try again to previous ANN and come back to current SNN to check for the failover one more time. Once the successful proxy found, all subsequent requests will go there. In case of proposed {{RequestHedgingProxyProvider}}, Only at the beginning, there will not be any failed proxy, at that time hedged requests will goto both NNs. During failover, current failed proxy (prev ANN) will be ignored for hedged requests, i.e. in case of failover of HA, only one request will be invoked (SNN) in hedged invocations. Am I right? This way I feel both {{ConfiguredFailoverProxyProvider}} and {{RequestHedgingProxyProvider}} work same way, except at the very first time. And yes, if no. of proxies to try to are more than 2 then {{RequestHedgingProxyProvider}} will be best. Am I missing something here? > Improve HA Namenode Failover detection on the client > > > Key: HDFS-7858 > URL: https://issues.apache.org/jira/browse/HDFS-7858 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Arun Suresh >Assignee: Arun Suresh > Labels: BB2015-05-TBR > Attachments: HDFS-7858.1.patch, HDFS-7858.2.patch, HDFS-7858.2.patch, > HDFS-7858.3.patch, HDFS-7858.4.patch, HDFS-7858.5.patch, HDFS-7858.6.patch, > HDFS-7858.7.patch, HDFS-7858.8.patch, HDFS-7858.9.patch > > > In an HA deployment, Clients are configured with the hostnames of both the > Active and Standby Namenodes.Clients will first try one of the NNs > (non-deterministically) and if its a standby NN, then it will respond to the > client to retry the request on the other Namenode. > If the client happens to talks to the Standby first, and the standby is > undergoing some GC / is busy, then those clients might not get a response > soon enough to try the other NN. > Proposed Approach to solve this : > 1) Since Zookeeper is already used as the failover controller, the clients > could talk to ZK and find out which is the active namenode before contacting > it. > 2) Long-lived DFSClients would have a ZK watch configured which fires when > there is a failover so they do not have to query ZK everytime to find out the > active NN > 2) Clients can also cache the last active NN in the user's home directory > (~/.lastNN) so that short-lived clients can try that Namenode first before > querying ZK -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7858) Improve HA Namenode Failover detection on the client
[ https://issues.apache.org/jira/browse/HDFS-7858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14640114#comment-14640114 ] Hadoop QA commented on HDFS-7858: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 29m 21s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 1 new or modified test files. | | {color:green}+1{color} | javac | 9m 34s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 12m 13s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 30s | The applied patch does not increase the total number of release audit warnings. | | {color:green}+1{color} | site | 3m 33s | Site still builds. | | {color:green}+1{color} | checkstyle | 2m 36s | There were no new checkstyle issues. | | {color:red}-1{color} | whitespace | 0m 1s | The patch has 2 line(s) that end in whitespace. Use git apply --whitespace=fix. | | {color:green}+1{color} | install | 1m 59s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 40s | The patch built with eclipse:eclipse. | | {color:red}-1{color} | findbugs | 5m 36s | The patch appears to introduce 1 new Findbugs (version 3.0.0) warnings. | | {color:red}-1{color} | common tests | 24m 55s | Tests failed in hadoop-common. | | {color:red}-1{color} | hdfs tests | 0m 25s | Tests failed in hadoop-hdfs. | | | | 91m 30s | | \\ \\ || Reason || Tests || | FindBugs | module:hadoop-hdfs | | Failed unit tests | hadoop.io.retry.TestFailoverProxy | | Failed build | hadoop-hdfs | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12746923/HDFS-7858.8.patch | | Optional Tests | javadoc javac unit findbugs checkstyle site | | git revision | trunk / 02c0181 | | whitespace | https://builds.apache.org/job/PreCommit-HDFS-Build/11821/artifact/patchprocess/whitespace.txt | | Findbugs warnings | https://builds.apache.org/job/PreCommit-HDFS-Build/11821/artifact/patchprocess/newPatchFindbugsWarningshadoop-hdfs.html | | hadoop-common test log | https://builds.apache.org/job/PreCommit-HDFS-Build/11821/artifact/patchprocess/testrun_hadoop-common.txt | | hadoop-hdfs test log | https://builds.apache.org/job/PreCommit-HDFS-Build/11821/artifact/patchprocess/testrun_hadoop-hdfs.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/11821/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf904.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/11821/console | This message was automatically generated. > Improve HA Namenode Failover detection on the client > > > Key: HDFS-7858 > URL: https://issues.apache.org/jira/browse/HDFS-7858 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Arun Suresh >Assignee: Arun Suresh > Labels: BB2015-05-TBR > Attachments: HDFS-7858.1.patch, HDFS-7858.2.patch, HDFS-7858.2.patch, > HDFS-7858.3.patch, HDFS-7858.4.patch, HDFS-7858.5.patch, HDFS-7858.6.patch, > HDFS-7858.7.patch, HDFS-7858.8.patch, HDFS-7858.9.patch > > > In an HA deployment, Clients are configured with the hostnames of both the > Active and Standby Namenodes.Clients will first try one of the NNs > (non-deterministically) and if its a standby NN, then it will respond to the > client to retry the request on the other Namenode. > If the client happens to talks to the Standby first, and the standby is > undergoing some GC / is busy, then those clients might not get a response > soon enough to try the other NN. > Proposed Approach to solve this : > 1) Since Zookeeper is already used as the failover controller, the clients > could talk to ZK and find out which is the active namenode before contacting > it. > 2) Long-lived DFSClients would have a ZK watch configured which fires when > there is a failover so they do not have to query ZK everytime to find out the > active NN > 2) Clients can also cache the last active NN in the user's home directory > (~/.lastNN) so that short-lived clients can try that Namenode first before > querying ZK -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7858) Improve HA Namenode Failover detection on the client
[ https://issues.apache.org/jira/browse/HDFS-7858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14640012#comment-14640012 ] Hadoop QA commented on HDFS-7858: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 21m 48s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 1 new or modified test files. | | {color:green}+1{color} | javac | 7m 32s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 9m 36s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 22s | The applied patch does not increase the total number of release audit warnings. | | {color:green}+1{color} | site | 2m 58s | Site still builds. | | {color:green}+1{color} | checkstyle | 2m 2s | There were no new checkstyle issues. | | {color:red}-1{color} | whitespace | 0m 1s | The patch has 2 line(s) that end in whitespace. Use git apply --whitespace=fix. | | {color:green}+1{color} | install | 1m 29s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 32s | The patch built with eclipse:eclipse. | | {color:red}-1{color} | findbugs | 4m 25s | The patch appears to introduce 1 new Findbugs (version 3.0.0) warnings. | | {color:red}-1{color} | common tests | 21m 31s | Tests failed in hadoop-common. | | {color:red}-1{color} | hdfs tests | 171m 16s | Tests failed in hadoop-hdfs. | | | | 243m 36s | | \\ \\ || Reason || Tests || | FindBugs | module:hadoop-hdfs | | Failed unit tests | hadoop.io.retry.TestFailoverProxy | | | hadoop.hdfs.TestDFSInotifyEventInputStream | | | hadoop.hdfs.tools.TestDFSZKFailoverController | | | hadoop.hdfs.TestDFSClientFailover | | | hadoop.hdfs.server.namenode.ha.TestDelegationTokensWithHA | | | hadoop.hdfs.server.namenode.ha.TestPipelinesFailover | | | hadoop.hdfs.server.namenode.ha.TestFailoverWithBlockTokensEnabled | | | hadoop.hdfs.server.namenode.ha.TestDNFencingWithReplication | | | hadoop.hdfs.server.namenode.ha.TestDNFencing | | | hadoop.hdfs.server.namenode.ha.TestHAMetrics | | | hadoop.hdfs.server.namenode.ha.TestXAttrsWithHA | | | hadoop.hdfs.server.balancer.TestBalancerWithHANameNodes | | | hadoop.hdfs.TestDistributedFileSystem | | | hadoop.hdfs.server.namenode.ha.TestHAFsck | | | hadoop.hdfs.server.namenode.ha.TestSeveralNameNodes | | | hadoop.hdfs.server.namenode.ha.TestHAAppend | | | hadoop.hdfs.server.namenode.ha.TestDFSUpgradeWithHA | | | hadoop.hdfs.server.namenode.ha.TestHASafeMode | | | hadoop.hdfs.server.namenode.ha.TestStandbyIsHot | | | hadoop.hdfs.server.namenode.ha.TestHAStateTransitions | | | hadoop.hdfs.TestEncryptionZonesWithHA | | Timed out tests | org.apache.hadoop.hdfs.server.namenode.ha.TestRetryCacheWithHA | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12746923/HDFS-7858.8.patch | | Optional Tests | javadoc javac unit findbugs checkstyle site | | git revision | trunk / ab3197c | | whitespace | https://builds.apache.org/job/PreCommit-HDFS-Build/11817/artifact/patchprocess/whitespace.txt | | Findbugs warnings | https://builds.apache.org/job/PreCommit-HDFS-Build/11817/artifact/patchprocess/newPatchFindbugsWarningshadoop-hdfs.html | | hadoop-common test log | https://builds.apache.org/job/PreCommit-HDFS-Build/11817/artifact/patchprocess/testrun_hadoop-common.txt | | hadoop-hdfs test log | https://builds.apache.org/job/PreCommit-HDFS-Build/11817/artifact/patchprocess/testrun_hadoop-hdfs.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/11817/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf909.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/11817/console | This message was automatically generated. > Improve HA Namenode Failover detection on the client > > > Key: HDFS-7858 > URL: https://issues.apache.org/jira/browse/HDFS-7858 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Arun Suresh >Assignee: Arun Suresh > Labels: BB2015-05-TBR > Attachments: HDFS-7858.1.patch, HDFS-7858.2.patch, HDFS-7858.2.patch, > HDFS-7858.3.patch, HDFS-7858.4.patch, HDFS-7858.5.patch, HDFS-7858.6.patch, > HDFS-7858.7.patch, HDFS-7858.8.patch > > > In an HA deployment, Clients are configured with the hostnames of both the > Active and Standby Namenodes.Clients will first try one of the NNs > (non-deterministically) and if its a st
[jira] [Commented] (HDFS-7858) Improve HA Namenode Failover detection on the client
[ https://issues.apache.org/jira/browse/HDFS-7858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14639890#comment-14639890 ] Hadoop QA commented on HDFS-7858: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 22m 3s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 1 new or modified test files. | | {color:green}+1{color} | javac | 7m 39s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 9m 38s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 22s | The applied patch does not increase the total number of release audit warnings. | | {color:green}+1{color} | site | 2m 59s | Site still builds. | | {color:green}+1{color} | checkstyle | 2m 4s | There were no new checkstyle issues. | | {color:red}-1{color} | whitespace | 0m 1s | The patch has 2 line(s) that end in whitespace. Use git apply --whitespace=fix. | | {color:green}+1{color} | install | 1m 30s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 31s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 4m 21s | The patch does not introduce any new Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | common tests | 22m 25s | Tests passed in hadoop-common. | | {color:red}-1{color} | hdfs tests | 161m 5s | Tests failed in hadoop-hdfs. | | | | 234m 42s | | \\ \\ || Reason || Tests || | Failed unit tests | hadoop.hdfs.TestDistributedFileSystem | | | hadoop.hdfs.server.namenode.ha.TestStandbyIsHot | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12746903/HDFS-7858.7.patch | | Optional Tests | javadoc javac unit findbugs checkstyle site | | git revision | trunk / 1d3026e | | whitespace | https://builds.apache.org/job/PreCommit-HDFS-Build/11816/artifact/patchprocess/whitespace.txt | | hadoop-common test log | https://builds.apache.org/job/PreCommit-HDFS-Build/11816/artifact/patchprocess/testrun_hadoop-common.txt | | hadoop-hdfs test log | https://builds.apache.org/job/PreCommit-HDFS-Build/11816/artifact/patchprocess/testrun_hadoop-hdfs.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/11816/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf900.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/11816/console | This message was automatically generated. > Improve HA Namenode Failover detection on the client > > > Key: HDFS-7858 > URL: https://issues.apache.org/jira/browse/HDFS-7858 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Arun Suresh >Assignee: Arun Suresh > Labels: BB2015-05-TBR > Attachments: HDFS-7858.1.patch, HDFS-7858.2.patch, HDFS-7858.2.patch, > HDFS-7858.3.patch, HDFS-7858.4.patch, HDFS-7858.5.patch, HDFS-7858.6.patch, > HDFS-7858.7.patch, HDFS-7858.8.patch > > > In an HA deployment, Clients are configured with the hostnames of both the > Active and Standby Namenodes.Clients will first try one of the NNs > (non-deterministically) and if its a standby NN, then it will respond to the > client to retry the request on the other Namenode. > If the client happens to talks to the Standby first, and the standby is > undergoing some GC / is busy, then those clients might not get a response > soon enough to try the other NN. > Proposed Approach to solve this : > 1) Since Zookeeper is already used as the failover controller, the clients > could talk to ZK and find out which is the active namenode before contacting > it. > 2) Long-lived DFSClients would have a ZK watch configured which fires when > there is a failover so they do not have to query ZK everytime to find out the > active NN > 2) Clients can also cache the last active NN in the user's home directory > (~/.lastNN) so that short-lived clients can try that Namenode first before > querying ZK -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7858) Improve HA Namenode Failover detection on the client
[ https://issues.apache.org/jira/browse/HDFS-7858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14639676#comment-14639676 ] Jing Zhao commented on HDFS-7858: - bq. So, in the case of the ReqHedgingProxy, performFailover will be called only if ALL the proxies have failed (with retry/failover_and_retry.. ) But I guess if "the original successfulProxy is not null", this try only uses this single proxy thus the failover_and_retry only talks about it? bq. Oh.. I was thinking we should keep trunk Java 7 compilable ? Java 7 does not require explicit type argument. Java 6 does. > Improve HA Namenode Failover detection on the client > > > Key: HDFS-7858 > URL: https://issues.apache.org/jira/browse/HDFS-7858 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Arun Suresh >Assignee: Arun Suresh > Labels: BB2015-05-TBR > Attachments: HDFS-7858.1.patch, HDFS-7858.2.patch, HDFS-7858.2.patch, > HDFS-7858.3.patch, HDFS-7858.4.patch, HDFS-7858.5.patch, HDFS-7858.6.patch, > HDFS-7858.7.patch > > > In an HA deployment, Clients are configured with the hostnames of both the > Active and Standby Namenodes.Clients will first try one of the NNs > (non-deterministically) and if its a standby NN, then it will respond to the > client to retry the request on the other Namenode. > If the client happens to talks to the Standby first, and the standby is > undergoing some GC / is busy, then those clients might not get a response > soon enough to try the other NN. > Proposed Approach to solve this : > 1) Since Zookeeper is already used as the failover controller, the clients > could talk to ZK and find out which is the active namenode before contacting > it. > 2) Long-lived DFSClients would have a ZK watch configured which fires when > there is a failover so they do not have to query ZK everytime to find out the > active NN > 2) Clients can also cache the last active NN in the user's home directory > (~/.lastNN) so that short-lived clients can try that Namenode first before > querying ZK -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7858) Improve HA Namenode Failover detection on the client
[ https://issues.apache.org/jira/browse/HDFS-7858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14637663#comment-14637663 ] Arpit Agarwal commented on HDFS-7858: - Thanks for updating the patch Arun. The MultiException approach looks like a good alternative to refactoring RetryPolicy. A few comments: # I didn't understand the call to {{super.performFailover}} in {{RequestHedgingProxyProvider#getProxy}}. # The documentation in HDFSHighAvailabilityWithQJM.md and HDFSHighAvailabilityWithNFS.md should be updated as it states _The only implementation which currently ships with Hadoop is the ConfiguredFailoverProxyProvider_. Okay to do this in a separate Jira. # Agree with Jing's suggestion to use a {{CompletionService}}. Also we should file a task to make {{RequestHedgingProxyProvider}} the default eventually. Nitpicks: # {{getDelayMillis}} javadoc is wrong. # {{successfullproxy}} should be {{successfulproxy}}. # _new LinkedList_ - explicit type argument redundant. # _static interface ProxyFactory_ - static is redundant. > Improve HA Namenode Failover detection on the client > > > Key: HDFS-7858 > URL: https://issues.apache.org/jira/browse/HDFS-7858 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Arun Suresh >Assignee: Arun Suresh > Labels: BB2015-05-TBR > Attachments: HDFS-7858.1.patch, HDFS-7858.2.patch, HDFS-7858.2.patch, > HDFS-7858.3.patch, HDFS-7858.4.patch, HDFS-7858.5.patch, HDFS-7858.6.patch > > > In an HA deployment, Clients are configured with the hostnames of both the > Active and Standby Namenodes.Clients will first try one of the NNs > (non-deterministically) and if its a standby NN, then it will respond to the > client to retry the request on the other Namenode. > If the client happens to talks to the Standby first, and the standby is > undergoing some GC / is busy, then those clients might not get a response > soon enough to try the other NN. > Proposed Approach to solve this : > 1) Since Zookeeper is already used as the failover controller, the clients > could talk to ZK and find out which is the active namenode before contacting > it. > 2) Long-lived DFSClients would have a ZK watch configured which fires when > there is a failover so they do not have to query ZK everytime to find out the > active NN > 2) Clients can also cache the last active NN in the user's home directory > (~/.lastNN) so that short-lived clients can try that Namenode first before > querying ZK -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7858) Improve HA Namenode Failover detection on the client
[ https://issues.apache.org/jira/browse/HDFS-7858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14637494#comment-14637494 ] Jing Zhao commented on HDFS-7858: - Thanks for updating the patch, [~asuresh]! The current approach looks good to me. Some quick comments about the patch: # In {{RequestHedgingInvocationHandler#invoke}}, instead of polling the tasks every 10ms, can we use {{CompletionService}} here? # For {{RequestHedgingProxyProvider#performFailover}}, if the original successfulProxy is not null, we can exclude it for the next time retry. > Improve HA Namenode Failover detection on the client > > > Key: HDFS-7858 > URL: https://issues.apache.org/jira/browse/HDFS-7858 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Arun Suresh >Assignee: Arun Suresh > Labels: BB2015-05-TBR > Attachments: HDFS-7858.1.patch, HDFS-7858.2.patch, HDFS-7858.2.patch, > HDFS-7858.3.patch, HDFS-7858.4.patch, HDFS-7858.5.patch, HDFS-7858.6.patch > > > In an HA deployment, Clients are configured with the hostnames of both the > Active and Standby Namenodes.Clients will first try one of the NNs > (non-deterministically) and if its a standby NN, then it will respond to the > client to retry the request on the other Namenode. > If the client happens to talks to the Standby first, and the standby is > undergoing some GC / is busy, then those clients might not get a response > soon enough to try the other NN. > Proposed Approach to solve this : > 1) Since Zookeeper is already used as the failover controller, the clients > could talk to ZK and find out which is the active namenode before contacting > it. > 2) Long-lived DFSClients would have a ZK watch configured which fires when > there is a failover so they do not have to query ZK everytime to find out the > active NN > 2) Clients can also cache the last active NN in the user's home directory > (~/.lastNN) so that short-lived clients can try that Namenode first before > querying ZK -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7858) Improve HA Namenode Failover detection on the client
[ https://issues.apache.org/jira/browse/HDFS-7858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14632744#comment-14632744 ] Hadoop QA commented on HDFS-7858: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 18m 44s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 1 new or modified test files. | | {color:green}+1{color} | javac | 7m 34s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 9m 37s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 23s | The applied patch does not increase the total number of release audit warnings. | | {color:red}-1{color} | checkstyle | 2m 27s | The applied patch generated 11 new checkstyle issues (total was 426, now 436). | | {color:red}-1{color} | whitespace | 0m 0s | The patch has 1 line(s) that end in whitespace. Use git apply --whitespace=fix. | | {color:green}+1{color} | install | 1m 20s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 34s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 4m 23s | The patch does not introduce any new Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | common tests | 22m 26s | Tests passed in hadoop-common. | | {color:red}-1{color} | hdfs tests | 160m 22s | Tests failed in hadoop-hdfs. | | | | 227m 53s | | \\ \\ || Reason || Tests || | Failed unit tests | hadoop.hdfs.TestAppendSnapshotTruncate | | | hadoop.hdfs.TestDistributedFileSystem | | | hadoop.hdfs.server.namenode.ha.TestStandbyIsHot | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12745991/HDFS-7858.6.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | trunk / 176131f | | checkstyle | https://builds.apache.org/job/PreCommit-HDFS-Build/11748/artifact/patchprocess/diffcheckstylehadoop-hdfs.txt | | whitespace | https://builds.apache.org/job/PreCommit-HDFS-Build/11748/artifact/patchprocess/whitespace.txt | | hadoop-common test log | https://builds.apache.org/job/PreCommit-HDFS-Build/11748/artifact/patchprocess/testrun_hadoop-common.txt | | hadoop-hdfs test log | https://builds.apache.org/job/PreCommit-HDFS-Build/11748/artifact/patchprocess/testrun_hadoop-hdfs.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/11748/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf909.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/11748/console | This message was automatically generated. > Improve HA Namenode Failover detection on the client > > > Key: HDFS-7858 > URL: https://issues.apache.org/jira/browse/HDFS-7858 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Arun Suresh >Assignee: Arun Suresh > Labels: BB2015-05-TBR > Attachments: HDFS-7858.1.patch, HDFS-7858.2.patch, HDFS-7858.2.patch, > HDFS-7858.3.patch, HDFS-7858.4.patch, HDFS-7858.5.patch, HDFS-7858.6.patch > > > In an HA deployment, Clients are configured with the hostnames of both the > Active and Standby Namenodes.Clients will first try one of the NNs > (non-deterministically) and if its a standby NN, then it will respond to the > client to retry the request on the other Namenode. > If the client happens to talks to the Standby first, and the standby is > undergoing some GC / is busy, then those clients might not get a response > soon enough to try the other NN. > Proposed Approach to solve this : > 1) Since Zookeeper is already used as the failover controller, the clients > could talk to ZK and find out which is the active namenode before contacting > it. > 2) Long-lived DFSClients would have a ZK watch configured which fires when > there is a failover so they do not have to query ZK everytime to find out the > active NN > 2) Clients can also cache the last active NN in the user's home directory > (~/.lastNN) so that short-lived clients can try that Namenode first before > querying ZK -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7858) Improve HA Namenode Failover detection on the client
[ https://issues.apache.org/jira/browse/HDFS-7858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14632690#comment-14632690 ] Hadoop QA commented on HDFS-7858: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:red}-1{color} | pre-patch | 17m 6s | Findbugs (version ) appears to be broken on trunk. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 1 new or modified test files. | | {color:red}-1{color} | javac | 7m 34s | The applied patch generated 1 additional warning messages. | | {color:red}-1{color} | javadoc | 9m 35s | The applied patch generated 1 additional warning messages. | | {color:green}+1{color} | release audit | 0m 23s | The applied patch does not increase the total number of release audit warnings. | | {color:red}-1{color} | checkstyle | 1m 24s | The applied patch generated 6 new checkstyle issues (total was 7, now 12). | | {color:red}-1{color} | whitespace | 0m 1s | The patch has 1 line(s) that end in whitespace. Use git apply --whitespace=fix. | | {color:green}+1{color} | install | 1m 30s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 35s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 4m 23s | The patch does not introduce any new Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | common tests | 22m 31s | Tests passed in hadoop-common. | | {color:red}-1{color} | hdfs tests | 160m 43s | Tests failed in hadoop-hdfs. | | | | 226m 4s | | \\ \\ || Reason || Tests || | Failed unit tests | hadoop.hdfs.TestAppendSnapshotTruncate | | | hadoop.hdfs.TestDistributedFileSystem | | | hadoop.hdfs.server.namenode.ha.TestStandbyIsHot | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12745987/HDFS-7858.5.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | trunk / 176131f | | javac | https://builds.apache.org/job/PreCommit-HDFS-Build/11747/artifact/patchprocess/diffJavacWarnings.txt | | javadoc | https://builds.apache.org/job/PreCommit-HDFS-Build/11747/artifact/patchprocess/diffJavadocWarnings.txt | | checkstyle | https://builds.apache.org/job/PreCommit-HDFS-Build/11747/artifact/patchprocess/diffcheckstylehadoop-common.txt | | whitespace | https://builds.apache.org/job/PreCommit-HDFS-Build/11747/artifact/patchprocess/whitespace.txt | | hadoop-common test log | https://builds.apache.org/job/PreCommit-HDFS-Build/11747/artifact/patchprocess/testrun_hadoop-common.txt | | hadoop-hdfs test log | https://builds.apache.org/job/PreCommit-HDFS-Build/11747/artifact/patchprocess/testrun_hadoop-hdfs.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/11747/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf909.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/11747/console | This message was automatically generated. > Improve HA Namenode Failover detection on the client > > > Key: HDFS-7858 > URL: https://issues.apache.org/jira/browse/HDFS-7858 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Arun Suresh >Assignee: Arun Suresh > Labels: BB2015-05-TBR > Attachments: HDFS-7858.1.patch, HDFS-7858.2.patch, HDFS-7858.2.patch, > HDFS-7858.3.patch, HDFS-7858.4.patch, HDFS-7858.5.patch > > > In an HA deployment, Clients are configured with the hostnames of both the > Active and Standby Namenodes.Clients will first try one of the NNs > (non-deterministically) and if its a standby NN, then it will respond to the > client to retry the request on the other Namenode. > If the client happens to talks to the Standby first, and the standby is > undergoing some GC / is busy, then those clients might not get a response > soon enough to try the other NN. > Proposed Approach to solve this : > 1) Since Zookeeper is already used as the failover controller, the clients > could talk to ZK and find out which is the active namenode before contacting > it. > 2) Long-lived DFSClients would have a ZK watch configured which fires when > there is a failover so they do not have to query ZK everytime to find out the > active NN > 2) Clients can also cache the last active NN in the user's home directory > (~/.lastNN) so that short-lived clients can try that Namenode first before > querying ZK -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7858) Improve HA Namenode Failover detection on the client
[ https://issues.apache.org/jira/browse/HDFS-7858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14631806#comment-14631806 ] Arpit Agarwal commented on HDFS-7858: - Thanks [~asuresh], I will review your patch by next week. > Improve HA Namenode Failover detection on the client > > > Key: HDFS-7858 > URL: https://issues.apache.org/jira/browse/HDFS-7858 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Arun Suresh >Assignee: Arun Suresh > Labels: BB2015-05-TBR > Attachments: HDFS-7858.1.patch, HDFS-7858.2.patch, HDFS-7858.2.patch, > HDFS-7858.3.patch, HDFS-7858.4.patch > > > In an HA deployment, Clients are configured with the hostnames of both the > Active and Standby Namenodes.Clients will first try one of the NNs > (non-deterministically) and if its a standby NN, then it will respond to the > client to retry the request on the other Namenode. > If the client happens to talks to the Standby first, and the standby is > undergoing some GC / is busy, then those clients might not get a response > soon enough to try the other NN. > Proposed Approach to solve this : > 1) Since Zookeeper is already used as the failover controller, the clients > could talk to ZK and find out which is the active namenode before contacting > it. > 2) Long-lived DFSClients would have a ZK watch configured which fires when > there is a failover so they do not have to query ZK everytime to find out the > active NN > 2) Clients can also cache the last active NN in the user's home directory > (~/.lastNN) so that short-lived clients can try that Namenode first before > querying ZK -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7858) Improve HA Namenode Failover detection on the client
[ https://issues.apache.org/jira/browse/HDFS-7858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14630673#comment-14630673 ] Hadoop QA commented on HDFS-7858: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 18m 51s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 1 new or modified test files. | | {color:red}-1{color} | javac | 7m 39s | The applied patch generated 1 additional warning messages. | | {color:red}-1{color} | javadoc | 9m 45s | The applied patch generated 1 additional warning messages. | | {color:green}+1{color} | release audit | 0m 22s | The applied patch does not increase the total number of release audit warnings. | | {color:red}-1{color} | checkstyle | 1m 50s | The applied patch generated 6 new checkstyle issues (total was 7, now 12). | | {color:red}-1{color} | whitespace | 0m 1s | The patch has 1 line(s) that end in whitespace. Use git apply --whitespace=fix. | | {color:green}+1{color} | install | 1m 32s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 34s | The patch built with eclipse:eclipse. | | {color:red}-1{color} | findbugs | 4m 26s | The patch appears to introduce 4 new Findbugs (version 3.0.0) warnings. | | {color:red}-1{color} | common tests | 21m 29s | Tests failed in hadoop-common. | | {color:red}-1{color} | hdfs tests | 177m 1s | Tests failed in hadoop-hdfs. | | | | 243m 49s | | \\ \\ || Reason || Tests || | FindBugs | module:hadoop-common | | Failed unit tests | hadoop.io.retry.TestFailoverProxy | | | hadoop.ipc.TestIPC | | | hadoop.io.retry.TestRetryProxy | | | hadoop.hdfs.server.balancer.TestBalancerWithHANameNodes | | | hadoop.hdfs.server.namenode.ha.TestHASafeMode | | | hadoop.hdfs.TestEncryptionZonesWithHA | | | hadoop.hdfs.server.namenode.ha.TestSeveralNameNodes | | | hadoop.hdfs.server.namenode.ha.TestHAAppend | | | hadoop.hdfs.tools.TestDFSZKFailoverController | | | hadoop.hdfs.server.namenode.ha.TestHAStateTransitions | | | hadoop.hdfs.TestDistributedFileSystem | | | hadoop.hdfs.server.namenode.ha.TestXAttrsWithHA | | | hadoop.hdfs.server.namenode.ha.TestStandbyIsHot | | | hadoop.hdfs.server.namenode.TestCheckpoint | | | hadoop.hdfs.server.namenode.ha.TestPipelinesFailover | | | hadoop.hdfs.server.namenode.ha.TestDNFencingWithReplication | | | hadoop.hdfs.server.namenode.TestNameNodeRetryCacheMetrics | | | hadoop.hdfs.TestGetBlocks | | | hadoop.hdfs.server.namenode.ha.TestDFSUpgradeWithHA | | | hadoop.hdfs.server.namenode.ha.TestFailoverWithBlockTokensEnabled | | | hadoop.hdfs.server.namenode.TestBlockUnderConstruction | | | hadoop.hdfs.TestDFSClientRetries | | | hadoop.hdfs.TestDFSInotifyEventInputStream | | | hadoop.hdfs.TestDFSClientFailover | | | hadoop.hdfs.server.namenode.ha.TestHAFsck | | | hadoop.hdfs.server.namenode.ha.TestDNFencing | | Timed out tests | org.apache.hadoop.hdfs.server.namenode.ha.TestRetryCacheWithHA | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12745705/HDFS-7858.4.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | trunk / 0bda84f | | javac | https://builds.apache.org/job/PreCommit-HDFS-Build/11733/artifact/patchprocess/diffJavacWarnings.txt | | javadoc | https://builds.apache.org/job/PreCommit-HDFS-Build/11733/artifact/patchprocess/diffJavadocWarnings.txt | | checkstyle | https://builds.apache.org/job/PreCommit-HDFS-Build/11733/artifact/patchprocess/diffcheckstylehadoop-common.txt | | whitespace | https://builds.apache.org/job/PreCommit-HDFS-Build/11733/artifact/patchprocess/whitespace.txt | | Findbugs warnings | https://builds.apache.org/job/PreCommit-HDFS-Build/11733/artifact/patchprocess/newPatchFindbugsWarningshadoop-common.html | | hadoop-common test log | https://builds.apache.org/job/PreCommit-HDFS-Build/11733/artifact/patchprocess/testrun_hadoop-common.txt | | hadoop-hdfs test log | https://builds.apache.org/job/PreCommit-HDFS-Build/11733/artifact/patchprocess/testrun_hadoop-hdfs.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/11733/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf900.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/11733/console | This message was automatically generated. > Improve HA Namenode Failover detection on the client > > > Key: HDFS-7858 > URL: https://issues.apache.org/jira/browse/HDFS-7858
[jira] [Commented] (HDFS-7858) Improve HA Namenode Failover detection on the client
[ https://issues.apache.org/jira/browse/HDFS-7858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14626969#comment-14626969 ] Arun Suresh commented on HDFS-7858: --- [~arpitagarwal], apologize for sitting on this... I was trying to refactor this as per [~jingzhao]'s suggestion (replacing RetryInvocationHandler with RequestHedgingInvocationHandler). Unfortunately, it was turning out to be a more far reaching impact (technically request hedging is different from retry.. so the whole policy framework etc. would need to be refactored) If everyone is ok with the current approach, we can punt the larger refactoring to another JIRA and I can incorporate [~arpitagarwal]'s suggestion (skip standby for subsequent requests) and provide a quick patch. > Improve HA Namenode Failover detection on the client > > > Key: HDFS-7858 > URL: https://issues.apache.org/jira/browse/HDFS-7858 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Arun Suresh >Assignee: Arun Suresh > Labels: BB2015-05-TBR > Attachments: HDFS-7858.1.patch, HDFS-7858.2.patch, HDFS-7858.2.patch, > HDFS-7858.3.patch > > > In an HA deployment, Clients are configured with the hostnames of both the > Active and Standby Namenodes.Clients will first try one of the NNs > (non-deterministically) and if its a standby NN, then it will respond to the > client to retry the request on the other Namenode. > If the client happens to talks to the Standby first, and the standby is > undergoing some GC / is busy, then those clients might not get a response > soon enough to try the other NN. > Proposed Approach to solve this : > 1) Since Zookeeper is already used as the failover controller, the clients > could talk to ZK and find out which is the active namenode before contacting > it. > 2) Long-lived DFSClients would have a ZK watch configured which fires when > there is a failover so they do not have to query ZK everytime to find out the > active NN > 2) Clients can also cache the last active NN in the user's home directory > (~/.lastNN) so that short-lived clients can try that Namenode first before > querying ZK -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7858) Improve HA Namenode Failover detection on the client
[ https://issues.apache.org/jira/browse/HDFS-7858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14626963#comment-14626963 ] Hadoop QA commented on HDFS-7858: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:red}-1{color} | patch | 0m 0s | The patch command could not apply the patch during dryrun. | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12702886/HDFS-7858.3.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | trunk / 979c9ca | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/11704/console | This message was automatically generated. > Improve HA Namenode Failover detection on the client > > > Key: HDFS-7858 > URL: https://issues.apache.org/jira/browse/HDFS-7858 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Arun Suresh >Assignee: Arun Suresh > Labels: BB2015-05-TBR > Attachments: HDFS-7858.1.patch, HDFS-7858.2.patch, HDFS-7858.2.patch, > HDFS-7858.3.patch > > > In an HA deployment, Clients are configured with the hostnames of both the > Active and Standby Namenodes.Clients will first try one of the NNs > (non-deterministically) and if its a standby NN, then it will respond to the > client to retry the request on the other Namenode. > If the client happens to talks to the Standby first, and the standby is > undergoing some GC / is busy, then those clients might not get a response > soon enough to try the other NN. > Proposed Approach to solve this : > 1) Since Zookeeper is already used as the failover controller, the clients > could talk to ZK and find out which is the active namenode before contacting > it. > 2) Long-lived DFSClients would have a ZK watch configured which fires when > there is a failover so they do not have to query ZK everytime to find out the > active NN > 2) Clients can also cache the last active NN in the user's home directory > (~/.lastNN) so that short-lived clients can try that Namenode first before > querying ZK -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7858) Improve HA Namenode Failover detection on the client
[ https://issues.apache.org/jira/browse/HDFS-7858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14626950#comment-14626950 ] Arpit Agarwal commented on HDFS-7858: - Hi [~asuresh], were you thinking of posting an updated patch. The overall approach looks good. One comment from a quick look - RequestHedgingProxyProvider sends all requests to both NNs. Should it skip the standby for subsequent requests? > Improve HA Namenode Failover detection on the client > > > Key: HDFS-7858 > URL: https://issues.apache.org/jira/browse/HDFS-7858 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Arun Suresh >Assignee: Arun Suresh > Labels: BB2015-05-TBR > Attachments: HDFS-7858.1.patch, HDFS-7858.2.patch, HDFS-7858.2.patch, > HDFS-7858.3.patch > > > In an HA deployment, Clients are configured with the hostnames of both the > Active and Standby Namenodes.Clients will first try one of the NNs > (non-deterministically) and if its a standby NN, then it will respond to the > client to retry the request on the other Namenode. > If the client happens to talks to the Standby first, and the standby is > undergoing some GC / is busy, then those clients might not get a response > soon enough to try the other NN. > Proposed Approach to solve this : > 1) Since Zookeeper is already used as the failover controller, the clients > could talk to ZK and find out which is the active namenode before contacting > it. > 2) Long-lived DFSClients would have a ZK watch configured which fires when > there is a failover so they do not have to query ZK everytime to find out the > active NN > 2) Clients can also cache the last active NN in the user's home directory > (~/.lastNN) so that short-lived clients can try that Namenode first before > querying ZK -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7858) Improve HA Namenode Failover detection on the client
[ https://issues.apache.org/jira/browse/HDFS-7858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14511949#comment-14511949 ] Chris Nauroth commented on HDFS-7858: - This is very interesting. Thanks for working on it, Arun! bq. Yup, QJM ensures only 1 namenode can write, but fencing is still recommended since there is still a possibility of the stale reads from the old Active NN before going down (I am hoping this will not be too much of an issue) I don't think the patch introduces any new problems here. If two NameNodes think they are active, there is already a risk of reads being served by the wrong node. > Improve HA Namenode Failover detection on the client > > > Key: HDFS-7858 > URL: https://issues.apache.org/jira/browse/HDFS-7858 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Arun Suresh >Assignee: Arun Suresh > Attachments: HDFS-7858.1.patch, HDFS-7858.2.patch, HDFS-7858.2.patch, > HDFS-7858.3.patch > > > In an HA deployment, Clients are configured with the hostnames of both the > Active and Standby Namenodes.Clients will first try one of the NNs > (non-deterministically) and if its a standby NN, then it will respond to the > client to retry the request on the other Namenode. > If the client happens to talks to the Standby first, and the standby is > undergoing some GC / is busy, then those clients might not get a response > soon enough to try the other NN. > Proposed Approach to solve this : > 1) Since Zookeeper is already used as the failover controller, the clients > could talk to ZK and find out which is the active namenode before contacting > it. > 2) Long-lived DFSClients would have a ZK watch configured which fires when > there is a failover so they do not have to query ZK everytime to find out the > active NN > 2) Clients can also cache the last active NN in the user's home directory > (~/.lastNN) so that short-lived clients can try that Namenode first before > querying ZK -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7858) Improve HA Namenode Failover detection on the client
[ https://issues.apache.org/jira/browse/HDFS-7858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14388750#comment-14388750 ] Arun Suresh commented on HDFS-7858: --- Thanks for the review [~jingzhao]. bq. I'm thinking that how about providing RequestHedgingInvocationHandler as a replacement of RetryInvocationHandler? We need to add the retry logic into RequestHedgingInvocationHandler but the whole layer may look more clean. Yup.. that makes sense.. let me give a shot at refactoring.. and will post an updated patch shortly > Improve HA Namenode Failover detection on the client > > > Key: HDFS-7858 > URL: https://issues.apache.org/jira/browse/HDFS-7858 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Arun Suresh >Assignee: Arun Suresh > Attachments: HDFS-7858.1.patch, HDFS-7858.2.patch, HDFS-7858.2.patch, > HDFS-7858.3.patch > > > In an HA deployment, Clients are configured with the hostnames of both the > Active and Standby Namenodes.Clients will first try one of the NNs > (non-deterministically) and if its a standby NN, then it will respond to the > client to retry the request on the other Namenode. > If the client happens to talks to the Standby first, and the standby is > undergoing some GC / is busy, then those clients might not get a response > soon enough to try the other NN. > Proposed Approach to solve this : > 1) Since Zookeeper is already used as the failover controller, the clients > could talk to ZK and find out which is the active namenode before contacting > it. > 2) Long-lived DFSClients would have a ZK watch configured which fires when > there is a failover so they do not have to query ZK everytime to find out the > active NN > 2) Clients can also cache the last active NN in the user's home directory > (~/.lastNN) so that short-lived clients can try that Namenode first before > querying ZK -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7858) Improve HA Namenode Failover detection on the client
[ https://issues.apache.org/jira/browse/HDFS-7858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14387816#comment-14387816 ] Jing Zhao commented on HDFS-7858: - Thanks for working on this, [~asuresh]. One concern is where we should put the new logic. Looks like the current patch wraps things in the following way: {{RequestHedgingInvocationHandler}} --> proxy returned by {{RequestHedgingProxyProvider#getProxy}} --> {{RetryInvocationHandler}} I'm not sure if this is the best way to go. RetryInvocationHandler has its own logic for retry and failover, which is usually based on the type of the exception thrown by the invocation. With the new design, the exception caught by {{RetryInvocationHandler}} is identified based on the exceptions thrown by all the targets inside of {{RequestHedgingInvocationHandler}}. Since different targets may return different exceptions, looks like we cannot guarantee {{RetryInvocationHandler}} finally gets the exception from the correct target. I'm thinking that how about providing {{RequestHedgingInvocationHandler}} as a replacement of {{RetryInvocationHandler}}? We need to add the retry logic into {{RequestHedgingInvocationHandler}} but the whole layer may look more clean. > Improve HA Namenode Failover detection on the client > > > Key: HDFS-7858 > URL: https://issues.apache.org/jira/browse/HDFS-7858 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Arun Suresh >Assignee: Arun Suresh > Attachments: HDFS-7858.1.patch, HDFS-7858.2.patch, HDFS-7858.2.patch, > HDFS-7858.3.patch > > > In an HA deployment, Clients are configured with the hostnames of both the > Active and Standby Namenodes.Clients will first try one of the NNs > (non-deterministically) and if its a standby NN, then it will respond to the > client to retry the request on the other Namenode. > If the client happens to talks to the Standby first, and the standby is > undergoing some GC / is busy, then those clients might not get a response > soon enough to try the other NN. > Proposed Approach to solve this : > 1) Since Zookeeper is already used as the failover controller, the clients > could talk to ZK and find out which is the active namenode before contacting > it. > 2) Long-lived DFSClients would have a ZK watch configured which fires when > there is a failover so they do not have to query ZK everytime to find out the > active NN > 2) Clients can also cache the last active NN in the user's home directory > (~/.lastNN) so that short-lived clients can try that Namenode first before > querying ZK -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7858) Improve HA Namenode Failover detection on the client
[ https://issues.apache.org/jira/browse/HDFS-7858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14387190#comment-14387190 ] Arun Suresh commented on HDFS-7858: --- ping [~atm], [~jingzhao], [~bikassaha], Was wondering if I might get a review for the current patch.. > Improve HA Namenode Failover detection on the client > > > Key: HDFS-7858 > URL: https://issues.apache.org/jira/browse/HDFS-7858 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Arun Suresh >Assignee: Arun Suresh > Attachments: HDFS-7858.1.patch, HDFS-7858.2.patch, HDFS-7858.2.patch, > HDFS-7858.3.patch > > > In an HA deployment, Clients are configured with the hostnames of both the > Active and Standby Namenodes.Clients will first try one of the NNs > (non-deterministically) and if its a standby NN, then it will respond to the > client to retry the request on the other Namenode. > If the client happens to talks to the Standby first, and the standby is > undergoing some GC / is busy, then those clients might not get a response > soon enough to try the other NN. > Proposed Approach to solve this : > 1) Since Zookeeper is already used as the failover controller, the clients > could talk to ZK and find out which is the active namenode before contacting > it. > 2) Long-lived DFSClients would have a ZK watch configured which fires when > there is a failover so they do not have to query ZK everytime to find out the > active NN > 2) Clients can also cache the last active NN in the user's home directory > (~/.lastNN) so that short-lived clients can try that Namenode first before > querying ZK -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7858) Improve HA Namenode Failover detection on the client
[ https://issues.apache.org/jira/browse/HDFS-7858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14362020#comment-14362020 ] Arun Suresh commented on HDFS-7858: --- [~kasha], bq. one is not required to configure a fencing mechanism when using QJM ? Yup, QJM ensures only 1 namenode can write, but fencing is still recommended since there is still a possibility of the stale reads from the old Active NN before going down (I am hoping this will not be too much of an issue) bq. it would be nice to make the solution here accessible to YARN as well. The current patch extends the {{ConfigredFailoverProxyProvider}} in the hdfs code base. The {{ConfiguredRMFailoverProxyProvider}} looks like it belongs to the same class hierarchy.. so it shouldnt be too hard. But like you mentioned, if YARN is not deployed with {{ZKRMStateStore}}, there is a possibility of split-brain.. which leads mean to think.. wouldnt it be nice to incorporate QJM and JNs into YARN deployment ? thoughts ? > Improve HA Namenode Failover detection on the client > > > Key: HDFS-7858 > URL: https://issues.apache.org/jira/browse/HDFS-7858 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Arun Suresh >Assignee: Arun Suresh > Attachments: HDFS-7858.1.patch, HDFS-7858.2.patch, HDFS-7858.2.patch, > HDFS-7858.3.patch > > > In an HA deployment, Clients are configured with the hostnames of both the > Active and Standby Namenodes.Clients will first try one of the NNs > (non-deterministically) and if its a standby NN, then it will respond to the > client to retry the request on the other Namenode. > If the client happens to talks to the Standby first, and the standby is > undergoing some GC / is busy, then those clients might not get a response > soon enough to try the other NN. > Proposed Approach to solve this : > 1) Since Zookeeper is already used as the failover controller, the clients > could talk to ZK and find out which is the active namenode before contacting > it. > 2) Long-lived DFSClients would have a ZK watch configured which fires when > there is a failover so they do not have to query ZK everytime to find out the > active NN > 2) Clients can also cache the last active NN in the user's home directory > (~/.lastNN) so that short-lived clients can try that Namenode first before > querying ZK -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7858) Improve HA Namenode Failover detection on the client
[ https://issues.apache.org/jira/browse/HDFS-7858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14361536#comment-14361536 ] Karthik Kambatla commented on HDFS-7858: If possible, it would be nice to make the solution here accessible to YARN as well. Simultaneously connecting to all the masters (NNs in HDFS and RMs in YARN) might work most of the time. How do we plan to handle a split-brain? In YARN, we don't use an explicit fencing mechanism. IIRR, one is not required to configure a fencing mechanism when using QJM? > Improve HA Namenode Failover detection on the client > > > Key: HDFS-7858 > URL: https://issues.apache.org/jira/browse/HDFS-7858 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Arun Suresh >Assignee: Arun Suresh > Attachments: HDFS-7858.1.patch, HDFS-7858.2.patch, HDFS-7858.2.patch, > HDFS-7858.3.patch > > > In an HA deployment, Clients are configured with the hostnames of both the > Active and Standby Namenodes.Clients will first try one of the NNs > (non-deterministically) and if its a standby NN, then it will respond to the > client to retry the request on the other Namenode. > If the client happens to talks to the Standby first, and the standby is > undergoing some GC / is busy, then those clients might not get a response > soon enough to try the other NN. > Proposed Approach to solve this : > 1) Since Zookeeper is already used as the failover controller, the clients > could talk to ZK and find out which is the active namenode before contacting > it. > 2) Long-lived DFSClients would have a ZK watch configured which fires when > there is a failover so they do not have to query ZK everytime to find out the > active NN > 2) Clients can also cache the last active NN in the user's home directory > (~/.lastNN) so that short-lived clients can try that Namenode first before > querying ZK -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7858) Improve HA Namenode Failover detection on the client
[ https://issues.apache.org/jira/browse/HDFS-7858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14349609#comment-14349609 ] Hadoop QA commented on HDFS-7858: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12702886/HDFS-7858.3.patch against trunk revision 952640f. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.TestDecommission org.apache.hadoop.hdfs.server.balancer.TestBalancer Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/9752//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/9752//console This message is automatically generated. > Improve HA Namenode Failover detection on the client > > > Key: HDFS-7858 > URL: https://issues.apache.org/jira/browse/HDFS-7858 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Arun Suresh >Assignee: Arun Suresh > Attachments: HDFS-7858.1.patch, HDFS-7858.2.patch, HDFS-7858.2.patch, > HDFS-7858.3.patch > > > In an HA deployment, Clients are configured with the hostnames of both the > Active and Standby Namenodes.Clients will first try one of the NNs > (non-deterministically) and if its a standby NN, then it will respond to the > client to retry the request on the other Namenode. > If the client happens to talks to the Standby first, and the standby is > undergoing some GC / is busy, then those clients might not get a response > soon enough to try the other NN. > Proposed Approach to solve this : > 1) Since Zookeeper is already used as the failover controller, the clients > could talk to ZK and find out which is the active namenode before contacting > it. > 2) Long-lived DFSClients would have a ZK watch configured which fires when > there is a failover so they do not have to query ZK everytime to find out the > active NN > 2) Clients can also cache the last active NN in the user's home directory > (~/.lastNN) so that short-lived clients can try that Namenode first before > querying ZK -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7858) Improve HA Namenode Failover detection on the client
[ https://issues.apache.org/jira/browse/HDFS-7858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14347533#comment-14347533 ] Arun Suresh commented on HDFS-7858: --- This testcase failure seems unrelated.. > Improve HA Namenode Failover detection on the client > > > Key: HDFS-7858 > URL: https://issues.apache.org/jira/browse/HDFS-7858 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Arun Suresh >Assignee: Arun Suresh > Attachments: HDFS-7858.1.patch, HDFS-7858.2.patch, HDFS-7858.2.patch > > > In an HA deployment, Clients are configured with the hostnames of both the > Active and Standby Namenodes.Clients will first try one of the NNs > (non-deterministically) and if its a standby NN, then it will respond to the > client to retry the request on the other Namenode. > If the client happens to talks to the Standby first, and the standby is > undergoing some GC / is busy, then those clients might not get a response > soon enough to try the other NN. > Proposed Approach to solve this : > 1) Since Zookeeper is already used as the failover controller, the clients > could talk to ZK and find out which is the active namenode before contacting > it. > 2) Long-lived DFSClients would have a ZK watch configured which fires when > there is a failover so they do not have to query ZK everytime to find out the > active NN > 2) Clients can also cache the last active NN in the user's home directory > (~/.lastNN) so that short-lived clients can try that Namenode first before > querying ZK -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7858) Improve HA Namenode Failover detection on the client
[ https://issues.apache.org/jira/browse/HDFS-7858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14346880#comment-14346880 ] Hadoop QA commented on HDFS-7858: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12702433/HDFS-7858.2.patch against trunk revision 3560180. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.server.namenode.TestFileTruncate Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/9730//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/9730//console This message is automatically generated. > Improve HA Namenode Failover detection on the client > > > Key: HDFS-7858 > URL: https://issues.apache.org/jira/browse/HDFS-7858 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Arun Suresh >Assignee: Arun Suresh > Attachments: HDFS-7858.1.patch, HDFS-7858.2.patch, HDFS-7858.2.patch > > > In an HA deployment, Clients are configured with the hostnames of both the > Active and Standby Namenodes.Clients will first try one of the NNs > (non-deterministically) and if its a standby NN, then it will respond to the > client to retry the request on the other Namenode. > If the client happens to talks to the Standby first, and the standby is > undergoing some GC / is busy, then those clients might not get a response > soon enough to try the other NN. > Proposed Approach to solve this : > 1) Since Zookeeper is already used as the failover controller, the clients > could talk to ZK and find out which is the active namenode before contacting > it. > 2) Long-lived DFSClients would have a ZK watch configured which fires when > there is a failover so they do not have to query ZK everytime to find out the > active NN > 2) Clients can also cache the last active NN in the user's home directory > (~/.lastNN) so that short-lived clients can try that Namenode first before > querying ZK -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7858) Improve HA Namenode Failover detection on the client
[ https://issues.apache.org/jira/browse/HDFS-7858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14346668#comment-14346668 ] Hadoop QA commented on HDFS-7858: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12702429/HDFS-7858.2.patch against trunk revision 3560180. {color:red}-1 patch{color}. Trunk compilation may be broken. Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/9727//console This message is automatically generated. > Improve HA Namenode Failover detection on the client > > > Key: HDFS-7858 > URL: https://issues.apache.org/jira/browse/HDFS-7858 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Arun Suresh >Assignee: Arun Suresh > Attachments: HDFS-7858.1.patch, HDFS-7858.2.patch > > > In an HA deployment, Clients are configured with the hostnames of both the > Active and Standby Namenodes.Clients will first try one of the NNs > (non-deterministically) and if its a standby NN, then it will respond to the > client to retry the request on the other Namenode. > If the client happens to talks to the Standby first, and the standby is > undergoing some GC / is busy, then those clients might not get a response > soon enough to try the other NN. > Proposed Approach to solve this : > 1) Since Zookeeper is already used as the failover controller, the clients > could talk to ZK and find out which is the active namenode before contacting > it. > 2) Long-lived DFSClients would have a ZK watch configured which fires when > there is a failover so they do not have to query ZK everytime to find out the > active NN > 2) Clients can also cache the last active NN in the user's home directory > (~/.lastNN) so that short-lived clients can try that Namenode first before > querying ZK -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7858) Improve HA Namenode Failover detection on the client
[ https://issues.apache.org/jira/browse/HDFS-7858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14344503#comment-14344503 ] Jing Zhao commented on HDFS-7858: - bq. Optimistically try to connect to both configured NNs simultaneously I like this better than letting clients connect JNs. JNs are in a critical code path for writing editlog and failing to write a quorum of JNs can cause NN to kill itself, thus maybe we need to be more careful when letting all clients directly connect to them. > Improve HA Namenode Failover detection on the client > > > Key: HDFS-7858 > URL: https://issues.apache.org/jira/browse/HDFS-7858 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Arun Suresh >Assignee: Arun Suresh > Attachments: HDFS-7858.1.patch > > > In an HA deployment, Clients are configured with the hostnames of both the > Active and Standby Namenodes.Clients will first try one of the NNs > (non-deterministically) and if its a standby NN, then it will respond to the > client to retry the request on the other Namenode. > If the client happens to talks to the Standby first, and the standby is > undergoing some GC / is busy, then those clients might not get a response > soon enough to try the other NN. > Proposed Approach to solve this : > 1) Since Zookeeper is already used as the failover controller, the clients > could talk to ZK and find out which is the active namenode before contacting > it. > 2) Long-lived DFSClients would have a ZK watch configured which fires when > there is a failover so they do not have to query ZK everytime to find out the > active NN > 2) Clients can also cache the last active NN in the user's home directory > (~/.lastNN) so that short-lived clients can try that Namenode first before > querying ZK -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7858) Improve HA Namenode Failover detection on the client
[ https://issues.apache.org/jira/browse/HDFS-7858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14344311#comment-14344311 ] Hadoop QA commented on HDFS-7858: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12701992/HDFS-7858.1.patch against trunk revision ca1c00b. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 3 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:red}-1 findbugs{color}. The patch appears to introduce 4 new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common: org.apache.hadoop.hdfs.server.namenode.TestNameNodeRetryCacheMetrics Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/9699//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HDFS-Build/9699//artifact/patchprocess/newPatchFindbugsWarningshadoop-hdfs.html Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/9699//console This message is automatically generated. > Improve HA Namenode Failover detection on the client > > > Key: HDFS-7858 > URL: https://issues.apache.org/jira/browse/HDFS-7858 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Arun Suresh >Assignee: Arun Suresh > Attachments: HDFS-7858.1.patch > > > In an HA deployment, Clients are configured with the hostnames of both the > Active and Standby Namenodes.Clients will first try one of the NNs > (non-deterministically) and if its a standby NN, then it will respond to the > client to retry the request on the other Namenode. > If the client happens to talks to the Standby first, and the standby is > undergoing some GC / is busy, then those clients might not get a response > soon enough to try the other NN. > Proposed Approach to solve this : > 1) Since Zookeeper is already used as the failover controller, the clients > could talk to ZK and find out which is the active namenode before contacting > it. > 2) Long-lived DFSClients would have a ZK watch configured which fires when > there is a failover so they do not have to query ZK everytime to find out the > active NN > 2) Clients can also cache the last active NN in the user's home directory > (~/.lastNN) so that short-lived clients can try that Namenode first before > querying ZK -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7858) Improve HA Namenode Failover detection on the client
[ https://issues.apache.org/jira/browse/HDFS-7858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14343889#comment-14343889 ] Aaron T. Myers commented on HDFS-7858: -- Hey folks, sorry to come into this discussion so late. Given that some folks choose to use HDFS HA without auto failover at all, and thus without ZKFCs or ZK in sight, I think we should target any solution to this problem to work without ZK. I'm also a little leery of using a cache file, as I'm afraid of thundering herd effects (if the file is in HDFS or in a home dir which is network mounted), and also don't like the fact that in a large cluster all users on all machines might need to populate this cache file. As such, I'd propose that we pursue either of the following two options: # Optimistically try to connect to both configured NNs simultaneously, thus allowing that one (the standby) may take a while to respond, but also expecting that the active will always respond rather promptly. This is similar to Kihwal's suggestion. # Have the client connect to the JNs to determine which NN is the likely the active. In my experience, even those who don't use automatic failover basically always use the QJM. I think those that continue to use NFS-based HA are very few and far between. Thoughts? > Improve HA Namenode Failover detection on the client > > > Key: HDFS-7858 > URL: https://issues.apache.org/jira/browse/HDFS-7858 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Arun Suresh >Assignee: Arun Suresh > Attachments: HDFS-7858.1.patch > > > In an HA deployment, Clients are configured with the hostnames of both the > Active and Standby Namenodes.Clients will first try one of the NNs > (non-deterministically) and if its a standby NN, then it will respond to the > client to retry the request on the other Namenode. > If the client happens to talks to the Standby first, and the standby is > undergoing some GC / is busy, then those clients might not get a response > soon enough to try the other NN. > Proposed Approach to solve this : > 1) Since Zookeeper is already used as the failover controller, the clients > could talk to ZK and find out which is the active namenode before contacting > it. > 2) Long-lived DFSClients would have a ZK watch configured which fires when > there is a failover so they do not have to query ZK everytime to find out the > active NN > 2) Clients can also cache the last active NN in the user's home directory > (~/.lastNN) so that short-lived clients can try that Namenode first before > querying ZK -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7858) Improve HA Namenode Failover detection on the client using Zookeeper
[ https://issues.apache.org/jira/browse/HDFS-7858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14341849#comment-14341849 ] Bikas Saha commented on HDFS-7858: -- What are long lived client examples? How many such clients would be there in a large busy cluster? Will they be setting watches on ZK? bq. Adding a cached entry to user's home dir to pick last active NN. If entry is not present, the client picks the Standby from the configuration. This seems like a reasonable improvement to the current scheme which will allow a client to connect to the current active directly (even though it may be listed later in the NN names list). Please do keep in mind that ZK is just a notifier in the leader election scheme. The real control lies in the FailoverController which is pluggable. A different FailoverController may not use ZK. The status of the master flag may not be valid/be-empty while the FailoverController is fencing the old master and bringing up the new master. Getting configuration from ZK is related but probably orthogonal. The entire config for HDFS could be downloaded from ZK based on a well known HDFS service name. > Improve HA Namenode Failover detection on the client using Zookeeper > > > Key: HDFS-7858 > URL: https://issues.apache.org/jira/browse/HDFS-7858 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Arun Suresh >Assignee: Arun Suresh > > In an HA deployment, Clients are configured with the hostnames of both the > Active and Standby Namenodes.Clients will first try one of the NNs > (non-deterministically) and if its a standby NN, then it will respond to the > client to retry the request on the other Namenode. > If the client happens to talks to the Standby first, and the standby is > undergoing some GC / is busy, then those clients might not get a response > soon enough to try the other NN. > Proposed Approach to solve this : > 1) Since Zookeeper is already used as the failover controller, the clients > could talk to ZK and find out which is the active namenode before contacting > it. > 2) Long-lived DFSClients would have a ZK watch configured which fires when > there is a failover so they do not have to query ZK everytime to find out the > active NN > 2) Clients can also cache the last active NN in the user's home directory > (~/.lastNN) so that short-lived clients can try that Namenode first before > querying ZK -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7858) Improve HA Namenode Failover detection on the client using Zookeeper
[ https://issues.apache.org/jira/browse/HDFS-7858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14341411#comment-14341411 ] Arun Suresh commented on HDFS-7858: --- [~bikassaha], you make a very valid point. I guess the situation you mentioned can be alleviated as follows : Considering the fact a client knows apriori, both the Active and Standby, what if we do the following : if locally cached active namenode entry has become unavailable, yes there will be an initial surge of requests to the failed NN, but the client can directly retry to the Standby without consulting ZK. ZK connections will happen only in the following cases : # If no cached entry is present in the user home directory. # Long living clients Also I was thinking, maybe we break this into 2 separate JIRAs : # Adding a cached entry to user's home dir to pick last active NN. If entry is not present, the client picks the Standby from the configuration. No ZK involvement for this, it only brings some determinism in which namenode is picked first. # Have another JIRA to add ZK client optimization. This would in addition to the ZK watch feature for long lived clients can bring in probably additional benefits such as having only the logical nameservice name in the Configuration. Namenodes when it starts up will register under a ZNode and clients find out the actual URI of the Active and Standby directly from ZK (like HBase clients). Short lived clients would then first query ZK, finding the active and standby NN URIs and cache them (rather than reading from the Configuration), so subsequent Client invocation do not hit ZK. > Improve HA Namenode Failover detection on the client using Zookeeper > > > Key: HDFS-7858 > URL: https://issues.apache.org/jira/browse/HDFS-7858 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Arun Suresh >Assignee: Arun Suresh > > In an HA deployment, Clients are configured with the hostnames of both the > Active and Standby Namenodes.Clients will first try one of the NNs > (non-deterministically) and if its a standby NN, then it will respond to the > client to retry the request on the other Namenode. > If the client happens to talks to the Standby first, and the standby is > undergoing some GC / is busy, then those clients might not get a response > soon enough to try the other NN. > Proposed Approach to solve this : > 1) Since Zookeeper is already used as the failover controller, the clients > could talk to ZK and find out which is the active namenode before contacting > it. > 2) Long-lived DFSClients would have a ZK watch configured which fires when > there is a failover so they do not have to query ZK everytime to find out the > active NN > 2) Clients can also cache the last active NN in the user's home directory > (~/.lastNN) so that short-lived clients can try that Namenode first before > querying ZK -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7858) Improve HA Namenode Failover detection on the client using Zookeeper
[ https://issues.apache.org/jira/browse/HDFS-7858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14341399#comment-14341399 ] Bikas Saha commented on HDFS-7858: -- bq. The client will proceed to connect to that NN first (thereby removing non-determinism from the current scheme).. and will most probably succeed. It will contact ZK only if the connection was unsuccessful.. Yes. It will most probably succeed. But when will it not succeed? When that NN has failed over or has crashed, right? Which means that every time a known primary NN becomes unavailable there will be surge of failed connections to it (from cached entries that point to it) and then these connections will be redirected to ZK. For a proxy of the number of connections consider MR jobs, where every Map task running on every machine has a DFS client to read from HDFS and every Reduce task on every machine has a DFS client to write to HDFS. MR tasks are typically short lived clients. > Improve HA Namenode Failover detection on the client using Zookeeper > > > Key: HDFS-7858 > URL: https://issues.apache.org/jira/browse/HDFS-7858 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Arun Suresh >Assignee: Arun Suresh > > In an HA deployment, Clients are configured with the hostnames of both the > Active and Standby Namenodes.Clients will first try one of the NNs > (non-deterministically) and if its a standby NN, then it will respond to the > client to retry the request on the other Namenode. > If the client happens to talks to the Standby first, and the standby is > undergoing some GC / is busy, then those clients might not get a response > soon enough to try the other NN. > Proposed Approach to solve this : > 1) Since Zookeeper is already used as the failover controller, the clients > could talk to ZK and find out which is the active namenode before contacting > it. > 2) Long-lived DFSClients would have a ZK watch configured which fires when > there is a failover so they do not have to query ZK everytime to find out the > active NN > 2) Clients can also cache the last active NN in the user's home directory > (~/.lastNN) so that short-lived clients can try that Namenode first before > querying ZK -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7858) Improve HA Namenode Failover detection on the client using Zookeeper
[ https://issues.apache.org/jira/browse/HDFS-7858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14341375#comment-14341375 ] Arun Suresh commented on HDFS-7858: --- [~kihwal], [~bikassaha], I understand your concerns over the use of ZK. But consider the following: # Most DFSClients are cached (since the Filesystem objects are generally cached). Thus long lived clients will probably not have more than 1-2 persistent connections to ZK. # Short lived clients will first check if there is cached entry (possibly in the home directory, something like ~/.lastNN) that contains the last accessed active NN. The client will proceed to connect to that NN first (thereby removing non-determinism from the current scheme).. and will most probably succeed. It will contact ZK only if the connection was unsuccessful.. and we can limit this to just a ping (not a watch registration) so the connection is not persistent. # A Client that has connected to a NN without the need for a ZK connection can continue to NOT talk to ZK till ## the client dies and no ZK connection is ever made ## after waiting for a configurable time (maybe an hour).. after which it is established that it is a ling lived client, Do you think this might reduce the total number of connection to ZK at any point of time ? > Improve HA Namenode Failover detection on the client using Zookeeper > > > Key: HDFS-7858 > URL: https://issues.apache.org/jira/browse/HDFS-7858 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Arun Suresh >Assignee: Arun Suresh > > In an HA deployment, Clients are configured with the hostnames of both the > Active and Standby Namenodes.Clients will first try one of the NNs > (non-deterministically) and if its a standby NN, then it will respond to the > client to retry the request on the other Namenode. > If the client happens to talks to the Standby first, and the standby is > undergoing some GC / is busy, then those clients might not get a response > soon enough to try the other NN. > Proposed Approach to solve this : > 1) Since Zookeeper is already used as the failover controller, the clients > could talk to ZK and find out which is the active namenode before contacting > it. > 2) Long-lived DFSClients would have a ZK watch configured which fires when > there is a failover so they do not have to query ZK everytime to find out the > active NN > 2) Clients can also cache the last active NN in the user's home directory > (~/.lastNN) so that short-lived clients can try that Namenode first before > querying ZK -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7858) Improve HA Namenode Failover detection on the client using Zookeeper
[ https://issues.apache.org/jira/browse/HDFS-7858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14341336#comment-14341336 ] Bikas Saha commented on HDFS-7858: -- We need to be careful about how many clients can be supported by ZK (either pinging for info or watchers). ZK is typically a shared service with YARN/HBase etc. > Improve HA Namenode Failover detection on the client using Zookeeper > > > Key: HDFS-7858 > URL: https://issues.apache.org/jira/browse/HDFS-7858 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Arun Suresh >Assignee: Arun Suresh > > In an HA deployment, Clients are configured with the hostnames of both the > Active and Standby Namenodes.Clients will first try one of the NNs > (non-deterministically) and if its a standby NN, then it will respond to the > client to retry the request on the other Namenode. > If the client happens to talks to the Standby first, and the standby is > undergoing some GC / is busy, then those clients might not get a response > soon enough to try the other NN. > Proposed Approach to solve this : > 1) Since Zookeeper is already used as the failover controller, the clients > could talk to ZK and find out which is the active namenode before contacting > it. > 2) Long-lived DFSClients would have a ZK watch configured which fires when > there is a failover so they do not have to query ZK everytime to find out the > active NN > 2) Clients can also cache the last active NN in the user's home directory > (~/.lastNN) so that short-lived clients can try that Namenode first before > querying ZK -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7858) Improve HA Namenode Failover detection on the client using Zookeeper
[ https://issues.apache.org/jira/browse/HDFS-7858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14340846#comment-14340846 ] Kihwal Lee commented on HDFS-7858: -- ZK may not scale to support thousands of clients. I think it will be better to use more aggressive timeout and proper retry policy to get around such problems. > Improve HA Namenode Failover detection on the client using Zookeeper > > > Key: HDFS-7858 > URL: https://issues.apache.org/jira/browse/HDFS-7858 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Arun Suresh >Assignee: Arun Suresh > > In an HA deployment, Clients are configured with the hostnames of both the > Active and Standby Namenodes.Clients will first try one of the NNs > (non-deterministically) and if its a standby NN, then it will respond to the > client to retry the request on the other Namenode. > If the client happens to talks to the Standby first, and the standby is > undergoing some GC / is busy, then those clients might not get a response > soon enough to try the other NN. > Proposed Approach to solve this : > 1) Since Zookeeper is already used as the failover controller, the clients > could talk to ZK and find out which is the active namenode before contacting > it. > 2) Long-lived DFSClients would have a ZK watch configured which fires when > there is a failover so they do not have to query ZK everytime to find out the > active NN > 2) Clients can also cache the last active NN in the user's home directory > (~/.lastNN) so that short-lived clients can try that Namenode first before > querying ZK -- This message was sent by Atlassian JIRA (v6.3.4#6332)