Todd Lipcon has uploaded this change for review. ( http://gerrit.cloudera.org:8080/9615
Change subject: KUDU-2343. java: properly reconnect to new leader master after failover ...................................................................... KUDU-2343. java: properly reconnect to new leader master after failover This fixes the "fake" location information returned in response to a ConnectToMaster RPC to include a distinct "fake UUID" for each master. Previously, we were using an empty string for the UUID of the masters. This caused collisions in the ConnectionCache, which is keyed by server UUIDs. The fake UUID added by this patch matches the fake UUID already in use by AsyncKuduClient.newMasterRpcProxy. This should allow us to share the RPC connection between the ConnectToMaster RPCs and the subsequent GetTableLocation RPCs, which is also a benefit for latency after a failover or on a fresh client. Additionally, this will help with various log messages that previously would print an empty UUID string. A prior version of this patch solved the problem by changing the key for the ConnectionCache to be based on IP address, which has other benefits in terms of future support for servers changing their DNS resolution at runtime. However, since this patch is intended for backport into prior releases, this simpler approach is taken for now. A TODO is added for the longer-term idea. An existing test which tested killing a master now runs in a second mode which restarts the master. This reproduced the bug prior to the fix. This patch also cleans up that test somewhat - it was doing some buggy logic to attempt to kill more than one tablet server, but in fact just called "killTabletServer" three times on the same one. Killing three tablet servers never made sense, either, since the table in the test only had three replicas. Neither did it make sense to start six tablet servers for the test. Change-Id: I36f96c6712800e398ed46887d97d4b09fd993b04 Reviewed-on: http://gerrit.cloudera.org:8080/9612 Reviewed-by: Alexey Serbin <aser...@cloudera.com> Tested-by: Todd Lipcon <t...@apache.org> (cherry picked from commit 487a21476ba3551b3a7ec98cf96f772a495f31fb) --- M java/kudu-client/src/main/java/org/apache/kudu/client/AsyncKuduClient.java M java/kudu-client/src/main/java/org/apache/kudu/client/ConnectToClusterResponse.java M java/kudu-client/src/main/java/org/apache/kudu/client/ConnectionCache.java M java/kudu-client/src/test/java/org/apache/kudu/client/MiniKuduCluster.java M java/kudu-client/src/test/java/org/apache/kudu/client/TestClientFailoverSupport.java M java/kudu-client/src/test/java/org/apache/kudu/client/TestConnectionCache.java 6 files changed, 65 insertions(+), 19 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/15/9615/1 -- To view, visit http://gerrit.cloudera.org:8080/9615 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: branch-1.7.x Gerrit-MessageType: newchange Gerrit-Change-Id: I36f96c6712800e398ed46887d97d4b09fd993b04 Gerrit-Change-Number: 9615 Gerrit-PatchSet: 1 Gerrit-Owner: Todd Lipcon <t...@apache.org>