[kudu-CR] [java] KUDU-3213: try at different server on TABLET NOT RUNNING
Andrew Wong has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/17124 ) Change subject: [java] KUDU-3213: try at different server on TABLET_NOT_RUNNING .. [java] KUDU-3213: try at different server on TABLET_NOT_RUNNING Prior to this patch, if a tablet server were quiescing for a prolonged period, scan requests could time out, complaining that the tablet server is quiescing, but without ever retrying the scan at another tablet server. This is because tablet servers will return TABLET_NOT_RUNNING to clients when attempting a scan while quiescing. The behavior in the C++ client is that the location is then blacklisted and the request is retried elsewhere. The behavior in the Java client, though, is that the same location is retried until failure. This patch addresses this by treating TABLET_NOT_RUNNING errors in the Java client as we would for TABLET_NOT_FOUND, which is actually quite similar to the handling for TABLET_NOT_RUNNING in the C++ client: the location is invalidated for further attempts, and the request is retried elsewhere. Why not just have quiescing tablet servers return TABLET_NOT_FOUND, then? TABLET_NOT_FOUND errors in the C++ client actually have some behavior not present in the Java client: a tablet whose location is invalidated with TABLET_NOT_FOUND in the C++ client will be required to be looked up again, requiring a round trip to the master. This behavior doesn't exist in the Java client, so I thought it easiest to piggyback on TABLET_NOT_FOUND handling for now. Change-Id: I38ac84a52676ff361fa1ba996665b338d1bbfba1 Reviewed-on: http://gerrit.cloudera.org:8080/17124 Tested-by: Kudu Jenkins Reviewed-by: Alexey Serbin --- M java/kudu-client/src/main/java/org/apache/kudu/client/RpcProxy.java M java/kudu-client/src/test/java/org/apache/kudu/client/TestKuduScanner.java M java/kudu-test-utils/src/main/java/org/apache/kudu/test/KuduTestHarness.java 3 files changed, 59 insertions(+), 4 deletions(-) Approvals: Kudu Jenkins: Verified Alexey Serbin: Looks good to me, approved -- To view, visit http://gerrit.cloudera.org:8080/17124 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I38ac84a52676ff361fa1ba996665b338d1bbfba1 Gerrit-Change-Number: 17124 Gerrit-PatchSet: 6 Gerrit-Owner: Andrew Wong Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Attila Bukor Gerrit-Reviewer: Grant Henke Gerrit-Reviewer: Hao Hao Gerrit-Reviewer: Kudu Jenkins (120)
[kudu-CR] [java] KUDU-3213: try at different server on TABLET NOT RUNNING
Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/17124 ) Change subject: [java] KUDU-3213: try at different server on TABLET_NOT_RUNNING .. Patch Set 5: Code-Review+2 (1 comment) http://gerrit.cloudera.org:8080/#/c/17124/4/java/kudu-client/src/test/java/org/apache/kudu/client/TestKuduScanner.java File java/kudu-client/src/test/java/org/apache/kudu/client/TestKuduScanner.java: http://gerrit.cloudera.org:8080/#/c/17124/4/java/kudu-client/src/test/java/org/apache/kudu/client/TestKuduScanner.java@120 PS4, Line 120: rver.waitFor()); > We don't, and we might not. But without proper handling of quiescing server I see. Thank you for the explanation. I think it's good enough if it fails in some non-negligible amount of runs. SGTM. -- To view, visit http://gerrit.cloudera.org:8080/17124 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I38ac84a52676ff361fa1ba996665b338d1bbfba1 Gerrit-Change-Number: 17124 Gerrit-PatchSet: 5 Gerrit-Owner: Andrew Wong Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Attila Bukor Gerrit-Reviewer: Grant Henke Gerrit-Reviewer: Hao Hao Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Comment-Date: Sat, 06 Mar 2021 07:42:51 + Gerrit-HasComments: Yes
[kudu-CR] [java] KUDU-3213: try at different server on TABLET NOT RUNNING
Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/17124 ) Change subject: [java] KUDU-3213: try at different server on TABLET_NOT_RUNNING .. Patch Set 5: (2 comments) http://gerrit.cloudera.org:8080/#/c/17124/4/java/kudu-client/src/test/java/org/apache/kudu/client/TestKuduScanner.java File java/kudu-client/src/test/java/org/apache/kudu/client/TestKuduScanner.java: http://gerrit.cloudera.org:8080/#/c/17124/4/java/kudu-client/src/test/java/org/apache/kudu/client/TestKuduScanner.java@96 PS4, Line 96: set some partitioning though). > nit: would it make sense to use hash partitioning instead? Otherwise, how The partitioning isn't important here other than the fact that Kudu complains if there's none set. Added a comment. http://gerrit.cloudera.org:8080/#/c/17124/4/java/kudu-client/src/test/java/org/apache/kudu/client/TestKuduScanner.java@120 PS4, Line 120: rver.waitFor()); > nit: how do we know it's so, indeed? Could it happen that the scanner alwa We don't, and we might not. But without proper handling of quiescing servers, at least, without proper handling of quiescing this test fails a non-negligible amount of the time. -- To view, visit http://gerrit.cloudera.org:8080/17124 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I38ac84a52676ff361fa1ba996665b338d1bbfba1 Gerrit-Change-Number: 17124 Gerrit-PatchSet: 5 Gerrit-Owner: Andrew Wong Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Attila Bukor Gerrit-Reviewer: Grant Henke Gerrit-Reviewer: Hao Hao Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Comment-Date: Tue, 02 Mar 2021 19:58:35 + Gerrit-HasComments: Yes
[kudu-CR] [java] KUDU-3213: try at different server on TABLET NOT RUNNING
Hello Alexey Serbin, Attila Bukor, Kudu Jenkins, Grant Henke, Hao Hao, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/17124 to look at the new patch set (#5). Change subject: [java] KUDU-3213: try at different server on TABLET_NOT_RUNNING .. [java] KUDU-3213: try at different server on TABLET_NOT_RUNNING Prior to this patch, if a tablet server were quiescing for a prolonged period, scan requests could time out, complaining that the tablet server is quiescing, but without ever retrying the scan at another tablet server. This is because tablet servers will return TABLET_NOT_RUNNING to clients when attempting a scan while quiescing. The behavior in the C++ client is that the location is then blacklisted and the request is retried elsewhere. The behavior in the Java client, though, is that the same location is retried until failure. This patch addresses this by treating TABLET_NOT_RUNNING errors in the Java client as we would for TABLET_NOT_FOUND, which is actually quite similar to the handling for TABLET_NOT_RUNNING in the C++ client: the location is invalidated for further attempts, and the request is retried elsewhere. Why not just have quiescing tablet servers return TABLET_NOT_FOUND, then? TABLET_NOT_FOUND errors in the C++ client actually have some behavior not present in the Java client: a tablet whose location is invalidated with TABLET_NOT_FOUND in the C++ client will be required to be looked up again, requiring a round trip to the master. This behavior doesn't exist in the Java client, so I thought it easiest to piggyback on TABLET_NOT_FOUND handling for now. Change-Id: I38ac84a52676ff361fa1ba996665b338d1bbfba1 --- M java/kudu-client/src/main/java/org/apache/kudu/client/RpcProxy.java M java/kudu-client/src/test/java/org/apache/kudu/client/TestKuduScanner.java M java/kudu-test-utils/src/main/java/org/apache/kudu/test/KuduTestHarness.java 3 files changed, 59 insertions(+), 4 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/24/17124/5 -- To view, visit http://gerrit.cloudera.org:8080/17124 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I38ac84a52676ff361fa1ba996665b338d1bbfba1 Gerrit-Change-Number: 17124 Gerrit-PatchSet: 5 Gerrit-Owner: Andrew Wong Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Attila Bukor Gerrit-Reviewer: Grant Henke Gerrit-Reviewer: Hao Hao Gerrit-Reviewer: Kudu Jenkins (120)
[kudu-CR] [java] KUDU-3213: try at different server on TABLET NOT RUNNING
Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/17124 ) Change subject: [java] KUDU-3213: try at different server on TABLET_NOT_RUNNING .. Patch Set 4: Code-Review+1 (2 comments) http://gerrit.cloudera.org:8080/#/c/17124/4/java/kudu-client/src/test/java/org/apache/kudu/client/TestKuduScanner.java File java/kudu-client/src/test/java/org/apache/kudu/client/TestKuduScanner.java: http://gerrit.cloudera.org:8080/#/c/17124/4/java/kudu-client/src/test/java/org/apache/kudu/client/TestKuduScanner.java@96 PS4, Line 96: setRangePartitionColumns(Collections.singletonList("key")) nit: would it make sense to use hash partitioning instead? Otherwise, how do we know that the quiesce tablet server hosts the replica that contains the necessary data? If it's so even with range-partitioned table, it would be great if you could add a small comment explaining why it's so. Thanks! http://gerrit.cloudera.org:8080/#/c/17124/4/java/kudu-client/src/test/java/org/apache/kudu/client/TestKuduScanner.java@120 PS4, Line 120: if the scan goes to the quiescing server nit: how do we know it's so, indeed? Could it happen that the scanner always hits only non-quested servers? -- To view, visit http://gerrit.cloudera.org:8080/17124 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I38ac84a52676ff361fa1ba996665b338d1bbfba1 Gerrit-Change-Number: 17124 Gerrit-PatchSet: 4 Gerrit-Owner: Andrew Wong Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Attila Bukor Gerrit-Reviewer: Grant Henke Gerrit-Reviewer: Hao Hao Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Comment-Date: Tue, 02 Mar 2021 02:52:22 + Gerrit-HasComments: Yes
[kudu-CR] [java] KUDU-3213: try at different server on TABLET NOT RUNNING
Hao Hao has posted comments on this change. ( http://gerrit.cloudera.org:8080/17124 ) Change subject: [java] KUDU-3213: try at different server on TABLET_NOT_RUNNING .. Patch Set 4: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/17124 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I38ac84a52676ff361fa1ba996665b338d1bbfba1 Gerrit-Change-Number: 17124 Gerrit-PatchSet: 4 Gerrit-Owner: Andrew Wong Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Attila Bukor Gerrit-Reviewer: Grant Henke Gerrit-Reviewer: Hao Hao Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Comment-Date: Tue, 02 Mar 2021 00:03:12 + Gerrit-HasComments: No
[kudu-CR] [java] KUDU-3213: try at different server on TABLET NOT RUNNING
Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/17124 ) Change subject: [java] KUDU-3213: try at different server on TABLET_NOT_RUNNING .. Patch Set 4: (1 comment) http://gerrit.cloudera.org:8080/#/c/17124/3/java/kudu-client/src/test/java/org/apache/kudu/client/TestKuduScanner.java File java/kudu-client/src/test/java/org/apache/kudu/client/TestKuduScanner.java: http://gerrit.cloudera.org:8080/#/c/17124/3/java/kudu-client/src/test/java/org/apache/kudu/client/TestKuduScanner.java@118 PS3, Line 118: assertEquals(0, quiesceTserver.waitFor()); : > why do we need to call waitFor() twice here? Done -- To view, visit http://gerrit.cloudera.org:8080/17124 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I38ac84a52676ff361fa1ba996665b338d1bbfba1 Gerrit-Change-Number: 17124 Gerrit-PatchSet: 4 Gerrit-Owner: Andrew Wong Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Attila Bukor Gerrit-Reviewer: Grant Henke Gerrit-Reviewer: Hao Hao Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Comment-Date: Fri, 26 Feb 2021 22:10:42 + Gerrit-HasComments: Yes
[kudu-CR] [java] KUDU-3213: try at different server on TABLET NOT RUNNING
Hello Attila Bukor, Kudu Jenkins, Grant Henke, Hao Hao, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/17124 to look at the new patch set (#4). Change subject: [java] KUDU-3213: try at different server on TABLET_NOT_RUNNING .. [java] KUDU-3213: try at different server on TABLET_NOT_RUNNING Prior to this patch, if a tablet server were quiescing for a prolonged period, scan requests could time out, complaining that the tablet server is quiescing, but without ever retrying the scan at another tablet server. This is because tablet servers will return TABLET_NOT_RUNNING to clients when attempting a scan while quiescing. The behavior in the C++ client is that the location is then blacklisted and the request is retried elsewhere. The behavior in the Java client, though, is that the same location is retried until failure. This patch addresses this by treating TABLET_NOT_RUNNING errors in the Java client as we would for TABLET_NOT_FOUND, which is actually quite similar to the handling for TABLET_NOT_RUNNING in the C++ client: the location is invalidated for further attempts, and the request is retried elsewhere. Why not just have quiescing tablet servers return TABLET_NOT_FOUND, then? TABLET_NOT_FOUND errors in the C++ client actually have some behavior not present in the Java client: a tablet whose location is invalidated with TABLET_NOT_FOUND in the C++ client will be required to be looked up again, requiring a round trip to the master. This behavior doesn't exist in the Java client, so I thought it easiest to piggyback on TABLET_NOT_FOUND handling for now. Change-Id: I38ac84a52676ff361fa1ba996665b338d1bbfba1 --- M java/kudu-client/src/main/java/org/apache/kudu/client/RpcProxy.java M java/kudu-client/src/test/java/org/apache/kudu/client/TestKuduScanner.java M java/kudu-test-utils/src/main/java/org/apache/kudu/test/KuduTestHarness.java 3 files changed, 56 insertions(+), 4 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/24/17124/4 -- To view, visit http://gerrit.cloudera.org:8080/17124 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I38ac84a52676ff361fa1ba996665b338d1bbfba1 Gerrit-Change-Number: 17124 Gerrit-PatchSet: 4 Gerrit-Owner: Andrew Wong Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Attila Bukor Gerrit-Reviewer: Grant Henke Gerrit-Reviewer: Hao Hao Gerrit-Reviewer: Kudu Jenkins (120)
[kudu-CR] [java] KUDU-3213: try at different server on TABLET NOT RUNNING
Hao Hao has posted comments on this change. ( http://gerrit.cloudera.org:8080/17124 ) Change subject: [java] KUDU-3213: try at different server on TABLET_NOT_RUNNING .. Patch Set 3: Code-Review+1 (1 comment) http://gerrit.cloudera.org:8080/#/c/17124/3/java/kudu-client/src/test/java/org/apache/kudu/client/TestKuduScanner.java File java/kudu-client/src/test/java/org/apache/kudu/client/TestKuduScanner.java: http://gerrit.cloudera.org:8080/#/c/17124/3/java/kudu-client/src/test/java/org/apache/kudu/client/TestKuduScanner.java@118 PS3, Line 118: quiesceTserver.waitFor(); : assertEquals(0, quiesceTserver.waitFor() why do we need to call waitFor() twice here? -- To view, visit http://gerrit.cloudera.org:8080/17124 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I38ac84a52676ff361fa1ba996665b338d1bbfba1 Gerrit-Change-Number: 17124 Gerrit-PatchSet: 3 Gerrit-Owner: Andrew Wong Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Attila Bukor Gerrit-Reviewer: Grant Henke Gerrit-Reviewer: Hao Hao Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Comment-Date: Fri, 26 Feb 2021 20:25:15 + Gerrit-HasComments: Yes
[kudu-CR] [java] KUDU-3213: try at different server on TABLET NOT RUNNING
Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/17124 ) Change subject: [java] KUDU-3213: try at different server on TABLET_NOT_RUNNING .. Patch Set 3: Verified+1 Unrelated failure of HmsConfigurations/MasterFailoverTest.TestMasterUUIDResolution/1 -- To view, visit http://gerrit.cloudera.org:8080/17124 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I38ac84a52676ff361fa1ba996665b338d1bbfba1 Gerrit-Change-Number: 17124 Gerrit-PatchSet: 3 Gerrit-Owner: Andrew Wong Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Attila Bukor Gerrit-Reviewer: Grant Henke Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Comment-Date: Fri, 26 Feb 2021 03:22:25 + Gerrit-HasComments: No
[kudu-CR] [java] KUDU-3213: try at different server on TABLET NOT RUNNING
Andrew Wong has removed a vote on this change. Change subject: [java] KUDU-3213: try at different server on TABLET_NOT_RUNNING .. Removed Verified-1 by Kudu Jenkins (120) -- To view, visit http://gerrit.cloudera.org:8080/17124 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: deleteVote Gerrit-Change-Id: I38ac84a52676ff361fa1ba996665b338d1bbfba1 Gerrit-Change-Number: 17124 Gerrit-PatchSet: 3 Gerrit-Owner: Andrew Wong Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Attila Bukor Gerrit-Reviewer: Grant Henke Gerrit-Reviewer: Kudu Jenkins (120)
[kudu-CR] [java] KUDU-3213: try at different server on TABLET NOT RUNNING
Hello Attila Bukor, Kudu Jenkins, Grant Henke, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/17124 to look at the new patch set (#3). Change subject: [java] KUDU-3213: try at different server on TABLET_NOT_RUNNING .. [java] KUDU-3213: try at different server on TABLET_NOT_RUNNING Prior to this patch, if a tablet server were quiescing for a prolonged period, scan requests could time out, complaining that the tablet server is quiescing, but without ever retrying the scan at another tablet server. This is because tablet servers will return TABLET_NOT_RUNNING to clients when attempting a scan while quiescing. The behavior in the C++ client is that the location is then blacklisted and the request is retried elsewhere. The behavior in the Java client, though, is that the same location is retried until failure. This patch addresses this by treating TABLET_NOT_RUNNING errors in the Java client as we would for TABLET_NOT_FOUND, which is actually quite similar to the handling for TABLET_NOT_RUNNING in the C++ client: the location is invalidated for further attempts, and the request is retried elsewhere. Why not just have quiescing tablet servers return TABLET_NOT_FOUND, then? TABLET_NOT_FOUND errors in the C++ client actually have some behavior not present in the Java client: a tablet whose location is invalidated with TABLET_NOT_FOUND in the C++ client will be required to be looked up again, requiring a round trip to the master. This behavior doesn't exist in the Java client, so I thought it easiest to piggyback on TABLET_NOT_FOUND handling for now. Change-Id: I38ac84a52676ff361fa1ba996665b338d1bbfba1 --- M java/kudu-client/src/main/java/org/apache/kudu/client/RpcProxy.java M java/kudu-client/src/test/java/org/apache/kudu/client/TestKuduScanner.java M java/kudu-test-utils/src/main/java/org/apache/kudu/test/KuduTestHarness.java 3 files changed, 57 insertions(+), 4 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/24/17124/3 -- To view, visit http://gerrit.cloudera.org:8080/17124 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I38ac84a52676ff361fa1ba996665b338d1bbfba1 Gerrit-Change-Number: 17124 Gerrit-PatchSet: 3 Gerrit-Owner: Andrew Wong Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Attila Bukor Gerrit-Reviewer: Grant Henke Gerrit-Reviewer: Kudu Jenkins (120)
[kudu-CR] [java] KUDU-3213: try at different server on TABLET NOT RUNNING
Hello Attila Bukor, Kudu Jenkins, Grant Henke, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/17124 to look at the new patch set (#2). Change subject: [java] KUDU-3213: try at different server on TABLET_NOT_RUNNING .. [java] KUDU-3213: try at different server on TABLET_NOT_RUNNING Prior to this patch, if a tablet server were quiescing for a prolonged period, scan requests could time out, complaining that the tablet server is quiescing, but without ever retrying the scan at another tablet server. This is because tablet servers will return TABLET_NOT_RUNNING to clients when attempting a scan while quiescing. The behavior in the C++ client is that the location is then blacklisted and the request is retried elsewhere. The behavior in the Java client, though, is that the same location is retried until failure. This patch addresses this by treating TABLET_NOT_RUNNING errors in the Java client as we would for TABLET_NOT_FOUND, which is actually quite similar to the handling for TABLET_NOT_RUNNING in the C++ client: the location is invalidated for further attempts, and the request is retried elsewhere. Why not just have quiescing tablet servers return TABLET_NOT_FOUND, then? TABLET_NOT_FOUND errors in the C++ client actually have some behavior not present in the Java client: a tablet whose location is invalidated with TABLET_NOT_FOUND in the C++ client will be required to be looked up again, requiring a round trip to the master. This behavior doesn't exist in the Java client, so I thought it easiest to piggyback on TABLET_NOT_FOUND handling for now. Change-Id: I38ac84a52676ff361fa1ba996665b338d1bbfba1 --- M java/kudu-client/src/main/java/org/apache/kudu/client/RpcProxy.java M java/kudu-client/src/test/java/org/apache/kudu/client/TestKuduScanner.java M java/kudu-test-utils/src/main/java/org/apache/kudu/test/KuduTestHarness.java 3 files changed, 57 insertions(+), 4 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/24/17124/2 -- To view, visit http://gerrit.cloudera.org:8080/17124 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I38ac84a52676ff361fa1ba996665b338d1bbfba1 Gerrit-Change-Number: 17124 Gerrit-PatchSet: 2 Gerrit-Owner: Andrew Wong Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Attila Bukor Gerrit-Reviewer: Grant Henke Gerrit-Reviewer: Kudu Jenkins (120)
[kudu-CR] [java] KUDU-3213: try at different server on TABLET NOT RUNNING
Andrew Wong has uploaded this change for review. ( http://gerrit.cloudera.org:8080/17124 Change subject: [java] KUDU-3213: try at different server on TABLET_NOT_RUNNING .. [java] KUDU-3213: try at different server on TABLET_NOT_RUNNING Prior to this patch, if a tablet server were quiescing for a prolonged period, scan requests could time out, complaining that the tablet server is quiescing, but without ever retrying the scan at another tablet server. This is because tablet servers will return TABLET_NOT_RUNNING to clients when attempting a scan while quiescing. The behavior in the C++ client is that the location is then blacklisted and the request is retried elsewhere. The behavior in the Java client, though, is that the same location is retried until failure. This patch addresses this by treating TABLET_NOT_RUNNING errors in the Java client as we would for TABLET_NOT_FOUND, which is actually quite similar to the handling for TABLET_NOT_RUNNING in the C++ client: the location is invalidated for further attempts, and the request is retried elsewhere. Why not just have quiescing tablet servers return TABLET_NOT_FOUND, then? TABLET_NOT_FOUND errors in the C++ client actually have some behavior not present in the Java client: a tablet whose location is invalidated with TABLET_NOT_FOUND in the C++ client will be required to be looked up again, requiring a round trip to the master. This behavior doesn't exist in the Java client, so I thought it easiest to piggyback on TABLET_NOT_FOUND handling for now. Change-Id: I38ac84a52676ff361fa1ba996665b338d1bbfba1 --- M java/kudu-client/src/main/java/org/apache/kudu/client/RpcProxy.java M java/kudu-client/src/test/java/org/apache/kudu/client/TestKuduScanner.java M java/kudu-test-utils/src/main/java/org/apache/kudu/test/KuduTestHarness.java 3 files changed, 57 insertions(+), 4 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/24/17124/1 -- To view, visit http://gerrit.cloudera.org:8080/17124 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I38ac84a52676ff361fa1ba996665b338d1bbfba1 Gerrit-Change-Number: 17124 Gerrit-PatchSet: 1 Gerrit-Owner: Andrew Wong