[Impala-ASF-CR] IMPALA-5167: Reduce the number of Kudu clients created (FE)
Impala Public Jenkins has posted comments on this change. Change subject: IMPALA-5167: Reduce the number of Kudu clients created (FE) .. Patch Set 5: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/6898 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I9b0b346f37ee43f7f0eefe34a093eddbbdcf2a5e Gerrit-PatchSet: 5 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Thomas Tauber-Marshall Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5167: Reduce the number of Kudu clients created (FE)
Impala Public Jenkins has submitted this change and it was merged. Change subject: IMPALA-5167: Reduce the number of Kudu clients created (FE) .. IMPALA-5167: Reduce the number of Kudu clients created (FE) Creating Kudu clients is very expensive as each will fetch metadata from the Kudu master, so we should minimize the number of Kudu clients that get created. This patch stores a map from Kudu master addressed to Kudu clients in KuduUtil to be used across the FE and catalog. Another patch has already addressed the BE. Future work will consider providing a way to invalidate the stored Kudu clients in case something goes wrong (IMPALA-5685) This relies on two changes on the Kudu side: one that clears non-covered range entries from the client's cache on table open (d07ecd6ded01201c912d2e336611a6a941f48d98), and one that automatically refreshes auth tokens when they expire (603c1578c78c0377ffafdd9c427ebfd8a206bda3). This patch disables some tests that no longer work as they relied on Kudu metadata loading operations timing out, but since we're reusing clients the metadata is already loaded when the test is run. Testing: - Ran a stress test on a 10 node cluster: scan of a small Kudu table, 1000 concurrent queries, load on the Kudu master was reduced signficantly, from ~50% cpu to ~5%. (with the BE changes included) - Ran the Kudu e2e tests. - Manually ran a test with concurrent INSERTs and 'ALTER TABLE ADD PARTITION' (which is affected by the Kudu side change mentiond above) and verified correctness. Change-Id: I9b0b346f37ee43f7f0eefe34a093eddbbdcf2a5e Reviewed-on: http://gerrit.cloudera.org:8080/6898 Reviewed-by: Thomas Tauber-Marshall Tested-by: Impala Public Jenkins --- M fe/src/main/java/org/apache/impala/catalog/KuduTable.java M fe/src/main/java/org/apache/impala/planner/KuduScanNode.java M fe/src/main/java/org/apache/impala/service/KuduCatalogOpExecutor.java M fe/src/main/java/org/apache/impala/util/KuduUtil.java M testdata/workloads/functional-query/queries/QueryTest/kudu-timeouts-catalogd.test M testdata/workloads/functional-query/queries/QueryTest/kudu-timeouts-impalad.test 6 files changed, 55 insertions(+), 31 deletions(-) Approvals: Impala Public Jenkins: Verified Thomas Tauber-Marshall: Looks good to me, approved -- To view, visit http://gerrit.cloudera.org:8080/6898 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: merged Gerrit-Change-Id: I9b0b346f37ee43f7f0eefe34a093eddbbdcf2a5e Gerrit-PatchSet: 6 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Thomas Tauber-Marshall Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Thomas Tauber-Marshall
[Impala-ASF-CR] IMPALA-5167: Reduce the number of Kudu clients created (FE)
Impala Public Jenkins has posted comments on this change. Change subject: IMPALA-5167: Reduce the number of Kudu clients created (FE) .. Patch Set 5: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/907/ -- To view, visit http://gerrit.cloudera.org:8080/6898 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I9b0b346f37ee43f7f0eefe34a093eddbbdcf2a5e Gerrit-PatchSet: 5 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Thomas Tauber-Marshall Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5167: Reduce the number of Kudu clients created (FE)
Thomas Tauber-Marshall has posted comments on this change. Change subject: IMPALA-5167: Reduce the number of Kudu clients created (FE) .. Patch Set 5: GVO failure was unrelated. I filed: IMPALA-5692 -- To view, visit http://gerrit.cloudera.org:8080/6898 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I9b0b346f37ee43f7f0eefe34a093eddbbdcf2a5e Gerrit-PatchSet: 5 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Thomas Tauber-Marshall Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5167: Reduce the number of Kudu clients created (FE)
Impala Public Jenkins has posted comments on this change. Change subject: IMPALA-5167: Reduce the number of Kudu clients created (FE) .. Patch Set 5: Verified-1 Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/901/ -- To view, visit http://gerrit.cloudera.org:8080/6898 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I9b0b346f37ee43f7f0eefe34a093eddbbdcf2a5e Gerrit-PatchSet: 5 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Thomas Tauber-Marshall Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5167: Reduce the number of Kudu clients created (FE)
Impala Public Jenkins has posted comments on this change. Change subject: IMPALA-5167: Reduce the number of Kudu clients created (FE) .. Patch Set 5: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/901/ -- To view, visit http://gerrit.cloudera.org:8080/6898 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I9b0b346f37ee43f7f0eefe34a093eddbbdcf2a5e Gerrit-PatchSet: 5 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Thomas Tauber-Marshall Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5167: Reduce the number of Kudu clients created (FE)
Thomas Tauber-Marshall has posted comments on this change. Change subject: IMPALA-5167: Reduce the number of Kudu clients created (FE) .. Patch Set 5: Code-Review+2 GVO failed due to IMPALA-5686. Rebased to include the fix for it. -- To view, visit http://gerrit.cloudera.org:8080/6898 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I9b0b346f37ee43f7f0eefe34a093eddbbdcf2a5e Gerrit-PatchSet: 5 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Thomas Tauber-Marshall Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5167: Reduce the number of Kudu clients created (FE)
Impala Public Jenkins has posted comments on this change. Change subject: IMPALA-5167: Reduce the number of Kudu clients created (FE) .. Patch Set 4: Verified-1 Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/897/ -- To view, visit http://gerrit.cloudera.org:8080/6898 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I9b0b346f37ee43f7f0eefe34a093eddbbdcf2a5e Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Thomas Tauber-Marshall Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5167: Reduce the number of Kudu clients created (FE)
Impala Public Jenkins has posted comments on this change. Change subject: IMPALA-5167: Reduce the number of Kudu clients created (FE) .. Patch Set 4: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/897/ -- To view, visit http://gerrit.cloudera.org:8080/6898 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I9b0b346f37ee43f7f0eefe34a093eddbbdcf2a5e Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Thomas Tauber-Marshall Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5167: Reduce the number of Kudu clients created (FE)
Matthew Jacobs has posted comments on this change. Change subject: IMPALA-5167: Reduce the number of Kudu clients created (FE) .. Patch Set 4: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/6898 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I9b0b346f37ee43f7f0eefe34a093eddbbdcf2a5e Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Thomas Tauber-Marshall Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5167: Reduce the number of Kudu clients created (FE)
Thomas Tauber-Marshall has posted comments on this change. Change subject: IMPALA-5167: Reduce the number of Kudu clients created (FE) .. Patch Set 4: (1 comment) http://gerrit.cloudera.org:8080/#/c/6898/4/testdata/workloads/functional-query/queries/QueryTest/kudu-timeouts-catalogd.test File testdata/workloads/functional-query/queries/QueryTest/kudu-timeouts-catalogd.test: Line 1: # TODO: enable this once we have a way to invalidate kudu clients (IMPALA-5685) A few notes here: - Moved to the top as the test file parser ignores 'headers'. Seemed like the cleanest way to disable it. - I renamed the table here as the test previously wasn't actually testing what its supposed to - it was just returning the cached failed table load from the first query. - The other tests in the file still work as the second and third tests are related to table creation situations, where the table's metadata won't have been loaded by the previous query. -- To view, visit http://gerrit.cloudera.org:8080/6898 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I9b0b346f37ee43f7f0eefe34a093eddbbdcf2a5e Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Thomas Tauber-Marshall Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5167: Reduce the number of Kudu clients created (FE)
Hello Impala Public Jenkins, Matthew Jacobs, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/6898 to look at the new patch set (#4). Change subject: IMPALA-5167: Reduce the number of Kudu clients created (FE) .. IMPALA-5167: Reduce the number of Kudu clients created (FE) Creating Kudu clients is very expensive as each will fetch metadata from the Kudu master, so we should minimize the number of Kudu clients that get created. This patch stores a map from Kudu master addressed to Kudu clients in KuduUtil to be used across the FE and catalog. Another patch has already addressed the BE. Future work will consider providing a way to invalidate the stored Kudu clients in case something goes wrong (IMPALA-5685) This relies on two changes on the Kudu side: one that clears non-covered range entries from the client's cache on table open (d07ecd6ded01201c912d2e336611a6a941f48d98), and one that automatically refreshes auth tokens when they expire (603c1578c78c0377ffafdd9c427ebfd8a206bda3). This patch disables some tests that no longer work as they relied on Kudu metadata loading operations timing out, but since we're reusing clients the metadata is already loaded when the test is run. Testing: - Ran a stress test on a 10 node cluster: scan of a small Kudu table, 1000 concurrent queries, load on the Kudu master was reduced signficantly, from ~50% cpu to ~5%. (with the BE changes included) - Ran the Kudu e2e tests. - Manually ran a test with concurrent INSERTs and 'ALTER TABLE ADD PARTITION' (which is affected by the Kudu side change mentiond above) and verified correctness. Change-Id: I9b0b346f37ee43f7f0eefe34a093eddbbdcf2a5e --- M fe/src/main/java/org/apache/impala/catalog/KuduTable.java M fe/src/main/java/org/apache/impala/planner/KuduScanNode.java M fe/src/main/java/org/apache/impala/service/KuduCatalogOpExecutor.java M fe/src/main/java/org/apache/impala/util/KuduUtil.java M testdata/workloads/functional-query/queries/QueryTest/kudu-timeouts-catalogd.test M testdata/workloads/functional-query/queries/QueryTest/kudu-timeouts-impalad.test 6 files changed, 55 insertions(+), 31 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/98/6898/4 -- To view, visit http://gerrit.cloudera.org:8080/6898 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I9b0b346f37ee43f7f0eefe34a093eddbbdcf2a5e Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Thomas Tauber-Marshall Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Thomas Tauber-Marshall
[Impala-ASF-CR] IMPALA-5167: Reduce the number of Kudu clients created (FE)
Matthew Jacobs has posted comments on this change. Change subject: IMPALA-5167: Reduce the number of Kudu clients created (FE) .. Patch Set 3: > So the GVO failed because there's a custom cluster test that runs > impala with a very short (1ms) timeout for Kudu operations and then > runs two queries, each of which it expects to timeout while > retrieving table metadata. > > Since we reuse clients now, the metadata ends up being fetched by > the first query (even though the operation times out, it apparently > still completes afterwards) and then the second query doesn't time > out since the metadata is already preset (both tests currently > touch the same table, but changing the second one to a different > table doesn't seem to fix the problem, presumably the Kudu client > loads more tables than explicitly requested). > > Some possible solutions: > - Only run one of the queries. This would cause some coverage to be > lost (we could replace it by adding a test that checks for timeout > behavior in the BE, which I don't think currently exists, and would > work since the BE and FE still use different clients) > - Run both of the tests in separate custom_cluster tests. > custom_cluster tests are expensive (this one runs for ~20s) so it > seems like a lot of test overhead for little benefit. > - Provide some way to invalidate the Kudu clients, eg. have > 'invalidate metadata' clear the list of clients in KuduUtil. We may > want to do something like this anyways since in the current version > of the patch once a KuduClient is created for a particular Kudu > master, that client will live on as long as the impalad does, and > if something goes wrong (eg. the automatic token refresh that the > Kudu team implemented doesn't work for some reason) you're > basically screwed and won't be able to do anything with that Kudu > master without restarting the impalad. Good catch, yes that seems like the test won't work after your change. Let's do this: 1) Change the tests (I think you'll have to update both kudu-timeouts-impalad and kudu-timeouts-catalogd) to only run the first for now. I agree it's not a huge amount of lost coverage. Just comment out the rest and leave a TODO to add them back, w/ a reference to the JIRA in #2. 2) File a JIRA to provide some way to invalidate the Kudu clients. It's a bit different from HMS metadata since it's not really per table, it's per Kudu server. I'm not sure yet if this is something that will be worthwhile to expose for general use (e.g. a well documented stmt like 'invalidate metadata'), or if we should try to do something that would be more covert, e.g. a web endpoint (hacky). -- To view, visit http://gerrit.cloudera.org:8080/6898 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I9b0b346f37ee43f7f0eefe34a093eddbbdcf2a5e Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Thomas Tauber-Marshall Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5167: Reduce the number of Kudu clients created (FE)
Thomas Tauber-Marshall has posted comments on this change. Change subject: IMPALA-5167: Reduce the number of Kudu clients created (FE) .. Patch Set 3: So the GVO failed because there's a custom cluster test that runs impala with a very short (1ms) timeout for Kudu operations and then runs two queries, each of which it expects to timeout while retrieving table metadata. Since we reuse clients now, the metadata ends up being fetched by the first query (even though the operation times out, it apparently still completes afterwards) and then the second query doesn't time out since the metadata is already preset (both tests currently touch the same table, but changing the second one to a different table doesn't seem to fix the problem, presumably the Kudu client loads more tables than explicitly requested). Some possible solutions: - Only run one of the queries. This would cause some coverage to be lost (we could replace it by adding a test that checks for timeout behavior in the BE, which I don't think currently exists, and would work since the BE and FE still use different clients) - Run both of the tests in separate custom_cluster tests. custom_cluster tests are expensive (this one runs for ~20s) so it seems like a lot of test overhead for little benefit. - Provide some way to invalidate the Kudu clients, eg. have 'invalidate metadata' clear the list of clients in KuduUtil. We may want to do something like this anyways since in the current version of the patch once a KuduClient is created for a particular Kudu master, that client will live on as long as the impalad does, and if something goes wrong (eg. the automatic token refresh that the Kudu team implemented doesn't work for some reason) you're basically screwed and won't be able to do anything with that Kudu master without restarting the impalad. -- To view, visit http://gerrit.cloudera.org:8080/6898 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I9b0b346f37ee43f7f0eefe34a093eddbbdcf2a5e Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Thomas Tauber-Marshall Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5167: Reduce the number of Kudu clients created (FE)
Impala Public Jenkins has posted comments on this change. Change subject: IMPALA-5167: Reduce the number of Kudu clients created (FE) .. Patch Set 3: Verified-1 Build failed: http://jenkins.impala.io:8080/job/gerrit-verify-dryrun/872/ -- To view, visit http://gerrit.cloudera.org:8080/6898 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I9b0b346f37ee43f7f0eefe34a093eddbbdcf2a5e Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Thomas Tauber-Marshall Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5167: Reduce the number of Kudu clients created (FE)
Impala Public Jenkins has posted comments on this change. Change subject: IMPALA-5167: Reduce the number of Kudu clients created (FE) .. Patch Set 3: Build started: http://jenkins.impala.io:8080/job/gerrit-verify-dryrun/872/ -- To view, visit http://gerrit.cloudera.org:8080/6898 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I9b0b346f37ee43f7f0eefe34a093eddbbdcf2a5e Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Thomas Tauber-Marshall Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5167: Reduce the number of Kudu clients created (FE)
Matthew Jacobs has posted comments on this change. Change subject: IMPALA-5167: Reduce the number of Kudu clients created (FE) .. Patch Set 3: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/6898 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I9b0b346f37ee43f7f0eefe34a093eddbbdcf2a5e Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Thomas Tauber-Marshall Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5167: Reduce the number of Kudu clients created (FE)
Thomas Tauber-Marshall has posted comments on this change. Change subject: IMPALA-5167: Reduce the number of Kudu clients created (FE) .. Patch Set 3: (1 comment) http://gerrit.cloudera.org:8080/#/c/6898/2/fe/src/main/java/org/apache/impala/util/KuduUtil.java File fe/src/main/java/org/apache/impala/util/KuduUtil.java: PS2, Line 60: > is this needed? Done -- To view, visit http://gerrit.cloudera.org:8080/6898 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I9b0b346f37ee43f7f0eefe34a093eddbbdcf2a5e Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Thomas Tauber-Marshall Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5167: Reduce the number of Kudu clients created (FE)
Thomas Tauber-Marshall has uploaded a new patch set (#3). Change subject: IMPALA-5167: Reduce the number of Kudu clients created (FE) .. IMPALA-5167: Reduce the number of Kudu clients created (FE) Creating Kudu clients is very expensive as each will fetch metadata from the Kudu master, so we should minimize the number of Kudu clients that get created. This patch stores a map from Kudu master addressed to Kudu clients in KuduUtil to be used across the FE and catalog. Another patch has already addressed the BE. This relies on two changes on the Kudu side: one that clears non-covered range entries from the client's cache on table open (d07ecd6ded01201c912d2e336611a6a941f48d98), and one that automatically refreshes auth tokens when they expire (603c1578c78c0377ffafdd9c427ebfd8a206bda3). Testing: - Ran a stress test on a 10 node cluster: scan of a small Kudu table, 1000 concurrent queries, load on the Kudu master was reduced signficantly, from ~50% cpu to ~5%. (with the BE changes included) - Ran the Kudu e2e tests. - Manually ran a test with concurrent INSERTs and 'ALTER TABLE ADD PARTITION' (which is affected by the Kudu side change mentiond above) and verified correctness. Change-Id: I9b0b346f37ee43f7f0eefe34a093eddbbdcf2a5e --- M fe/src/main/java/org/apache/impala/catalog/KuduTable.java M fe/src/main/java/org/apache/impala/planner/KuduScanNode.java M fe/src/main/java/org/apache/impala/service/KuduCatalogOpExecutor.java M fe/src/main/java/org/apache/impala/util/KuduUtil.java 4 files changed, 41 insertions(+), 21 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/98/6898/3 -- To view, visit http://gerrit.cloudera.org:8080/6898 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I9b0b346f37ee43f7f0eefe34a093eddbbdcf2a5e Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Thomas Tauber-Marshall Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Thomas Tauber-Marshall
[Impala-ASF-CR] IMPALA-5167: Reduce the number of Kudu clients created (FE)
Matthew Jacobs has posted comments on this change. Change subject: IMPALA-5167: Reduce the number of Kudu clients created (FE) .. Patch Set 2: (1 comment) http://gerrit.cloudera.org:8080/#/c/6898/2/fe/src/main/java/org/apache/impala/util/KuduUtil.java File fe/src/main/java/org/apache/impala/util/KuduUtil.java: PS2, Line 60: import com.google.common.collect.Sets; is this needed? -- To view, visit http://gerrit.cloudera.org:8080/6898 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I9b0b346f37ee43f7f0eefe34a093eddbbdcf2a5e Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Thomas Tauber-Marshall Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5167: Reduce the number of Kudu clients created (FE)
Thomas Tauber-Marshall has posted comments on this change. Change subject: IMPALA-5167: Reduce the number of Kudu clients created (FE) .. Patch Set 2: (1 comment) http://gerrit.cloudera.org:8080/#/c/6898/1/fe/src/main/java/org/apache/impala/util/KuduUtil.java File fe/src/main/java/org/apache/impala/util/KuduUtil.java: Line 84: public static KuduClient getKuduClient(String kuduMasters) { > how does this deal with invalidated clients? is there some way to detect th The change has now gone in on the Kudu side to automatically refresh invalidated clients (KUDU-2013). -- To view, visit http://gerrit.cloudera.org:8080/6898 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I9b0b346f37ee43f7f0eefe34a093eddbbdcf2a5e Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Thomas Tauber-Marshall Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5167: Reduce the number of Kudu clients created (FE)
Thomas Tauber-Marshall has uploaded a new patch set (#2). Change subject: IMPALA-5167: Reduce the number of Kudu clients created (FE) .. IMPALA-5167: Reduce the number of Kudu clients created (FE) Creating Kudu clients is very expensive as each will fetch metadata from the Kudu master, so we should minimize the number of Kudu clients that get created. This patch stores a map from Kudu master addressed to Kudu clients in KuduUtil to be used across the FE and catalog. Another patch has already addressed the BE. This relies on two changes on the Kudu side: one that clears non-covered range entries from the client's cache on table open (d07ecd6ded01201c912d2e336611a6a941f48d98), and one that automatically refreshes auth tokens when they expire (603c1578c78c0377ffafdd9c427ebfd8a206bda3). Testing: - Ran a stress test on a 10 node cluster: scan of a small Kudu table, 1000 concurrent queries, load on the Kudu master was reduced signficantly, from ~50% cpu to ~5%. (with the BE changes included) - Ran the Kudu e2e tests. - Manually ran a test with concurrent INSERTs and 'ALTER TABLE ADD PARTITION' (which is affected by the Kudu side change mentiond above) and verified correctness. Change-Id: I9b0b346f37ee43f7f0eefe34a093eddbbdcf2a5e --- M fe/src/main/java/org/apache/impala/catalog/KuduTable.java M fe/src/main/java/org/apache/impala/planner/KuduScanNode.java M fe/src/main/java/org/apache/impala/service/KuduCatalogOpExecutor.java M fe/src/main/java/org/apache/impala/util/KuduUtil.java 4 files changed, 42 insertions(+), 21 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/98/6898/2 -- To view, visit http://gerrit.cloudera.org:8080/6898 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I9b0b346f37ee43f7f0eefe34a093eddbbdcf2a5e Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Thomas Tauber-Marshall Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Matthew Jacobs
[Impala-ASF-CR] IMPALA-5167: Reduce the number of Kudu clients created (FE)
Marcel Kornacker has posted comments on this change. Change subject: IMPALA-5167: Reduce the number of Kudu clients created (FE) .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/6898/1/fe/src/main/java/org/apache/impala/util/KuduUtil.java File fe/src/main/java/org/apache/impala/util/KuduUtil.java: Line 84: return kuduClients_.get(kuduMasters); how does this deal with invalidated clients? is there some way to detect that programmatically (call some cheap function) and then re-create the cache entry? -- To view, visit http://gerrit.cloudera.org:8080/6898 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I9b0b346f37ee43f7f0eefe34a093eddbbdcf2a5e Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Thomas Tauber-Marshall Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Matthew Jacobs Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5167: Reduce the number of Kudu clients created (FE)
Thomas Tauber-Marshall has uploaded a new change for review. http://gerrit.cloudera.org:8080/6898 Change subject: IMPALA-5167: Reduce the number of Kudu clients created (FE) .. IMPALA-5167: Reduce the number of Kudu clients created (FE) Creating Kudu clients is very expensive as each will fetch metadata from the Kudu master, so we should minimize the number of Kudu clients that get created. This patch stores a map from Kudu master addressed to Kudu clients in KuduUtil to be used across the FE and catalog. Another patch will address the BE. This relies on a change on the Kudu side that clears non-covered range entries from the client's cache on table open (d07ecd6ded01201c912d2e336611a6a941f48d98). Testing: - Ran a stress test on a 10 node cluster: scan of a small Kudu table, 1000 concurrent queries, load on the Kudu master was reduced signficantly, from ~50% cpu to ~5%. (with the BE changes included) - Ran the Kudu e2e tests. - Manually ran a test with concurrent INSERTs and 'ALTER TABLE ADD PARTITION' (which is affected by the Kudu side change mentiond above) and verified correctness. Change-Id: I9b0b346f37ee43f7f0eefe34a093eddbbdcf2a5e --- M fe/src/main/java/org/apache/impala/catalog/KuduTable.java M fe/src/main/java/org/apache/impala/planner/KuduScanNode.java M fe/src/main/java/org/apache/impala/service/KuduCatalogOpExecutor.java M fe/src/main/java/org/apache/impala/util/KuduUtil.java 4 files changed, 41 insertions(+), 21 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/98/6898/1 -- To view, visit http://gerrit.cloudera.org:8080/6898 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newchange Gerrit-Change-Id: I9b0b346f37ee43f7f0eefe34a093eddbbdcf2a5e Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Thomas Tauber-Marshall