[Impala-ASF-CR] IMPALA-6671: Change wait for sync ddl timeout
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/17253 ) Change subject: IMPALA-6671: Change wait for sync ddl timeout .. IMPALA-6671: Change wait for sync ddl timeout When skip locked tables from topic updates is enabled (topic_update_tbl_max_wait_time_ms > 0), it is possible that a thread waiting for a topic update during a sync ddl execution to terminate earlier. This is because the waitForSyncDdlVersion currently statically counts the number of instances of topic updates against a maximum and bails out when the maximum is reached. With topic update thread skipping locked tables, this number of instances of topic updates is more likely to hit the maximum attempt limit. This commit changes the logic so that when locked tables are skipped from topic updates, the sync ddl operation waits until a configurable timeout. This timeout value is specified in seconds using the configuration max_wait_time_for_sync_ddl_s. The default value of this configuration is 0 which means it waits indefinitely. This makes sure that only the sync ddl queries on a table which has been locked wait for extended durations while the other unrelated sync ddl queries can finish appropriately. Also this commit changes the default values of following flags which were introduced in the earlier patch for IMPALA-6671. The current default value of topic_update_tbl_max_wait_time_ms of 500ms is too low and may skip the locked tables more aggressively than needed. The new defaults were set based on analysis of a real world deployment. topic_update_tbl_max_wait_time_ms = 12 catalog_max_lock_skipped_topic_updates = 3 Testing: 1. Ran existing test test_topic_update_frequency. 2. Added a new test for the newly added flag. Change-Id: I79e64cdec0e6aa7b597a47851b4b5c5441ca5528 Reviewed-on: http://gerrit.cloudera.org:8080/17253 Reviewed-by: Impala Public Jenkins Tested-by: Impala Public Jenkins --- M be/src/catalog/catalog-server.cc M be/src/util/backend-gflag-util.cc M common/thrift/BackendGflags.thrift M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/service/BackendConfig.java M tests/custom_cluster/test_topic_update_frequency.py 6 files changed, 86 insertions(+), 5 deletions(-) Approvals: Impala Public Jenkins: Looks good to me, approved; Verified -- To view, visit http://gerrit.cloudera.org:8080/17253 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I79e64cdec0e6aa7b597a47851b4b5c5441ca5528 Gerrit-Change-Number: 17253 Gerrit-PatchSet: 8 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Vihang Karajgaonkar
[Impala-ASF-CR] IMPALA-6671: Change wait for sync ddl timeout
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17253 ) Change subject: IMPALA-6671: Change wait for sync ddl timeout .. Patch Set 7: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/17253 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I79e64cdec0e6aa7b597a47851b4b5c5441ca5528 Gerrit-Change-Number: 17253 Gerrit-PatchSet: 7 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Wed, 07 Apr 2021 05:46:38 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6671: Change wait for sync ddl timeout
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17253 ) Change subject: IMPALA-6671: Change wait for sync ddl timeout .. Patch Set 6: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/8510/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/17253 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I79e64cdec0e6aa7b597a47851b4b5c5441ca5528 Gerrit-Change-Number: 17253 Gerrit-PatchSet: 6 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Wed, 07 Apr 2021 00:24:28 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6671: Change wait for sync ddl timeout
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17253 ) Change subject: IMPALA-6671: Change wait for sync ddl timeout .. Patch Set 7: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/7048/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/17253 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I79e64cdec0e6aa7b597a47851b4b5c5441ca5528 Gerrit-Change-Number: 17253 Gerrit-PatchSet: 7 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Wed, 07 Apr 2021 00:04:24 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6671: Change wait for sync ddl timeout
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17253 ) Change subject: IMPALA-6671: Change wait for sync ddl timeout .. Patch Set 7: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/17253 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I79e64cdec0e6aa7b597a47851b4b5c5441ca5528 Gerrit-Change-Number: 17253 Gerrit-PatchSet: 7 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Wed, 07 Apr 2021 00:04:23 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6671: Change wait for sync ddl timeout
Vihang Karajgaonkar has posted comments on this change. ( http://gerrit.cloudera.org:8080/17253 ) Change subject: IMPALA-6671: Change wait for sync ddl timeout .. Patch Set 6: Code-Review+2 Carrying the +2 from Quanlong from earlier. -- To view, visit http://gerrit.cloudera.org:8080/17253 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I79e64cdec0e6aa7b597a47851b4b5c5441ca5528 Gerrit-Change-Number: 17253 Gerrit-PatchSet: 6 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Wed, 07 Apr 2021 00:04:01 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6671: Change wait for sync ddl timeout
Vihang Karajgaonkar has uploaded a new patch set (#6). ( http://gerrit.cloudera.org:8080/17253 ) Change subject: IMPALA-6671: Change wait for sync ddl timeout .. IMPALA-6671: Change wait for sync ddl timeout When skip locked tables from topic updates is enabled (topic_update_tbl_max_wait_time_ms > 0), it is possible that a thread waiting for a topic update during a sync ddl execution to terminate earlier. This is because the waitForSyncDdlVersion currently statically counts the number of instances of topic updates against a maximum and bails out when the maximum is reached. With topic update thread skipping locked tables, this number of instances of topic updates is more likely to hit the maximum attempt limit. This commit changes the logic so that when locked tables are skipped from topic updates, the sync ddl operation waits until a configurable timeout. This timeout value is specified in seconds using the configuration max_wait_time_for_sync_ddl_s. The default value of this configuration is 0 which means it waits indefinitely. This makes sure that only the sync ddl queries on a table which has been locked wait for extended durations while the other unrelated sync ddl queries can finish appropriately. Also this commit changes the default values of following flags which were introduced in the earlier patch for IMPALA-6671. The current default value of topic_update_tbl_max_wait_time_ms of 500ms is too low and may skip the locked tables more aggressively than needed. The new defaults were set based on analysis of a real world deployment. topic_update_tbl_max_wait_time_ms = 12 catalog_max_lock_skipped_topic_updates = 3 Testing: 1. Ran existing test test_topic_update_frequency. 2. Added a new test for the newly added flag. Change-Id: I79e64cdec0e6aa7b597a47851b4b5c5441ca5528 --- M be/src/catalog/catalog-server.cc M be/src/util/backend-gflag-util.cc M common/thrift/BackendGflags.thrift M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/service/BackendConfig.java M tests/custom_cluster/test_topic_update_frequency.py 6 files changed, 86 insertions(+), 5 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/53/17253/6 -- To view, visit http://gerrit.cloudera.org:8080/17253 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I79e64cdec0e6aa7b597a47851b4b5c5441ca5528 Gerrit-Change-Number: 17253 Gerrit-PatchSet: 6 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Vihang Karajgaonkar
[Impala-ASF-CR] IMPALA-6671: Change wait for sync ddl timeout
Quanlong Huang has posted comments on this change. ( http://gerrit.cloudera.org:8080/17253 ) Change subject: IMPALA-6671: Change wait for sync ddl timeout .. Patch Set 5: Code-Review+2 (3 comments) http://gerrit.cloudera.org:8080/#/c/17253/4/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java: http://gerrit.cloudera.org:8080/#/c/17253/4/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@3094 PS4, Line 3094: long maxNumAttempts = 5; > "Now when skipping locked table is enabled, more rounds of topic updates co Ah, I see. Just realized the numSkippedUpdatesLockContention_ of a table entry only increases when the table's LastVersionSeenByTopicUpdate changes. Then we really can't estimate how many times the topic update thread would skip a locked table version. Thanks for the explanation! http://gerrit.cloudera.org:8080/#/c/17253/4/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@3160 PS4, Line 3160: It checks > nit: remove this? remove this? http://gerrit.cloudera.org:8080/#/c/17253/4/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@3168 PS4, Line 3168: timeoutSecs == 0 > I think we should return false for negative values as well. Some users may set this to -1 thinking -1 means disabling it. So I think negative values should be taken care as well. -- To view, visit http://gerrit.cloudera.org:8080/17253 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I79e64cdec0e6aa7b597a47851b4b5c5441ca5528 Gerrit-Change-Number: 17253 Gerrit-PatchSet: 5 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Tue, 06 Apr 2021 23:52:41 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6671: Change wait for sync ddl timeout
Vihang Karajgaonkar has posted comments on this change. ( http://gerrit.cloudera.org:8080/17253 ) Change subject: IMPALA-6671: Change wait for sync ddl timeout .. Patch Set 4: (1 comment) http://gerrit.cloudera.org:8080/#/c/17253/4/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java: http://gerrit.cloudera.org:8080/#/c/17253/4/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@3094 PS4, Line 3094: long maxNumAttempts = 5; > Ah, I see. After thinking this more clear, I think my idea should be changi "Now when skipping locked table is enabled, more rounds of topic updates could be sent in the time range between t1-a and t1-b. So 'maxNumAttempts' is reached quickly if it's still 5" Yes, that is exactly the problem that this patch is trying to address. However, in my opinion the number of topic updates which will skip the locked table is non-deterministic and depends on how long the table is locked. The maxSkippedUpdatesLockContention_ doesn't simply count the number of topic updates which skip the table. E.g. if between t1-a and t1-b topic update thread started delta computation 10 times and skipped the table 10 times because ddl2 is holding the lock, it would still be counted as 1 skippedUpdate because it is the same lock operation. If you read the description of catalog_max_lock_skipped_topic_updates it says "This limit only applies to distinct lock operations which block the topic update thread". I think I should update the comment with a more detailed comment. Hope this makes it clearer. -- To view, visit http://gerrit.cloudera.org:8080/17253 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I79e64cdec0e6aa7b597a47851b4b5c5441ca5528 Gerrit-Change-Number: 17253 Gerrit-PatchSet: 4 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Tue, 06 Apr 2021 23:18:24 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6671: Change wait for sync ddl timeout
Quanlong Huang has posted comments on this change. ( http://gerrit.cloudera.org:8080/17253 ) Change subject: IMPALA-6671: Change wait for sync ddl timeout .. Patch Set 5: (1 comment) http://gerrit.cloudera.org:8080/#/c/17253/4/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java: http://gerrit.cloudera.org:8080/#/c/17253/4/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@3094 PS4, Line 3094: long maxNumAttempts = 5; > Unfortunately, I don't think this approach would work. maxSkippedUpdatesLoc Ah, I see. After thinking this more clear, I think my idea should be changing 'maxNumAttempts' from 5 to 5 * (maxSkippedUpdatesLockContention_ + 1). Please verify my understanding. Let's say there are 6 concurrent DDLs (ddl1, ddl2, ..., ddl6) on table 'foo' and both with sync_ddl=true. And let's say they finishes in this order: t0: ddl1 done, go into waitForSyncDdlVersion() t1-a: catalog topic update thread start collecting updates t1-b: ddl2 done, bump the table version, go into waitForSyncDdlVersion() t1-c: catalog topic update thread finish collecting updates and skip the table since its version is out of scope t2-a: catalog topic update thread start t2-b: ddl3 done, bump the table version, go into waitForSyncDdlVersion() t2-c: catalog topic update thread finish collecting updates and skip the table since its version is out of scope ... t5-a: catalog topic update thread start t5-b: ddl6 done, bump the table version, go into waitForSyncDdlVersion() t5-c: catalog topic update thread finish collecting updates and skip the table since its version is out of scope t6: waitForSyncDdlVersion() for ddl1 returns since 'maxNumAttempts' is reached. This is the expected behavior before IMPALA-6671. Now when skipping locked table is enabled, more rounds of topic updates could be sent in the time range between t1-a and t1-b. So 'maxNumAttempts' is reached quickly if it's still 5. There are at most maxSkippedUpdatesLockContention_ updates can be sent between t1-a and t1-b, the same for t2, t3, ..., t5. So I'm thinking setting maxNumAttempts to 5 * (maxSkippedUpdatesLockContention_ + 1) can achieve the pre-IMPALA-6671 purpose. -- To view, visit http://gerrit.cloudera.org:8080/17253 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I79e64cdec0e6aa7b597a47851b4b5c5441ca5528 Gerrit-Change-Number: 17253 Gerrit-PatchSet: 5 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Tue, 06 Apr 2021 02:24:57 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6671: Change wait for sync ddl timeout
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17253 ) Change subject: IMPALA-6671: Change wait for sync ddl timeout .. Patch Set 5: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/8501/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/17253 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I79e64cdec0e6aa7b597a47851b4b5c5441ca5528 Gerrit-Change-Number: 17253 Gerrit-PatchSet: 5 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Mon, 05 Apr 2021 21:19:57 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6671: Change wait for sync ddl timeout
Vihang Karajgaonkar has uploaded a new patch set (#5). ( http://gerrit.cloudera.org:8080/17253 ) Change subject: IMPALA-6671: Change wait for sync ddl timeout .. IMPALA-6671: Change wait for sync ddl timeout When skip locked tables from topic updates is enabled (topic_update_tbl_max_wait_time_ms > 0), it is possible that a thread waiting for a topic update during a sync ddl execution to terminate earlier. This is because the waitForSyncDdlVersion currently statically counts the number of instances of topic updates against a maximum and bails out when the maximum is reached. With topic update thread skipping locked tables, this number of instances of topic updates is more likely to hit the maximum attempt limit. This commit changes the logic so that when locked tables are skipped from topic updates, the sync ddl operation waits until a configurable timeout. This timeout value is specified in seconds using the configuration max_wait_time_for_sync_ddl_s. The default value of this configuration is 0 which means it waits indefinitely. This makes sure that only the sync ddl queries on a table which has been locked wait for extended durations while the other unrelated sync ddl queries can finish appropriately. Also this commit changes the default values of following flags which were introduced in the earlier patch for IMPALA-6671. The current default value of topic_update_tbl_max_wait_time_ms of 500ms is too low and may skip the locked tables more aggressively than needed. The new defaults were set based on analysis of a real world deployment. topic_update_tbl_max_wait_time_ms = 12 catalog_max_lock_skipped_topic_updates = 3 Testing: 1. Ran existing test test_topic_update_frequency. 2. Added a new test for the newly added flag. Change-Id: I79e64cdec0e6aa7b597a47851b4b5c5441ca5528 --- M be/src/catalog/catalog-server.cc M be/src/util/backend-gflag-util.cc M common/thrift/BackendGflags.thrift M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/service/BackendConfig.java M tests/custom_cluster/test_topic_update_frequency.py 6 files changed, 88 insertions(+), 5 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/53/17253/5 -- To view, visit http://gerrit.cloudera.org:8080/17253 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I79e64cdec0e6aa7b597a47851b4b5c5441ca5528 Gerrit-Change-Number: 17253 Gerrit-PatchSet: 5 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Vihang Karajgaonkar
[Impala-ASF-CR] IMPALA-6671: Change wait for sync ddl timeout
Vihang Karajgaonkar has posted comments on this change. ( http://gerrit.cloudera.org:8080/17253 ) Change subject: IMPALA-6671: Change wait for sync ddl timeout .. Patch Set 4: (3 comments) http://gerrit.cloudera.org:8080/#/c/17253/4/be/src/catalog/catalog-server.cc File be/src/catalog/catalog-server.cc: http://gerrit.cloudera.org:8080/#/c/17253/4/be/src/catalog/catalog-server.cc@92 PS4, Line 92: DEFINE_int32(max_wait_time_for_sync_ddl_s, 0, "Maximum time (in seconds) until " > Do you have a recommendation on what this should be when topic_update_tbl_m Do you mean what should the recommendation for the least value of it? It is hard to predict. Ideally, the minimum time required for this would be the 2*(time required for one topic update delta calculation). The default value of 0 is good for most cases when topic_update_tbl_max_wait_time_ms is enabled, unless there are any issues and user has to override it manually. http://gerrit.cloudera.org:8080/#/c/17253/4/be/src/catalog/catalog-server.cc@94 PS4, Line 94: will > nit: will wait? Done http://gerrit.cloudera.org:8080/#/c/17253/4/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java: http://gerrit.cloudera.org:8080/#/c/17253/4/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@3094 PS4, Line 3094: long maxNumAttempts = 5; > I think another solution is increasing maxNumAttempts with maxSkippedUpdate Unfortunately, I don't think this approach would work. maxSkippedUpdatesLockContention_ is not just simply counts the number of topic updates due to the lock, but number of distinct lock operations which are skipped by topic update thread. Otherwise, it would be not work since we would just be delaying the onset of the problem. Hence we cannot rely on maxSkippedUpdatesLockContention_ since the topic updates skipped will almost always be much higher than maxSkippedUpdatesLockContention_. Hope that makes sense. -- To view, visit http://gerrit.cloudera.org:8080/17253 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I79e64cdec0e6aa7b597a47851b4b5c5441ca5528 Gerrit-Change-Number: 17253 Gerrit-PatchSet: 4 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Mon, 05 Apr 2021 20:59:55 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6671: Change wait for sync ddl timeout
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17253 ) Change subject: IMPALA-6671: Change wait for sync ddl timeout .. Patch Set 4: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/17253 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I79e64cdec0e6aa7b597a47851b4b5c5441ca5528 Gerrit-Change-Number: 17253 Gerrit-PatchSet: 4 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Comment-Date: Fri, 02 Apr 2021 05:37:37 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6671: Change wait for sync ddl timeout
Quanlong Huang has posted comments on this change. ( http://gerrit.cloudera.org:8080/17253 ) Change subject: IMPALA-6671: Change wait for sync ddl timeout .. Patch Set 4: (5 comments) Thanks for this work! The patch makes sense to me. http://gerrit.cloudera.org:8080/#/c/17253/4/be/src/catalog/catalog-server.cc File be/src/catalog/catalog-server.cc: http://gerrit.cloudera.org:8080/#/c/17253/4/be/src/catalog/catalog-server.cc@92 PS4, Line 92: DEFINE_int32(max_wait_time_for_sync_ddl_s, 0, "Maximum time (in seconds) until " Do you have a recommendation on what this should be when topic_update_tbl_max_wait_time_ms is enabled? I think it should be larger than the topic update frequency since we at least will wait for that time. http://gerrit.cloudera.org:8080/#/c/17253/4/be/src/catalog/catalog-server.cc@94 PS4, Line 94: will nit: will wait? http://gerrit.cloudera.org:8080/#/c/17253/4/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java: http://gerrit.cloudera.org:8080/#/c/17253/4/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@3094 PS4, Line 3094: long maxNumAttempts = 5; I think another solution is increasing maxNumAttempts with maxSkippedUpdatesLockContention_, then maybe we don't need introducing a new timeout flag which may be hard to tune. The rationale is that we will at most skip the table maxSkippedUpdatesLockContention_ times comparing to the legacy behavior. What do you think? http://gerrit.cloudera.org:8080/#/c/17253/4/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@3160 PS4, Line 3160: It checks nit: remove this? http://gerrit.cloudera.org:8080/#/c/17253/4/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@3168 PS4, Line 3168: timeoutSecs == 0 I think we should return false for negative values as well. -- To view, visit http://gerrit.cloudera.org:8080/17253 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I79e64cdec0e6aa7b597a47851b4b5c5441ca5528 Gerrit-Change-Number: 17253 Gerrit-PatchSet: 4 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Comment-Date: Fri, 02 Apr 2021 02:14:29 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6671: Change wait for sync ddl timeout
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17253 ) Change subject: IMPALA-6671: Change wait for sync ddl timeout .. Patch Set 4: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/8497/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/17253 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I79e64cdec0e6aa7b597a47851b4b5c5441ca5528 Gerrit-Change-Number: 17253 Gerrit-PatchSet: 4 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Comment-Date: Fri, 02 Apr 2021 00:11:41 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6671: Change wait for sync ddl timeout
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17253 ) Change subject: IMPALA-6671: Change wait for sync ddl timeout .. Patch Set 3: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/8496/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/17253 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I79e64cdec0e6aa7b597a47851b4b5c5441ca5528 Gerrit-Change-Number: 17253 Gerrit-PatchSet: 3 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Comment-Date: Fri, 02 Apr 2021 00:03:39 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6671: Change wait for sync ddl timeout
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17253 ) Change subject: IMPALA-6671: Change wait for sync ddl timeout .. Patch Set 4: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/7042/ DRY_RUN=true -- To view, visit http://gerrit.cloudera.org:8080/17253 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I79e64cdec0e6aa7b597a47851b4b5c5441ca5528 Gerrit-Change-Number: 17253 Gerrit-PatchSet: 4 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Comment-Date: Thu, 01 Apr 2021 23:51:09 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6671: Change wait for sync ddl timeout
Vihang Karajgaonkar has uploaded a new patch set (#4). ( http://gerrit.cloudera.org:8080/17253 ) Change subject: IMPALA-6671: Change wait for sync ddl timeout .. IMPALA-6671: Change wait for sync ddl timeout When skip locked tables from topic updates is enabled (topic_update_tbl_max_wait_time_ms > 0), it is possible that a thread waiting for a topic update during a sync ddl execution to terminate earlier. This is because the waitForSyncDdlVersion currently statically counts the number of instances of topic updates against a maximum and bails out when the maximum is reached. With topic update thread skipping locked tables, this number of instances of topic updates is more likely to hit the maximum attempt limit. This commit changes the logic so that when locked tables are skipped from topic updates, the sync ddl operation waits until a configurable timeout. This timeout value is specified in seconds using the configuration max_wait_time_for_sync_ddl_s. The default value of this configuration is 0 which means it waits indefinitely. This makes sure that only the sync ddl queries on a table which has been locked wait for extended durations while the other unrelated sync ddl queries can finish appropriately. Also this commit changes the default values of following flags which were introduced in the earlier patch for IMPALA-6671. The current default value of topic_update_tbl_max_wait_time_ms of 500ms is too low and may skip the locked tables more aggressively than needed. The new defaults were set based on analysis of a real world deployment. topic_update_tbl_max_wait_time_ms = 12 catalog_max_lock_skipped_topic_updates = 3 Testing: 1. Ran existing test test_topic_update_frequency. 2. Added a new test for the newly added flag. Change-Id: I79e64cdec0e6aa7b597a47851b4b5c5441ca5528 --- M be/src/catalog/catalog-server.cc M be/src/util/backend-gflag-util.cc M common/thrift/BackendGflags.thrift M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/service/BackendConfig.java M tests/custom_cluster/test_topic_update_frequency.py 6 files changed, 88 insertions(+), 5 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/53/17253/4 -- To view, visit http://gerrit.cloudera.org:8080/17253 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I79e64cdec0e6aa7b597a47851b4b5c5441ca5528 Gerrit-Change-Number: 17253 Gerrit-PatchSet: 4 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang
[Impala-ASF-CR] IMPALA-6671: Change wait for sync ddl timeout
Vihang Karajgaonkar has uploaded a new patch set (#3). ( http://gerrit.cloudera.org:8080/17253 ) Change subject: IMPALA-6671: Change wait for sync ddl timeout .. IMPALA-6671: Change wait for sync ddl timeout When skip locked tables from topic updates is enabled (topic_update_tbl_max_wait_time_ms > 0), it is possible that a thread waiting for a topic update during a sync ddl execution to terminate earlier. This is because the waitForSyncDdlVersion currently statically counts the number of instances of topic updates against a maximum and bails out when the maximum is reached. With topic update thread skipping locked tables, this number of instances of topic updates is more likely to hit the maximum attempt limit. This commit changes the logic so that when locked tables are skipped from topic updates, the sync ddl operation waits until a configurable timeout. This timeout value is specified in seconds using the configuration max_wait_time_for_sync_ddl_s. The default value of this configuration is 0 which means it waits indefinitely. This makes sure that only the sync ddl queries on a table which has been locked wait for extended durations while the other unrelated sync ddl queries can finish appropriately. Also this commit changes the default values of following flags which were introduced in the earlier patch for IMPALA-6671. The current default value of topic_update_tbl_max_wait_time_ms of 500ms is too low and may skip the locked tables more aggressively than needed. The new defaults were set based on analysis of a real world deployment. topic_update_tbl_max_wait_time_ms = 12 catalog_max_lock_skipped_topic_updates = 3 Testing: 1. Ran existing test test_topic_update_frequency. 2. Added a new test for the newly added flag. Change-Id: I79e64cdec0e6aa7b597a47851b4b5c5441ca5528 --- M be/src/catalog/catalog-server.cc M be/src/util/backend-gflag-util.cc M common/thrift/BackendGflags.thrift M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/service/BackendConfig.java M tests/custom_cluster/test_topic_update_frequency.py 6 files changed, 88 insertions(+), 5 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/53/17253/3 -- To view, visit http://gerrit.cloudera.org:8080/17253 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I79e64cdec0e6aa7b597a47851b4b5c5441ca5528 Gerrit-Change-Number: 17253 Gerrit-PatchSet: 3 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang