[Impala-ASF-CR] IMPALA-6671: Change wait for sync ddl timeout

2021-04-06 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/17253 )

Change subject: IMPALA-6671: Change wait for sync ddl timeout
..

IMPALA-6671: Change wait for sync ddl timeout

When skip locked tables from topic updates is enabled
(topic_update_tbl_max_wait_time_ms > 0), it is possible that a thread
waiting for a topic update during a sync ddl execution to terminate
earlier. This is because the waitForSyncDdlVersion currently statically
counts the number of instances of topic updates against a maximum
and bails out when the maximum is reached. With topic update thread
skipping locked tables, this number of instances of topic updates
is more likely to hit the maximum attempt limit.

This commit changes the logic so that when locked tables are skipped from
topic updates, the sync ddl operation waits until a configurable
timeout. This timeout value is specified in seconds using the configuration
max_wait_time_for_sync_ddl_s. The default value of this configuration is 0
which means it waits indefinitely. This makes sure that only the sync ddl
queries on a table which has been locked wait for extended durations while
the other unrelated sync ddl queries can finish appropriately.

Also this commit changes the default values of following flags
which were introduced in the earlier patch for IMPALA-6671.

The current default value of topic_update_tbl_max_wait_time_ms
of 500ms is too low and may skip the locked tables more
aggressively than needed. The new defaults were set based on
analysis of a real world deployment.

topic_update_tbl_max_wait_time_ms = 12
catalog_max_lock_skipped_topic_updates = 3

Testing:
1. Ran existing test test_topic_update_frequency.
2. Added a new test for the newly added flag.

Change-Id: I79e64cdec0e6aa7b597a47851b4b5c5441ca5528
Reviewed-on: http://gerrit.cloudera.org:8080/17253
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 
---
M be/src/catalog/catalog-server.cc
M be/src/util/backend-gflag-util.cc
M common/thrift/BackendGflags.thrift
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M tests/custom_cluster/test_topic_update_frequency.py
6 files changed, 86 insertions(+), 5 deletions(-)

Approvals:
  Impala Public Jenkins: Looks good to me, approved; Verified

--
To view, visit http://gerrit.cloudera.org:8080/17253
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I79e64cdec0e6aa7b597a47851b4b5c5441ca5528
Gerrit-Change-Number: 17253
Gerrit-PatchSet: 8
Gerrit-Owner: Vihang Karajgaonkar 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Vihang Karajgaonkar 


[Impala-ASF-CR] IMPALA-6671: Change wait for sync ddl timeout

2021-04-06 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17253 )

Change subject: IMPALA-6671: Change wait for sync ddl timeout
..


Patch Set 7: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/17253
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I79e64cdec0e6aa7b597a47851b4b5c5441ca5528
Gerrit-Change-Number: 17253
Gerrit-PatchSet: 7
Gerrit-Owner: Vihang Karajgaonkar 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Wed, 07 Apr 2021 05:46:38 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6671: Change wait for sync ddl timeout

2021-04-06 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17253 )

Change subject: IMPALA-6671: Change wait for sync ddl timeout
..


Patch Set 6:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/8510/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/17253
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I79e64cdec0e6aa7b597a47851b4b5c5441ca5528
Gerrit-Change-Number: 17253
Gerrit-PatchSet: 6
Gerrit-Owner: Vihang Karajgaonkar 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Wed, 07 Apr 2021 00:24:28 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6671: Change wait for sync ddl timeout

2021-04-06 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17253 )

Change subject: IMPALA-6671: Change wait for sync ddl timeout
..


Patch Set 7:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/7048/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/17253
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I79e64cdec0e6aa7b597a47851b4b5c5441ca5528
Gerrit-Change-Number: 17253
Gerrit-PatchSet: 7
Gerrit-Owner: Vihang Karajgaonkar 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Wed, 07 Apr 2021 00:04:24 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6671: Change wait for sync ddl timeout

2021-04-06 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17253 )

Change subject: IMPALA-6671: Change wait for sync ddl timeout
..


Patch Set 7: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/17253
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I79e64cdec0e6aa7b597a47851b4b5c5441ca5528
Gerrit-Change-Number: 17253
Gerrit-PatchSet: 7
Gerrit-Owner: Vihang Karajgaonkar 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Wed, 07 Apr 2021 00:04:23 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6671: Change wait for sync ddl timeout

2021-04-06 Thread Vihang Karajgaonkar (Code Review)
Vihang Karajgaonkar has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17253 )

Change subject: IMPALA-6671: Change wait for sync ddl timeout
..


Patch Set 6: Code-Review+2

Carrying the +2 from Quanlong from earlier.


--
To view, visit http://gerrit.cloudera.org:8080/17253
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I79e64cdec0e6aa7b597a47851b4b5c5441ca5528
Gerrit-Change-Number: 17253
Gerrit-PatchSet: 6
Gerrit-Owner: Vihang Karajgaonkar 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Wed, 07 Apr 2021 00:04:01 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6671: Change wait for sync ddl timeout

2021-04-06 Thread Vihang Karajgaonkar (Code Review)
Vihang Karajgaonkar has uploaded a new patch set (#6). ( 
http://gerrit.cloudera.org:8080/17253 )

Change subject: IMPALA-6671: Change wait for sync ddl timeout
..

IMPALA-6671: Change wait for sync ddl timeout

When skip locked tables from topic updates is enabled
(topic_update_tbl_max_wait_time_ms > 0), it is possible that a thread
waiting for a topic update during a sync ddl execution to terminate
earlier. This is because the waitForSyncDdlVersion currently statically
counts the number of instances of topic updates against a maximum
and bails out when the maximum is reached. With topic update thread
skipping locked tables, this number of instances of topic updates
is more likely to hit the maximum attempt limit.

This commit changes the logic so that when locked tables are skipped from
topic updates, the sync ddl operation waits until a configurable
timeout. This timeout value is specified in seconds using the configuration
max_wait_time_for_sync_ddl_s. The default value of this configuration is 0
which means it waits indefinitely. This makes sure that only the sync ddl
queries on a table which has been locked wait for extended durations while
the other unrelated sync ddl queries can finish appropriately.

Also this commit changes the default values of following flags
which were introduced in the earlier patch for IMPALA-6671.

The current default value of topic_update_tbl_max_wait_time_ms
of 500ms is too low and may skip the locked tables more
aggressively than needed. The new defaults were set based on
analysis of a real world deployment.

topic_update_tbl_max_wait_time_ms = 12
catalog_max_lock_skipped_topic_updates = 3

Testing:
1. Ran existing test test_topic_update_frequency.
2. Added a new test for the newly added flag.

Change-Id: I79e64cdec0e6aa7b597a47851b4b5c5441ca5528
---
M be/src/catalog/catalog-server.cc
M be/src/util/backend-gflag-util.cc
M common/thrift/BackendGflags.thrift
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M tests/custom_cluster/test_topic_update_frequency.py
6 files changed, 86 insertions(+), 5 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/53/17253/6
--
To view, visit http://gerrit.cloudera.org:8080/17253
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I79e64cdec0e6aa7b597a47851b4b5c5441ca5528
Gerrit-Change-Number: 17253
Gerrit-PatchSet: 6
Gerrit-Owner: Vihang Karajgaonkar 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Vihang Karajgaonkar 


[Impala-ASF-CR] IMPALA-6671: Change wait for sync ddl timeout

2021-04-06 Thread Quanlong Huang (Code Review)
Quanlong Huang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17253 )

Change subject: IMPALA-6671: Change wait for sync ddl timeout
..


Patch Set 5: Code-Review+2

(3 comments)

http://gerrit.cloudera.org:8080/#/c/17253/4/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java:

http://gerrit.cloudera.org:8080/#/c/17253/4/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@3094
PS4, Line 3094: long maxNumAttempts = 5;
> "Now when skipping locked table is enabled, more rounds of topic updates co
Ah, I see. Just realized the numSkippedUpdatesLockContention_ of a table entry 
only increases when the table's LastVersionSeenByTopicUpdate changes. Then we 
really can't estimate how many times the topic update thread would skip a 
locked table version.

Thanks for the explanation!


http://gerrit.cloudera.org:8080/#/c/17253/4/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@3160
PS4, Line 3160: It checks
> nit: remove this?
remove this?


http://gerrit.cloudera.org:8080/#/c/17253/4/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@3168
PS4, Line 3168: timeoutSecs == 0
> I think we should return false for negative values as well.
Some users may set this to -1 thinking -1 means disabling it. So I think 
negative values should be taken care as well.



--
To view, visit http://gerrit.cloudera.org:8080/17253
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I79e64cdec0e6aa7b597a47851b4b5c5441ca5528
Gerrit-Change-Number: 17253
Gerrit-PatchSet: 5
Gerrit-Owner: Vihang Karajgaonkar 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Tue, 06 Apr 2021 23:52:41 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6671: Change wait for sync ddl timeout

2021-04-06 Thread Vihang Karajgaonkar (Code Review)
Vihang Karajgaonkar has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17253 )

Change subject: IMPALA-6671: Change wait for sync ddl timeout
..


Patch Set 4:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/17253/4/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java:

http://gerrit.cloudera.org:8080/#/c/17253/4/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@3094
PS4, Line 3094: long maxNumAttempts = 5;
> Ah, I see. After thinking this more clear, I think my idea should be changi
"Now when skipping locked table is enabled, more rounds of topic updates could 
be sent in the time range between t1-a and t1-b. So 'maxNumAttempts' is reached 
quickly if it's still 5"

Yes, that is exactly the problem that this patch is trying to address. However, 
in my opinion the number of topic updates which will skip the locked table is 
non-deterministic and depends on how long the table is locked. The 
maxSkippedUpdatesLockContention_ doesn't simply count the number of topic 
updates which skip the table. E.g. if between t1-a and t1-b topic update thread 
started delta computation 10 times and skipped the table 10 times because ddl2 
is holding the lock, it would still be counted as 1 skippedUpdate because it is 
the same lock operation. If you read the description of 
catalog_max_lock_skipped_topic_updates it says "This limit only applies to 
distinct lock operations which block the topic update thread". I think I should 
update the comment with a more detailed comment. Hope this makes it clearer.



--
To view, visit http://gerrit.cloudera.org:8080/17253
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I79e64cdec0e6aa7b597a47851b4b5c5441ca5528
Gerrit-Change-Number: 17253
Gerrit-PatchSet: 4
Gerrit-Owner: Vihang Karajgaonkar 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Tue, 06 Apr 2021 23:18:24 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6671: Change wait for sync ddl timeout

2021-04-05 Thread Quanlong Huang (Code Review)
Quanlong Huang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17253 )

Change subject: IMPALA-6671: Change wait for sync ddl timeout
..


Patch Set 5:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/17253/4/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java:

http://gerrit.cloudera.org:8080/#/c/17253/4/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@3094
PS4, Line 3094: long maxNumAttempts = 5;
> Unfortunately, I don't think this approach would work. maxSkippedUpdatesLoc
Ah, I see. After thinking this more clear, I think my idea should be changing 
'maxNumAttempts' from 5 to 5 * (maxSkippedUpdatesLockContention_ + 1). Please 
verify my understanding. Let's say there are 6 concurrent DDLs (ddl1, ddl2, 
..., ddl6) on table 'foo' and both with sync_ddl=true. And let's say they 
finishes in this order:

t0: ddl1 done, go into waitForSyncDdlVersion()
t1-a: catalog topic update thread start collecting updates
t1-b: ddl2 done, bump the table version, go into waitForSyncDdlVersion()
t1-c: catalog topic update thread finish collecting updates and skip the table 
since its version is out of scope
t2-a: catalog topic update thread start
t2-b: ddl3 done, bump the table version, go into waitForSyncDdlVersion()
t2-c: catalog topic update thread finish collecting updates and skip the table 
since its version is out of scope
...
t5-a: catalog topic update thread start
t5-b: ddl6 done, bump the table version, go into waitForSyncDdlVersion()
t5-c: catalog topic update thread finish collecting updates and skip the table 
since its version is out of scope
t6: waitForSyncDdlVersion() for ddl1 returns since 'maxNumAttempts' is reached.

This is the expected behavior before IMPALA-6671. Now when skipping locked 
table is enabled, more rounds of topic updates could be sent in the time range 
between t1-a and t1-b. So 'maxNumAttempts' is reached quickly if it's still 5. 
There are at most maxSkippedUpdatesLockContention_ updates can be sent between 
t1-a and t1-b, the same for t2, t3, ..., t5. So I'm thinking setting 
maxNumAttempts to 5 * (maxSkippedUpdatesLockContention_ + 1) can achieve the 
pre-IMPALA-6671 purpose.



--
To view, visit http://gerrit.cloudera.org:8080/17253
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I79e64cdec0e6aa7b597a47851b4b5c5441ca5528
Gerrit-Change-Number: 17253
Gerrit-PatchSet: 5
Gerrit-Owner: Vihang Karajgaonkar 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Tue, 06 Apr 2021 02:24:57 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6671: Change wait for sync ddl timeout

2021-04-05 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17253 )

Change subject: IMPALA-6671: Change wait for sync ddl timeout
..


Patch Set 5:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/8501/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/17253
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I79e64cdec0e6aa7b597a47851b4b5c5441ca5528
Gerrit-Change-Number: 17253
Gerrit-PatchSet: 5
Gerrit-Owner: Vihang Karajgaonkar 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Mon, 05 Apr 2021 21:19:57 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6671: Change wait for sync ddl timeout

2021-04-05 Thread Vihang Karajgaonkar (Code Review)
Vihang Karajgaonkar has uploaded a new patch set (#5). ( 
http://gerrit.cloudera.org:8080/17253 )

Change subject: IMPALA-6671: Change wait for sync ddl timeout
..

IMPALA-6671: Change wait for sync ddl timeout

When skip locked tables from topic updates is enabled
(topic_update_tbl_max_wait_time_ms > 0), it is possible that a thread
waiting for a topic update during a sync ddl execution to terminate
earlier. This is because the waitForSyncDdlVersion currently statically
counts the number of instances of topic updates against a maximum
and bails out when the maximum is reached. With topic update thread
skipping locked tables, this number of instances of topic updates
is more likely to hit the maximum attempt limit.

This commit changes the logic so that when locked tables are skipped from
topic updates, the sync ddl operation waits until a configurable
timeout. This timeout value is specified in seconds using the configuration
max_wait_time_for_sync_ddl_s. The default value of this configuration is 0
which means it waits indefinitely. This makes sure that only the sync ddl
queries on a table which has been locked wait for extended durations while
the other unrelated sync ddl queries can finish appropriately.

Also this commit changes the default values of following flags
which were introduced in the earlier patch for IMPALA-6671.

The current default value of topic_update_tbl_max_wait_time_ms
of 500ms is too low and may skip the locked tables more
aggressively than needed. The new defaults were set based on
analysis of a real world deployment.

topic_update_tbl_max_wait_time_ms = 12
catalog_max_lock_skipped_topic_updates = 3

Testing:
1. Ran existing test test_topic_update_frequency.
2. Added a new test for the newly added flag.

Change-Id: I79e64cdec0e6aa7b597a47851b4b5c5441ca5528
---
M be/src/catalog/catalog-server.cc
M be/src/util/backend-gflag-util.cc
M common/thrift/BackendGflags.thrift
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M tests/custom_cluster/test_topic_update_frequency.py
6 files changed, 88 insertions(+), 5 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/53/17253/5
--
To view, visit http://gerrit.cloudera.org:8080/17253
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I79e64cdec0e6aa7b597a47851b4b5c5441ca5528
Gerrit-Change-Number: 17253
Gerrit-PatchSet: 5
Gerrit-Owner: Vihang Karajgaonkar 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Vihang Karajgaonkar 


[Impala-ASF-CR] IMPALA-6671: Change wait for sync ddl timeout

2021-04-05 Thread Vihang Karajgaonkar (Code Review)
Vihang Karajgaonkar has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17253 )

Change subject: IMPALA-6671: Change wait for sync ddl timeout
..


Patch Set 4:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/17253/4/be/src/catalog/catalog-server.cc
File be/src/catalog/catalog-server.cc:

http://gerrit.cloudera.org:8080/#/c/17253/4/be/src/catalog/catalog-server.cc@92
PS4, Line 92: DEFINE_int32(max_wait_time_for_sync_ddl_s, 0, "Maximum time (in 
seconds) until "
> Do you have a recommendation on what this should be when topic_update_tbl_m
Do you mean what should the recommendation for the least value of it? It is 
hard to predict. Ideally, the minimum time required for this would be the 
2*(time required for one topic update delta calculation).

The default value of 0 is good for most cases when 
topic_update_tbl_max_wait_time_ms is enabled, unless there are any issues and 
user has to override it manually.


http://gerrit.cloudera.org:8080/#/c/17253/4/be/src/catalog/catalog-server.cc@94
PS4, Line 94: will
> nit: will wait?
Done


http://gerrit.cloudera.org:8080/#/c/17253/4/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java:

http://gerrit.cloudera.org:8080/#/c/17253/4/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@3094
PS4, Line 3094: long maxNumAttempts = 5;
> I think another solution is increasing maxNumAttempts with maxSkippedUpdate
Unfortunately, I don't think this approach would work. 
maxSkippedUpdatesLockContention_ is not just simply counts the number of topic 
updates due to the lock, but number of distinct lock operations which are 
skipped by topic update thread. Otherwise, it would be not work since we would 
just be delaying the onset of the problem. Hence we cannot rely on 
maxSkippedUpdatesLockContention_ since the topic updates skipped will almost 
always be much higher than maxSkippedUpdatesLockContention_. Hope that makes 
sense.



--
To view, visit http://gerrit.cloudera.org:8080/17253
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I79e64cdec0e6aa7b597a47851b4b5c5441ca5528
Gerrit-Change-Number: 17253
Gerrit-PatchSet: 4
Gerrit-Owner: Vihang Karajgaonkar 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Mon, 05 Apr 2021 20:59:55 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6671: Change wait for sync ddl timeout

2021-04-01 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17253 )

Change subject: IMPALA-6671: Change wait for sync ddl timeout
..


Patch Set 4: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/17253
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I79e64cdec0e6aa7b597a47851b4b5c5441ca5528
Gerrit-Change-Number: 17253
Gerrit-PatchSet: 4
Gerrit-Owner: Vihang Karajgaonkar 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Comment-Date: Fri, 02 Apr 2021 05:37:37 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6671: Change wait for sync ddl timeout

2021-04-01 Thread Quanlong Huang (Code Review)
Quanlong Huang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17253 )

Change subject: IMPALA-6671: Change wait for sync ddl timeout
..


Patch Set 4:

(5 comments)

Thanks for this work! The patch makes sense to me.

http://gerrit.cloudera.org:8080/#/c/17253/4/be/src/catalog/catalog-server.cc
File be/src/catalog/catalog-server.cc:

http://gerrit.cloudera.org:8080/#/c/17253/4/be/src/catalog/catalog-server.cc@92
PS4, Line 92: DEFINE_int32(max_wait_time_for_sync_ddl_s, 0, "Maximum time (in 
seconds) until "
Do you have a recommendation on what this should be when 
topic_update_tbl_max_wait_time_ms is enabled? I think it should be larger than 
the topic update frequency since we at least will wait for that time.


http://gerrit.cloudera.org:8080/#/c/17253/4/be/src/catalog/catalog-server.cc@94
PS4, Line 94: will
nit: will wait?


http://gerrit.cloudera.org:8080/#/c/17253/4/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java:

http://gerrit.cloudera.org:8080/#/c/17253/4/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@3094
PS4, Line 3094: long maxNumAttempts = 5;
I think another solution is increasing maxNumAttempts with 
maxSkippedUpdatesLockContention_, then maybe we don't need introducing a new 
timeout flag which may be hard to tune. The rationale is that we will at most 
skip the table maxSkippedUpdatesLockContention_ times comparing to the legacy 
behavior.

What do you think?


http://gerrit.cloudera.org:8080/#/c/17253/4/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@3160
PS4, Line 3160: It checks
nit: remove this?


http://gerrit.cloudera.org:8080/#/c/17253/4/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@3168
PS4, Line 3168: timeoutSecs == 0
I think we should return false for negative values as well.



--
To view, visit http://gerrit.cloudera.org:8080/17253
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I79e64cdec0e6aa7b597a47851b4b5c5441ca5528
Gerrit-Change-Number: 17253
Gerrit-PatchSet: 4
Gerrit-Owner: Vihang Karajgaonkar 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Comment-Date: Fri, 02 Apr 2021 02:14:29 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6671: Change wait for sync ddl timeout

2021-04-01 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17253 )

Change subject: IMPALA-6671: Change wait for sync ddl timeout
..


Patch Set 4:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/8497/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/17253
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I79e64cdec0e6aa7b597a47851b4b5c5441ca5528
Gerrit-Change-Number: 17253
Gerrit-PatchSet: 4
Gerrit-Owner: Vihang Karajgaonkar 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Comment-Date: Fri, 02 Apr 2021 00:11:41 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6671: Change wait for sync ddl timeout

2021-04-01 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17253 )

Change subject: IMPALA-6671: Change wait for sync ddl timeout
..


Patch Set 3:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/8496/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/17253
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I79e64cdec0e6aa7b597a47851b4b5c5441ca5528
Gerrit-Change-Number: 17253
Gerrit-PatchSet: 3
Gerrit-Owner: Vihang Karajgaonkar 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Comment-Date: Fri, 02 Apr 2021 00:03:39 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6671: Change wait for sync ddl timeout

2021-04-01 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17253 )

Change subject: IMPALA-6671: Change wait for sync ddl timeout
..


Patch Set 4:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/7042/ 
DRY_RUN=true


--
To view, visit http://gerrit.cloudera.org:8080/17253
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I79e64cdec0e6aa7b597a47851b4b5c5441ca5528
Gerrit-Change-Number: 17253
Gerrit-PatchSet: 4
Gerrit-Owner: Vihang Karajgaonkar 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Comment-Date: Thu, 01 Apr 2021 23:51:09 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6671: Change wait for sync ddl timeout

2021-04-01 Thread Vihang Karajgaonkar (Code Review)
Vihang Karajgaonkar has uploaded a new patch set (#4). ( 
http://gerrit.cloudera.org:8080/17253 )

Change subject: IMPALA-6671: Change wait for sync ddl timeout
..

IMPALA-6671: Change wait for sync ddl timeout

When skip locked tables from topic updates is enabled
(topic_update_tbl_max_wait_time_ms > 0), it is possible that a thread
waiting for a topic update during a sync ddl execution to terminate
earlier. This is because the waitForSyncDdlVersion currently statically
counts the number of instances of topic updates against a maximum
and bails out when the maximum is reached. With topic update thread
skipping locked tables, this number of instances of topic updates
is more likely to hit the maximum attempt limit.

This commit changes the logic so that when locked tables are skipped from
topic updates, the sync ddl operation waits until a configurable
timeout. This timeout value is specified in seconds using the configuration
max_wait_time_for_sync_ddl_s. The default value of this configuration is 0
which means it waits indefinitely. This makes sure that only the sync ddl
queries on a table which has been locked wait for extended durations while
the other unrelated sync ddl queries can finish appropriately.

Also this commit changes the default values of following flags
which were introduced in the earlier patch for IMPALA-6671.

The current default value of topic_update_tbl_max_wait_time_ms
of 500ms is too low and may skip the locked tables more
aggressively than needed. The new defaults were set based on
analysis of a real world deployment.

topic_update_tbl_max_wait_time_ms = 12
catalog_max_lock_skipped_topic_updates = 3

Testing:
1. Ran existing test test_topic_update_frequency.
2. Added a new test for the newly added flag.

Change-Id: I79e64cdec0e6aa7b597a47851b4b5c5441ca5528
---
M be/src/catalog/catalog-server.cc
M be/src/util/backend-gflag-util.cc
M common/thrift/BackendGflags.thrift
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M tests/custom_cluster/test_topic_update_frequency.py
6 files changed, 88 insertions(+), 5 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/53/17253/4
--
To view, visit http://gerrit.cloudera.org:8080/17253
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I79e64cdec0e6aa7b597a47851b4b5c5441ca5528
Gerrit-Change-Number: 17253
Gerrit-PatchSet: 4
Gerrit-Owner: Vihang Karajgaonkar 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 


[Impala-ASF-CR] IMPALA-6671: Change wait for sync ddl timeout

2021-04-01 Thread Vihang Karajgaonkar (Code Review)
Vihang Karajgaonkar has uploaded a new patch set (#3). ( 
http://gerrit.cloudera.org:8080/17253 )

Change subject: IMPALA-6671: Change wait for sync ddl timeout
..

IMPALA-6671: Change wait for sync ddl timeout

When skip locked tables from topic updates is enabled
(topic_update_tbl_max_wait_time_ms > 0), it is possible that a thread
waiting for a topic update during a sync ddl execution to terminate
earlier. This is because the waitForSyncDdlVersion currently statically
counts the number of instances of topic updates against a maximum
and bails out when the maximum is reached. With topic update thread
skipping locked tables, this number of instances of topic updates
is more likely to hit the maximum attempt limit.

This commit changes the logic so that when locked tables are skipped from
topic updates, the sync ddl operation waits until a configurable
timeout. This timeout value is specified in seconds using the configuration
max_wait_time_for_sync_ddl_s. The default value of this configuration is 0
which means it waits indefinitely. This makes sure that only the sync ddl
queries on a table which has been locked wait for extended durations while
the other unrelated sync ddl queries can finish appropriately.

Also this commit changes the default values of following flags
which were introduced in the earlier patch for IMPALA-6671.

The current default value of topic_update_tbl_max_wait_time_ms
of 500ms is too low and may skip the locked tables more
aggressively than needed. The new defaults were set based on
analysis of a real world deployment.

topic_update_tbl_max_wait_time_ms = 12
catalog_max_lock_skipped_topic_updates = 3

Testing:
1. Ran existing test test_topic_update_frequency.
2. Added a new test for the newly added flag.

Change-Id: I79e64cdec0e6aa7b597a47851b4b5c5441ca5528
---
M be/src/catalog/catalog-server.cc
M be/src/util/backend-gflag-util.cc
M common/thrift/BackendGflags.thrift
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M tests/custom_cluster/test_topic_update_frequency.py
6 files changed, 88 insertions(+), 5 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/53/17253/3
--
To view, visit http://gerrit.cloudera.org:8080/17253
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I79e64cdec0e6aa7b597a47851b4b5c5441ca5528
Gerrit-Change-Number: 17253
Gerrit-PatchSet: 3
Gerrit-Owner: Vihang Karajgaonkar 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang