Quanlong Huang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16549 )

Change subject: IMPALA-6671: Skip locked tables from topic updates
......................................................................


Patch Set 6:

(12 comments)

This is really what we need in catalog-v1! I still need to look into it deeper.

http://gerrit.cloudera.org:8080/#/c/16549/6/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java:

http://gerrit.cloudera.org:8080/#/c/16549/6/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@a456
PS6, Line 456:
nit: keep the blank line


http://gerrit.cloudera.org:8080/#/c/16549/6/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@209
PS6, Line 209: maxSkippedUpdatesLockContention
nit: maxSkippedUpdatesLockContention_


http://gerrit.cloudera.org:8080/#/c/16549/6/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@210
PS6, Line 210: //v
nit: needs a space and capitalizes 'v'.


http://gerrit.cloudera.org:8080/#/c/16549/6/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@211
PS6, Line 211: topicUpdateTblLockMaxWaitTimeMs
nit: topicUpdateTblLockMaxWaitTimeMs_


http://gerrit.cloudera.org:8080/#/c/16549/6/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@212
PS6, Line 212: //D
nit: needs a space


http://gerrit.cloudera.org:8080/#/c/16549/6/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@1309
PS6, Line 1309: we
nit: "We". I think we always capitalize the first char of comments. It'd be 
neat to keep this convention.


http://gerrit.cloudera.org:8080/#/c/16549/6/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@1315
PS6, Line 1315: else {
              :       if
nit: we can merge this else-if


http://gerrit.cloudera.org:8080/#/c/16549/6/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@1372
PS6, Line 1372: tryLock is false.
nit: The other reason is non hdfs tables are updated quickly so won't suffer 
the problem


http://gerrit.cloudera.org:8080/#/c/16549/6/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java:

http://gerrit.cloudera.org:8080/#/c/16549/6/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@2682
PS6, Line 2682:     LOG.debug(String
              :         .format("%s the pending catalog version of table %s 
from %s to %s%s.",
              :             (success ? "Bumped up" : "Attempt to bump up"), 
getFullName(),
              :             expectedPendingVersion, newCatalogVersion, success 
? "" : " failed"));
nit: Don't need String.format

    LOG.debug("{} the pending catalog version of table {} from {} to {}{}.",
        (success ? "Bumped up" : "Attempt to bump up"), getFullName(),
        expectedPendingVersion, newCatalogVersion, success ? "" : " failed");


http://gerrit.cloudera.org:8080/#/c/16549/6/fe/src/main/java/org/apache/impala/catalog/TopicUpdateLog.java
File fe/src/main/java/org/apache/impala/catalog/TopicUpdateLog.java:

http://gerrit.cloudera.org:8080/#/c/16549/6/fe/src/main/java/org/apache/impala/catalog/TopicUpdateLog.java@67
PS6, Line 67: number
ni: Number


http://gerrit.cloudera.org:8080/#/c/16549/6/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java:

http://gerrit.cloudera.org:8080/#/c/16549/6/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@665
PS6, Line 665:       newCatalogVersion = 
catalog_.updateTblPendingVersion((HdfsTable) tbl);
Could you explain what error can happen if we don't set the pending version 
here? Can we just call catalog_.incrementAndGetCatalogVersion() to get the 
newCatalogVersion? Let's say the last or currently on-going catalog topic 
update is in version range (fromVersion, toVersion]. IIUC, the invariant we 
need is

  max(newCatalogVersion, pendingVersion) > toVersion

When we finish updaing the table, we set its catalog version to 
max(newCatalogVersion, pendingVersion). It will fit into the version range of 
the next catalog update since it's larger than toVersion.

Here we get a large enough 'newCatalogVersion' to keep this invariant. If the 
catalog update thread is collecting a version range with toVersion > our 
newCatalogVersion, it's its duty to maintain this invariant. So only the 
catalog update thread is required to update the pendingVersion. Then we won't 
have contentions on updaing the pendingVersion.

If I'm wrong, then I think we should do the same (i.e. call 
catalog_.updateTblPendingVersion()) for the REFRESH code path in 
CatalogServiceCatalog#reloadTable() as well: 
https://github.com/apache/impala/blob/227e43f48147c8725100ddc05521bae07ee9becd/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java#L2304


http://gerrit.cloudera.org:8080/#/c/16549/6/tests/metadata/test_topic_update_frequency.py
File tests/metadata/test_topic_update_frequency.py:

http://gerrit.cloudera.org:8080/#/c/16549/6/tests/metadata/test_topic_update_frequency.py@20
PS6, Line 20: class TestTopicUpdateFrequency(ImpalaTestSuite):
I think we need a test on the new flag. Some tests may need to be only run in 
exhaustive mode if they take long time.



--
To view, visit http://gerrit.cloudera.org:8080/16549
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic657b96edbcdc94c6b906e7ca59291f4e4715655
Gerrit-Change-Number: 16549
Gerrit-PatchSet: 6
Gerrit-Owner: Vihang Karajgaonkar <vih...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <huangquanl...@gmail.com>
Gerrit-Reviewer: Shant Hovsepian <sh...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <tarmstr...@cloudera.com>
Gerrit-Reviewer: Vihang Karajgaonkar <vih...@cloudera.com>
Gerrit-Comment-Date: Fri, 23 Oct 2020 07:44:45 +0000
Gerrit-HasComments: Yes

Reply via email to