Quanlong Huang has posted comments on this change. ( http://gerrit.cloudera.org:8080/16549 )
Change subject: IMPALA-6671: Skip locked tables from topic updates ...................................................................... Patch Set 6: (12 comments) This is really what we need in catalog-v1! I still need to look into it deeper. http://gerrit.cloudera.org:8080/#/c/16549/6/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java: http://gerrit.cloudera.org:8080/#/c/16549/6/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@a456 PS6, Line 456: nit: keep the blank line http://gerrit.cloudera.org:8080/#/c/16549/6/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@209 PS6, Line 209: maxSkippedUpdatesLockContention nit: maxSkippedUpdatesLockContention_ http://gerrit.cloudera.org:8080/#/c/16549/6/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@210 PS6, Line 210: //v nit: needs a space and capitalizes 'v'. http://gerrit.cloudera.org:8080/#/c/16549/6/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@211 PS6, Line 211: topicUpdateTblLockMaxWaitTimeMs nit: topicUpdateTblLockMaxWaitTimeMs_ http://gerrit.cloudera.org:8080/#/c/16549/6/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@212 PS6, Line 212: //D nit: needs a space http://gerrit.cloudera.org:8080/#/c/16549/6/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@1309 PS6, Line 1309: we nit: "We". I think we always capitalize the first char of comments. It'd be neat to keep this convention. http://gerrit.cloudera.org:8080/#/c/16549/6/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@1315 PS6, Line 1315: else { : if nit: we can merge this else-if http://gerrit.cloudera.org:8080/#/c/16549/6/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@1372 PS6, Line 1372: tryLock is false. nit: The other reason is non hdfs tables are updated quickly so won't suffer the problem http://gerrit.cloudera.org:8080/#/c/16549/6/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java: http://gerrit.cloudera.org:8080/#/c/16549/6/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@2682 PS6, Line 2682: LOG.debug(String : .format("%s the pending catalog version of table %s from %s to %s%s.", : (success ? "Bumped up" : "Attempt to bump up"), getFullName(), : expectedPendingVersion, newCatalogVersion, success ? "" : " failed")); nit: Don't need String.format LOG.debug("{} the pending catalog version of table {} from {} to {}{}.", (success ? "Bumped up" : "Attempt to bump up"), getFullName(), expectedPendingVersion, newCatalogVersion, success ? "" : " failed"); http://gerrit.cloudera.org:8080/#/c/16549/6/fe/src/main/java/org/apache/impala/catalog/TopicUpdateLog.java File fe/src/main/java/org/apache/impala/catalog/TopicUpdateLog.java: http://gerrit.cloudera.org:8080/#/c/16549/6/fe/src/main/java/org/apache/impala/catalog/TopicUpdateLog.java@67 PS6, Line 67: number ni: Number http://gerrit.cloudera.org:8080/#/c/16549/6/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java: http://gerrit.cloudera.org:8080/#/c/16549/6/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@665 PS6, Line 665: newCatalogVersion = catalog_.updateTblPendingVersion((HdfsTable) tbl); Could you explain what error can happen if we don't set the pending version here? Can we just call catalog_.incrementAndGetCatalogVersion() to get the newCatalogVersion? Let's say the last or currently on-going catalog topic update is in version range (fromVersion, toVersion]. IIUC, the invariant we need is max(newCatalogVersion, pendingVersion) > toVersion When we finish updaing the table, we set its catalog version to max(newCatalogVersion, pendingVersion). It will fit into the version range of the next catalog update since it's larger than toVersion. Here we get a large enough 'newCatalogVersion' to keep this invariant. If the catalog update thread is collecting a version range with toVersion > our newCatalogVersion, it's its duty to maintain this invariant. So only the catalog update thread is required to update the pendingVersion. Then we won't have contentions on updaing the pendingVersion. If I'm wrong, then I think we should do the same (i.e. call catalog_.updateTblPendingVersion()) for the REFRESH code path in CatalogServiceCatalog#reloadTable() as well: https://github.com/apache/impala/blob/227e43f48147c8725100ddc05521bae07ee9becd/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java#L2304 http://gerrit.cloudera.org:8080/#/c/16549/6/tests/metadata/test_topic_update_frequency.py File tests/metadata/test_topic_update_frequency.py: http://gerrit.cloudera.org:8080/#/c/16549/6/tests/metadata/test_topic_update_frequency.py@20 PS6, Line 20: class TestTopicUpdateFrequency(ImpalaTestSuite): I think we need a test on the new flag. Some tests may need to be only run in exhaustive mode if they take long time. -- To view, visit http://gerrit.cloudera.org:8080/16549 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic657b96edbcdc94c6b906e7ca59291f4e4715655 Gerrit-Change-Number: 16549 Gerrit-PatchSet: 6 Gerrit-Owner: Vihang Karajgaonkar <vih...@cloudera.com> Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com> Gerrit-Reviewer: Quanlong Huang <huangquanl...@gmail.com> Gerrit-Reviewer: Shant Hovsepian <sh...@cloudera.com> Gerrit-Reviewer: Tim Armstrong <tarmstr...@cloudera.com> Gerrit-Reviewer: Vihang Karajgaonkar <vih...@cloudera.com> Gerrit-Comment-Date: Fri, 23 Oct 2020 07:44:45 +0000 Gerrit-HasComments: Yes