Vihang Karajgaonkar has posted comments on this change. ( http://gerrit.cloudera.org:8080/17697 )
Change subject: IMPALA-10801: Check the latest compaction Id before serving ACID table ...................................................................... Patch Set 14: (2 comments) Thanks for the clarification. I think we should have a more detailed comment. Otherwise the patch looks great to me. I will +2 the patch once the comments are updated as suggested below. http://gerrit.cloudera.org:8080/#/c/17697/14/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java: http://gerrit.cloudera.org:8080/#/c/17697/14/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@2194 PS14, Line 2194: AcidUtils.compare((HdfsTable) existingTbl, : updatedTbl.getValidWriteIds(), tableId) : >= 0) > No, it is not guaranteed. In this case, the table is coming from a full tab I synced up with Yu-Wen offline. Based on the discussion, I understand the reasoning of this part of the change better. With this change now it is possible that when a ACID table is refreshed the version number is bumped up but it still keeps the ValidWriteIdList to the previous ValidWriteIdList because refresh was due to compaction. Hence the assumption which replaceTableIfUnchanged makes that any time a table is modified it moves to a later validWriteIdList than previous does not hold anymore. Since this is not very easy to understand for someone who doesn't know how ValidWriteIdLists are used here, I think it would be good to add a more comprehensive comment. Something similar to below (feel free to modify as appropriate): For ACID tables, we also compare the writeIdList with existing table's and return existing table when its writeIdList is more recent. This check is needed in addition to the catalogVersion check because it is possible that when table's files are refreshed to account for compaction, the table's version number is higher but ValidWriteIdList is still the same as before refresh. This could cause this method to not update the table when the updatedTbl has a higher ValidWriteIdList if we just rely on catalog version comparison which would break the reload on stale ValidWriteIdList logic. http://gerrit.cloudera.org:8080/#/c/17697/15/fe/src/test/java/org/apache/impala/catalog/PartialCatalogInfoWriteIdTest.java File fe/src/test/java/org/apache/impala/catalog/PartialCatalogInfoWriteIdTest.java: http://gerrit.cloudera.org:8080/#/c/17697/15/fe/src/test/java/org/apache/impala/catalog/PartialCatalogInfoWriteIdTest.java@598 PS15, Line 598: .get(0).file_descriptors; nit, this comment can be removed now. -- To view, visit http://gerrit.cloudera.org:8080/17697 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I86a112a77980fef7f6238978bc9668a65262101e Gerrit-Change-Number: 17697 Gerrit-PatchSet: 14 Gerrit-Owner: Yu-Wen Lai <yu-wen....@cloudera.com> Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com> Gerrit-Reviewer: Sourabh Goyal <soura...@cloudera.com> Gerrit-Reviewer: Vihang Karajgaonkar <vih...@cloudera.com> Gerrit-Reviewer: Yu-Wen Lai <yu-wen....@cloudera.com> Gerrit-Comment-Date: Tue, 10 Aug 2021 19:12:42 +0000 Gerrit-HasComments: Yes