Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/13334 )
Change subject: acid: Filter unwanted files based on ACID state. ...................................................................... Patch Set 14: (6 comments) http://gerrit.cloudera.org:8080/#/c/13334/12/fe/src/main/java/org/apache/impala/catalog/FileMetadataLoader.java File fe/src/main/java/org/apache/impala/catalog/FileMetadataLoader.java: http://gerrit.cloudera.org:8080/#/c/13334/12/fe/src/main/java/org/apache/impala/catalog/FileMetadataLoader.java@206 PS12, Line 206: loadedFds_ = AcidUtils.filterFilesForAcidState(loadedFds_, : relativeDirPaths); nit: fits to one line http://gerrit.cloudera.org:8080/#/c/13334/14/fe/src/main/java/org/apache/impala/util/AcidUtils.java File fe/src/main/java/org/apache/impala/util/AcidUtils.java: http://gerrit.cloudera.org:8080/#/c/13334/14/fe/src/main/java/org/apache/impala/util/AcidUtils.java@48 PS14, Line 48: (?:/.*)? I have deltas like this produced by compaction: base_0000002_v0000108 +Same as line 54. http://gerrit.cloudera.org:8080/#/c/13334/14/fe/src/main/java/org/apache/impala/util/AcidUtils.java@54 PS14, Line 54: // Optional path suffix. : "(?:/.*)?"); Is this optional? We should only match files inside directories as far as I understand. http://gerrit.cloudera.org:8080/#/c/13334/14/fe/src/main/java/org/apache/impala/util/AcidUtils.java@105 PS14, Line 105: public boolean test(String dirPath) { To support upgraded tables this should also accept any path without "/", so where it is a file directly inside the table/partition directory. http://gerrit.cloudera.org:8080/#/c/13334/14/fe/src/main/java/org/apache/impala/util/AcidUtils.java@170 PS14, Line 170: if (baseWriteId == maxBaseWriteId) { : validDescriptors.add(fd); We will go to this branch with any file if getMaxBaseId() didn't found any base directory, so maxBaseWriteId == Long.MIN_VALUE. This is not necessarily a problem, as we should accept all valid deltas in this case, but it seems more logical to add them in the next branch. http://gerrit.cloudera.org:8080/#/c/13334/14/fe/src/main/java/org/apache/impala/util/AcidUtils.java@182 PS14, Line 182: // If the table was not transaction table then return all fds. : if (validDescriptors.isEmpty()) { : return fileDescriptors; : } Can't we enter this if the table was truncated, but the old deltas are still there? -- To view, visit http://gerrit.cloudera.org:8080/13334 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Icf0aeb36e10c827ead59ed7f67e731199394fe8e Gerrit-Change-Number: 13334 Gerrit-PatchSet: 14 Gerrit-Owner: Sudhanshu Arora <sudhan...@cloudera.com> Gerrit-Reviewer: Csaba Ringhofer <csringho...@cloudera.com> Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com> Gerrit-Reviewer: Sudhanshu Arora <sudhan...@cloudera.com> Gerrit-Reviewer: Todd Lipcon <t...@apache.org> Gerrit-Reviewer: Vihang Karajgaonkar <vih...@cloudera.com> Gerrit-Reviewer: Yongzhi Chen <yc...@cloudera.com> Gerrit-Reviewer: Zoltan Borok-Nagy <borokna...@cloudera.com> Gerrit-Comment-Date: Wed, 22 May 2019 19:28:47 +0000 Gerrit-HasComments: Yes