Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13334 )

Change subject: acid: Filter unwanted files based on ACID state.
......................................................................


Patch Set 14:

(6 comments)

http://gerrit.cloudera.org:8080/#/c/13334/12/fe/src/main/java/org/apache/impala/catalog/FileMetadataLoader.java
File fe/src/main/java/org/apache/impala/catalog/FileMetadataLoader.java:

http://gerrit.cloudera.org:8080/#/c/13334/12/fe/src/main/java/org/apache/impala/catalog/FileMetadataLoader.java@206
PS12, Line 206:         loadedFds_ = 
AcidUtils.filterFilesForAcidState(loadedFds_,
              :           relativeDirPaths);
nit: fits to one line


http://gerrit.cloudera.org:8080/#/c/13334/14/fe/src/main/java/org/apache/impala/util/AcidUtils.java
File fe/src/main/java/org/apache/impala/util/AcidUtils.java:

http://gerrit.cloudera.org:8080/#/c/13334/14/fe/src/main/java/org/apache/impala/util/AcidUtils.java@48
PS14, Line 48: (?:/.*)?
I have deltas like this produced by compaction:
base_0000002_v0000108

+Same as line 54.


http://gerrit.cloudera.org:8080/#/c/13334/14/fe/src/main/java/org/apache/impala/util/AcidUtils.java@54
PS14, Line 54:        // Optional path suffix.
             :        "(?:/.*)?");
Is this optional? We should only match files inside directories as far as I 
understand.


http://gerrit.cloudera.org:8080/#/c/13334/14/fe/src/main/java/org/apache/impala/util/AcidUtils.java@105
PS14, Line 105:     public boolean test(String dirPath) {
To support upgraded tables this should also accept any path without "/", so 
where it is a file directly inside the table/partition directory.


http://gerrit.cloudera.org:8080/#/c/13334/14/fe/src/main/java/org/apache/impala/util/AcidUtils.java@170
PS14, Line 170:       if (baseWriteId == maxBaseWriteId) {
              :         validDescriptors.add(fd);
We will go to this branch with any file if getMaxBaseId() didn't found any base 
directory, so maxBaseWriteId ==  Long.MIN_VALUE. This is not necessarily a 
problem, as we should accept all valid deltas in this case, but it seems more 
logical to add them in the next branch.


http://gerrit.cloudera.org:8080/#/c/13334/14/fe/src/main/java/org/apache/impala/util/AcidUtils.java@182
PS14, Line 182:     // If the table was not transaction table then return all 
fds.
              :     if (validDescriptors.isEmpty()) {
              :       return fileDescriptors;
              :     }
Can't we enter this if the table was truncated, but the old deltas are still 
there?



--
To view, visit http://gerrit.cloudera.org:8080/13334
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Icf0aeb36e10c827ead59ed7f67e731199394fe8e
Gerrit-Change-Number: 13334
Gerrit-PatchSet: 14
Gerrit-Owner: Sudhanshu Arora <sudhan...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <csringho...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Gerrit-Reviewer: Sudhanshu Arora <sudhan...@cloudera.com>
Gerrit-Reviewer: Todd Lipcon <t...@apache.org>
Gerrit-Reviewer: Vihang Karajgaonkar <vih...@cloudera.com>
Gerrit-Reviewer: Yongzhi Chen <yc...@cloudera.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <borokna...@cloudera.com>
Gerrit-Comment-Date: Wed, 22 May 2019 19:28:47 +0000
Gerrit-HasComments: Yes

Reply via email to