Aman Sinha has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17075 )

Change subject: IMPALA-10494: Making use of the min/max column stats to improve 
min/max filters
......................................................................


Patch Set 28: Code-Review+1

(2 comments)

Just a nit and a comment. Should be able to +2 after that.

http://gerrit.cloudera.org:8080/#/c/17075/27/fe/src/main/java/org/apache/impala/analysis/ComputeStatsStmt.java
File fe/src/main/java/org/apache/impala/analysis/ComputeStatsStmt.java:

http://gerrit.cloudera.org:8080/#/c/17075/27/fe/src/main/java/org/apache/impala/analysis/ComputeStatsStmt.java@260
PS27, Line 260:       }
> Looks like mixture of files of different format (like Parquet and ORC at the 
> same time) is not allowed.
This may not be accurate.  In the HdfsScanNode's computeScanRangeLocation(), we 
examine the file format for a partition and just record it but just set a flag 
if they are not all parquet format. It does not assert or return error.
Here's the snippet in HdfsScanNode.java:
      fileFormats_.add(partition.getFileFormat());
      if (!isParquetBased(partition.getFileFormat())) {
        allParquet = false;
      }
However, for statistics, as I mentioned before (and you  seem in agreement) 
that having at least one parquet partition is sufficient to trigger the min-max 
filter checks since it does not affect correctness of results.


http://gerrit.cloudera.org:8080/#/c/17075/28/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
File fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java:

http://gerrit.cloudera.org:8080/#/c/17075/28/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@387
PS28, Line 387:   /*
Can you remove this method.



--
To view, visit http://gerrit.cloudera.org:8080/17075
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I08581b44419bb8da5940cbf98502132acd1c86df
Gerrit-Change-Number: 17075
Gerrit-PatchSet: 28
Gerrit-Owner: Qifan Chen <qc...@cloudera.com>
Gerrit-Reviewer: Aman Sinha <amsi...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qc...@cloudera.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <borokna...@cloudera.com>
Gerrit-Comment-Date: Wed, 31 Mar 2021 21:39:27 +0000
Gerrit-HasComments: Yes

Reply via email to