Todd Lipcon has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12974 )

Change subject: IMPALA-7608: Estimate row count from file size when no stats 
available
......................................................................


Patch Set 8:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/12974/8/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
File fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java:

http://gerrit.cloudera.org:8080/#/c/12974/8/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@1198
PS8, Line 1198:       // estimated row count by ESTIMATED_COMPRESSION_FACTOR.
Should we estimate the compression factor differently depending on the file 
format? eg text I would guess would have a much lower compression factor than 
parquet. Might be interesting to look at some emprical data here since changing 
it later could change plans, and seems not too hard to do a little research



--
To view, visit http://gerrit.cloudera.org:8080/12974
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic414121c8df0d5222e4aeea096b5365beb04568a
Gerrit-Change-Number: 12974
Gerrit-PatchSet: 8
Gerrit-Owner: Fang-Yu Rao <fangyu....@cloudera.com>
Gerrit-Reviewer: Fang-Yu Rao <fangyu....@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Gerrit-Reviewer: Paul Rogers <prog...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <tarmstr...@cloudera.com>
Gerrit-Reviewer: Todd Lipcon <t...@apache.org>
Gerrit-Comment-Date: Fri, 17 May 2019 04:44:57 +0000
Gerrit-HasComments: Yes

Reply via email to