Alex Behm has posted comments on this change.

Change subject: IMPALA-3905: HdfsScanner::GetNext() for Avro, RC, and Seq scans.
......................................................................


Patch Set 8:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/6527/8/be/src/exec/base-sequence-scanner.cc
File be/src/exec/base-sequence-scanner.cc:

PS8, Line 65: ProcessSplit() will issue the files' scan ranges
            :   // and those ranges will need scanner threads, so no files are 
marked completed yet.
> hmm, is that stale now? i guess technically not since this now happens in G
Good catch. I think this is misleading. Changed to GetNextInternal()


http://gerrit.cloudera.org:8080/#/c/6527/8/be/src/exec/hdfs-scanner.h
File be/src/exec/hdfs-scanner.h:

PS8, Line 133: ProcessSplit
> what's the deal with making this non-pure? oh, I guess (most) scanners now 
Happy to address this, but let's discuss approaches first. 

Options:
* add a new virtual function that is a no-op for all scanners except parquet 
where we do a runtime filter check
* move the runtime filter stats and related functions like 
HdfsParquetScanner::CheckFiltersEffectiveness() into HdfsScanner and just do 
the runtime filter check for all scanners even though they are useless for 
non-parquet
* other ideas?


-- 
To view, visit http://gerrit.cloudera.org:8080/6527
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ie18f57b0d3fe0052a8ccd361b6a5fcdf979d0669
Gerrit-PatchSet: 8
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Alex Behm <alex.b...@cloudera.com>
Gerrit-Reviewer: Alex Behm <alex.b...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dhe...@cloudera.com>
Gerrit-Reviewer: Henry Robinson <he...@cloudera.com>
Gerrit-Reviewer: Marcel Kornacker <mar...@cloudera.com>
Gerrit-Reviewer: Sailesh Mukil <sail...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <tarmstr...@cloudera.com>
Gerrit-Reviewer: anujphadke <apha...@cloudera.com>
Gerrit-HasComments: Yes

Reply via email to