Amogh Margoor has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17860 )

Change subject: IMPALA-9873: Avoid materilization of columns for filtered out 
rows in Parquet table.
......................................................................


Patch Set 11:

(10 comments)

http://gerrit.cloudera.org:8080/#/c/17860/10/be/src/exec/parquet/hdfs-parquet-scanner.cc
File be/src/exec/parquet/hdfs-parquet-scanner.cc:

http://gerrit.cloudera.org:8080/#/c/17860/10/be/src/exec/parquet/hdfs-parquet-scanner.cc@2223
PS10, Line 2223: c.
> Could you please explain where do we filter out the rows in the merged micr
We don't need to re-filter after step 3. I will explain it in comment.


http://gerrit.cloudera.org:8080/#/c/17860/10/be/src/exec/scratch-tuple-batch-test.cc
File be/src/exec/scratch-tuple-batch-test.cc:

http://gerrit.cloudera.org:8080/#/c/17860/10/be/src/exec/scratch-tuple-batch-test.cc@67
PS10, Line 67: scratch_batch->num_tuples = BATCH_
> I wonder if we can add two more tests for the following situations.
Done


http://gerrit.cloudera.org:8080/#/c/17860/10/be/src/exec/scratch-tuple-batch.h
File be/src/exec/scratch-tuple-batch.h:

http://gerrit.cloudera.org:8080/#/c/17860/10/be/src/exec/scratch-tuple-batch.h@29
PS10, Line 29: ScratchMicroBatch
> May need a cstr to properly init these fields.
Using aggregate initialisers instead of constructor accepting arguments as we 
need default constructor too. Plus we don't want many function calls on hot 
path (GetMicroBatches).


http://gerrit.cloudera.org:8080/#/c/17860/10/be/src/exec/scratch-tuple-batch.h@171
PS10, Line 171:   /// bits set are used to create micro batches. Micro batches 
that differ by less than
> nit (or micro batches).
Done


http://gerrit.cloudera.org:8080/#/c/17860/10/be/src/exec/scratch-tuple-batch.h@176
PS10, Line 176: present
> nit.
Done


http://gerrit.cloudera.org:8080/#/c/17860/10/be/src/exec/scratch-tuple-batch.h@178
PS10, Line 178: batch
> nit. batch_idx may be a better name in this method.
Done


http://gerrit.cloudera.org:8080/#/c/17860/10/be/src/exec/scratch-tuple-batch.h@203
PS10, Line 203:                     << "should be true";
              :     /// Add the last micro batch which was b
> nit. An alternative is the following, which is more robust.
We can avoid that extra branch and condition and also extra condition on client 
side to handle 0 being returned, as it is anyways going to be dead code and 
also mentioned as precondition for method. DCHECK is to ensure that 
precondition and in future this dead code doesn't get activated.


http://gerrit.cloudera.org:8080/#/c/17860/10/be/src/service/query-options.h
File be/src/service/query-options.h:

http://gerrit.cloudera.org:8080/#/c/17860/10/be/src/service/query-options.h@50
PS10, Line 50: PARQUET_LATE_MATERIALIZATION_THRE
> nit: PARQUET_LATE_MATERIALIZATION_THRESHOLD?
Done


http://gerrit.cloudera.org:8080/#/c/17860/10/common/thrift/ImpalaService.thrift
File common/thrift/ImpalaService.thrift:

http://gerrit.cloudera.org:8080/#/c/17860/10/common/thrift/ImpalaService.thrift@701
PS10, Line 701:   ENABLE_ASYNC_DDL_EXECUTION = 136
> nit. -1 to turn off the feature.
Done


http://gerrit.cloudera.org:8080/#/c/17860/10/common/thrift/Query.thrift
File common/thrift/Query.thrift:

http://gerrit.cloudera.org:8080/#/c/17860/10/common/thrift/Query.thrift@554
PS10, Line 554:   137: optional bool enable_async_ddl_execution = true;
> nit. -1 to turn off the feature.
Done



--
To view, visit http://gerrit.cloudera.org:8080/17860
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I46406c913297d5bbbec3ccae62a83bb214ed2c60
Gerrit-Change-Number: 17860
Gerrit-PatchSet: 11
Gerrit-Owner: Amogh Margoor <amarg...@gmail.com>
Gerrit-Reviewer: Amogh Margoor <amarg...@gmail.com>
Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Gerrit-Reviewer: Kurt Deschler <kdesc...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qc...@cloudera.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <borokna...@cloudera.com>
Gerrit-Comment-Date: Mon, 25 Oct 2021 10:30:16 +0000
Gerrit-HasComments: Yes

Reply via email to