Amogh Margoor has posted comments on this change. ( http://gerrit.cloudera.org:8080/17860 )
Change subject: IMPALA-9873: Avoid materilization of columns for filtered out rows in Parquet table. ...................................................................... Patch Set 11: (10 comments) http://gerrit.cloudera.org:8080/#/c/17860/10/be/src/exec/parquet/hdfs-parquet-scanner.cc File be/src/exec/parquet/hdfs-parquet-scanner.cc: http://gerrit.cloudera.org:8080/#/c/17860/10/be/src/exec/parquet/hdfs-parquet-scanner.cc@2223 PS10, Line 2223: c. > Could you please explain where do we filter out the rows in the merged micr We don't need to re-filter after step 3. I will explain it in comment. http://gerrit.cloudera.org:8080/#/c/17860/10/be/src/exec/scratch-tuple-batch-test.cc File be/src/exec/scratch-tuple-batch-test.cc: http://gerrit.cloudera.org:8080/#/c/17860/10/be/src/exec/scratch-tuple-batch-test.cc@67 PS10, Line 67: scratch_batch->num_tuples = BATCH_ > I wonder if we can add two more tests for the following situations. Done http://gerrit.cloudera.org:8080/#/c/17860/10/be/src/exec/scratch-tuple-batch.h File be/src/exec/scratch-tuple-batch.h: http://gerrit.cloudera.org:8080/#/c/17860/10/be/src/exec/scratch-tuple-batch.h@29 PS10, Line 29: ScratchMicroBatch > May need a cstr to properly init these fields. Using aggregate initialisers instead of constructor accepting arguments as we need default constructor too. Plus we don't want many function calls on hot path (GetMicroBatches). http://gerrit.cloudera.org:8080/#/c/17860/10/be/src/exec/scratch-tuple-batch.h@171 PS10, Line 171: /// bits set are used to create micro batches. Micro batches that differ by less than > nit (or micro batches). Done http://gerrit.cloudera.org:8080/#/c/17860/10/be/src/exec/scratch-tuple-batch.h@176 PS10, Line 176: present > nit. Done http://gerrit.cloudera.org:8080/#/c/17860/10/be/src/exec/scratch-tuple-batch.h@178 PS10, Line 178: batch > nit. batch_idx may be a better name in this method. Done http://gerrit.cloudera.org:8080/#/c/17860/10/be/src/exec/scratch-tuple-batch.h@203 PS10, Line 203: << "should be true"; : /// Add the last micro batch which was b > nit. An alternative is the following, which is more robust. We can avoid that extra branch and condition and also extra condition on client side to handle 0 being returned, as it is anyways going to be dead code and also mentioned as precondition for method. DCHECK is to ensure that precondition and in future this dead code doesn't get activated. http://gerrit.cloudera.org:8080/#/c/17860/10/be/src/service/query-options.h File be/src/service/query-options.h: http://gerrit.cloudera.org:8080/#/c/17860/10/be/src/service/query-options.h@50 PS10, Line 50: PARQUET_LATE_MATERIALIZATION_THRE > nit: PARQUET_LATE_MATERIALIZATION_THRESHOLD? Done http://gerrit.cloudera.org:8080/#/c/17860/10/common/thrift/ImpalaService.thrift File common/thrift/ImpalaService.thrift: http://gerrit.cloudera.org:8080/#/c/17860/10/common/thrift/ImpalaService.thrift@701 PS10, Line 701: ENABLE_ASYNC_DDL_EXECUTION = 136 > nit. -1 to turn off the feature. Done http://gerrit.cloudera.org:8080/#/c/17860/10/common/thrift/Query.thrift File common/thrift/Query.thrift: http://gerrit.cloudera.org:8080/#/c/17860/10/common/thrift/Query.thrift@554 PS10, Line 554: 137: optional bool enable_async_ddl_execution = true; > nit. -1 to turn off the feature. Done -- To view, visit http://gerrit.cloudera.org:8080/17860 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I46406c913297d5bbbec3ccae62a83bb214ed2c60 Gerrit-Change-Number: 17860 Gerrit-PatchSet: 11 Gerrit-Owner: Amogh Margoor <amarg...@gmail.com> Gerrit-Reviewer: Amogh Margoor <amarg...@gmail.com> Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com> Gerrit-Reviewer: Kurt Deschler <kdesc...@cloudera.com> Gerrit-Reviewer: Qifan Chen <qc...@cloudera.com> Gerrit-Reviewer: Zoltan Borok-Nagy <borokna...@cloudera.com> Gerrit-Comment-Date: Mon, 25 Oct 2021 10:30:16 +0000 Gerrit-HasComments: Yes