Fang-Yu Rao has posted comments on this change. ( http://gerrit.cloudera.org:8080/17252 )
Change subject: IMPALA-10647 Improve always-true min/max filter handling in coordinator ...................................................................... Patch Set 15: (4 comments) Hi Qifan, I only have some minor comments on this patch. Thank you very much for working on this! http://gerrit.cloudera.org:8080/#/c/17252/15//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/17252/15//COMMIT_MSG@23 PS15, Line 23: arrival time of min/max filters Is it true that the logic added in ScanNode::WaitForRuntimeFilters() also applies for Bloom filters? Namely, the logged 'end' over there records the time when we are done waiting for all filters in 'filter_ctxs_' whether or not there might be some Bloom filters. If the logic added also applies for Bloom filters, is there any particular reason why we do not tackle the case when a runtime filter is a Bloom filter in this patch? I will also paste the questions above at ScanNode::WaitForRuntimeFilters() for easy reference. http://gerrit.cloudera.org:8080/#/c/17252/15//COMMIT_MSG@27 PS15, Line 27: relate nit: related http://gerrit.cloudera.org:8080/#/c/17252/15/be/src/exec/scan-node.cc File be/src/exec/scan-node.cc: http://gerrit.cloudera.org:8080/#/c/17252/15/be/src/exec/scan-node.cc@239 PS15, Line 239: end Is it true that the logic added here also applies for Bloom filters? Namely, the logged 'end' here records the time when we are done waiting for all filters in 'filter_ctxs_' whether or not there might be some Bloom filters. If the logic added also applies for Bloom filters, is there any particular reason why we do not tackle the case when a runtime filter is a Bloom filter in this patch? http://gerrit.cloudera.org:8080/#/c/17252/15/be/src/runtime/coordinator.cc File be/src/runtime/coordinator.cc: http://gerrit.cloudera.org:8080/#/c/17252/15/be/src/runtime/coordinator.cc@652 PS15, Line 652: // Also add the min/max value for partitioned joins, when all updates are available : // or the filter is disabled due to being always true. I was wondering whether the comment would be clearer if we rephrase this sentence as the following although it seems a bit verbose. Please also let me know if my understanding is correct. Thanks! For partitioned joins, add the actual min/max values when all updates are available and the filter is neither always true nor always false. Add "AlwasyTrue" if at least one received filter is always true. Add "AlwaysFalse" when the aggregated filter is always false after all updates are received. All other cases are considered "PartialUpdates". -- To view, visit http://gerrit.cloudera.org:8080/17252 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I326317833979efcbe02ce6c95ad80133dd5c7964 Gerrit-Change-Number: 17252 Gerrit-PatchSet: 15 Gerrit-Owner: Qifan Chen <qc...@cloudera.com> Gerrit-Reviewer: Fang-Yu Rao <fangyu....@cloudera.com> Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com> Gerrit-Reviewer: Qifan Chen <qc...@cloudera.com> Gerrit-Reviewer: Riza Suminto <riza.sumi...@cloudera.com> Gerrit-Reviewer: Wenzhe Zhou <wz...@cloudera.com> Gerrit-Comment-Date: Sun, 11 Apr 2021 19:04:02 +0000 Gerrit-HasComments: Yes