Quanlong Huang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17295 )

Change subject: IMPALA-10650: Bailout min/max filters in hash join builder early
......................................................................


Patch Set 8:

(11 comments)

I think this is a nice improvement. Looking forward to the performance results!

BTW, Sorry for leaving lots of code style comments..

http://gerrit.cloudera.org:8080/#/c/17295/8/be/src/codegen/gen_ir_descriptions.py
File be/src/codegen/gen_ir_descriptions.py:

http://gerrit.cloudera.org:8080/#/c/17295/8/be/src/codegen/gen_ir_descriptions.py@235
PS8, Line 235:   ["BOOL_MIN_MAX_FILTER_ALWAYSTRUE", 
"_ZNK6impala16BoolMinMaxFilter10AlwaysTrueEv"],
             :   ["TINYINT_MIN_MAX_FILTER_ALWAYSTRUE", 
"_ZNK6impala19TinyIntMinMaxFilter10AlwaysTrueEv"],
             :   ["SMALLINT_MIN_MAX_FILTER_ALWAYSTRUE", 
"_ZNK6impala20SmallIntMinMaxFilter10AlwaysTrueEv"],
             :   ["INT_MIN_MAX_FILTER_ALWAYSTRUE", 
"_ZNK6impala15IntMinMaxFilter10AlwaysTrueEv"],
             :   ["BIGINT_MIN_MAX_FILTER_ALWAYSTRUE", 
"_ZNK6impala18BigIntMinMaxFilter10AlwaysTrueEv"],
             :   ["FLOAT_MIN_MAX_FILTER_ALWAYSTRUE", 
"_ZNK6impala17FloatMinMaxFilter10AlwaysTrueEv"],
             :   ["DOUBLE_MIN_MAX_FILTER_ALWAYSTRUE", 
"_ZNK6impala18DoubleMinMaxFilter10AlwaysTrueEv"],
             :   ["STRING_MIN_MAX_FILTER_ALWAYSTRUE", 
"_ZNK6impala18StringMinMaxFilter10AlwaysTrueEv"],
             :   ["TIMESTAMP_MIN_MAX_FILTER_ALWAYSTRUE", 
"_ZNK6impala21TimestampMinMaxFilter10AlwaysTrueEv"],
             :   ["DATE_MIN_MAX_FILTER_ALWAYSTRUE", 
"_ZNK6impala16DateMinMaxFilter10AlwaysTrueEv"],
             :   ["DECIMAL_MIN_MAX_FILTER_ALWAYSTRUE", 
"_ZNK6impala19DecimalMinMaxFilter10AlwaysTrueEv"],
Just curious, what errors will we encounter if we don't make the AlwaysTrue() 
method virtual and use _ZNK6impala12MinMaxFilter10AlwaysTrueEv directly?


http://gerrit.cloudera.org:8080/#/c/17295/8/be/src/exec/filter-context.cc
File be/src/exec/filter-context.cc:

http://gerrit.cloudera.org:8080/#/c/17295/8/be/src/exec/filter-context.cc@281
PS8, Line 281:
nit: redundant blank line


http://gerrit.cloudera.org:8080/#/c/17295/8/be/src/exec/filter-context.cc@512
PS8, Line 512:   if (computed_ratio) {
             :     *computed_ratio = ratio;
             :   }
nit: our code style prefers collapsing simple if-statement to one line

  if (computed_ratio) *computed_ratio = ratio;


http://gerrit.cloudera.org:8080/#/c/17295/8/be/src/exec/partitioned-hash-join-builder.cc
File be/src/exec/partitioned-hash-join-builder.cc:

http://gerrit.cloudera.org:8080/#/c/17295/8/be/src/exec/partitioned-hash-join-builder.cc@329
PS8, Line 329:   DetermineUsefulnessForMinmaxFilters();
Should we codegen FilterContext::ShouldRejectFilterBasedOnColumnStats() if we 
want to call this for each batch (e.g. eliminate the switch branch in it in 
codegen)?


http://gerrit.cloudera.org:8080/#/c/17295/8/be/src/exec/partitioned-hash-join-builder.cc@335
PS8, Line 335:   if (filter_ctxs_.size() == 0) {
             :     return;
             :   }
nit: our code style prefer collapsing simple if-statement to one line

  if (filter_ctxs_.size() == 0) return;


http://gerrit.cloudera.org:8080/#/c/17295/8/be/src/exec/partitioned-hash-join-builder.cc@342
PS8, Line 342:       if (ctx.local_min_max_filter->AlwaysTrue()) {
             :         continue;
             :       }
nit: our code style prefer collapsing simple if-statement to one line

    if (ctx.local_min_max_filter->AlwaysTrue()) continue;


http://gerrit.cloudera.org:8080/#/c/17295/8/be/src/exec/partitioned-hash-join-builder.cc@349
PS8, Line 349: auto
nit: should we use "const auto&" instead?


http://gerrit.cloudera.org:8080/#/c/17295/8/be/src/exec/partitioned-hash-join-builder.cc@351
PS8, Line 351:
nit: our code style uses 4 spaces indention here


http://gerrit.cloudera.org:8080/#/c/17295/8/be/src/exec/partitioned-hash-join-builder.cc@355
PS8, Line 355:         if (min_ratio > ratio) {
             :           min_ratio = ratio;
             :         }
nit: our code style prefer collapsing simple if-statement to one line

  if (min_ratio > ratio) min_ratio = ratio;


http://gerrit.cloudera.org:8080/#/c/17295/8/be/src/exec/partitioned-hash-join-builder.cc@992
PS8, Line 992: auto
nit: should we use "const auto&" instead?


http://gerrit.cloudera.org:8080/#/c/17295/8/be/src/runtime/runtime-filter-ir.cc
File be/src/runtime/runtime-filter-ir.cc:

http://gerrit.cloudera.org:8080/#/c/17295/8/be/src/runtime/runtime-filter-ir.cc@35
PS8, Line 35:  && !filter->AlwaysTrue()
Should we move this into LIKELY()?



--
To view, visit http://gerrit.cloudera.org:8080/17295
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I193646e7acfdd3023f7c947d8107da58a1f41183
Gerrit-Change-Number: 17295
Gerrit-PatchSet: 8
Gerrit-Owner: Qifan Chen <qc...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qc...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <huangquanl...@gmail.com>
Gerrit-Reviewer: Wenzhe Zhou <wz...@cloudera.com>
Gerrit-Comment-Date: Wed, 21 Apr 2021 02:58:04 +0000
Gerrit-HasComments: Yes

Reply via email to