Bikramjeet Vig has posted comments on this change. ( http://gerrit.cloudera.org:8080/14859 )
Change subject: IMPALA-4224: execute separate join builds fragments ...................................................................... Patch Set 35: (19 comments) Only looked at the backend part. Will take a look at the frontend and tests next. Mostly found nits, rest looks good http://gerrit.cloudera.org:8080/#/c/14859/35/be/src/exec/blocking-join-node.h File be/src/exec/blocking-join-node.h: http://gerrit.cloudera.org:8080/#/c/14859/35/be/src/exec/blocking-join-node.h@243 PS35, Line 243: a nit: the integrated join builder http://gerrit.cloudera.org:8080/#/c/14859/35/be/src/exec/join-builder.h File be/src/exec/join-builder.h: http://gerrit.cloudera.org:8080/#/c/14859/35/be/src/exec/join-builder.h@138 PS35, Line 138: HandoffToProbe nit: how about HandoffToProbeAndWait? http://gerrit.cloudera.org:8080/#/c/14859/33/be/src/exec/join-builder.h File be/src/exec/join-builder.h: http://gerrit.cloudera.org:8080/#/c/14859/33/be/src/exec/join-builder.h@137 PS33, Line 137: stuck nit: extra word? http://gerrit.cloudera.org:8080/#/c/14859/33/be/src/exec/join-builder.cc File be/src/exec/join-builder.cc: http://gerrit.cloudera.org:8080/#/c/14859/33/be/src/exec/join-builder.cc@39 PS33, Line 39: JoinBuilder::~JoinBuilder() {} maybe DCHECK probe_refcount_ and outstanding_probes_ == 0 http://gerrit.cloudera.org:8080/#/c/14859/33/be/src/exec/join-builder.cc@68 PS33, Line 68: or nit: extra word http://gerrit.cloudera.org:8080/#/c/14859/33/be/src/exec/join-builder.cc@94 PS33, Line 94: { nit: dont need enclosing brackets http://gerrit.cloudera.org:8080/#/c/14859/35/be/src/exec/partitioned-hash-join-builder.h File be/src/exec/partitioned-hash-join-builder.h: http://gerrit.cloudera.org:8080/#/c/14859/35/be/src/exec/partitioned-hash-join-builder.h@a504 PS35, Line 504: did you miss adding this back? (here and below) http://gerrit.cloudera.org:8080/#/c/14859/35/be/src/exec/partitioned-hash-join-builder.h@61 PS35, Line 61: embedded might be worth adding a DCHECK for all the embedded specific methods http://gerrit.cloudera.org:8080/#/c/14859/35/be/src/exec/partitioned-hash-join-builder.h@278 PS35, Line 278: as nit: is http://gerrit.cloudera.org:8080/#/c/14859/35/be/src/exec/partitioned-hash-join-builder.h@637 PS35, Line 637: was nit: ways http://gerrit.cloudera.org:8080/#/c/14859/33/be/src/exec/partitioned-hash-join-builder.h File be/src/exec/partitioned-hash-join-builder.h: http://gerrit.cloudera.org:8080/#/c/14859/33/be/src/exec/partitioned-hash-join-builder.h@456 PS33, Line 456: need one additional buffer for the input while repartitioning the build or probe. : /// For NAAJ, we need 3 additional buffers for 'null_awar nit: update comment http://gerrit.cloudera.org:8080/#/c/14859/33/be/src/exec/partitioned-hash-join-builder.h@636 PS33, Line 636: nit: typo http://gerrit.cloudera.org:8080/#/c/14859/35/be/src/exec/partitioned-hash-join-builder.cc File be/src/exec/partitioned-hash-join-builder.cc: http://gerrit.cloudera.org:8080/#/c/14859/35/be/src/exec/partitioned-hash-join-builder.cc@611 PS35, Line 611: if (buffer_pool_client_ != probe_client) { nit: how about we use is_separate_build_ as the if condition and then DCHECK(buffer_pool_client_ != probe_client). Or the other way around. Just so that we can clearly separate code for is_separate_build_ cases throughout the file. http://gerrit.cloudera.org:8080/#/c/14859/33/be/src/exec/partitioned-hash-join-node.cc File be/src/exec/partitioned-hash-join-node.cc: http://gerrit.cloudera.org:8080/#/c/14859/33/be/src/exec/partitioned-hash-join-node.cc@78 PS33, Line 78: RETURN_IF_ERROR(PhjBuilderConfig::CreateConfig(state, tnode_->node_id, : tnode_->hash_join_node.join_op, &build_row_desc(), eq_join_conjuncts, : tnode_->runtime_filters, tnode_->hash_join_node.hash_seed, &phj_builder_config)); this can be optional now based on UseSeparateBuild http://gerrit.cloudera.org:8080/#/c/14859/33/be/src/exec/partitioned-hash-join-node.cc@723 PS33, Line 723: output_unmatched_batch_->TransferResourceOwnership(out_batch); : output_unmatched_batch_->Reset(); why is this done before and after? http://gerrit.cloudera.org:8080/#/c/14859/33/be/src/runtime/coordinator.cc File be/src/runtime/coordinator.cc: http://gerrit.cloudera.org:8080/#/c/14859/33/be/src/runtime/coordinator.cc@363 PS33, Line 363: ms, num_inst this is kind of confusing because in this case the sink produces the filter and not the dest join node http://gerrit.cloudera.org:8080/#/c/14859/33/be/src/runtime/runtime-state.cc File be/src/runtime/runtime-state.cc: http://gerrit.cloudera.org:8080/#/c/14859/33/be/src/runtime/runtime-state.cc@249 PS33, Line 249: nit: extra line http://gerrit.cloudera.org:8080/#/c/14859/33/common/thrift/PlanNodes.thrift File common/thrift/PlanNodes.thrift: http://gerrit.cloudera.org:8080/#/c/14859/33/common/thrift/PlanNodes.thrift@367 PS33, Line 367: optional should this be required instead? http://gerrit.cloudera.org:8080/#/c/14859/33/common/thrift/PlanNodes.thrift@617 PS33, Line 617: 28 nit: would it might make more sense to have hash_join_node or nested_loop_join_node nested inside TJoinNode -- To view, visit http://gerrit.cloudera.org:8080/14859 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I4403c8e62d9c13854e7830602ee613f8efc80c58 Gerrit-Change-Number: 14859 Gerrit-PatchSet: 35 Gerrit-Owner: Tim Armstrong <tarmstr...@cloudera.com> Gerrit-Reviewer: Bikramjeet Vig <bikramjeet....@cloudera.com> Gerrit-Reviewer: Csaba Ringhofer <csringho...@cloudera.com> Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com> Gerrit-Reviewer: Tim Armstrong <tarmstr...@cloudera.com> Gerrit-Reviewer: Zoltan Borok-Nagy <borokna...@cloudera.com> Gerrit-Comment-Date: Thu, 06 Feb 2020 01:27:50 +0000 Gerrit-HasComments: Yes