Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/14859 )
Change subject: IMPALA-4224: execute separate join builds fragments ...................................................................... Patch Set 35: (15 comments) http://gerrit.cloudera.org:8080/#/c/14859/35//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/14859/35//COMMIT_MSG@38 PS35, Line 38: accordingly. > Something lost between edits. Done http://gerrit.cloudera.org:8080/#/c/14859/35/be/src/exec/blocking-join-node.h File be/src/exec/blocking-join-node.h: http://gerrit.cloudera.org:8080/#/c/14859/35/be/src/exec/blocking-join-node.h@63 PS35, Line 63: separate build fragment > nit: maybe "separate but co-located build fragment"? I feel it's worth to e Done http://gerrit.cloudera.org:8080/#/c/14859/35/be/src/exec/blocking-join-node.h@243 PS35, Line 243: a > nit: the integrated join builder Done http://gerrit.cloudera.org:8080/#/c/14859/35/be/src/exec/blocking-join-node.cc File be/src/exec/blocking-join-node.cc: http://gerrit.cloudera.org:8080/#/c/14859/35/be/src/exec/blocking-join-node.cc@196 PS35, Line 196: if (!open_called_) { > nit: maybe an early return would make it a bit more readable Done http://gerrit.cloudera.org:8080/#/c/14859/35/be/src/exec/join-builder.h File be/src/exec/join-builder.h: http://gerrit.cloudera.org:8080/#/c/14859/35/be/src/exec/join-builder.h@54 PS35, Line 54: class JoinBuilder : public DataSink { > Maybe you could add a timeline, or sequence diagram-like thingy about the i Done http://gerrit.cloudera.org:8080/#/c/14859/35/be/src/exec/join-builder.h@138 PS35, Line 138: HandoffToProbe > nit: how about HandoffToProbeAndWait? Done (combined with Zoltan's suggestion) http://gerrit.cloudera.org:8080/#/c/14859/35/be/src/exec/join-builder.cc File be/src/exec/join-builder.cc: http://gerrit.cloudera.org:8080/#/c/14859/35/be/src/exec/join-builder.cc@90 PS35, Line 90: HandoffToProbe > nit: HandoffToProbes (plural)? Done http://gerrit.cloudera.org:8080/#/c/14859/35/be/src/exec/partitioned-hash-join-builder.h File be/src/exec/partitioned-hash-join-builder.h: http://gerrit.cloudera.org:8080/#/c/14859/35/be/src/exec/partitioned-hash-join-builder.h@61 PS35, Line 61: embedded > might be worth adding a DCHECK for all the embedded specific methods I did a pass over the methods and I think I already had DCHECKs in all the places that made sense - there are various methods specific to the separate build. These constructors/factory methods imply whether it's embedded or not. Maybe I missed something thouhg? http://gerrit.cloudera.org:8080/#/c/14859/35/be/src/exec/partitioned-hash-join-builder.h@278 PS35, Line 278: as > nit: is Done http://gerrit.cloudera.org:8080/#/c/14859/35/be/src/exec/partitioned-hash-join-builder.h@637 PS35, Line 637: was > nit: ways Done http://gerrit.cloudera.org:8080/#/c/14859/35/be/src/exec/partitioned-hash-join-builder.cc File be/src/exec/partitioned-hash-join-builder.cc: http://gerrit.cloudera.org:8080/#/c/14859/35/be/src/exec/partitioned-hash-join-builder.cc@611 PS35, Line 611: if (buffer_pool_client_ != probe_client) { > nit: how about we use is_separate_build_ as the if condition and then DCHEC Good point, I like the suggestion. I updated the class comment to explain the invariant then added in DCHECKs. http://gerrit.cloudera.org:8080/#/c/14859/35/fe/src/main/java/org/apache/impala/planner/PlanFragment.java File fe/src/main/java/org/apache/impala/planner/PlanFragment.java: http://gerrit.cloudera.org:8080/#/c/14859/35/fe/src/main/java/org/apache/impala/planner/PlanFragment.java@305 PS35, Line 305: Data sinks > nit: Build sinks? Done http://gerrit.cloudera.org:8080/#/c/14859/35/fe/src/main/java/org/apache/impala/planner/Planner.java File fe/src/main/java/org/apache/impala/planner/Planner.java: http://gerrit.cloudera.org:8080/#/c/14859/35/fe/src/main/java/org/apache/impala/planner/Planner.java@115 PS35, Line 115: createDistributedPlan > Yeah the naming is a bit confusing. I think there are two sets of concepts Done http://gerrit.cloudera.org:8080/#/c/14859/35/fe/src/main/java/org/apache/impala/planner/Planner.java@128 PS35, Line 128: // MT_DOP > 0 is not supported by default for plans with base table joins or table : // sinks: we only allow MT_DOP > 0 with such plans if --unlock_mt_dop=true is : // specified. We allow single node plans with mt_dop since there is no actual : // parallelism. > nit: maybe the comment should placed in useParallelPlan() since it refers t I think parts of this comment apply here - I simplified and shortened it. http://gerrit.cloudera.org:8080/#/c/14859/35/tests/query_test/test_join_queries.py File tests/query_test/test_join_queries.py: http://gerrit.cloudera.org:8080/#/c/14859/35/tests/query_test/test_join_queries.py@35 PS35, Line 35: > Would it add coverage if we keep 1 as well? I think it's worthwhile, but I was somewhat concerned about runtime. I've added it to exhaustive only. -- To view, visit http://gerrit.cloudera.org:8080/14859 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I4403c8e62d9c13854e7830602ee613f8efc80c58 Gerrit-Change-Number: 14859 Gerrit-PatchSet: 35 Gerrit-Owner: Tim Armstrong <tarmstr...@cloudera.com> Gerrit-Reviewer: Bikramjeet Vig <bikramjeet....@cloudera.com> Gerrit-Reviewer: Csaba Ringhofer <csringho...@cloudera.com> Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com> Gerrit-Reviewer: Tim Armstrong <tarmstr...@cloudera.com> Gerrit-Reviewer: Zoltan Borok-Nagy <borokna...@cloudera.com> Gerrit-Comment-Date: Fri, 07 Feb 2020 00:51:43 +0000 Gerrit-HasComments: Yes