Taras Bobrovytsky has posted comments on this change.

Change subject: IMPALA-3586: Implement union passthrough
......................................................................


Patch Set 7:

(40 comments)

http://gerrit.cloudera.org:8080/#/c/5816/7/be/src/exec/analytic-eval-node.cc
File be/src/exec/analytic-eval-node.cc:

Line 145:   DCHECK(child(0)->row_desc().IsPrefixOfEquivalent(row_desc()));
> This should be IsPrefixOf() because we sanity checking the row descriptors 
Done


http://gerrit.cloudera.org:8080/#/c/5816/7/be/src/exec/union-node.cc
File be/src/exec/union-node.cc:

Line 115:   // passthrough case, the child can be closed right away because the 
row batch received
> the child can be closed right away -> the child was already closed?
Done


Line 116:   // from the child is copied (more details below).
> accuracy: the row batch wasn't copied
Done


Line 121:   if (child_idx_ < children_.size() && 
isChildPassThrough(child_idx_)) {
> Suggest comment:
Done


Line 122:     // If the rows from the current child can be passed through, the 
physical row layout
> This comment doesn't seem to add anything, I suggest removing it.
Replaced this with your suggestion.


Line 131:     // It may be possible that the row batch is not empty, so we save 
the previous value.
> More details would be helpful. If the batch has rows at this point, I'd ima
Added a dcheck instead. Time made this suggestion in patch 4.


Line 148:       // 'needs_deep_copy' lets us safely close the child in the next 
GetNext() call.
> style: 'needs_deep_copy' is not a visible variable here, suggest just sayin
Done


Line 154: 
> DCHECK that child_idx_ is not passthrough here
Done


http://gerrit.cloudera.org:8080/#/c/5816/7/be/src/exec/union-node.h
File be/src/exec/union-node.h:

Line 67:   // Which children can be passed through, without being materialized.
> ... without evaluating and materializing their exprs.
Done


http://gerrit.cloudera.org:8080/#/c/5816/7/be/src/runtime/descriptors.h
File be/src/runtime/descriptors.h:

Line 412:   /// Return true if the physical layout of this descriptor matches 
the physical layout
> matches that of other_desc
Done


Line 413:   /// of other_desc, but not necessarily ids.
> bot not necessarily the id.
Done


Line 565:   /// of other_desc, but not necessarily ids.
> but not necessarily the ids
Done


http://gerrit.cloudera.org:8080/#/c/5816/7/be/src/service/query-options.h
File be/src/service/query-options.h:

Line 38:       TImpalaQueryOptions::DISABLE_UNION_PASSTHROUGH + 1);\
> I tend to prefer ENABLE_UNION_PASSTHROUGH. To me positive phrasing is a lit
We have both positive and negative like DISABLE_CODEGEN and 
ENABLE_EXPR_REWRITES. I agree that ENABLE is simpler and easier to think about. 
(We should rename all DISABLE options.)


http://gerrit.cloudera.org:8080/#/c/5816/7/fe/src/main/java/org/apache/impala/analysis/SlotDescriptor.java
File fe/src/main/java/org/apache/impala/analysis/SlotDescriptor.java:

Line 227:   public boolean isEquivalent(SlotDescriptor other) {
> Unfortunately, the term 'equivalent' already has a different meaning in the
Done


http://gerrit.cloudera.org:8080/#/c/5816/7/fe/src/main/java/org/apache/impala/analysis/UnionStmt.java
File fe/src/main/java/org/apache/impala/analysis/UnionStmt.java:

Line 616:   public List<Expr> getUnionNodeResultExprs() {
> getUnionResultExprs()
Done


http://gerrit.cloudera.org:8080/#/c/5816/7/fe/src/main/java/org/apache/impala/planner/UnionNode.java
File fe/src/main/java/org/apache/impala/planner/UnionNode.java:

Line 47:  * a child has an identical tuple layout as the output of the union 
node, the
> ... as the output of the union node, and the child only has naked SlotRefs 
Done


Line 57:   protected final List<Expr> resultExprs_;
> unionResultExprs_ to make distinguish it from the resultExprLists_ and such
Done


Line 73:   // If false, no batches from child nodes would be passed through in 
the backend.
> Comment should describe what this flag is. Also you mean "true" right?
Done


Line 76:   // Indicates for which child nodes batches can be passed through in 
the backend.
> remove "in the backend" (it's clear that execution happens there)
Done


Line 81:   protected UnionNode(PlanNodeId id, TupleId tupleId) {
> Is this c'tor still needed? If not, remove.
Yes, it's used if we are creating a constant node. (with no children)


Line 89:                       List<Expr> resultExprs, boolean isInSubplan) {
> indentation, unionResultExprs
Done


Line 182:    * Compute whether we can pass through rows without materializing 
for the given child.
> Can combine/simplify like this:
Done. I don't think it's necessary to describe the passthrough conditions here. 
The implementation makes it clear.


Line 189:       Analyzer analyzer, List<TupleId> childTupleIds, List<Expr> 
childExprList) {
> childExprList -> childResultExprs
Done


Line 190:     if (analyzer.getQueryOptions().isDisable_union_passthrough()) 
return false;
> seems clearer to move this into init() and not execute any of the passthrou
We need to construct a list of booleans to indicate if the child can be passed 
through. We would have to then construct a list of false in init() if 
passthrough is disabled. I think it's simpler if we keep it the way it is.


Line 193:     // TODO: Remove this as part of IMPALA-4179.
> Move TODO to the class comment
This TODO seems a little out of place in the class comment. Won't we have to 
give additional information there for this comment to make sense.


Line 194:     if (isInSubplan_) return false;
> same here, seems easier to move this check into init()
Same as above. It's weird to special case adding a false to the list.


Line 205:     // Verify that the union node has one slot for every expression.
> union node -> union tuple descriptor
Done


Line 212:     if (resultExprs_.size() != 
childTupleDescriptor.getSlots().size()) return false;
> I don't think this tricky check is correct because it won't allow passthrou
Created a JIRA for handling advanced passthrough cases.


Line 218:       SlotRef unionSlot = resultExprs_.get(i).unwrapSlotRef(false);
> unionSlotRef, childSlotRef
Done


Line 220:       if (!unionTupleDescriptor.getSlots().get(i).isMaterialized()) 
continue;
> move above the unwrapSlotRef() calls
Done


Line 221:       
Preconditions.checkState(unionSlot.getDesc().getParent().getId().equals(tupleId_));
> Don't think we need this check, but something like this would be good:
Done


Line 223:         Preconditions.checkState(!(childSlot instanceof SlotRef));
> No need for this check
Done


Line 262:     // Compute which nodes can be passed through.
> which child nodes
Done


Line 266:       // Check that if the child outputs a single tuple, then it's 
not nullable. Tuple
> move into computePassThrough
Done


http://gerrit.cloudera.org:8080/#/c/5816/7/testdata/workloads/functional-planner/queries/PlannerTest/kudu.test
File testdata/workloads/functional-planner/queries/PlannerTest/kudu.test:

Line 329: # IMPALA-3586: The operand with the Kudu scan cannot be passed 
through because it's not
> because id is not-nullable (primary key)
Done


Line 346: select id from functional_kudu.alltypes
> do select *
With select *, passthrough doesn't get enabled. The layout of the union tuple 
is different that the layout of the child tuples.


http://gerrit.cloudera.org:8080/#/c/5816/7/testdata/workloads/functional-planner/queries/PlannerTest/union.test
File testdata/workloads/functional-planner/queries/PlannerTest/union.test:

Line 3104: # IMPALA-3678: Both union operands should produce rows with 
non-nullable slots which
> remove "should"
Done


Line 3124: # IMPALA-3678: The Union operands that contain a join should not be 
passed through,
> nice
thanks!


Line 3184: select COUNT(c.c_custkey), COUNT(v.tot_price)
> lowercase count for consistency
Done


http://gerrit.cloudera.org:8080/#/c/5816/7/testdata/workloads/functional-query/queries/QueryTest/union.test
File testdata/workloads/functional-query/queries/QueryTest/union.test:

Line 1126: select COUNT(c.c_custkey), COUNT(v.tot_price)
> lowercase count for consistency
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/5816
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia8f6d5062724ba5b78174c3227a7a796d10d8416
Gerrit-PatchSet: 7
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Taras Bobrovytsky <tbobrovyt...@cloudera.com>
Gerrit-Reviewer: Alex Behm <alex.b...@cloudera.com>
Gerrit-Reviewer: Taras Bobrovytsky <tbobrovyt...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <tarmstr...@cloudera.com>
Gerrit-HasComments: Yes

Reply via email to