Tim Armstrong has posted comments on this change.

Change subject: IMPALA-3586 (Part 1): Implement Union Pass Through
......................................................................


Patch Set 2:

(6 comments)

Need to look at fe and tests, but had some comments on the backend.

http://gerrit.cloudera.org:8080/#/c/5816/2/be/src/exec/union-node.cc
File be/src/exec/union-node.cc:

PS2, Line 122: isPassThrough
I think we would either pass child_idx_ into isPassThrough() or call it 
something like currentChildIsPassthrough() so that it's clear it refers to the 
child.


Line 123:     
DCHECK(child(child_idx_)->row_desc().Equals(row_batch->row_desc()));
Move this into the Open() branch so we don't execute it unnecessarily.


Line 126:       // We will not close any PassThrough children nodes here, they 
will be closed in
We need to think about the implications of this. This could increase resource 
consumption quite a bit in some cases. E.g. if you had a union of N hash join 
nodes, each of which had a limit applied, this could increase resource 
consumption Nx.

We need to fix the memory management model to allow early closing in cases like 
this, but that shouldn't block progress here.

The workaround we generally use is to set the MarkNeedsDeepCopy() flag on the 
last row batch. This forces all operators to either finish processing the batch 
(in the common case) or copy the data (in the case of nested loop join). It 
generally works ok except in subplans, where it can cause performance problems 
by forcing many tiny batches.

I think the best course would be to disable passthrough in subplans and set the 
MarkNeedsDeepCopy() flag on the last batch returned from each child. We do 
something like this in PartitionedAggregationNode::HandleOutputStrings().

I'm eventually going to get rid of MarkNeedsDeepCopy(), but using it here would 
flag it as one of the places we need to clean up.


http://gerrit.cloudera.org:8080/#/c/5816/2/be/src/runtime/descriptors.cc
File be/src/runtime/descriptors.cc:

PS2, Line 438: verify
Verify (caps)


PS2, Line 442: std::
Don't need std::, we automatically import it in common/names.h.


PS2, Line 442:  
Make these references with & to avoid copying the vector.


-- 
To view, visit http://gerrit.cloudera.org:8080/5816
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia8f6d5062724ba5b78174c3227a7a796d10d8416
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Taras Bobrovytsky <tbobrovyt...@cloudera.com>
Gerrit-Reviewer: Alex Behm <alex.b...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <tarmstr...@cloudera.com>
Gerrit-HasComments: Yes

Reply via email to