Matthew Jacobs has posted comments on this change.

Change subject: IMPALA-3742: Partitions and sort INSERTs for Kudu tables
......................................................................


Patch Set 3:

(11 comments)

http://gerrit.cloudera.org:8080/#/c/6559/3/be/src/exec/kudu-util.cc
File be/src/exec/kudu-util.cc:

Line 110:     void* value, bool copy_strings) {
can you add a TODO that this needs to be codegen'd ?


http://gerrit.cloudera.org:8080/#/c/6559/3/be/src/exec/kudu-util.h
File be/src/exec/kudu-util.h:

PS3, Line 62: row
into memory owned by the row. If false, string data must remain valid while the 
row is being used.


http://gerrit.cloudera.org:8080/#/c/6559/2/be/src/exprs/kudu-partition-expr.cc
File be/src/exprs/kudu-partition-expr.cc:

PS2, Line 51: ast<KuduTableDes
> I haven't rebased in awhile and don't have this locally, and rebasing is pr
ok, can you put a TODO here and try to get to it before this gets checked in?


http://gerrit.cloudera.org:8080/#/c/6559/3/be/src/exprs/kudu-partition-expr.cc
File be/src/exprs/kudu-partition-expr.cc:

PS3, Line 55:   
RETURN_IF_ERROR(CreateKuduClient(kudu_table_desc->kudu_master_addresses(), 
&client));
let's use the query state kudu client


Line 84:   DCHECK(s.ok());
comment here and above why we're dchecking s.ok()


http://gerrit.cloudera.org:8080/#/c/6559/3/be/src/runtime/data-stream-sender.h
File be/src/runtime/data-stream-sender.h:

PS3, Line 109: kudu_
I think this may be better off avoiding mentioning kudu directly, e.g. calling 
this has_sink_partition_expr_

and adding a comment


PS3, Line 127: partition_expr_ctx_
this needs explanation as well.

Looking at the code in the .cc, it's weird having both this and 
partition_expr_ctxs_ 

How about calling this sink_partition_expr_ctx_ (with a comment) and the above 
to data_partition_expr_ctxs_

That's more in line with our changes to the thrift defs and separating logical 
vs. physical.


http://gerrit.cloudera.org:8080/#/c/6559/3/common/thrift/Exprs.thrift
File common/thrift/Exprs.thrift:

Line 134:   2: required list<Types.TPrimitiveType> types
nit newline after this


http://gerrit.cloudera.org:8080/#/c/6559/3/fe/src/main/java/org/apache/impala/analysis/KuduPartitionExpr.java
File fe/src/main/java/org/apache/impala/analysis/KuduPartitionExpr.java:

PS3, Line 76:       if 
(!children_.get(i).getType().equals(partitionKeyTypes_.get(i))) {
            :         
children_.get(i).uncheckedCastTo(partitionKeyTypes_.get(i));
            :       }
@Marcel, is there a cleaner way to ensure the partition exprs are cast to the 
target table type?


http://gerrit.cloudera.org:8080/#/c/6559/3/fe/src/main/java/org/apache/impala/planner/DistributedPlanner.java
File fe/src/main/java/org/apache/impala/planner/DistributedPlanner.java:

PS3, Line 219:  TODO: consider making a cost based decision for Kudu tables.
             :     if (!insertStmt.hasShuffleHint()
             :         && !(insertStmt.getTargetTable() instanceof KuduTable)
we should address this before the release. it'd be good to start thinking about 
how to change this. probably we should skip partitioning/sorting if there's a 
values clause or a limit < 100 or so.


http://gerrit.cloudera.org:8080/#/c/6559/3/fe/src/main/java/org/apache/impala/planner/Planner.java
File fe/src/main/java/org/apache/impala/planner/Planner.java:

PS3, Line 540: !ctx_.isSingleNodeExec() || insertStmt.hasClusteredHint()
I don't understand this logic about the clustered hint.

I think we want to not care about clustered/sortby for Kudu tables, right? At 
least after this change that would be the behavior.

Why handle !isSingleNodeExec here but not for the not-kudu case?

If we are going to have different cases, we should have planner tests for them. 
Hopefully we can just simplify though.


-- 
To view, visit http://gerrit.cloudera.org:8080/6559
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I84ce0032a1b10958fdf31faef225372c5c38fdc4
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Thomas Tauber-Marshall <tmarsh...@cloudera.com>
Gerrit-Reviewer: Dimitris Tsirogiannis <dtsirogian...@cloudera.com>
Gerrit-Reviewer: Matthew Jacobs <m...@cloudera.com>
Gerrit-Reviewer: Thomas Tauber-Marshall <tmarsh...@cloudera.com>
Gerrit-HasComments: Yes

Reply via email to