Matthew Jacobs has posted comments on this change. Change subject: IMPALA-3742: Partitions and sort INSERTs for Kudu tables ......................................................................
Patch Set 3: (11 comments) http://gerrit.cloudera.org:8080/#/c/6559/3/be/src/exec/kudu-util.cc File be/src/exec/kudu-util.cc: Line 110: void* value, bool copy_strings) { can you add a TODO that this needs to be codegen'd ? http://gerrit.cloudera.org:8080/#/c/6559/3/be/src/exec/kudu-util.h File be/src/exec/kudu-util.h: PS3, Line 62: row into memory owned by the row. If false, string data must remain valid while the row is being used. http://gerrit.cloudera.org:8080/#/c/6559/2/be/src/exprs/kudu-partition-expr.cc File be/src/exprs/kudu-partition-expr.cc: PS2, Line 51: ast<KuduTableDes > I haven't rebased in awhile and don't have this locally, and rebasing is pr ok, can you put a TODO here and try to get to it before this gets checked in? http://gerrit.cloudera.org:8080/#/c/6559/3/be/src/exprs/kudu-partition-expr.cc File be/src/exprs/kudu-partition-expr.cc: PS3, Line 55: RETURN_IF_ERROR(CreateKuduClient(kudu_table_desc->kudu_master_addresses(), &client)); let's use the query state kudu client Line 84: DCHECK(s.ok()); comment here and above why we're dchecking s.ok() http://gerrit.cloudera.org:8080/#/c/6559/3/be/src/runtime/data-stream-sender.h File be/src/runtime/data-stream-sender.h: PS3, Line 109: kudu_ I think this may be better off avoiding mentioning kudu directly, e.g. calling this has_sink_partition_expr_ and adding a comment PS3, Line 127: partition_expr_ctx_ this needs explanation as well. Looking at the code in the .cc, it's weird having both this and partition_expr_ctxs_ How about calling this sink_partition_expr_ctx_ (with a comment) and the above to data_partition_expr_ctxs_ That's more in line with our changes to the thrift defs and separating logical vs. physical. http://gerrit.cloudera.org:8080/#/c/6559/3/common/thrift/Exprs.thrift File common/thrift/Exprs.thrift: Line 134: 2: required list<Types.TPrimitiveType> types nit newline after this http://gerrit.cloudera.org:8080/#/c/6559/3/fe/src/main/java/org/apache/impala/analysis/KuduPartitionExpr.java File fe/src/main/java/org/apache/impala/analysis/KuduPartitionExpr.java: PS3, Line 76: if (!children_.get(i).getType().equals(partitionKeyTypes_.get(i))) { : children_.get(i).uncheckedCastTo(partitionKeyTypes_.get(i)); : } @Marcel, is there a cleaner way to ensure the partition exprs are cast to the target table type? http://gerrit.cloudera.org:8080/#/c/6559/3/fe/src/main/java/org/apache/impala/planner/DistributedPlanner.java File fe/src/main/java/org/apache/impala/planner/DistributedPlanner.java: PS3, Line 219: TODO: consider making a cost based decision for Kudu tables. : if (!insertStmt.hasShuffleHint() : && !(insertStmt.getTargetTable() instanceof KuduTable) we should address this before the release. it'd be good to start thinking about how to change this. probably we should skip partitioning/sorting if there's a values clause or a limit < 100 or so. http://gerrit.cloudera.org:8080/#/c/6559/3/fe/src/main/java/org/apache/impala/planner/Planner.java File fe/src/main/java/org/apache/impala/planner/Planner.java: PS3, Line 540: !ctx_.isSingleNodeExec() || insertStmt.hasClusteredHint() I don't understand this logic about the clustered hint. I think we want to not care about clustered/sortby for Kudu tables, right? At least after this change that would be the behavior. Why handle !isSingleNodeExec here but not for the not-kudu case? If we are going to have different cases, we should have planner tests for them. Hopefully we can just simplify though. -- To view, visit http://gerrit.cloudera.org:8080/6559 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I84ce0032a1b10958fdf31faef225372c5c38fdc4 Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Thomas Tauber-Marshall <tmarsh...@cloudera.com> Gerrit-Reviewer: Dimitris Tsirogiannis <dtsirogian...@cloudera.com> Gerrit-Reviewer: Matthew Jacobs <m...@cloudera.com> Gerrit-Reviewer: Thomas Tauber-Marshall <tmarsh...@cloudera.com> Gerrit-HasComments: Yes