Matthew Jacobs has posted comments on this change.

Change subject: IMPALA-3710: Kudu DML should ignore conflicts, pt2
......................................................................


Patch Set 1:

(7 comments)

http://gerrit.cloudera.org:8080/#/c/4985/1/be/src/exec/data-sink.cc
File be/src/exec/data-sink.cc:

Line 119:     if (!dst_stats->__isset.kudu_stats) 
dst_stats->__set_kudu_stats(TKuduDmlStats());
> pass 0 in c'tor of TKuduDmlStats?
I can't bc there's no parameter, I think because I use optional fields.


Line 161:       ss << "NumRowErrors: " << stats.kudu_stats.num_row_errors << 
endl;
> General question: Is there an easy way to distinguish the different types o
It is unfortunate and probably not helpful enough, but I'm very worried about 
us presenting incorrect information to the user which I think is worse. The 
issue is that we can't trust the Kudu error codes enough. E.g. they re-use 
error codes for some different cases where I'm not confident enough that we'll 
always get it right. My goal is to work with them for the next release to get a 
more concrete contract. I was on the fence about having different Impala error 
codes at all, but I think it's worthwhile for de-duping the logs. If you think 
this is confusing, I'd prefer to just remove the warning logging w/ different 
error codes.


http://gerrit.cloudera.org:8080/#/c/4985/1/be/src/exec/kudu-table-sink.cc
File be/src/exec/kudu-table-sink.cc:

Line 214:           COUNTER_ADD(num_row_errors_, 1);
> This is expensive right? How about we update a local counter and COUNTER_AD
Done


Line 305:     if (!e.IsNotFound() && !e.IsAlreadyPresent() && status.ok()) {
> Why not move to L325 into an else if?
Done


Line 310:       // Kudu does not have yet have a way to programmatically 
differentiate between 'row
> few words mixed up
Done


http://gerrit.cloudera.org:8080/#/c/4985/1/common/thrift/ImpalaInternalService.thrift
File common/thrift/ImpalaInternalService.thrift:

Line 430: // TODO: Refactor to reflect usage by other DML statements.
> Do you intend to do that soon? The structures are a little confusing now.
Yeah I'll go ahead and do this if you think the strategy is otherwise OK.


Line 445:   // The latest observed Kudu timestamp reported by the KuduSession 
at this partition.
> is partition == table here?
Hm yeah I guess it's not great to call this a partition... Table I think might 
be confusing too since there are many of these inserting across the cluster for 
the same table. How about I just remove 'at this partition'?


-- 
To view, visit http://gerrit.cloudera.org:8080/4985
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I4eb1ad91dc355ea51de261c3a14df0f9d28c879c
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Matthew Jacobs <m...@cloudera.com>
Gerrit-Reviewer: Alex Behm <alex.b...@cloudera.com>
Gerrit-Reviewer: Matthew Jacobs <m...@cloudera.com>
Gerrit-HasComments: Yes

Reply via email to