Thomas Tauber-Marshall has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14677 )

Change subject: IMPALA-9137, IMPALA-9138: Mark failed RPCs as retryable and 
update blacklist
......................................................................


Patch Set 5:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/14677/3/common/thrift/Common.thrift
File common/thrift/Common.thrift:

http://gerrit.cloudera.org:8080/#/c/14677/3/common/thrift/Common.thrift@1
PS3, Line 1: // Licensed to the Apache Software Foundation (ASF) under one
> Types.thrift contains a bunch of thrift structs that aren't relevant to Sta
Not really sure I see any benefit - I guess it very slightly reduces 
recompilation time when people make changes to Types.thrift that don't affect 
Status.thrift, though how frequent is that really, and its not really any 
cleaner imo since now we have to answer the question of what belongs in Common 
vs. Types, which isn't clear - and its nice to avoid unnecessary code churn, 
but obviously not that big of a deal.

Imo, if anything the refactoring that we need is to make the thrift and 
protobuf definitions of this stuff more consistent, i.e. add a status.proto 
instead of defining it in common.proto, but again not necessarily worth the 
code churn.


http://gerrit.cloudera.org:8080/#/c/14677/3/common/thrift/Status.thrift
File common/thrift/Status.thrift:

http://gerrit.cloudera.org:8080/#/c/14677/3/common/thrift/Status.thrift@48
PS3, Line 48:
> Agree those use cases are important. I think the current mechanism will wor
Agreed that a string->string map was a bad suggestion, just throwing stuff out 
there.

Idk about this RetryInfo approach, though. I'm concerned that it doesn't really 
map to the existing semantics of what a Status is (a small object that 
indicates success/failure with a message about the cause of failure), and that 
as a result going this direction is awkward and potentially error-prone.

For example, you seem to be assuming that at any random point in the code we 
can construct a Status object with a RetryInfo as part of it, and that will be 
bubbled up to become the ultimate query status that is returned to the 
coordinator.

However, its currently a common pattern for code to call some function, get an 
error status back, and then instead of actually returning that Status object it 
constructs a new Status object, usually with an error message that contains the 
error message of the original error Status. In a case like this, if the 
original Status has a RetryInfo it was probably lost when the new Status was 
created.

Of course, we can go through and make sure that all of the code paths where we 
generate RetryInfos do in fact bubble that Status up to become the overall 
query status, but again that seems error prone and means that the semantics of 
Status objects are sort of different in different places, i.e. sometimes a 
Status is just a potential indicator of some error that can be handled and 
discarded by the calling code, sometimes its an <Important_Query_Status> that 
must be preserved and returned to the coordinator.



--
To view, visit http://gerrit.cloudera.org:8080/14677
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I733cca13847fde43c8ea2ae574d3ae04bd06419c
Gerrit-Change-Number: 14677
Gerrit-PatchSet: 5
Gerrit-Owner: Sahil Takiar <stak...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Gerrit-Reviewer: Michael Ho <k...@cloudera.com>
Gerrit-Reviewer: Sahil Takiar <stak...@cloudera.com>
Gerrit-Reviewer: Thomas Tauber-Marshall <tmarsh...@cloudera.com>
Gerrit-Comment-Date: Mon, 25 Nov 2019 23:09:37 +0000
Gerrit-HasComments: Yes

Reply via email to