Alexey Serbin has posted comments on this change. Change subject: KUDU-1034 client does not failover due to timeout ......................................................................
Patch Set 1: (2 comments) http://gerrit.cloudera.org:8080/#/c/6924/1/src/kudu/rpc/retriable_rpc.h File src/kudu/rpc/retriable_rpc.h: Line 217: server_picker_->MarkServerFailed(server, result.status); > In the case of a timeout, that still might be a good idea, no? It's not 100 I think timeout is considered as non-retriable error. Given current behavior of the client (refreshing its meta-cache in case of absence of usable server for an RPC), marking a timed-out server would do no harm. Line 217: server_picker_->MarkServerFailed(server, result.status); > OK now I remember the issue I had when looking at this issue previously. I As I understand, regardless of whether the client aware of particular replica or not, marking a server failed will make the client to switch to a different tablet server in the scope of this particular RPC. Also, most of the errors in this context (except, may be, REPLICA_NOT_LEADER) have semantics of 'mark the whole server out': connection timeout, server too busy, invalid authn token. Yes, all the tablets will be marked as non-accessible when using MarkServerFailed(), but the client will refresh its meta-cache if it ended up with no active replica, right? E.g., take a look at handling REPLICA_NOT_LEADER error code. It might happen that by the time client marks the server as failed it becomes the leader. So, in that case the client will end up in having no servers in its metacache, and will refresh it to get a new leader. -- To view, visit http://gerrit.cloudera.org:8080/6924 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Icfcece485e4053d921ffdc865612b3e7b9a992a3 Gerrit-PatchSet: 1 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Alexey Serbin <aser...@cloudera.com> Gerrit-Reviewer: Alexey Serbin <aser...@cloudera.com> Gerrit-Reviewer: David Ribeiro Alves <davidral...@gmail.com> Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy <mpe...@apache.org> Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon <t...@apache.org> Gerrit-HasComments: Yes