Mike Percy has posted comments on this change.

Change subject: Add a design doc for rpc retry/failover semantics
......................................................................


Patch Set 4:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/2642/4/docs/design-docs/rpc-retry-and-failover.md
File docs/design-docs/rpc-retry-and-failover.md:

Line 119: each client request is
        : recorded by the consensus log almost as is so it wouldn't be 
problematic to additionally store
        : the client id and request seq no. so when a write is "consensus 
committed" all future handlers
        : of that write (future leaders) will automatically be able to identify 
the client and request
> Did you see the last rev? I've updated milestone 1 to make even clearer tha
Thanks, I was reading via email and so I missed your response to the general 
comments because it came in a 2nd reply.

The design is starting to clarify for me. I think something else that is 
missing is a list of errors that the replay cache will *not* handle. One 
example: If the seqno is auto-assigned by the client library, then in the case 
of a partial timeout, how can we manually retry inserts with the same seqno?

Likewise, say the cluster is rebooted. How can we maintain cache consistency 
across reboots when we are not durably storing the result of an operation?

Maybe neither of these situations is handled. If so, let's make sure to call it 
out in the design as a tradeoff.

I think we are also missing specific design goals, for example which errors 
specifically are we trying to handle? The most important one is writes, but the 
corner cases on writes are many. I think we want to handle the following: 1. 
Leader commits and then crashes, we retry on the new leader and get the cached 
result. 2. We get a client timeout, retry and get the operation result on the 
same leader if it completed. (however this second use case is complicated by 
the seqno thing above)


-- 
To view, visit http://gerrit.cloudera.org:8080/2642
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Idc2aa40486153b39724e1c9bd09c626b829274c6
Gerrit-PatchSet: 4
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: David Ribeiro Alves <[email protected]>
Gerrit-Reviewer: Adar Dembo <[email protected]>
Gerrit-Reviewer: Dan Burkert <[email protected]>
Gerrit-Reviewer: David Ribeiro Alves <[email protected]>
Gerrit-Reviewer: Jean-Daniel Cryans
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <[email protected]>
Gerrit-Reviewer: Todd Lipcon <[email protected]>
Gerrit-HasComments: Yes

Reply via email to