Hi All, Per my discussion with Enrico, I wrote up a Google Doc with a Draft of a blog post for idempotent writes in Curator 5.2.0: https://docs.google.com/document/d/1867mkraRzat3fueYm5BR2ihzma4f_zgywqDH0msQLu0/edit?usp=sharing
If anyone has comments or suggestions, feel free to comment them on the Google Doc. Thanks, Josh On Tue, Jul 27, 2021 at 9:58 AM Enrico Olivelli (Jira) <[email protected]> wrote: > > [ > https://issues.apache.org/jira/browse/CURATOR-584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17388123#comment-17388123 > ] > > Enrico Olivelli commented on CURATOR-584: > ----------------------------------------- > > Great ! > > I usually use Medium [https://medium.com/@eolivelli] to post my blogs. > If you want to write up a Google Doc I will be happy to review it and help > you get it published (or co-authored) > > you can reach me out directly to my apache address [email protected], > or better we can interact on [[email protected]|mailto: > [email protected]] ([https://curator.apache.org/mailing-lists.html)] > > > Curator Client Fault Tolerance Extensions > > ----------------------------------------- > > > > Key: CURATOR-584 > > URL: https://issues.apache.org/jira/browse/CURATOR-584 > > Project: Apache Curator > > Issue Type: Improvement > > Reporter: Josh Slocum > > Assignee: Enrico Olivelli > > Priority: Minor > > Fix For: 5.2.0 > > > > Time Spent: 7h 40m > > Remaining Estimate: 0h > > > > Tl;dr My team at Indeed has developed ZooKeeper functionality to handle > stateful retrying of connectionloss for write operations, and we wanted to > reach out to discuss if this is something the Curator team may be > interested in incorporating. > > We initially reached out to the Zookeeper team ( > https://issues.apache.org/jira/browse/ZOOKEEPER-3927) but were redirected > to Curator as the better place to contribute them. The changes could be > relatively easily added as additional parameters and/or extensions of the > existing retry behavior in Curator's write operations. > > > > Hi Curator Devs, > > My team uses zookeeper extensively as part of a distributed key-value > store we've built at Indeed (think HBase replacement). Due to our > deployment setup co-locating our database daemons with our large hadoop > cluster, and the network-intensive nature of a lot of our compute jobs, we > were experiencing a large amount of transient ConnectionLoss issues. This > was especially problematic on important write operations, such as the > creation deletion of distributed locks/leases or updating distributed state > in the cluster. > > We saw that some existing zookeeper client wrappers handled retrying in > the presence of ConnectionLoss, but all of the ones we looked at (including > Curator) didn't allow for retrying writes wiith all of the proper state. > Consider the case of retrying a create. If the initial create had succeeded > on the server, but the client got connectionloss, the client would get a > NodeExists exception on the retried request, even though the znode was > created. This resulted in many issues. For the distributed lock/lease > example, to other nodes, it looked like the calling node had been > successful acquiring the "lock", and to the calling node, it appeared that > it was not able to acquire the "lock", which results in a deadlock. > > Curator has parameters that can modify the behavior upon retry, but > those were not sufficient. For example, create() has orSetData(), and > delete() has guaranteed(). > > To solve this, we implemented a set of "connection-loss tolerant > primitives" for the main types of write operations. They handle a > connection loss by retrying the operation in a loop, but upon error cases > in the retry, inspect the current state to see if it matches the case where > a previous round that got connectionloss actually succeeded. > > * createRetriable(String path, byte[] data) > > * setDataRetriable(String path, byte[] newData, int currentVersion) > > * deleteRetriable(String path, int currentVersion) > > * compareAndDeleteRetriable(String path, byte[] currentData, int > currentVersion) > > For example, in createRetriable, it will retry the create again on > connection loss. If the retried call gets a NodeExists exception, it will > check to see if (getData(path) == data and dataVersion == 0). If it does, > it assumes the first create succeeded and returns success, otherwise it > propagates the NodeExists exception. > > These primitives have allowed us to program our ZooKeeper layer as if > ConnectionLoss isn't a transient state we have to worry about, since they > have essentially the same guarantees as the non-retriable functions in the > zookeeper api do (with a slight difference in semantics). > > Because these behaviors could be relatively easily added to Curator as > additional parameters to the existing mechanisms, and (to my knowledge) > aren't implemented anywhere else, we think it could be a useful > contribution to the Curator project. If this isn't something that Curator > is interested in incorporating, Indeed may also consider open sourcing it > as a standalone library. > > > > -- > This message was sent by Atlassian Jira > (v8.3.4#803005) >
