This is nothing different in any network based system. Like nfs. So we need to have some kind of logic on the client side to make reasonable assumption. May not be perfect for negative testing.
JV On Mon, Jun 26, 2017 at 11:19 PM Sijie Guo <guosi...@gmail.com> wrote: > Hi Sam, > > Let's assume there is no such retry logic. How do you expect to handle this > situation? > > Can your application try to create a new ledger or catch NodeExists > exception? > > - Sijie > > On Mon, Jun 26, 2017 at 5:49 PM, Sam Just <sj...@salesforce.com> wrote: > > > Hmm, curator seems to have essentially the same problem: > > https://issues.apache.org/jira/browse/CURATOR-268 > > I'm not sure there's a good way to solve this transparently -- the right > > answer is > > probably to plumb the ConnectionLoss event through ZooKeeperClient for > > interested callers, audit for metadata users where we depend on > atomicity, > > and update each one to handle it appropriately. > > -Sam > > > > On Mon, Jun 26, 2017 at 4:34 PM, Sam Just <sj...@salesforce.com> wrote: > > > > > BookKeeper has a wrapper class for the ZooKeeper client called > > > ZooKeeperClient. > > > Its purpose appears to be to transparently perform retries in the case > > that > > > ZooKeeper returns ConnectionLoss on an operation due to a Disconnect > > event. > > > > > > The trouble is that it's possible that a write which received a > > > ConnectionLoss > > > error as a return value actually succeeded. Once ZooKeeperClient > > retries, > > > it'll > > > get back NodeExists and propagate that error to the caller, even though > > the > > > write succeeded and the node in fact did not exist. > > > > > > It seems as though the same issue would hold for setData and delete > calls > > > which > > > use the version argument -- you could get a spurious BadVersion. > > > > > > I believe I've reproduced the variant with a spurious NodeExists. It > > > manifested as a suprious BKLedgerExistException when running against a > 3 > > > instance ZooKeeper cluster with dm-delay under the ZooKeeper instance > > > storage > > > to force Disconnect events by injecting write delays. This seems to > make > > > sense > > > as AbstractZkLedgerManager.createLedgerMetadata uses > > > ZkUtils.asyncCreateFullPathOptimistic to create the metadata node and > > > appears > > > to depend on the create atomicity to avoid two writers overwriting each > > > other's > > > nodes. > > > > > > AbstractZkLedgerManager.writeLedger would seem to have the same problem > > > with > > > its dependence on using setData with the version argument to avoid lost > > > updates. > > > > > > Am I missing something in this analysis? It seems to me that behavior > > > could > > > be actually problematic during periods of spotty connectivity to the > > > ZooKeeper cluster. > > > > > > Thanks! > > > -Sam > > > > > > -- Sent from iPhone