[
https://issues.apache.org/jira/browse/CURATOR-584?focusedWorklogId=544082&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-544082
]
ASF GitHub Bot logged work on CURATOR-584:
------------------------------------------
Author: ASF GitHub Bot
Created on: 29/Jan/21 07:41
Start Date: 29/Jan/21 07:41
Worklog Time Spent: 10m
Work Description: eolivelli commented on a change in pull request #376:
URL: https://github.com/apache/curator/pull/376#discussion_r566630026
##########
File path:
curator-framework/src/main/java/org/apache/curator/framework/imps/SetDataBuilderImpl.java
##########
@@ -210,6 +230,47 @@ public Stat forPath(String path) throws Exception
return this;
}
+ private void backgroundCheckIdempotent(final CuratorFrameworkImpl client,
final OperationAndData<PathAndBytes> mainOperationAndData, final String path,
final Backgrounding backgrounding)
+ {
+ final AsyncCallback.DataCallback dataCallback = new
AsyncCallback.DataCallback()
+ {
+ @Override
+ public void processResult(int rc, String path, Object ctx, byte[]
data, Stat stat)
+ {
+ if ( rc == KeeperException.Code.OK.intValue() )
+ {
+ if ( failNextIdempotentCheckForTesting )
+ {
+ failNextIdempotentCheckForTesting = false;
+ rc =
KeeperException.Code.CONNECTIONLOSS.intValue();
+ }
+ else if ( !idempotentSetMatches(stat.getVersion(),
mainOperationAndData.getData().getData(), data) )
+ {
+ rc = KeeperException.Code.BADVERSION.intValue();
+ }
+ }
+ CuratorEvent event = new CuratorEventImpl(client,
CuratorEventType.SET_DATA, rc, path, null, ctx, stat, null, null, null, null,
null);
+ client.processBackgroundOperation(mainOperationAndData,
event);
+ }
+ };
+ BackgroundOperation<PathAndBytes> operation = new
BackgroundOperation<PathAndBytes>()
+ {
+ @Override
+ public void
performBackgroundOperation(OperationAndData<PathAndBytes> op) throws Exception
+ {
+ try
+ {
+ client.getZooKeeper().getData(path, false, dataCallback,
backgrounding.getContext());
+ }
+ catch ( KeeperException e )
+ {
+ // ignore
Review comment:
log ?
##########
File path:
curator-framework/src/main/java/org/apache/curator/framework/imps/CreateBuilderImpl.java
##########
@@ -807,6 +826,53 @@ public void
performBackgroundOperation(OperationAndData<PathAndBytes> op) throws
client.queueOperation(new OperationAndData<>(operation, null, null,
null, null, null));
}
+ private void backgroundCheckIdempotent(final CuratorFrameworkImpl client,
final OperationAndData<PathAndBytes> mainOperationAndData, final String path,
final Backgrounding backgrounding)
+ {
+ final AsyncCallback.DataCallback dataCallback = new
AsyncCallback.DataCallback()
+ {
+ @Override
+ public void processResult(int rc, String path, Object ctx, byte[]
data, Stat stat)
+ {
+ if ( rc == KeeperException.Code.NONODE.intValue() )
+ {
+ client.queueOperation(mainOperationAndData); // try to
create it again
+ }
+ else
+ {
+ if ( rc == KeeperException.Code.OK.intValue() )
+ {
+ if ( failNextIdempotentCheckForTesting )
+ {
+ failNextIdempotentCheckForTesting = false;
+ rc =
KeeperException.Code.CONNECTIONLOSS.intValue();
+ }
+ else if ( !IdempotentUtils.matches(0,
mainOperationAndData.getData().getData(), stat.getVersion(), data) )
+ {
+ rc = KeeperException.Code.NODEEXISTS.intValue();
+ }
+ }
+ sendBackgroundResponse(rc, path, ctx, path, stat,
mainOperationAndData);
+ }
+ }
+ };
+ BackgroundOperation<PathAndBytes> operation = new
BackgroundOperation<PathAndBytes>()
+ {
+ @Override
+ public void
performBackgroundOperation(OperationAndData<PathAndBytes> op) throws Exception
+ {
+ try
+ {
+ client.getZooKeeper().getData(path, false, dataCallback,
backgrounding.getContext());
+ }
+ catch ( KeeperException e )
+ {
+ // ignore
Review comment:
log ?
##########
File path: pom.xml
##########
@@ -68,7 +68,7 @@
<!-- versions -->
<zookeeper-version>3.6.0</zookeeper-version>
- <maven-bundle-plugin-version>4.1.0</maven-bundle-plugin-version>
+ <maven-bundle-plugin-version>5.1.1</maven-bundle-plugin-version>
Review comment:
It should be fine to make the upgrade
##########
File path:
curator-framework/src/main/java/org/apache/curator/framework/imps/IdempotentUtils.java
##########
@@ -0,0 +1,53 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements. See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership. The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied. See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.curator.framework.imps;
+
+/*
+ * Utilties Class for idempotent operatons.
+ */
+class IdempotentUtils
+{
+
+ /**
+ * Returns whether the version and data currently in the node match what
would be expected in the idempotent retry case.
+ */
+ static boolean matches(int expectedVersion, byte[] expectedData, int
actualVersion, byte[] actualData)
+ {
+ return expectedVersion == actualVersion && matches(expectedData,
actualData);
+ }
+
+ /**
+ * Returns whether the data currently in the node matches what would be
expected in the idempotent retry case (ignoring version).
+ */
+ static boolean matches(byte[] expectedData, byte[] actualData)
+ {
+ if ( expectedData.length != actualData.length )
Review comment:
can we use java.util.Arrays.equals ?
##########
File path:
curator-framework/src/main/java/org/apache/curator/framework/imps/SetDataBuilderImpl.java
##########
@@ -51,6 +62,8 @@ public SetDataBuilderImpl(CuratorFrameworkImpl client,
Backgrounding backgroundi
this.backgrounding = backgrounding;
this.version = version;
this.compress = compress;
+ // TODO jslocum - this is only public for x-async api. Should I add
idempotent there too?
Review comment:
can we link a JIRA ?
##########
File path:
curator-framework/src/main/java/org/apache/curator/framework/imps/CreateBuilderImpl.java
##########
@@ -87,6 +93,7 @@ public CreateBuilderImpl(CuratorFrameworkImpl client,
CreateMode createMode, Bac
this.createParentsAsContainers = createParentsAsContainers;
this.compress = compress;
this.setDataIfExists = setDataIfExists;
+ this.idempotent = false; // TODO should I add this to async framework
as well? Looks like that's the reason this is a public constructor
Review comment:
can we please remove TODO and create an issue ?
or we can leave here the TODO but put the link to the JIRA
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
Issue Time Tracking
-------------------
Worklog Id: (was: 544082)
Time Spent: 2h 50m (was: 2h 40m)
> Curator Client Fault Tolerance Extensions
> -----------------------------------------
>
> Key: CURATOR-584
> URL: https://issues.apache.org/jira/browse/CURATOR-584
> Project: Apache Curator
> Issue Type: Improvement
> Reporter: Josh Slocum
> Assignee: Jordan Zimmerman
> Priority: Minor
> Time Spent: 2h 50m
> Remaining Estimate: 0h
>
> Tl;dr My team at Indeed has developed ZooKeeper functionality to handle
> stateful retrying of connectionloss for write operations, and we wanted to
> reach out to discuss if this is something the Curator team may be interested
> in incorporating.
> We initially reached out to the Zookeeper team
> (https://issues.apache.org/jira/browse/ZOOKEEPER-3927) but were redirected to
> Curator as the better place to contribute them. The changes could be
> relatively easily added as additional parameters and/or extensions of the
> existing retry behavior in Curator's write operations.
>
> Hi Curator Devs,
> My team uses zookeeper extensively as part of a distributed key-value store
> we've built at Indeed (think HBase replacement). Due to our deployment setup
> co-locating our database daemons with our large hadoop cluster, and the
> network-intensive nature of a lot of our compute jobs, we were experiencing a
> large amount of transient ConnectionLoss issues. This was especially
> problematic on important write operations, such as the creation deletion of
> distributed locks/leases or updating distributed state in the cluster.
> We saw that some existing zookeeper client wrappers handled retrying in the
> presence of ConnectionLoss, but all of the ones we looked at (including
> Curator) didn't allow for retrying writes wiith all of the proper state.
> Consider the case of retrying a create. If the initial create had succeeded
> on the server, but the client got connectionloss, the client would get a
> NodeExists exception on the retried request, even though the znode was
> created. This resulted in many issues. For the distributed lock/lease
> example, to other nodes, it looked like the calling node had been successful
> acquiring the "lock", and to the calling node, it appeared that it was not
> able to acquire the "lock", which results in a deadlock.
> Curator has parameters that can modify the behavior upon retry, but those
> were not sufficient. For example, create() has orSetData(), and delete() has
> guaranteed().
> To solve this, we implemented a set of "connection-loss tolerant primitives"
> for the main types of write operations. They handle a connection loss by
> retrying the operation in a loop, but upon error cases in the retry, inspect
> the current state to see if it matches the case where a previous round that
> got connectionloss actually succeeded.
> * createRetriable(String path, byte[] data)
> * setDataRetriable(String path, byte[] newData, int currentVersion)
> * deleteRetriable(String path, int currentVersion)
> * compareAndDeleteRetriable(String path, byte[] currentData, int
> currentVersion)
> For example, in createRetriable, it will retry the create again on connection
> loss. If the retried call gets a NodeExists exception, it will check to see
> if (getData(path) == data and dataVersion == 0). If it does, it assumes the
> first create succeeded and returns success, otherwise it propagates the
> NodeExists exception.
> These primitives have allowed us to program our ZooKeeper layer as if
> ConnectionLoss isn't a transient state we have to worry about, since they
> have essentially the same guarantees as the non-retriable functions in the
> zookeeper api do (with a slight difference in semantics).
> Because these behaviors could be relatively easily added to Curator as
> additional parameters to the existing mechanisms, and (to my knowledge)
> aren't implemented anywhere else, we think it could be a useful contribution
> to the Curator project. If this isn't something that Curator is interested in
> incorporating, Indeed may also consider open sourcing it as a standalone
> library.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)