[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2619?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15602873#comment-15602873
 ] 

Diego Ongaro commented on ZOOKEEPER-2619:
-----------------------------------------

Thanks for the quick replies.

bq. In your first example, you're mixing the synchronous and the asynchronous 
API. We don't actually enforce that an application uses one or the other, but 
we do not recommend to mix synchronous and asynchronous calls precisely because 
of the reasons you're raising.

In my three-line example, it doesn't matter whether the second create is 
synchronous (mixing) or asynchronous (not mixing). Isn't it fine to issue a 
series of asynchronous calls ending with a synchronous call?

Summarizing the *{{reenableOps()}}* proposal so far, we have:
 - A new configuration option (say, "requireReenableOps") for an application to 
opt into calling {{reenableOps()}}. This would preserve backwards compatibility 
for existing clients, so the default would be to keep today's behavior.
 - Upon a connection loss, the client library would still reconnect 
automatically and issue pings as needed. However, with the configuration option 
set, it would continue to fail all requests (both synchronous and asynchronous).
 - After the application calls {{reenableOps()}}, the client library would 
permit new requests to use the new connection.

This would work, but in terms of API design, I can think of a couple of 
drawbacks:

- If we consider an application that's spread across various modules and/or 
thread boundaries, it has to be carefully designed with respect to 
{{reenableOps()}}, as this is a global property. I think this is what [~breed] 
was getting at with "it affects everything using the zookeeper handle". 
Specifically, every module/thread must be ready for a call to {{reenableOps()}} 
before it is safe to invoke, and until such time, the application won't make 
much progress.

- Most asynchronous code will need to know whether it's running under 
requireReenableOps=false or requireReenableOps=true. This might complicate a 
reusable library's internals and API, which may need to have different behavior 
depending on this setting.

The *{{getConnection()}}* proposal doesn't have these problems. Some modules of 
an application can continue issuing synchronous calls on the ZooKeeper object, 
others can issue (deprecated) asynchronous calls on the ZooKeeper object, and 
yet others can use ZKConnection objects, moving to new connections at their own 
pace with no application-level synchronization.

bq. it sounds to me that getConnection() and reenableOps() are basically the 
same. right? or are you proposing that when you get a ZKConnection object you 
can invoke the zookeeper operations on that?

You would invoke the operations directly on the ZKConnection object.

I do like the idea of the library automatically reconnecting for pings, and we 
could apply that to the {{getConnection()}} proposal too:
- The library would reconnect automatically to issue pings, as it does now.
- {{ZooKeeper.getConnection()}} would return the current ZKConnection object, 
which wraps the current TCP connection.
- Upon a connection loss, all subsequent operations on the same ZKConnection 
would return a Connection Loss error. The library would internally reconnect 
automatically, but it would not make this TCP connection availble via the old 
ZKConnection.
- Upon noticing a connection loss, the application would need to call 
{{ZooKeeper.getConnection()}} again to get the new ZKConnection object.

The ZKConnection object certainly needs to export the asynchronous methods that 
are in ZooKeeper today. I think it should also export the synchronous methods. 
I think these are both well-defined and useful in ZKConnection, as in my 
original example with {{createAsync()}} followed by {{createSync()}}.

I think the ZooKeeper class should continue to export non-deprecated 
synchronous methods as well. These would just be convenience wrappers that 
invoke {{getConnection()}} followed by the synchronous operation on that. Much 
ZooKeeper code is entirely synchronous, and as such, does not depend on FIFO 
client order. I see no reason to force churn upon that code.

I still think the asynchronous calls on the ZooKeeper class should be 
deprecated, since applications cannot rely on FIFO client order when using 
them. At a minimum, these should have a very clear warning to this effect, and 
applications should be encouraged to get a ZKConnection and then make 
asynchronous calls on that.


> Client library reconnecting breaks FIFO client order
> ----------------------------------------------------
>
>                 Key: ZOOKEEPER-2619
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2619
>             Project: ZooKeeper
>          Issue Type: Bug
>            Reporter: Diego Ongaro
>
> According to the USENIX ATC 2010 
> [paper|https://www.usenix.org/conference/usenix-atc-10/zookeeper-wait-free-coordination-internet-scale-systems],
>  ZooKeeper provides "FIFO client order: all requests from a given client are 
> executed in the order that they were sent by the client." I believe 
> applications written using the Java client library are unable to rely on this 
> guarantee, and any current application that does so is broken. Other client 
> libraries are also likely to be affected.
> Consider this application, which is simplified from the algorithm described 
> on Page 4 (right column) of the paper:
> {code}
>   zk = new ZooKeeper(...)
>   zk.createAsync("/data-23857", "...", callback)
>   zk.createSync("/pointer", "/data-23857")
> {code}
> Assume an empty ZooKeeper database to begin with and no other writers. 
> Applying the above definition, if the ZooKeeper database contains /pointer, 
> it must also contain /data-23857.
> Now consider this series of unfortunate events:
> {code}
>   zk = new ZooKeeper(...)
>   // The library establishes a TCP connection.
>   zk.createAsync("/data-23857", "...", callback)
>   // The library/kernel closes the TCP connection because it times out, and
>   // the create of /data-23857 is doomed to fail with ConnectionLoss. Suppose
>   // that it never reaches the server.
>   // The library establishes a new TCP connection.
>   zk.createSync("/pointer", "/data-23857")
>   // The create of /pointer succeeds.
> {code}
> That's the problem: subsequent operations get assigned to the new connection 
> and succeed, while earlier operations fail.
> In general, I believe it's impossible to have a system with the following 
> three properties:
>  # FIFO client order for asynchronous operations,
>  # Failing operations when connections are lost, AND
>  # Transparently reconnecting when connections are lost.
> To argue this, consider an application that issues a series of pipelined 
> operations, then upon noticing a connection loss, issues a series of recovery 
> operations, repeating the recovery procedure as necessary. If a pipelined 
> operation fails, all subsequent operations in the pipeline must also fail. 
> Yet the client must also carry on eventually: the recovery operations cannot 
> be trivially failed forever. Unfortunately, the client library does not know 
> where the pipelined operations end and the recovery operations begin. At the 
> time of a connection loss, subsequent pipelined operations may or may not be 
> queued in the library; others might be upcoming in the application thread. If 
> the library re-establishes a connection too early, it will send pipelined 
> operations out of FIFO client order.
> I considered a possible workaround of having the client diligently check its 
> callbacks and watchers for connection loss events, and do its best to stop 
> the subsequent pipelined operations at the first sign of a connection loss. 
> In addition to being a large burden for the application, this does not solve 
> the problem all the time. In particular, if the callback thread is delayed 
> significantly (as can happen due to excessive computation or scheduling 
> hiccups), the application may not learn about the connection loss event until 
> after the connection has been re-established and after dependent pipelined 
> operations have already been transmitted over the new connection.
> I suggest the following API changes to fix the problem:
>  - Add a method ZooKeeper.getConnection() returning a ZKConnection object. 
> ZKConnection would wrap a TCP connection. It would include all synchronous 
> and asynchronous operations currently defined on the ZooKeeper class. Upon a 
> connection loss on a ZKConnection, all subsequent operations on the same 
> ZKConnection would return a Connection Loss error. Upon noticing, the client 
> would need to call ZooKeeper.getConnection() again to get a working 
> ZKConnection object, and it would execute its recovery procedure on this new 
> connection.
>  - Deprecate all asynchronous methods on the ZooKeeper object. These are 
> unsafe to use if the caller assumes they're getting FIFO client order.
>  - No changes to the protocols or servers are required.
> I recognize this could cause a lot of code churn for both ZooKeeper and 
> projects that use it. On the other hand, the existing asynchronous calls in 
> applications should now be audited anyhow.
> The code affected by this issue may be difficult to contain:
>  - It likely affects all ZooKeeper client libraries that provide both 
> asynchronous operations and transparent reconnection. That's probably all 
> versions of the official Java client library, as well as most other client 
> libraries.
>  - It affects all applications using those libraries that depend on the FIFO 
> client order of asynchronous operations. I don't know how common that is, but 
> the paper implies that FIFO client order is important.
>  - Fortunately, the issue can only manifest itself when connections are lost 
> and transparently reestablished. In practice, it may also require a long 
> pipeline or a significant delay in the application thread while the library 
> establishes a new connection.
>  - In case you're wondering, this issue occurred to me while working on a new 
> client library for Go. I haven't seen this issue in the wild, but I was able 
> to produce it locally by placing sleep statements in a Java program and 
> closing its TCP connections.
> I'm new to this community, so I'm looking forward to the discussion. Let me 
> know if I can clarify any of the above.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to