Re: Dealing with eventual consistency

2020-05-02 Thread Jordan Zimmerman
That part is true I'm not sure how much use it is though. If there are multiple 
writers you can't know what the latest version is - there may be other servers 
writing that have been seen yet. But, then, I probably don't completely 
understand the use case.

-JZ

> On May 2, 2020, at 9:35 AM, Scott Blum  wrote:
> 
> I don't follow... I'm saying that if server A wants to be absolutely sure a 
> write is visible to server B, it can grab the mxid of the write, send it to 
> server B, and server B can ensure that its LastZxid >= that value before 
> doing the read.



Re: Dealing with eventual consistency

2020-05-02 Thread Scott Blum
I don't follow... I'm saying that if server A wants to be absolutely sure a
write is visible to server B, it can grab the mxid of the write, send it to
server B, and server B can ensure that its LastZxid >= that value before
doing the read.


Re: Dealing with eventual consistency

2020-05-01 Thread Jordan Zimmerman
You can still miss a pending write on the client. Just because you read N from 
the ZNode doesn't mean that that's it's real value on the leader. The only way 
in ZooKeeper to be certain is to do a write as those always take place on the 
leader.

-Jordan

> On May 1, 2020, at 5:51 PM, Scott Blum  wrote:
> 
> On Fri, May 1, 2020 at 6:00 PM David Smiley  > wrote:
> I don't think I'll use the "LastZxid" trick because we update parts of the ZK 
> tree with high frequency but not this one yet the Zxid would still soar 
> upwards.
> 
> I don't follow.. if server A writes a node, recording the mzxid associated 
> with that write, and passes it along to server B, then server B just needs to 
> be sure its LastZxid is >= the one that server A wrote.  Doesn't matter if 
> server B's LastZxid is the same, 10 ahead, or 1000 ahead.



Re: Dealing with eventual consistency

2020-05-01 Thread Scott Blum
On Fri, May 1, 2020 at 6:00 PM David Smiley 
wrote:

> I don't think I'll use the "LastZxid" trick because we update parts of the
> ZK tree with high frequency but not this one yet the Zxid would still soar
> upwards.
>

I don't follow.. if server A writes a node, recording the mzxid associated
with that write, and passes it along to server B, then server B just needs
to be sure its LastZxid is >= the one that server A wrote.  Doesn't matter
if server B's LastZxid is the same, 10 ahead, or 1000 ahead.


Re: Dealing with eventual consistency

2020-05-01 Thread Jordan Zimmerman
You can also know whether or not you have a recent version of a ZNode by using 
the version. Read the node and its Stat, save that (Curator's Cache has this) 
and then update the node using the version. All updates occur on the current 
ZooKeeper leader so you get an exception if the version doesn't match.

Other than that, ZK is an eventually consistent system as you know. 

-JZ

> On May 1, 2020, at 5:00 PM, David Smiley  wrote:
> 
> Thanks for pointing out that conversation RE sync() "Consistency Guarantees". 
>  It's a shame sync() has that deficiency of not actually getting a quorum, 
> thus there's still an edge case.
> 
> I don't think I'll use the "LastZxid" trick because we update parts of the ZK 
> tree with high frequency but not this one yet the Zxid would still soar 
> upwards.
> 
> The Cache stuff is really nifty but still leaves an eventual consistency 
> issue.  Machine A need to get machine B to do something predicated on the 
> cached thing having a version >= some value that Machine A knows.
> 
> I think I must just deal with the fact of passing on a version / xid even 
> though I said for the system in question it's ugly.  I need to make it look 
> pretty :-)
> 
> ~ David
> 
> 
> On Fri, May 1, 2020 at 11:46 AM Jordan Zimmerman  > wrote:
>> I'm aware a znode or perhaps pzxid (for a path of data) could be passed to 
>> the second machine.
> 
> It's available to the client. See: 
> https://github.com/apache/zookeeper/blob/master/zookeeper-server/src/main/java/org/apache/zookeeper/ClientCnxn.java#L810
>  
> 
>  
> 
> You'd create your own ZooKeeper subclass - "cnxn" is a protected field. So, 
> something like:
> 
> public class MyZooKeeper extends ZooKeeper {
>   ...
> 
>   public long getLastZxid() {
>   return cnxn.getLastZxid();
>   }
> }
> 
>> But I'm worried about over-using sync() because I imagine it's not free.
> 
> TBH - I've never understood the sync() method. I always thought it was 
> useless but Alex Shraer wrote some ways that it can be useful. See: 
> https://mail-archives.apache.org/mod_mbox/zookeeper-dev/201908.mbox/thread?2 
> 
>  (search for "Consistency Guarantees").
> 
>> Imagine this is some immutable configuration data that is set once then used 
>> a lot by many ZK clients thereafter many thousands of times for days on end.
> 
> Note: this is how Facebook uses ZooKeeper. It's the backbone of their config 
> system. They have 10s of thousands (something like that) of read-only 
> observers that sit in front of their main ZK ensemble.
> 
>>  Does Curator facilitate this in any way?
> 
> This is what the various Cache recipes are for (Scott's TreeCache or the 
> upcoming CuratorCache). These recipes take care of pulling down the latest 
> versions of ZNodes for you.
> 
> -Jordan
> 
>> On May 1, 2020, at 12:43 AM, David Smiley > > wrote:
>> 
>> Hello,
>> 
>> I'm trying to come to grips with the ramifications of ZooKeeper's eventual 
>> consistency model and what mechanisms exists to help.  
>> 
>> Imagine two machines that use ZK, one of which stores data in it then tells 
>> the other machine to do something that will require it to read what the 
>> first machine wrote.
>> 
>> ZK's docs warn about this:
>> https://github.com/apache/zookeeper/blob/master/zookeeper-docs/src/main/resources/markdown/zookeeperProgrammers.md#ch_zkGuarantees
>>  
>> 
>>  .. and refer to a sync() method to help:
>> https://github.com/apache/zookeeper/blob/2e14a29cc6e58d9561e80b737a3168fbb1f752b4/zookeeper-server/src/main/java/org/apache/zookeeper/ZooKeeper.java#L3057
>>  
>> 
>> 
>> But I'm worried about over-using sync() because I imagine it's not free.  
>> For the scenario I have in mind, the vast majority of the time, the second 
>> machine will see the latest state because lots of time passes between the 
>> write and the read.  Imagine this is some immutable configuration data that 
>> is set once then used a lot by many ZK clients thereafter many thousands of 
>> times for days on end.
>> 
>> I'm aware a znode or perhaps pzxid (for a path of data) could be passed to 
>> the second machine... but for the system in question, this would be really 
>> ugly and I want to consider alternatives.  Besides, the data is organized 
>> into a tree that could have arbitrary nesting, so it's not clear to me that 
>> there's a single version for this any way.
>> 
>> Scott Blum told me about how there's an increasing "zxid" for all state 

Re: Dealing with eventual consistency

2020-05-01 Thread David Smiley
Thanks for pointing out that conversation RE sync() "Consistency
Guarantees".  It's a shame sync() has that deficiency of not actually
getting a quorum, thus there's still an edge case.

I don't think I'll use the "LastZxid" trick because we update parts of the
ZK tree with high frequency but not this one yet the Zxid would still soar
upwards.

The Cache stuff is really nifty but still leaves an eventual consistency
issue.  Machine A need to get machine B to do something predicated on the
cached thing having a version >= some value that Machine A knows.

I think I must just deal with the fact of passing on a version / xid even
though I said for the system in question it's ugly.  I need to make it look
pretty :-)

~ David


On Fri, May 1, 2020 at 11:46 AM Jordan Zimmerman 
wrote:

> I'm aware a znode or perhaps pzxid (for a path of data) could be passed to
> the second machine.
>
>
> It's available to the client. See:
> https://github.com/apache/zookeeper/blob/master/zookeeper-server/src/main/java/org/apache/zookeeper/ClientCnxn.java#L810
>
>
> You'd create your own ZooKeeper subclass - "cnxn" is a protected field.
> So, something like:
>
> public class MyZooKeeper extends ZooKeeper {
> ...
>
> public long getLastZxid() {
> return cnxn.getLastZxid();
> }
> }
>
> But I'm worried about over-using sync() because I imagine it's not free.
>
>
> TBH - I've never understood the sync() method. I always thought it was
> useless but Alex Shraer wrote some ways that it can be useful. See:
> https://mail-archives.apache.org/mod_mbox/zookeeper-dev/201908.mbox/thread?2
> (search for "Consistency Guarantees").
>
> Imagine this is some immutable configuration data that is set once then
> used a lot by many ZK clients thereafter many thousands of times for days
> on end.
>
>
> Note: this is how Facebook uses ZooKeeper. It's the backbone of their
> config system. They have 10s of thousands (something like that) of
> read-only observers that sit in front of their main ZK ensemble.
>
>  Does Curator facilitate this in any way?
>
>
> This is what the various Cache recipes are for (Scott's TreeCache or the
> upcoming CuratorCache). These recipes take care of pulling down the latest
> versions of ZNodes for you.
>
> -Jordan
>
> On May 1, 2020, at 12:43 AM, David Smiley  wrote:
>
> Hello,
>
> I'm trying to come to grips with the ramifications of ZooKeeper's eventual
> consistency model and what mechanisms exists to help.
>
> Imagine two machines that use ZK, one of which stores data in it then
> tells the other machine to do something that will require it to read what
> the first machine wrote.
>
> ZK's docs warn about this:
>
> https://github.com/apache/zookeeper/blob/master/zookeeper-docs/src/main/resources/markdown/zookeeperProgrammers.md#ch_zkGuarantees
>  .. and refer to a sync() method to help:
>
> https://github.com/apache/zookeeper/blob/2e14a29cc6e58d9561e80b737a3168fbb1f752b4/zookeeper-server/src/main/java/org/apache/zookeeper/ZooKeeper.java#L3057
>
> But I'm worried about over-using sync() because I imagine it's not free.
> For the scenario I have in mind, the vast majority of the time, the second
> machine will see the latest state because lots of time passes between the
> write and the read.  Imagine this is some immutable configuration data that
> is set once then used a lot by many ZK clients thereafter many thousands of
> times for days on end.
>
> I'm aware a znode or perhaps pzxid (for a path of data) could be passed to
> the second machine... but for the system in question, this would be really
> ugly and I want to consider alternatives.  Besides, the data is organized
> into a tree that could have arbitrary nesting, so it's not clear to me that
> there's a single version for this any way.
>
> Scott Blum told me about how there's an increasing "zxid" for all state
> change in ZK.  I can see this on ZK's ClientCnxn.getLastZxid().  If I were
> to pass that zxid to the additional machines (ZK clients) from the first
> for basically all interactions (not too ugly for the system), how would the
> receiving machine use this to get in sync?  I'm guessing it could read its
> own connection zxid and if it's out of date than call sync()?  Does that
> make sense?  Is there another strategy to be recommended?  Does Curator
> facilitate this in any way?
>
> Thanks in advance!  I already searched this list for answers.
>
> ~ David Smiley
> Apache Lucene/Solr Search Developer
> http://www.linkedin.com/in/davidwsmiley
>
>
>


Re: Dealing with eventual consistency

2020-05-01 Thread Jordan Zimmerman
> I'm aware a znode or perhaps pzxid (for a path of data) could be passed to 
> the second machine.

It's available to the client. See: 
https://github.com/apache/zookeeper/blob/master/zookeeper-server/src/main/java/org/apache/zookeeper/ClientCnxn.java#L810
 

You'd create your own ZooKeeper subclass - "cnxn" is a protected field. So, 
something like:

public class MyZooKeeper extends ZooKeeper {
...

public long getLastZxid() {
return cnxn.getLastZxid();
}
}

> But I'm worried about over-using sync() because I imagine it's not free.

TBH - I've never understood the sync() method. I always thought it was useless 
but Alex Shraer wrote some ways that it can be useful. See: 
https://mail-archives.apache.org/mod_mbox/zookeeper-dev/201908.mbox/thread?2 
(search for "Consistency Guarantees").

> Imagine this is some immutable configuration data that is set once then used 
> a lot by many ZK clients thereafter many thousands of times for days on end.

Note: this is how Facebook uses ZooKeeper. It's the backbone of their config 
system. They have 10s of thousands (something like that) of read-only observers 
that sit in front of their main ZK ensemble.

>  Does Curator facilitate this in any way?

This is what the various Cache recipes are for (Scott's TreeCache or the 
upcoming CuratorCache). These recipes take care of pulling down the latest 
versions of ZNodes for you.

-Jordan

> On May 1, 2020, at 12:43 AM, David Smiley  wrote:
> 
> Hello,
> 
> I'm trying to come to grips with the ramifications of ZooKeeper's eventual 
> consistency model and what mechanisms exists to help.  
> 
> Imagine two machines that use ZK, one of which stores data in it then tells 
> the other machine to do something that will require it to read what the first 
> machine wrote.
> 
> ZK's docs warn about this:
> https://github.com/apache/zookeeper/blob/master/zookeeper-docs/src/main/resources/markdown/zookeeperProgrammers.md#ch_zkGuarantees
>  
> 
>  .. and refer to a sync() method to help:
> https://github.com/apache/zookeeper/blob/2e14a29cc6e58d9561e80b737a3168fbb1f752b4/zookeeper-server/src/main/java/org/apache/zookeeper/ZooKeeper.java#L3057
>  
> 
> 
> But I'm worried about over-using sync() because I imagine it's not free.  For 
> the scenario I have in mind, the vast majority of the time, the second 
> machine will see the latest state because lots of time passes between the 
> write and the read.  Imagine this is some immutable configuration data that 
> is set once then used a lot by many ZK clients thereafter many thousands of 
> times for days on end.
> 
> I'm aware a znode or perhaps pzxid (for a path of data) could be passed to 
> the second machine... but for the system in question, this would be really 
> ugly and I want to consider alternatives.  Besides, the data is organized 
> into a tree that could have arbitrary nesting, so it's not clear to me that 
> there's a single version for this any way.
> 
> Scott Blum told me about how there's an increasing "zxid" for all state 
> change in ZK.  I can see this on ZK's ClientCnxn.getLastZxid().  If I were to 
> pass that zxid to the additional machines (ZK clients) from the first for 
> basically all interactions (not too ugly for the system), how would the 
> receiving machine use this to get in sync?  I'm guessing it could read its 
> own connection zxid and if it's out of date than call sync()?  Does that make 
> sense?  Is there another strategy to be recommended?  Does Curator facilitate 
> this in any way?
> 
> Thanks in advance!  I already searched this list for answers.
> 
> ~ David Smiley
> Apache Lucene/Solr Search Developer
> http://www.linkedin.com/in/davidwsmiley 
> 


Re: Dealing with eventual consistency

2020-05-01 Thread Scott Blum
FWIW, I dug through the docs, a bit of the source code, and some
internets.  The amount of super clear information on this subject is
vanishingly small. :/

On Fri, May 1, 2020 at 1:43 AM David Smiley  wrote:

> Hello,
>
> I'm trying to come to grips with the ramifications of ZooKeeper's eventual
> consistency model and what mechanisms exists to help.
>
> Imagine two machines that use ZK, one of which stores data in it then
> tells the other machine to do something that will require it to read what
> the first machine wrote.
>
> ZK's docs warn about this:
>
> https://github.com/apache/zookeeper/blob/master/zookeeper-docs/src/main/resources/markdown/zookeeperProgrammers.md#ch_zkGuarantees
>  .. and refer to a sync() method to help:
>
> https://github.com/apache/zookeeper/blob/2e14a29cc6e58d9561e80b737a3168fbb1f752b4/zookeeper-server/src/main/java/org/apache/zookeeper/ZooKeeper.java#L3057
>
> But I'm worried about over-using sync() because I imagine it's not free.
> For the scenario I have in mind, the vast majority of the time, the second
> machine will see the latest state because lots of time passes between the
> write and the read.  Imagine this is some immutable configuration data that
> is set once then used a lot by many ZK clients thereafter many thousands of
> times for days on end.
>
> I'm aware a znode or perhaps pzxid (for a path of data) could be passed to
> the second machine... but for the system in question, this would be really
> ugly and I want to consider alternatives.  Besides, the data is organized
> into a tree that could have arbitrary nesting, so it's not clear to me that
> there's a single version for this any way.
>
> Scott Blum told me about how there's an increasing "zxid" for all state
> change in ZK.  I can see this on ZK's ClientCnxn.getLastZxid().  If I were
> to pass that zxid to the additional machines (ZK clients) from the first
> for basically all interactions (not too ugly for the system), how would the
> receiving machine use this to get in sync?  I'm guessing it could read its
> own connection zxid and if it's out of date than call sync()?  Does that
> make sense?  Is there another strategy to be recommended?  Does Curator
> facilitate this in any way?
>
> Thanks in advance!  I already searched this list for answers.
>
> ~ David Smiley
> Apache Lucene/Solr Search Developer
> http://www.linkedin.com/in/davidwsmiley
>


Dealing with eventual consistency

2020-04-30 Thread David Smiley
Hello,

I'm trying to come to grips with the ramifications of ZooKeeper's eventual
consistency model and what mechanisms exists to help.

Imagine two machines that use ZK, one of which stores data in it then tells
the other machine to do something that will require it to read what the
first machine wrote.

ZK's docs warn about this:
https://github.com/apache/zookeeper/blob/master/zookeeper-docs/src/main/resources/markdown/zookeeperProgrammers.md#ch_zkGuarantees
 .. and refer to a sync() method to help:
https://github.com/apache/zookeeper/blob/2e14a29cc6e58d9561e80b737a3168fbb1f752b4/zookeeper-server/src/main/java/org/apache/zookeeper/ZooKeeper.java#L3057

But I'm worried about over-using sync() because I imagine it's not free.
For the scenario I have in mind, the vast majority of the time, the second
machine will see the latest state because lots of time passes between the
write and the read.  Imagine this is some immutable configuration data that
is set once then used a lot by many ZK clients thereafter many thousands of
times for days on end.

I'm aware a znode or perhaps pzxid (for a path of data) could be passed to
the second machine... but for the system in question, this would be really
ugly and I want to consider alternatives.  Besides, the data is organized
into a tree that could have arbitrary nesting, so it's not clear to me that
there's a single version for this any way.

Scott Blum told me about how there's an increasing "zxid" for all state
change in ZK.  I can see this on ZK's ClientCnxn.getLastZxid().  If I were
to pass that zxid to the additional machines (ZK clients) from the first
for basically all interactions (not too ugly for the system), how would the
receiving machine use this to get in sync?  I'm guessing it could read its
own connection zxid and if it's out of date than call sync()?  Does that
make sense?  Is there another strategy to be recommended?  Does Curator
facilitate this in any way?

Thanks in advance!  I already searched this list for answers.

~ David Smiley
Apache Lucene/Solr Search Developer
http://www.linkedin.com/in/davidwsmiley