Sequence Number Generation With Zookeeper

2010-08-05 Thread Jonathan Holloway
Hi all,

I'm looking at using Zookeeper for distributed sequence number generation.
 What's the best way to do this currently?  Is there a particular recipe
available for this?

My so far involve:
a) Creating a node with PERSISTENT_SEQUENTIAL then deleting it - this gives
me the monotonically increasing number, but the sequence number isn't
contiguous
b) Storing the sequence number in the data portion of a persistent node -
then updating this (using the version number - aka optimistic locking).  The
problem with this is that under high load I'm assuming there'll be a lot of
contention and hence failures with regards to updates.

What are your thoughts on the above?

Many thanks,
Jon.


Re: Sequence Number Generation With Zookeeper

2010-08-05 Thread Jonathan Holloway
Hi Ted,

Thanks for the comments.

I might have overlooked something here, but is it also possible to do the
following:

1. Create a PERSISTENT node
2. Have multiple clients set the data on the node, e.g.  Stat stat =
zookeeper.setData(SEQUENCE, ArrayUtils.EMPTY_BYTE_ARRAY, -1);
3. Use the version number from stat.getVersion() as the sequence (obviously
I'm limited to Integer.MAX_VALUE)

Are there any weird race conditions involved here which would mean that a
client would receive the wrong Stat object back?

Many thanks again,
Jon.

On 5 August 2010 16:09, Ted Dunning ted.dunn...@gmail.com wrote:

 (b)

 BUT:

 Sequential numbering is a special case of now.  In large diameters, now
 gets very expensive.  This is a special case of that assertion.  If there
 is
 a way to get away from this presumption of the need for sequential
 numbering, you will be miles better off.

 HOWEVER:

 ZK can do better than you suggest.  Incrementing a counter does involve
 potential contention, but you will very likely be able to get to pretty
 high
 rates before the optimistic locking begins to fail.  If you code your
 update
 with a few tries at full speed followed by some form of retry back-off, you
 should get pretty close to the best possible performance.

 You might also try building a lock with an ephemeral file before updating
 the counter.  I would expect that this will be slower than the back-off
 option if only because involves more transactions in ZK.  IF you wanted to
 get too complicated for your own good, you could have a secondary strategy
 flag that is only sampled by all clients every few seconds and is updated
 whenever a client needs to back-off more than say 5 steps.  If this flag
 has
 been updated recently, then clients should switch to the locking protocol.
  You might even have several locks so that you don't exclude all other
 updaters, merely thin them out a bit.  This flagged strategy would run as
 fast as optimistic locking as long as optimistic locking is fast and then
 would limit the total number of transactions needed under very high load.

 On Thu, Aug 5, 2010 at 3:31 PM, Jonathan Holloway 
 jonathan.hollo...@gmail.com wrote:

  My so far involve:
  a) Creating a node with PERSISTENT_SEQUENTIAL then deleting it - this
 gives
  me the monotonically increasing number, but the sequence number isn't
  contiguous
  b) Storing the sequence number in the data portion of a persistent node -
  then updating this (using the version number - aka optimistic locking).
   The
  problem with this is that under high load I'm assuming there'll be a lot
 of
  contention and hence failures with regards to updates.
 



Re: Sequence Number Generation With Zookeeper

2010-08-05 Thread Jonathan Holloway
Hi David,

We did discuss potentially doing this as well.  It would be nice to get some
recipes for Zookeeper done for this area, if people think it's useful.  Were
you thinking of submitting this back as a recipe, if not then I could
potentially work on such a recipe instead.

Many thanks,
Jon.


 I just ran into this exact situation, and handled it like so:

 I wrote a library that uses the option (b) you described above.  Only
 instead of requesting a single sequence number, you request a block of them
 at a time from Zookeeper, and then locally use them up one by one from the
 block you retrieved.  Retrieving by block (e.g., by blocks of 1 at a
 time) eliminates the contention issue.

 Then, if you're finished assigning ID's from that block, but still have a
 bunch of ID's left in the block, the library has another function to push
 back the unused ID's.  They'll then get pulled again in the next block
 retrieval.

 We don't actually have this code running in production yet, so I can't
 vouch for how well it works.  But the design was reviewed and given the
 thumbs up by the core developers on the team, and the implementation passes
 all my unit tests.

 HTH.  Feel free to email back with specific questions if you'd like more
 details.

 DR


ZKClient

2010-05-04 Thread Jonathan Holloway
I came across this project on Github

http://github.com/sgroschupf/zkclient

for working with the Zookeeper API.  Has anybody used it in the past?  Is it
a better way of interacting with
a Zookeeper cluster?

Many thanks,
Jon.


Re: ZKClient

2010-05-04 Thread Jonathan Holloway
It looks good, having written a client already myself, I'd rather use this
than have to roll
my own each time.

Is there any reason why this isn't part of the Zookeeper trunk already?
It would make working with Zookeeper a bit easier (at least from my
perspective)...

Jon.


On 4 May 2010 12:57, Ted Dunning ted.dunn...@gmail.com wrote:

 This is used as part of katta where it gets a fair bit of exercise at low
 update rates with small data.  It is used for managing the state of the
 search cluster.

 I don't think it has had much external review or use for purposes apart
 from
 katta.  Katta generally has pretty decent code, though.

 On Tue, May 4, 2010 at 12:39 PM, Jonathan Holloway 
 jonathan.hollo...@gmail.com wrote:

  I came across this project on Github
 
  http://github.com/sgroschupf/zkclient
 
  for working with the Zookeeper API.  Has anybody used it in the past?  Is
  it
  a better way of interacting with
  a Zookeeper cluster?
 
  Many thanks,
  Jon.