[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-3619?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17000695#comment-17000695
 ] 

Fangmin Lv commented on ZOOKEEPER-3619:
---------------------------------------

Thanks [~randgalt], we'll add you as reviewer when it's ready, we can probably 
add a new client implementation of semaphore based on this in Curator.

> Implement server side semaphore API to improve the efficiency and throughput 
> of coordination 
> ---------------------------------------------------------------------------------------------
>
>                 Key: ZOOKEEPER-3619
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3619
>             Project: ZooKeeper
>          Issue Type: New Feature
>          Components: server
>    Affects Versions: 3.6.0
>            Reporter: Fangmin Lv
>            Assignee: Fangmin Lv
>            Priority: Major
>             Fix For: 3.7.0
>
>
> The design principle of ZK API is simple, flexible and general, it can meets 
> different scenarios from coordination, health member track, meta store, etc. 
> But there are some cost of this general design, which makes heavy and 
> inefficient client code for recipes like distributed and semaphore, etc.
> Currently, the general client side semaphore implementation without waiting 
> time are:
>  # client A create sequential and ephemeral node N-1
>  # client B create sequential and ephemeral node N-2
>  # client A and B query all children and see if its holding the lock node 
> with the smallest sequential id 
>  # since client A has smaller sequential id, its the semaphore owner (assume 
> semaphore value is 1)
>  # client B will delete the node, close the session, and probably try again 
> later from step 2
> All the contenders will issue 4 write (create session, create lock, delete 
> lock, close session) and 1 read (get children), which are pretty heavy and 
> not scale well.
> We actually hit this issue internally for one heavy semaphore use case, and 
> we have to create dozens of ensembles to support their traffic.
> To make the semaphore recipe more efficient, we can move the semaphore 
> implementation to server side, where leader has all the context about who'll 
> win the semaphore/lock during txn preparation time, do short circuit and fail 
> the contender directly without proposing and committing those create/delete 
> lock transactions.
> To implement this, we need to add new semaphore API, which suppose to replace 
> client side lock, leader election (semaphore value 1), and general semaphore 
> use cases.
> We started to design and implement it recently, it will based on another big 
> improvement we've almost finished and will soon upstream it in ZOOKEEPER-3594 
> to skip proposing requests with error transactions.
> Meanwhile, we'd like to hear some early feedback from the community about 
> this feature.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to