[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15213521#comment-15213521
 ] 

Alexander Shraer commented on ZOOKEEPER-2024:
---------------------------------------------

Thanks for the comments, Flavio. 

Due to some personal circumstances, Kfir may not have time to do new 
experiments any time soon. I can address your formatting comments though, would 
that be ok? Wrt latency, Kfir checked it at some point but doesn't have recent 
data. It is pretty clear that latency improves too, isn't it ? instead of being 
blocked by any write each request may only be blocked by previous writes in the 
same client session, so there should be a huge win for latency as well. I don't 
see
any reason why latency could become worse due to this change.

Synchronization: https://goo.gl/m1cINJ gives more details, please see 
pseudocode on the last page. The idea is that synchronization on
new incoming requests is now separate from synchronization on "thread pool is 
available to process more requests", since
we may have work to do even if there is no "new" incoming requests.

Waiting for empty pool before and after a write: This is the same as before the 
patch (check out the condition for waking up the thread right now).
Due to synchronization issues in FinalRequestProcessor (potential races between 
setting watches and updating data), commitRequestProcessor
has always either sent it many reads or a single write. We kept this in the 
patch. I think this part can be made more efficient,
e.g., using read/write locks. We had discussions about this offline with a few 
contributors, and I plan to open a separate Jira for this.

> Major throughput improvement with mixed workloads
> -------------------------------------------------
>
>                 Key: ZOOKEEPER-2024
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2024
>             Project: ZooKeeper
>          Issue Type: Improvement
>          Components: quorum, server
>            Reporter: Kfir Lev-Ari
>            Assignee: Kfir Lev-Ari
>             Fix For: 3.5.3
>
>         Attachments: ZOOKEEPER-2024.patch, ZOOKEEPER-2024.patch, 
> ZOOKEEPER-2024.patch, ZOOKEEPER-2024.patch, ZOOKEEPER-2024.patch, 
> ZOOKEEPER-2024.patch, ZOOKEEPER-2024.patch, ZOOKEEPER-2024.patch, 
> ZOOKEEPER-2024.patch, ZOOKEEPER-2024.patch, ZOOKEEPER-2024.patch
>
>
> The patch is applied to the commit processor, and solves two problems:
> 1. Stalling - once the commit processor encounters a local write request, it 
> stalls local processing of all sessions until it receives a commit of that 
> request from the leader. 
> In mixed workloads, this severely hampers performance as it does not allow 
> read-only sessions to proceed at faster speed than read-write ones.
> 2. Starvation - as long as there are read requests to process, older remote 
> committed write requests are starved. 
> This occurs due to a bug fix 
> (https://issues.apache.org/jira/browse/ZOOKEEPER-1505) that forces processing 
> of local read requests before handling any committed write. The problem is 
> only manifested under high local read load. 
> Our solution solves these two problems. It improves throughput in mixed 
> workloads (in our tests, by up to 8x), and reduces latency, especially higher 
> percentiles (i.e., slowest requests). 
> The main idea is to separate sessions that inherently need to stall in order 
> to enforce order semantics, from ones that do not need to stall. To this end, 
> we add data structures for buffering and managing pending requests of stalled 
> sessions; these requests are moved out of the critical path to these data 
> structures, allowing continued processing of unaffected sessions. 
> Please see the docs:  
> 1) https://goo.gl/m1cINJ - includes a detailed description of the new commit 
> processor algorithm.
> 2) The attached patch implements our solution, and a collection of related 
> unit tests (https://reviews.apache.org/r/25160)
> 3) https://goo.gl/W0xDUP - performance results. 
> See also https://issues.apache.org/jira/browse/ZOOKEEPER-1609



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to