[jira] [Commented] (ZOOKEEPER-2024) Major throughput improvement with mixed workloads

Kfir Lev-Ari (JIRA) Sat, 27 Sep 2014 00:51:07 -0700

    [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14150496#comment-14150496
 ]


Kfir Lev-Ari commented on ZOOKEEPER-2024:
-----------------------------------------

Please see the following doc for updated comparison results (throughput & 
latency graphs) -
https://docs.google.com/document/d/1HXtgdYx1yJE4Bs02-Q4QCRjuAEVH3yOa4wgraSaR0dw/edit?usp=sharing
The logs can be found here - 
https://docs.google.com/spreadsheets/d/1mLZP5FmAXTtgOf60BZkhxOmLzNFrT02wBWNQopx_7Dw/edit?usp=sharing
 

I've modified generate load system test (diff can be found here 
https://reviews.apache.org/r/25935/) in order to get the latency per each type 
of client (MW vs RO).

> Major throughput improvement with mixed workloads
> -------------------------------------------------
>
>                 Key: ZOOKEEPER-2024
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2024
>             Project: ZooKeeper
>          Issue Type: Improvement
>          Components: quorum, server
>            Reporter: Kfir Lev-Ari
>            Assignee: Kfir Lev-Ari
>         Attachments: ZOOKEEPER-2024.patch, ZOOKEEPER-2024.patch, 
> ZOOKEEPER-2024.patch, ZOOKEEPER-2024.patch, ZOOKEEPER-2024.patch, 
> ZOOKEEPER-2024.patch, ZOOKEEPER-2024.patch
>
>
> The patch is applied to the commit processor, and solves two problems:
> 1. Stalling - once the commit processor encounters a local write request, it 
> stalls local processing of all sessions until it receives a commit of that 
> request from the leader. 
> In mixed workloads, this severely hampers performance as it does not allow 
> read-only sessions to proceed at faster speed than read-write ones.
> 2. Starvation - as long as there are read requests to process, older remote 
> committed write requests are starved. 
> This occurs due to a bug fix 
> (https://issues.apache.org/jira/browse/ZOOKEEPER-1505) that forces processing 
> of local read requests before handling any committed write. The problem is 
> only manifested under high local read load. 
> Our solution solves these two problems. It improves throughput in mixed 
> workloads (in our tests, by up to 8x), and reduces latency, especially higher 
> percentiles (i.e., slowest requests). 
> The main idea is to separate sessions that inherently need to stall in order 
> to enforce order semantics, from ones that do not need to stall. To this end, 
> we add data structures for buffering and managing pending requests of stalled 
> sessions; these requests are moved out of the critical path to these data 
> structures, allowing continued processing of unaffected sessions. 
> In order to avoid starvation, our solution prioritizes committed write 
> requests over reads, and enforces fairness among read requests of sessions. 
> Please see the docs:  
> 1) 
> https://docs.google.com/document/d/1oXJiSt9VqL35hCYQRmFuC63ETd0F_g6uApzocgkFe3Y/edit?usp=sharing
>  - includes a detailed description of the new commit processor algorithm.
> https://issues.apache.org/jira/browse/ZOOKEEPER-2024
> 2) The attached patch implements our solution, and a collection of related 
> unit tests (https://reviews.apache.org/r/25160)
> 3) 
> https://docs.google.com/spreadsheets/d/11mmobkIf-0czIyEEwgytwqRme5OH8tmZcb4EBcsMZ_w/edit?usp=sharing
>  - new performance results.
> https://docs.google.com/spreadsheets/d/1vmdfsq4WLr92BQO-CGcualE0KhAtjIu3bCaVwYajLo8/edit?usp=sharing
>  - shows (old) performance results of running system tests on the patched ZK 
> using the patched system test from 
> https://issues.apache.org/jira/browse/ZOOKEEPER-2023. 
> See also https://issues.apache.org/jira/browse/ZOOKEEPER-1609



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (ZOOKEEPER-2024) Major throughput improvement with mixed workloads

Reply via email to