[ 
https://issues.apache.org/jira/browse/RATIS-2403?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan Andika updated RATIS-2403:
-------------------------------
    Description: 
While benchmarking linearizable follower read, the observation is that the more 
requests go to the followers instead of the leader, the write throughput 
improves dramatically by around 2-3x compared to the leader-only write and read 
(most likely due to less leader resource contention). However, the read 
throughput becomes worst than leader-only write and read  (some can be below 
0.2x). Even with optimizations such as RATIS-2392 RATIS-2382 
[https://github.com/apache/ratis/pull/1334] RATIS-2379, the read throughput 
remains worse than leader-only write (it even improves the write performance 
instead of the read performance). I suspect that because write throughput 
increase, the read index increases at a faster rate which causes follower 
linearizable read to wait longer.

The target is to improve read throughput by 1.5x - 2x of the leader-only write 
and reads. Currently pure reads (no writes) performance improves read 
throughput up to 1.7x.

Currently my ideas are
 * Sacrificing writes for reads: Can we limit the write QPS so that read QPS 
can increase
 ** From the benchmark, the read throughput only improves when write throughput 
is lower
 ** We can try to use backpressure mechanism so that writes do not advance so 
quickly that read throughput suffer
 *** Follower gap mechanisms (RATIS-1411), but this might cause leader to stall 
if follower down for a while (e.g. restarted), which violates the majority 
availability guarantee. It's also hard to know which value is optimal for 
different workloads.

Raising this ticket for ideas. [~szetszwo] [~tanxinyu] 

  was:
While benchmarking linearizable follower read, the observation is that the more 
requests go to the followers instead of the leader, the write throughput 
improves dramatically by around 2-3x compared to the leader-only write and read 
(most likely due to less leader resource contention). However, the read 
throughput becomes worst than leader-only write and read  (some can be below 
0.2x). Even with optimizations such as RATIS-2392 RATIS-2382 
[https://github.com/apache/ratis/pull/1334] RATIS-2379, the read throughput 
remains worse than leader-only write (it even improves the write performance 
instead of the read performance). I suspect that because write throughput 
increase, the read index increases at a faster rate which causes follower 
linearizable read to wait longer.

The target is to improve read throughput by 1.5x - 2x of the leader-only write 
and reads. 

Currently my ideas are
 * Sacrificing writes for reads: Can we limit the write QPS so that read QPS 
can increase
 ** From the benchmark, the read throughput only improves when write throughput 
is lower
 ** We can try to use backpressure mechanism so that writes do not advance so 
quickly that read throughput suffer
 *** Follower gap mechanisms (RATIS-1411), but this might cause leader to stall 
if follower down for a while (e.g. restarted), which violates the majority 
availability guarantee. It's also hard to know which value is optimal for 
different workloads.

Raising this ticket for ideas. [~szetszwo] [~tanxinyu] 


> Improve linearizable follower read throughput instead of writes
> ---------------------------------------------------------------
>
>                 Key: RATIS-2403
>                 URL: https://issues.apache.org/jira/browse/RATIS-2403
>             Project: Ratis
>          Issue Type: Improvement
>            Reporter: Ivan Andika
>            Priority: Major
>
> While benchmarking linearizable follower read, the observation is that the 
> more requests go to the followers instead of the leader, the write throughput 
> improves dramatically by around 2-3x compared to the leader-only write and 
> read (most likely due to less leader resource contention). However, the read 
> throughput becomes worst than leader-only write and read  (some can be below 
> 0.2x). Even with optimizations such as RATIS-2392 RATIS-2382 
> [https://github.com/apache/ratis/pull/1334] RATIS-2379, the read throughput 
> remains worse than leader-only write (it even improves the write performance 
> instead of the read performance). I suspect that because write throughput 
> increase, the read index increases at a faster rate which causes follower 
> linearizable read to wait longer.
> The target is to improve read throughput by 1.5x - 2x of the leader-only 
> write and reads. Currently pure reads (no writes) performance improves read 
> throughput up to 1.7x.
> Currently my ideas are
>  * Sacrificing writes for reads: Can we limit the write QPS so that read QPS 
> can increase
>  ** From the benchmark, the read throughput only improves when write 
> throughput is lower
>  ** We can try to use backpressure mechanism so that writes do not advance so 
> quickly that read throughput suffer
>  *** Follower gap mechanisms (RATIS-1411), but this might cause leader to 
> stall if follower down for a while (e.g. restarted), which violates the 
> majority availability guarantee. It's also hard to know which value is 
> optimal for different workloads.
> Raising this ticket for ideas. [~szetszwo] [~tanxinyu] 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to