Re: [PR] feat(duplication): make the task code for incremental loading from private logs configurable [incubator-pegasus]

via GitHub Fri, 07 Mar 2025 04:16:38 -0800


ninsmiracle commented on PR #2184:
URL: 
https://github.com/apache/incubator-pegasus/pull/2184#issuecomment-2705682711


   > ### Add some information about dup sending delay
   > 
   > We conducted multiple control experiments on the test cluster with 
`duplicate_log_batch_bytes` of 0, 4096, and 8192. It can be clearly seen that 
configuring a larger `duplicate_log_batch_bytes` can improve the consumption 
capacity of the cluster dup. For the table below, when 
`duplicate_log_batch_bytes` is configured to 8192, the cluster is still able to 
consume writes at 40k write QPS; but if `duplicate_log_batch_bytes` is 
configured to 0, the cluster loses its consumption capacity at 20k write QPS. 
However, if the cluster dup can consume existing writes, the larger the 
`duplicate_log_batch_bytes`, the longer the delay in dup a piece of data 
between the master and slave clusters.
   > 
   > 
   > And I think I need to explain the 4th and 5th columns of the following 
table. When the delay between the master and standby clusters is too small, the 
delay data displayed by the monitoring is inaccurate. This is due to the 
counter reporting granularity. So we make a program to read and write the 
corresponding keys on both sides to determine the precise delay. However, when 
the delay between the master and slave clusters is too large, the delay of 
reading and writing each shard takes too long and is sometimes difficult to 
calculate. Therefore, we mainly use monitoring data to compare the experimental 
results in the scenario of large delay.
   > 
   > qps        plog Maximum backlog    duplicate_log_batch_bytes       
master/slave dup delay p99(Monitoring delay avg)        master/slave dup delay 
program test
   > 0  3       0               p95 105ms/p99 108ms
   > 0  3       4096            p95 106ms/p99 108ms
   > 0  3       8192            p95 127ms/p99 150ms
   > 8k 13K     0       120ms   p95 106ms/p99 137ms
   > 8K 17k     4096    3.7s    p95 119ms/p99 1673ms
   > 8K 17.2k   8192    6s      p95 138ms/p99 20s
   > 20K        Continue to increase    0       Continue to increase    
Difficult to observe
   > 20K        75k     4096    25s     Difficult to observe
   > 20K        70k     8192    25s     Difficult to observe
   > 30K        120k    8192    26s     Difficult to observe
   > 40K        24k     8192    28s     Difficult to observe
   > 45K        Continue to increase    8192    Continue to increase    
Difficult to observe
   > ==================================================
   > 
   > And here is an effect of adjusting the parameters of one of our online 
clusters:
   > 
   > 集群名        duplicate_log_batch_bytes = 4096        
duplicate_log_batch_bytes = 0
   > c3srv-online       p95 1008ms/ p99 1327ms  p95 100ms/ p99 108ms
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] feat(duplication): make the task code for incremental loading from private logs configurable [incubator-pegasus]

Reply via email to