thetumbled commented on code in PR #4258:
URL: https://github.com/apache/bookkeeper/pull/4258#discussion_r1582090916
##########
site3/website/src/pages/bps/BP-66-support-throttling-for-zookeeper-read-of-rereplication.md:
##########
@@ -0,0 +1,26 @@
+# BP-66: support throttling for zookeeper read of rereplication
+
+### Motivation
+
+Each time the cluster trigger the re-replication, all replicators will read
data from zookeeper. This can cause a great pressure on Zookeeper. We need to
support throttling for zookeeper read of re-replication.
+
+For example, in a Pulsar cluster, we enable auto-recovery for every bookie.
There are 400 bookies in a cluster, which means there are 400 replicators in
the cluster.
+And there are about 3000 ledgers in each bookie, 1/3 of them are small
ledgers, whose size is less than 0.1MB, that is 1000 small ledgers in each
bookie.
+If we decommission one bookie, the read latency of zookeeper will increase to
minutes.
+
+
+### Configuration
+add the following configuration:
Review Comment:
- ZK can't scale out. As the Pulsar & Bookkeeper cluster scale out, the
number of replicators must scale out inevitably. When the number of replicators
reach to tens or hundreds, the latency will soar to unacceptable level.
- For the convenience of operation and maintenance, we always set
`autoRecoveryDaemonEnable=true` for every bookie. We do not adopt the other two
options:
- Deploying another cluster for AutoRecovery increase the complexity of
the whole system.
- Make small part of bookies in the Bookkeeper cluster work as replicator
and set a high value of `replicationRateByBytes` is dangerous. because the
throughput of normal client will be impacted by the replication throuhput.
- Actually, there is not a way to relieve the pressure of zk in replicatioin
currently.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]