benjumanji opened a new issue, #4553: URL: https://github.com/apache/bookkeeper/issues/4553
I have the following config (shortened for brevity) on pulsar 4.0.1 ``` bookkeeperClientRegionawarePolicyEnabled=true reppRegionsToWrite=euw1-az3;euw1-az1;euw1-az2 reppMinimumRegionsForDurability=2 ``` I have at least three bookies. If I try the aforementioned policy (e3,w3,a2) then the exception here: https://github.com/apache/bookkeeper/blob/0748423e3228f7cf61d2e1f2ab11e354ed84c0df/bookkeeper-server/src/main/java/org/apache/bookkeeper/client/RegionAwareEnsemblePlacementPolicy.java#L317 is thrown. <img width="1210" alt="Screenshot 2025-01-30 at 21 01 17" src="https://github.com/user-attachments/assets/001a603c-32ce-4d1f-aba9-fea20dd17032" /> This makes little sense to me as `2 <= 3 - 3/2` evaluates to true, but I am failing to see _why_ this is a bad configuration. ``` // We must survive the failure of numRegions - effectiveMinRegionsForDurability. When these // regions have failed we would spread the replicas over the remaining // effectiveMinRegionsForDurability regions; we have to make sure that the ack quorum is large // enough such that there is a configuration for spreading the replicas across // effectiveMinRegionsForDurability - 1 regions ``` Ok so I have 3 regions, and I want 2 for durability. I therefore can only tolerate 1 region failing. If that region fails I have two regions, and I require two acks. I have two bookies, they can both ack, what's the problem? Why is 4/4/3 good and 3/3/2 bad? If the argument is that the initial placements might be 2 in one region and 1 in another, why doesn't this apply to 4/4/3 (3 in one region and one in another)? If we plug in 3/3/2 to the comment, then we need to survive 3 - 2 failures (1), and we need to make sure acks cover 2 - 1 (1) regions? Why does 3 acks + 4 writers fulfil this and 2 acks and 3 writers not? I guess what's eating me is I don't want the extra tail latency or to pay for the extra disks. I just want 3 replicas, and to survive a region out. There doesn't seem to be a configuration possible for this. Ok, lets take the following (from the [docs](https://pulsar.apache.org/docs/4.0.x/administration-isolation-bookie/#region-aware-placement-policy)): > For example, the BookKeeper cluster has 4 regions, and each region has several racks with their bookie instances, as shown in the following diagram. If a topic is configured with EnsembleSize=3, WriteQuorum=3, and AckQuorum=2, the BookKeeper client chooses three different regions, such as Region A, Region C and Region D. For each region, it chooses one bookie on a single rack, such as Bookie5 on Rack2, Bookie17 on Rack6, and Bookie21 on Rack8. The only value for min reegions for durability under which the expression evaluates to false for 3/3/2 is 1, which is a data-loss ready config. So either the docs are recommending a guaranteed fail, or an impossible configuration according the repp validation code. _Originally posted by @benjumanji in https://github.com/apache/pulsar/discussions/23913_ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
