asafm commented on code in PR #18878:
URL: https://github.com/apache/pulsar/pull/18878#discussion_r1047158588
##########
site2/docs/concepts-messaging.md:
##########
@@ -599,10 +599,157 @@ In the diagram below, **Consumer A**, **Consumer B** and
**Consumer C** are all
#### Key_Shared
-In the *Key_Shared* type, multiple consumers can attach to the same
subscription. Messages are delivered in distribution across consumers and
messages with the same key or same ordering key are delivered to only one
consumer. No matter how many times the message is re-delivered, it is delivered
to the same consumer. When a consumer connects or disconnects, it causes the
served consumer to change some message keys.
+In the *Key_Shared* type, multiple consumers can attach to the same
subscription. Messages are delivered in distribution across consumers and
messages with the same key or same ordering key are delivered to only one
consumer. No matter how many times the message is re-delivered, it is delivered
to the same consumer.

+There are three types of mapping algorithms dictating how to select a consumer
for a given message key (or ordering key): Sticky, Auto-split Hash Range, and
Auto-split Consistent Hashing. The steps for all algorithms are:
+1. The message key (or ordering key) is passed to a hash function (e.g.,
Murmur3 32-bit), yielding a 32-bit integer hash.
+2. That hash number is fed to the algorithm to select a consumer from the
existing connected consumers.
+
+```
+ +--------------+
+-----------+
+Message Key -----> / Hash Function / ----- hash (32-bit) -------> / Algorithm
/ ----> Consumer
+ +---------------+ +----------+
+```
+
+
+When a new consumer is connected and thus added to the list of connected
consumers, the algorithm re-adjusts the mapping such that some keys currently
mapped to existing consumers will be mapped to the newly added consumer. When a
consumer is disconnected, thus removed from the list of connected consumers,
keys mapped to it will be mapped to other consumers. The sections below will
explain how a consumer is selected given the message hash and how the mapping
is adjusted given a new consumer is connected or an existing consumer
disconnects for each algorithm.
+
+##### Auto-split Hash Range
+
+The algorithm assumes there is a range of numbers between 0 to 2^16 (65,536).
Each consumer is mapped into a single region in this range, so all mapped
regions cover the entire range, and no regions overlap. A consumer is selected
for a given key by running a modulo operation on the message hash by the range
size (65,536). The number received ( 0 <= i < 65,536) is contained within a
single region. The consumer mapped to that region is the one selected.
+
+Example:
+
+Suppose we have 4 consumers (C1, C2, C3 and C4), then:
+
+```
+ 0 16,384 32,768 49,152 65,536
+ |------- C3 ------|------- C2 ------|------- C1 ------|------- C4 ------|
+```
+
+Given a message key `Order-3459134`, its hash would be
`murmur32("Order-3459134") = 3112179635`, and its index in the range would be
`3112179635 mod 65536 = 6067`. That index is contained within region `[0,
16384)` thus consumer C1 will be mapped to this message key.
+
+When a new consumer is connected, the largest region is chosen and is then
split in half - the lower half will be mapped to the newly added consumer and
upper half will be mapped to the consumer owning that region. Here is how it
looks like from 1 to 4 consumers:
+
+```
+C1 connected:
+|---------------------------------- C1 ---------------------------------|
+
+C2 connected:
+|--------------- C2 ----------------|---------------- C1 ---------------|
+
+C3 connected:
+|------- C3 ------|------- C2 ------|---------------- C1 ---------------|
+
+C4 connected:
+|------- C3 ------|------- C2 ------|------- C1 ------|------- C4 ------|
+```
+
+When a consumer is disconnected its region will be merged into the region on
its right. Examples:
+
+C4 is disconnected:
Review Comment:
<img width="1050" alt="image"
src="https://user-images.githubusercontent.com/989425/207334474-4e0ded80-61f3-4d9c-b9ce-28a4aaa1e6c0.png">
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]