This is an automated email from the ASF dual-hosted git repository. rxl pushed a commit to branch branch-2.6 in repository https://gitbox.apache.org/repos/asf/pulsar.git
commit 14710a992750a945532b0d3bf40a0aa0de769a79 Author: HuanliMeng <48120384+huanli-m...@users.noreply.github.com> AuthorDate: Wed Jun 24 09:14:15 2020 +0800 chunking for PIP-37 support large message size (#7334) ### Motivation Create docs for PIP-37 to support large size of messages in pulsar. ### Modifications Add the chunking section in Pulsar Messaging doc. Add two images about how to handle chunked messages with one or more producers (cherry picked from commit 3708c9395039e6af0a36b7c87f51a9115f0028f9) --- site2/docs/assets/chunking-01.png | Bin 0 -> 11881 bytes site2/docs/assets/chunking-02.png | Bin 0 -> 30135 bytes site2/docs/concepts-messaging.md | 26 ++++++++++++++++++++++++++ 3 files changed, 26 insertions(+) diff --git a/site2/docs/assets/chunking-01.png b/site2/docs/assets/chunking-01.png new file mode 100644 index 0000000..f83b747 Binary files /dev/null and b/site2/docs/assets/chunking-01.png differ diff --git a/site2/docs/assets/chunking-02.png b/site2/docs/assets/chunking-02.png new file mode 100644 index 0000000..eb47a86 Binary files /dev/null and b/site2/docs/assets/chunking-02.png differ diff --git a/site2/docs/concepts-messaging.md b/site2/docs/concepts-messaging.md index 5805971..095fe86 100644 --- a/site2/docs/concepts-messaging.md +++ b/site2/docs/concepts-messaging.md @@ -59,6 +59,32 @@ To avoid redelivering acknowledged messages in a batch to the consumer, Pulsar i By default, batch index acknowledgement is disabled (`batchIndexAcknowledgeEnable=false`). You can enable batch index acknowledgement by setting the `batchIndexAcknowledgeEnable` parameter to `true` at the broker side. Enabling batch index acknowledgement may bring more memory overheads. So, perform this operation with caution. +### Chunking + +#### Note + +> - Batching and chunking cannot be enabled simultaneously. To enable chunking, you must disable batching in advance. +> - Chunking is only supported for persisted topics. +> - Chunking is only supported for the exclusive and failover subscription modes. + +When chunking is enabled (`chunkingEnabled=true`), if the message size is greater than the allowed maximum publish-payload size, the producer splits the original message into chunked messages and publishes them with chunked metadata to the broker separately and in order. At the broker, the chunked messages are stored in the managed-ledger in the same way as that of ordinary messages. The only difference is that the consumer needs to buffer the chunked messages and combines them into the [...] + +The consumer consumes the chunked messages and buffers them until the consumer receives all the chunks of a message. Finally, the consumer stitches chunked messages together and places them into the receiver-queue . Therefore, the client can consume messages from there. Once the consumer consumes entire large message and acknowledges it, the consumer internally sends acknowledgement of all the chunk messages associated to that large message. You can set the `maxPendingChuckedMessage` par [...] + + The broker does not require any changes to support chunking for non-shared subscription. The broker only use the `chuckedMessageRate` to record chunked message rate on the topic. + +#### Handle chunked messages with one producer and one ordered consumer + +As shown in the following figure, when a topic has one producer which publishes large message payload in chunked messages along with regular non-chunked messages. the producer publishes message M1 in three chunks M1-C1, M1-C2 and M1-C3. The broker stores all 3 chunked messages in the managed-ledger and dispatches to the ordered (exclusive/failover) consumer in the same order. The consumer buffers all the chunked messages in memory until it receives all the chunked messages, combines them [...] + +![](assets/chunking-01.png) + +#### Handle chunked messages with multiple producers and one ordered consumer + +When multiple publishers publishes chunked messages into the single topic. The broker stores all the chunked messages coming from different publishers in the same managed-ledger. As shown below, Producer 1 publishes message M1 in three chunks M1-C1, M1-C2 and M1-C3. Producer 2 publishes message M2 in three chunks M2-C1, M2-C2 and M2-C3. All chunked messages of the specific message are still in the order but might not be consecutive in the managed-ledger. This brings some memory pressure [...] + +![](assets/chunking-02.png) + ## Consumers A consumer is a process that attaches to a topic via a subscription and then receives messages.