PIP: https://github.com/apache/pulsar/issues/18510

Problem Statement

When a topic is a partitioned topic and a partition is not available for
producing messages, currently pulsar client will still try to produce
messages on unavailable partitions, which it may not necessarily need to do
in certain cases. Pulsar Client may simply pick up another partition and
try producing in certain cases.
Partition Unavailable

There could be a plethora of reasons a partition can become unavailable.
But the most prominent reason is partition is moving from one broker to
another, and until every actor is in sync with which broker owns the
partition, the partition will be unavailable for producing. Actors are
producers, old broker, new broker.
Client Behavior

This is the typical produce code.

producer.sendAsync(payLoad.getBytes(StandardCharsets.UTF_8));

When send is called message is enqueued in a queue(called pending message
queue) and the future is returned.

And future is only completed when the message is picked from the queue and
sent to the broker asynchronously and ack is received asynchronously again.
Max size of the pending message queue is controlled by producer config
maxPendingMessages.

When pending message queue is full, the application will start getting
publish failures. Pending message queue provide a cushion towards
unavailable partitions. But again it has some limits.

When another partitions can be picked

   1.

   When the message is not keyed. That means the message is not ordered
   based on a key.
   2.

   When routing mode is round-robin, that means a message can be produced
   to any of the partitions. So If a partition is unavailable try and pick up
   another partition for producing, by using the same round-robin algorithm.

Reply via email to