Dear Pulsar Community,

I would like to initiate a discussion regarding the optimization of
the method used for estimating the message backlog size.

In the current implementation, the backlog size is estimated from the
mark delete position to the last confirm position, whereas the backlog
message count is the number of messages from the mark delete position
to the last confirm position, minus the count of individually
acknowledged messages. The inconsistency between these two could
potentially confuse users.

For instance, let's consider there are 3,000 messages in a topic and
all messages except for message 1:0, 1:998, and 3:999 have been
acknowledged by a subscription. When users retrieve the stats of the
subscription, they will find that `msgBacklog` is 3, while
`backlogSize` is 3000 * entry size.

    |1:0|...|1:998|...|3:999|

When it comes to the value of `backlogSize`, there seem to be two
different opinions:
1. The backlog size should be consistent with the message backlog, and
it should not include the messages that have been individually
acknowledged.
2. Only the messages before the mark delete position can be deleted,
so we should calculate the backlog size from the mark delete position,
and individual acknowledgments should not affect the calculation of
the backlog size.

I'm interested in hearing how others view this issue. I look forward
to your response.

Best Regards,
Xiangying

Reply via email to