[
https://issues.apache.org/jira/browse/PULSAR-10?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17299027#comment-17299027
]
Piyush Mishra commented on PULSAR-10:
-------------------------------------
Hi,
I am interested to work on this issue. I already have good experience using
Kafka in production systems, so pulsar looks like a good project to contribute
to me.
> Improve the message backlogs for the topic
> ------------------------------------------
>
> Key: PULSAR-10
> URL: https://issues.apache.org/jira/browse/PULSAR-10
> Project: Pulsar
> Issue Type: Improvement
> Reporter: Penghui Li
> Priority: Major
> Labels: Pulsar, gsoc, gsoc2021, mentor
>
> In Pulsar, the client usually sends several messages with a batch. From the
> broker side, the broker receives a batch and write the batch message to the
> storage layer.
> The message backlog is maintaining how many messages should be handled for a
> subscription. But unfortunately, the current backlog is based on the batches,
> not the messages. This will confuse users that they have pushed 1000 messages
> to the topic, but from the subscription side, when to check the backlog, will
> return a value that lower than 1000 messages such as 100 batches. Not able to
> get the message based backlog is it's so expensive to calculate the number of
> messages in each batch.
>
> PIP-70
> [https://github.com/apache/pulsar/wiki/PIP-70%3A-Introduce-lightweight-raw-Message-metadata
>
> |https://github.com/apache/pulsar/wiki/PIP-70%3A-Introduce-lightweight-raw-Message-metadata]Introduced
> a broker level entry metadata which can support message index for a topic(or
> message offset of a topic). This will provide the ability to calculate the
> number of messages between a message index to another message index. So we
> can leverage PIP-70 to improve the message backlog implementation to able to
> get the message-based backlog.
>
> For the Exclusive subscription or Failover subscription, it easy to implement
> by calculating the messages between the mark delete position and the LAC
> position. But for the Shared and Key_Shared subscription, the individual
> acknowledgment will bring some complexity. We can cache the individual
> acknowledgment count in the broker memory, so the way to calculate the
> message backlog for the Shared and Key_Shared subscription is
> `backlogOfTheMarkdeletePosition` - `IndividualAckCount`
--
This message was sent by Atlassian Jira
(v8.3.4#803005)