Hi all,
Currently we have the following 5 implementations of MessageId:
- MessageIdImpl: (ledger id, entry id, partition index)
- BatchMessageIdImpl: adds (batch index, batch size, acker), where
acker is a wrapper of a BitSet.
- ChunkMessageIdImpl: adds another MessageIdImpl that represents
the first MessageIdImpl of a BitSet.
- MultiMessageIdImpl: adds a map that maps the topic name to the
MessageId.
- TopicMessageIdImpl: adds the topic name and the partition name
These implementations are such a mess. For example, when users get a
MessageId from `Producer#send`:
```java
var id = producer.send("msg");
```
There is no getter to get some specific fields like ledger id. You can
only see a representation from `toString` method and got some output
like "0:0:-1:0". Maybe this design is to hidden some details, but if
users don't know the details like ledger id and entry id, how could
you know what does "0:0:-1:0" mean? What if `MessageId#toString`'s
implementation changed? Should it be treated as a breaking change?
The original definition of the underlying MessageIdData is much more
clear:
```proto
message MessageIdData {
required uint64 ledgerId = 1;
required uint64 entryId = 2;
optional int32 partition = 3 [default = -1];
optional int32 batch_index = 4 [default = -1];
repeated int64 ack_set = 5;
optional int32 batch_size = 6;
// For the chunk message id, we need to specify the first chunk message id.
optional MessageIdData first_chunk_message_id = 7;
}
```
IMO, MessageId should be a wrapper of MessageIdData. It's more natural
to have an interface like:
```java
interface MessageId {
long ledgerId();
long entryId();
Optional<Integer> partition();
Optional<Integer> batchIndex();
// ...
```
Additionally, there are many places that use only the triple of
(ledger id, entry id, batch index) as the key to represent the position.
Currently, they are done by adding a conversion from
BatchMessageIdImpl to MessageIdImpl. However, it's more intuitive to
write something like:
```java
class MessageIdPosition implements Comparable<MessageIdPosition> {
private final MessageId messageId;
// TODO: compare only the triple (ledger, entry, batch)
```
Therefore, I'm going to write a proposal to redesign the MessageId
interface only by adding some getters. Regarding the 5 existing
implementations, I think we can drop them because they are a part
of `pulsar-client`, not `pulsar-client-api`.
Please feel free to share your points.
Thanks,
Yunze