Re: [DISCUSS]PIP-295: Fixing Chunk Message Duplication Issue

2023-08-25 Thread Heesung Sohn
> In this case, the consumer only can receive m1. Regarding this comment, can you explain how the consumer only receives m1? Here, m1's and m2's uuid and msgId will be different(if we suffix with a chunkingSessionId), although the sequence id is the same. > If we throw an exception when users use

Re: [DISCUSS]PIP-295: Fixing Chunk Message Duplication Issue

2023-08-25 Thread Xiangying Meng
Hi Heesung, Maybe we only need to maintain the last chunk ID in a map. Map map1. And we already have a map maintaining the last sequence ID. Map map2. If we do not throw an exception when users use the same sequence to send the message. For any incoming msg, m : chunk ID = -1; If m is a chunk me

Re: [DISCUSS]PIP-295: Fixing Chunk Message Duplication Issue

2023-08-25 Thread Heesung Sohn
However, what If the producer jvm gets restarted after the broker persists the m1 (but before updating their sequence id in their persistence storage), and the producer is trying to resend the same msg(so m2) with the same sequence id after restarting? On Fri, Aug 25, 2023 at 8:22 PM Xiangying

Re: [DISCUSS]PIP-295: Fixing Chunk Message Duplication Issue

2023-08-25 Thread Heesung Sohn
Don't we still need the broker dedup logic for the above case? Then, probably brokers need to track the following. Map // additionally track Map ChunkingContext{ uuid, numChunks, lastChunkId } The chunked msg dedup logic might be like: For any incoming chunked msg, m : If m.currentSeqid < Last

Re: [DISCUSS]PIP-295: Fixing Chunk Message Duplication Issue

2023-08-25 Thread Xiangying Meng
Hi Heesung, In this case, the consumer only can receive m1. But it has the same content as the previous case: What should we do if the user sends messages with the sequence ID that was used previously? I am afraid to introduce the incompatibility in this case, so I only added a warning log in th

Re: [DISCUSS]PIP-295: Fixing Chunk Message Duplication Issue

2023-08-25 Thread Heesung Sohn
Actually, can we think about this case too? What happens if the cx sends the same chunked msg with the same seq id when dedup is enabled? // user send a chunked msg, m1 s1, c0 s1, c1 s1, c2 // complete // user resend the duplicate msg, m2 s1, c0 s1, c1 s1, c2 //complete Do consumers receive m1

Re: [DISCUSS]PIP-295: Fixing Chunk Message Duplication Issue

2023-08-25 Thread Xiangying Meng
Hi Heesung, >I think this means, for the PIP, the broker side's chunk deduplication. >I think brokers probably need to track map to dedup What is the significance of doing this? My understanding is that if the client crashes and restarts after sending half of a chunk message and then it resends t

Re: [DISCUSS]PIP-295: Fixing Chunk Message Duplication Issue

2023-08-25 Thread Heesung Sohn
I think this means, for the PIP, the broker side's chunk deduplication. I think brokers probably need to track map to dedup chunks on the broker side. On Fri, Aug 25, 2023 at 6:16 PM Xiangying Meng wrote: > Hi Heesung > > It is a good point. > Assume the producer application jvm restarts in t

Re: [DISCUSS]PIP-295: Fixing Chunk Message Duplication Issue

2023-08-25 Thread Xiangying Meng
Hi Heesung It is a good point. Assume the producer application jvm restarts in the middle of chunking and resends the message chunks from the beginning with the previous sequence id. For the previous version, it should be: Producer send: 1. SequenceID: 0, ChunkID: 0 2. SequenceID: 0, ChunkID: 1

Re: [DISCUSS]PIP-295: Fixing Chunk Message Duplication Issue

2023-08-25 Thread Heesung Sohn
Hi, I meant What if the producer application jvm restarts in the middle of chunking and resends the message chunks from the beginning with the previous sequence id? On Fri, Aug 25, 2023 at 5:15 PM Xiangying Meng wrote: > Hi Heesung > > It is a good idea to cover this incompatibility case if t

Re: [DISCUSS]PIP-295: Fixing Chunk Message Duplication Issue

2023-08-25 Thread Xiangying Meng
Hi Heesung It is a good idea to cover this incompatibility case if the producer splits the chunk message again when retrying. But in fact, the producer only resents the chunks that are assembled to `OpSendMsg` instead of splitting the chunk message again. So, there is no incompatibility issue of

Re: [DISCUSS]PIP-295: Fixing Chunk Message Duplication Issue

2023-08-25 Thread Heesung Sohn
>> I think brokers can track the last chunkMaxMessageSize for each producer. > Using different chunkMaxMessageSize is just one of the aspects. In PIP-132 [0], we have included the message metadata size when checking maxMessageSize.The message metadata can be changed after splitting the chunks. We

[DISCUSS] Release DotPulsar 3.0.0

2023-08-25 Thread David Jensen
Dear Apache PMC and Committers Me and Daniel Blankensteiner (blankensteiner) would like to announce we are soon ready to release DotPulsar 3.0.0. The release contains breaking changes, therefor we bump to a new major version. Changelog ### Added - Added partitioned topic support for the Consume

Re: [DISCUSS]PIP-295: Fixing Chunk Message Duplication Issue

2023-08-25 Thread Xiangying Meng
Hi Zike, >How would this happen to get two duplicated and consecutive ChunkID-1 >messages? The producer should guarantee to retry the whole chunked >messages instead of some parts of the chunks. If the producer guarantees to retry the whole chunked messages instead of some parts of the chunks, Wh

Re: [VOTE] PIP 296: Introduce the `getLastMessageIds` API to Reader

2023-08-25 Thread Yubiao Feng
+1 (non-binding) Thanks Yubiao Feng On Fri, Aug 25, 2023 at 2:53 PM Xiangying Meng wrote: > Hi Pulsar Community, > > This is the vote thread for PIP 296: > https://github.com/apache/pulsar/pull/21052 > > This PIP will help to improve the flexibility of Reader usage. > > Thanks, > Xiangying >

Re: [VOTE] PIP 296: Introduce the `getLastMessageIds` API to Reader

2023-08-25 Thread Zike Yang
+1 (non-binding) Thanks, Zike Yang On Fri, Aug 25, 2023 at 2:52 PM Xiangying Meng wrote: > > Hi Pulsar Community, > > This is the vote thread for PIP 296: > https://github.com/apache/pulsar/pull/21052 > > This PIP will help to improve the flexibility of Reader usage. > > Thanks, > Xiangying

Re: [DISCUSS]PIP-295: Fixing Chunk Message Duplication Issue

2023-08-25 Thread Zike Yang
HI xiangying > The rewind operation is seen in the test log. That seems weird. Not sure if this rewind is related to the chunk consuming. > 1. SequenceID: 0, ChunkID: 0 2. SequenceID: 0, ChunkID: 1 3. SequenceID: 0, ChunkID: 1 4. SequenceID: 0, ChunkID: 2 Such four chunks cannot be processed cor