First, I agree with Yubiao that we can avoid calling the `isDuplicate`
method once this option is enabled.

Then, I'm wondering in which case would users want to disable this
option? What's the disadvantage to disable the option? I think we can
just record the latest position (ledger id, entry id, batch index) of
the message received if the subscription type is Exclusive or
Failover.

Is there any breaking change if we just apply this filter without
adding a configuration option?

Thanks,
Yunze

On Tue, Mar 21, 2023 at 2:26 PM 丛搏 <congbobo...@gmail.com> wrote:
>
> Hi, Michael
>
> Michael Marshall <mmarsh...@apache.org> 于2023年3月21日周二 13:03写道:
> >
> > This is a great problem to improve.
> >
> > What if we instead expand the CommandSubscribe [0] protocol message
> > with a new field to represent the client's desired read position? This
> > way, the client can tell the second broker where to start sending
> > messages, and there is no need to send the messages twice.
> >
> > I like the protocol expansion because it saves on unnecessary network
> > transfer in several places and because it will be more straightforward
> > for clients in other languages to implement.
> >
> > What do you think?
> if we add the new field in CommandSubscribe, we should ensure
> the synchronization between consumer reconnection and user
> calling receive and redeliverUnack method. it will affect the performance
> of receive. expose synchronization to hot paths it not a good idea.
> Although the message is re-delivered twice, I don't think it
> will cause too much performance loss.
>
> This filtering is rigorous, and there cannot be some race condition problems
> because it involves transactions. I want it to be simple and efficient,
> and I don't want it to become complicated and difficult to maintain.
>
> Of course, if the failover and exclusive consumers are changed to pull mode,
> I believe that the change protocol is a very good idea. But at present,
> there is obviously no sufficient reason to do so.
>
> Thanks,
> Bo
>
> >
> > Thanks,
> > Michael
> >
> > [0] 
> > https://github.com/apache/pulsar/blob/af1360fb167c1f9484fda5771df3ea9b21d1440b/pulsar-common/src/main/proto/PulsarApi.proto#L339-L400
> >
> >
> > On Mon, Mar 20, 2023 at 10:56 AM Xiangying Meng <xiangy...@apache.org> 
> > wrote:
> > >
> > > Hi Congbo,
> > > I think this is a great idea.
> > > This is more efficient in filtering duplicate messages for a single
> > > consumer.
> > > And maybe more details about implementation should be shown in the 
> > > proposal.
> > >
> > > Best regards,
> > > Xiangying
> > >
> > > On Mon, Mar 20, 2023 at 10:53 PM Yubiao Feng
> > > <yubiao.f...@streamnative.io.invalid> wrote:
> > >
> > > > Hi Bo
> > > >
> > > > I think this is a good way to filter messages that the client has 
> > > > received.
> > > >
> > > > And I have two questions:
> > > >
> > > > 1. This is more powerful than the original way
> > > > (`acknowledgmentsGroupingTracker.isDuplicate(msgId)) to filter out
> > > > duplicated messages.
> > > >  Is it possible to turn off the original de-replay logic to improve
> > > > performance after enabling this new feature?
> > > >
> > > > 2. There should be a typo in the article
> > > >
> > > > > ## Only support Consumer#redeliverUnacknowledgedMessages()
> > > > > If we redeliver individual messages, they will be filtered. Because we
> > > > can't clear the record latest message
> > > > >in the consumer when redelivering individual messages. It will make 
> > > > >this
> > > > config unclear, and if every redeliver
> > > > > method changes, it will bring a lot of redundant code, which is 
> > > > > difficult
> > > > to maintain. If there is a need in the
> > > > > future, just support it.
> > > >
> > > > I suppose you want to say not support `redeliverUnacknowledgedMessages`,
> > > > right?
> > > >
> > > >
> > > > Thanks
> > > > Yubiao Feng
> > > >
> > > > On Mon, Mar 20, 2023 at 10:21 PM 丛搏 <congbobo...@gmail.com> wrote:
> > > >
> > > > > Hi, pulsar community:
> > > > >
> > > > > I started a PIP about `Client consumer filter received messages`.
> > > > >
> > > > > PIP: https://github.com/apache/pulsar/issues/19864
> > > > >
> > > > > Thanks,
> > > > > Bo
> > > > >
> > > >

Reply via email to