Re: [DISCUSS] PIP-310: Support custom publish rate limiters

Girish Sharma Fri, 03 Nov 2023 11:48:10 -0700

Hello Lari, replies inline.


On Fri, Nov 3, 2023 at 11:13 PM Lari Hotari <lhot...@apache.org> wrote:

> Hi Girish,
>
> Thanks for the questions. I'll reply to them
>
> > does this sharing of the same tcp/ip connection happen across partitions
> as
> > well (assuming both the partitions of the topic are on the same broker)?
> > i.e. producer 127.0.0.1 for partition
> > `persistent://tenant/ns/topic0-partition0` and producer 127.0.0.1 for
> > partition `persistent://tenant/ns/topic0-partition1` share the same
> tcp/ip
> > connection assuming both are on broker-0 ?
>
> The Pulsar Java client would be sharing the same TCP/IP connection to a
> single broker when using the default setting of connectionsPerBroker = 1.
> It could be using a different connection if connectionsPerBroker > 1.
>
> Thanks for clarifying this.

Could you please elaborate more on these details? Here are some questions:
> 1. What do you mean that it is too strict?
>     - Should the rate limiting allow bursting over the limit for some time?
>

That's one of the major use cases, yes.


> 2. What type of data loss are you experiencing?
>

Messages produced by the producers which eventually get timed out due to
rate limiting.


> 3. What is the root cause of the data loss?
>    - Do you mean that the system performance degrades and data loss is due
> to not being able to produce from client to the broker quickly enough and
> data loss happens because messages cannot be forwarded from the client to
> the broker?
>

No, the system performance decreases in the case of poller based rate
limiters. In the precise one, it's purely the broker pausing the netty
channel's auto read property. If the producer goes beyond the set
throughput for a longer (than timeout) duration then it starts observing
timeouts leading to the messages being timed out essentially being lost.

As mentioned in my previous email, there has been discussions about
> improving producer flow control. One of the solution ideas that was
> discussed in a Pulsar community meeting in January was to add explicit flow
> control to producers, somewhat similar to how there are "permits" as the
> flow control for consumers. The permits would be based on byte size
> (instead of number of messages). With explicit flow control in the
> protocol, the rate limiting will also be effective and deterministic and
> the issues that Tao Jiuming was explaining could also be resolved. It also
> would solve the producer/consumer multiplexing on a single TCP/IP
> connection when flow control and rate limiting isn't based on the TCP/IP
> level (and toggling the Netty channel's auto read property).
>
> I think the core implementation of how the broker fails fast at the time
of rate limiting (whether it is by pausing netty channel or a new permits
based model) does not change the actual issue I am targeting. Multiplexing
has some impact on it - but yet again only limited, and can easily be fixed
by the client by increasing the connections per broker. Even after assuming
both these things are somehow "fixed", the fact remains that an absolutely
strict rate limiter will lead to the above mentioned data loss for burst
going above the limit and that a poller based rate limiter doesn't really
rate limit anything as it allows all produce in the first interval of the
next second.


> Let's continue discussion, since I think that this is an important
> improvement area. Together we could find a good solution that works for
> multiple use cases and addresses existing challenges in producer flow
> control and rate limiting.
>
> -Lari
>
> On 2023/11/03 11:16:37 Girish Sharma wrote:
> > Hello Lari,
> > Thanks for bringing this to my attention. I went through the links, but
> > does this sharing of the same tcp/ip connection happen across partitions
> as
> > well (assuming both the partitions of the topic are on the same broker)?
> > i.e. producer 127.0.0.1 for partition
> > `persistent://tenant/ns/topic0-partition0` and producer 127.0.0.1 for
> > partition `persistent://tenant/ns/topic0-partition1` share the same
> tcp/ip
> > connection assuming both are on broker-0 ?
> >
> > In general, the major use case behind this PIP for me and my organization
> > is about supporting produce spikes. We do not want to allocate absolute
> > maximum throughput for a topic which would not even be utilized 99.99% of
> > the time. Thus, for a topic that stays constantly at 100MBps and goes to
> > 150MBps only once in a blue moon, it's unwise to allocate 150MBps worth
> of
> > resources 100% of the time. The poller based rate limiter is also not
> good
> > here as it would allow over use of hardware without a check, degrading
> the
> > system.
> >
> > @Asif, I have been sick these last 10 days, but will be updating the PIP
> > with the discussed changes early next week.
> >
> > Regards
> >
> > On Fri, Nov 3, 2023 at 3:25 PM Lari Hotari <lhot...@apache.org> wrote:
> >
> > > Hi Girish,
> > >
> > > In order to address your problem described in the PIP document [1], it
> > > might be necessary to make improvements in how rate limiters apply
> > > backpressure in Pulsar.
> > >
> > > Pulsar uses mainly TCP/IP connection level controls for achieving
> > > backpressure. The challenge is that Pulsar can share a single TCP/IP
> > > connection across multiple producers and consumers. Because of this,
> there
> > > could be multiple producers and consumers and rate limiters operating
> on
> > > the same connection on the broker, and they will do conflicting
> decisions,
> > > which results in undesired behavior.
> > >
> > > Regarding the shared TCP/IP connection backpressure issue, Apache Flink
> > > had a somewhat similar problem before Flink 1.5. It is described in the
> > > "inflicting backpressure" section of this blog post from 2019:
> > >
> > >
> https://flink.apache.org/2019/06/05/flink-network-stack.html#inflicting-backpressure-1
> > > Flink solved the issue of multiplexing multiple streams of data on a
> > > single TCP/IP connection in Flink 1.5 by introducing it's own flow
> control
> > > mechanism.
> > >
> > > The backpressure and rate limiting challenges have been discussed a few
> > > times in Pulsar community meetings over the past years. There was also
> a
> > > generic backpressure thread on the dev mailing list [2] in Sep 2022.
> > > However, we haven't really documented Pulsar's backpressure design and
> how
> > > rate limiters are part of the overall solution and how we could
> improve.
> > > I think it might be time to do so since there's a requirement to
> improve
> > > rate limiting. I guess that's the main motivation also for PIP-310.
> > >
> > > -Lari
> > >
> > > 1 - https://github.com/apache/pulsar/pull/21399/files
> > > 2 - https://lists.apache.org/thread/03w6x9zsgx11mqcp5m4k4n27cyqmp271
> > >
> > > On 2023/10/19 12:51:14 Girish Sharma wrote:
> > > > Hi,
> > > > Currently, there are only 2 kinds of publish rate limiters - polling
> > > based
> > > > and precise. Users have an option to use either one of them in the
> topic
> > > > publish rate limiter, but the resource group rate limiter only uses
> > > polling
> > > > one.
> > > >
> > > > There are challenges with both the rate limiters and the fact that we
> > > can't
> > > > use precise rate limiter in the resource group level.
> > > >
> > > > Thus, in order to support custom rate limiters, I've created the
> PIP-310
> > > >
> > > > This is the discussion thread. Please go through the PIP and provide
> your
> > > > inputs.
> > > >
> > > > Link - https://github.com/apache/pulsar/pull/21399
> > > >
> > > > Regards
> > > > --
> > > > Girish Sharma
> > > >
> > >
> >
> >
> > --
> > Girish Sharma
> >
>


-- 
Girish Sharma

Re: [DISCUSS] PIP-310: Support custom publish rate limiters

Reply via email to