Hi Girish,

Thanks for the questions. I'll reply to them

> does this sharing of the same tcp/ip connection happen across partitions as
> well (assuming both the partitions of the topic are on the same broker)?
> i.e. producer 127.0.0.1 for partition
> `persistent://tenant/ns/topic0-partition0` and producer 127.0.0.1 for
> partition `persistent://tenant/ns/topic0-partition1` share the same tcp/ip
> connection assuming both are on broker-0 ?

The Pulsar Java client would be sharing the same TCP/IP connection to a single 
broker when using the default setting of connectionsPerBroker = 1. It could be 
using a different connection if connectionsPerBroker > 1.

> In general, the major use case behind this PIP for me and my organization
> is about supporting produce spikes. We do not want to allocate absolute
> maximum throughput for a topic which would not even be utilized 99.99% of
> the time. Thus, for a topic that stays constantly at 100MBps and goes to
> 150MBps only once in a blue moon, it's unwise to allocate 150MBps worth of
> resources 100% of the time. The poller based rate limiter is also not good
> here as it would allow over use of hardware without a check, degrading the
> system.

I'm trying to understand your use case better. 
In the PIP-310 document it says:
> Precise rate limiter fixes the above two issues, but introduces another 
> challenge - the rate 
> limiting is too strict.
> * This leads potential for data loss in case there are sudden spikes in 
> produce and client side >    produce queue breaches.
> * The produce latencies increase exponentially in case produce breaches the 
> set throughput >    even for small windows.

Could you please elaborate more on these details? Here are some questions:
1. What do you mean that it is too strict? 
    - Should the rate limiting allow bursting over the limit for some time?
2. What type of data loss are you experiencing? 
3. What is the root cause of the data loss?
   - Do you mean that the system performance degrades and data loss is due to 
not being able to produce from client to the broker quickly enough and data 
loss happens because messages cannot be forwarded from the client to the broker?

Once there's a common understanding of the problem, it's easier to design the 
solution together in the Pulsar community. One possibility is that PIP-310 
already solves your problem. Another possibility is that we need to improve 
producer flow control. I currently feel that it is needed, but I might be wrong.

As mentioned in my previous email, there has been discussions about improving 
producer flow control. One of the solution ideas that was discussed in a Pulsar 
community meeting in January was to add explicit flow control to producers, 
somewhat similar to how there are "permits" as the flow control for consumers. 
The permits would be based on byte size (instead of number of messages). With 
explicit flow control in the protocol, the rate limiting will also be effective 
and deterministic and the issues that Tao Jiuming was explaining could also be 
resolved. It also would solve the producer/consumer multiplexing on a single 
TCP/IP connection when flow control and rate limiting isn't based on the TCP/IP 
level (and toggling the Netty channel's auto read property).

Let's continue discussion, since I think that this is an important improvement 
area. Together we could find a good solution that works for multiple use cases 
and addresses existing challenges in producer flow control and rate limiting. 

-Lari

On 2023/11/03 11:16:37 Girish Sharma wrote:
> Hello Lari,
> Thanks for bringing this to my attention. I went through the links, but
> does this sharing of the same tcp/ip connection happen across partitions as
> well (assuming both the partitions of the topic are on the same broker)?
> i.e. producer 127.0.0.1 for partition
> `persistent://tenant/ns/topic0-partition0` and producer 127.0.0.1 for
> partition `persistent://tenant/ns/topic0-partition1` share the same tcp/ip
> connection assuming both are on broker-0 ?
> 
> In general, the major use case behind this PIP for me and my organization
> is about supporting produce spikes. We do not want to allocate absolute
> maximum throughput for a topic which would not even be utilized 99.99% of
> the time. Thus, for a topic that stays constantly at 100MBps and goes to
> 150MBps only once in a blue moon, it's unwise to allocate 150MBps worth of
> resources 100% of the time. The poller based rate limiter is also not good
> here as it would allow over use of hardware without a check, degrading the
> system.
> 
> @Asif, I have been sick these last 10 days, but will be updating the PIP
> with the discussed changes early next week.
> 
> Regards
> 
> On Fri, Nov 3, 2023 at 3:25 PM Lari Hotari <lhot...@apache.org> wrote:
> 
> > Hi Girish,
> >
> > In order to address your problem described in the PIP document [1], it
> > might be necessary to make improvements in how rate limiters apply
> > backpressure in Pulsar.
> >
> > Pulsar uses mainly TCP/IP connection level controls for achieving
> > backpressure. The challenge is that Pulsar can share a single TCP/IP
> > connection across multiple producers and consumers. Because of this, there
> > could be multiple producers and consumers and rate limiters operating on
> > the same connection on the broker, and they will do conflicting decisions,
> > which results in undesired behavior.
> >
> > Regarding the shared TCP/IP connection backpressure issue, Apache Flink
> > had a somewhat similar problem before Flink 1.5. It is described in the
> > "inflicting backpressure" section of this blog post from 2019:
> >
> > https://flink.apache.org/2019/06/05/flink-network-stack.html#inflicting-backpressure-1
> > Flink solved the issue of multiplexing multiple streams of data on a
> > single TCP/IP connection in Flink 1.5 by introducing it's own flow control
> > mechanism.
> >
> > The backpressure and rate limiting challenges have been discussed a few
> > times in Pulsar community meetings over the past years. There was also a
> > generic backpressure thread on the dev mailing list [2] in Sep 2022.
> > However, we haven't really documented Pulsar's backpressure design and how
> > rate limiters are part of the overall solution and how we could improve.
> > I think it might be time to do so since there's a requirement to improve
> > rate limiting. I guess that's the main motivation also for PIP-310.
> >
> > -Lari
> >
> > 1 - https://github.com/apache/pulsar/pull/21399/files
> > 2 - https://lists.apache.org/thread/03w6x9zsgx11mqcp5m4k4n27cyqmp271
> >
> > On 2023/10/19 12:51:14 Girish Sharma wrote:
> > > Hi,
> > > Currently, there are only 2 kinds of publish rate limiters - polling
> > based
> > > and precise. Users have an option to use either one of them in the topic
> > > publish rate limiter, but the resource group rate limiter only uses
> > polling
> > > one.
> > >
> > > There are challenges with both the rate limiters and the fact that we
> > can't
> > > use precise rate limiter in the resource group level.
> > >
> > > Thus, in order to support custom rate limiters, I've created the PIP-310
> > >
> > > This is the discussion thread. Please go through the PIP and provide your
> > > inputs.
> > >
> > > Link - https://github.com/apache/pulsar/pull/21399
> > >
> > > Regards
> > > --
> > > Girish Sharma
> > >
> >
> 
> 
> -- 
> Girish Sharma
> 

Reply via email to