Re: [DISCUSS] PIP-310: Support custom publish rate limiters

2023-11-03 Thread Girish Sharma
Hello Lari, replies inline.


On Fri, Nov 3, 2023 at 11:13 PM Lari Hotari  wrote:

> Hi Girish,
>
> Thanks for the questions. I'll reply to them
>
> > does this sharing of the same tcp/ip connection happen across partitions
> as
> > well (assuming both the partitions of the topic are on the same broker)?
> > i.e. producer 127.0.0.1 for partition
> > `persistent://tenant/ns/topic0-partition0` and producer 127.0.0.1 for
> > partition `persistent://tenant/ns/topic0-partition1` share the same
> tcp/ip
> > connection assuming both are on broker-0 ?
>
> The Pulsar Java client would be sharing the same TCP/IP connection to a
> single broker when using the default setting of connectionsPerBroker = 1.
> It could be using a different connection if connectionsPerBroker > 1.
>
> Thanks for clarifying this.

Could you please elaborate more on these details? Here are some questions:
> 1. What do you mean that it is too strict?
> - Should the rate limiting allow bursting over the limit for some time?
>

That's one of the major use cases, yes.


> 2. What type of data loss are you experiencing?
>

Messages produced by the producers which eventually get timed out due to
rate limiting.


> 3. What is the root cause of the data loss?
>- Do you mean that the system performance degrades and data loss is due
> to not being able to produce from client to the broker quickly enough and
> data loss happens because messages cannot be forwarded from the client to
> the broker?
>

No, the system performance decreases in the case of poller based rate
limiters. In the precise one, it's purely the broker pausing the netty
channel's auto read property. If the producer goes beyond the set
throughput for a longer (than timeout) duration then it starts observing
timeouts leading to the messages being timed out essentially being lost.

As mentioned in my previous email, there has been discussions about
> improving producer flow control. One of the solution ideas that was
> discussed in a Pulsar community meeting in January was to add explicit flow
> control to producers, somewhat similar to how there are "permits" as the
> flow control for consumers. The permits would be based on byte size
> (instead of number of messages). With explicit flow control in the
> protocol, the rate limiting will also be effective and deterministic and
> the issues that Tao Jiuming was explaining could also be resolved. It also
> would solve the producer/consumer multiplexing on a single TCP/IP
> connection when flow control and rate limiting isn't based on the TCP/IP
> level (and toggling the Netty channel's auto read property).
>
> I think the core implementation of how the broker fails fast at the time
of rate limiting (whether it is by pausing netty channel or a new permits
based model) does not change the actual issue I am targeting. Multiplexing
has some impact on it - but yet again only limited, and can easily be fixed
by the client by increasing the connections per broker. Even after assuming
both these things are somehow "fixed", the fact remains that an absolutely
strict rate limiter will lead to the above mentioned data loss for burst
going above the limit and that a poller based rate limiter doesn't really
rate limit anything as it allows all produce in the first interval of the
next second.


> Let's continue discussion, since I think that this is an important
> improvement area. Together we could find a good solution that works for
> multiple use cases and addresses existing challenges in producer flow
> control and rate limiting.
>
> -Lari
>
> On 2023/11/03 11:16:37 Girish Sharma wrote:
> > Hello Lari,
> > Thanks for bringing this to my attention. I went through the links, but
> > does this sharing of the same tcp/ip connection happen across partitions
> as
> > well (assuming both the partitions of the topic are on the same broker)?
> > i.e. producer 127.0.0.1 for partition
> > `persistent://tenant/ns/topic0-partition0` and producer 127.0.0.1 for
> > partition `persistent://tenant/ns/topic0-partition1` share the same
> tcp/ip
> > connection assuming both are on broker-0 ?
> >
> > In general, the major use case behind this PIP for me and my organization
> > is about supporting produce spikes. We do not want to allocate absolute
> > maximum throughput for a topic which would not even be utilized 99.99% of
> > the time. Thus, for a topic that stays constantly at 100MBps and goes to
> > 150MBps only once in a blue moon, it's unwise to allocate 150MBps worth
> of
> > resources 100% of the time. The poller based rate limiter is also not
> good
> > here as it would allow over use of hardware without a check, degrading
> the
> > system.
> >
> > @Asif, I have been sick these last 10 days, but will be updating the PIP
> > with the discussed changes early next week.
> >
> > Regards
> >
> > On Fri, Nov 3, 2023 at 3:25 PM Lari Hotari  wrote:
> >
> > > Hi Girish,
> > >
> > > In order to address your problem described in the PIP document [1], it
> > > 

Re: [DISCUSS] PIP-310: Support custom publish rate limiters

2023-11-03 Thread Lari Hotari
Hi Girish,

Thanks for the questions. I'll reply to them

> does this sharing of the same tcp/ip connection happen across partitions as
> well (assuming both the partitions of the topic are on the same broker)?
> i.e. producer 127.0.0.1 for partition
> `persistent://tenant/ns/topic0-partition0` and producer 127.0.0.1 for
> partition `persistent://tenant/ns/topic0-partition1` share the same tcp/ip
> connection assuming both are on broker-0 ?

The Pulsar Java client would be sharing the same TCP/IP connection to a single 
broker when using the default setting of connectionsPerBroker = 1. It could be 
using a different connection if connectionsPerBroker > 1.

> In general, the major use case behind this PIP for me and my organization
> is about supporting produce spikes. We do not want to allocate absolute
> maximum throughput for a topic which would not even be utilized 99.99% of
> the time. Thus, for a topic that stays constantly at 100MBps and goes to
> 150MBps only once in a blue moon, it's unwise to allocate 150MBps worth of
> resources 100% of the time. The poller based rate limiter is also not good
> here as it would allow over use of hardware without a check, degrading the
> system.

I'm trying to understand your use case better. 
In the PIP-310 document it says:
> Precise rate limiter fixes the above two issues, but introduces another 
> challenge - the rate 
> limiting is too strict.
> * This leads potential for data loss in case there are sudden spikes in 
> produce and client side >produce queue breaches.
> * The produce latencies increase exponentially in case produce breaches the 
> set throughput >even for small windows.

Could you please elaborate more on these details? Here are some questions:
1. What do you mean that it is too strict? 
- Should the rate limiting allow bursting over the limit for some time?
2. What type of data loss are you experiencing? 
3. What is the root cause of the data loss?
   - Do you mean that the system performance degrades and data loss is due to 
not being able to produce from client to the broker quickly enough and data 
loss happens because messages cannot be forwarded from the client to the broker?

Once there's a common understanding of the problem, it's easier to design the 
solution together in the Pulsar community. One possibility is that PIP-310 
already solves your problem. Another possibility is that we need to improve 
producer flow control. I currently feel that it is needed, but I might be wrong.

As mentioned in my previous email, there has been discussions about improving 
producer flow control. One of the solution ideas that was discussed in a Pulsar 
community meeting in January was to add explicit flow control to producers, 
somewhat similar to how there are "permits" as the flow control for consumers. 
The permits would be based on byte size (instead of number of messages). With 
explicit flow control in the protocol, the rate limiting will also be effective 
and deterministic and the issues that Tao Jiuming was explaining could also be 
resolved. It also would solve the producer/consumer multiplexing on a single 
TCP/IP connection when flow control and rate limiting isn't based on the TCP/IP 
level (and toggling the Netty channel's auto read property).

Let's continue discussion, since I think that this is an important improvement 
area. Together we could find a good solution that works for multiple use cases 
and addresses existing challenges in producer flow control and rate limiting. 

-Lari

On 2023/11/03 11:16:37 Girish Sharma wrote:
> Hello Lari,
> Thanks for bringing this to my attention. I went through the links, but
> does this sharing of the same tcp/ip connection happen across partitions as
> well (assuming both the partitions of the topic are on the same broker)?
> i.e. producer 127.0.0.1 for partition
> `persistent://tenant/ns/topic0-partition0` and producer 127.0.0.1 for
> partition `persistent://tenant/ns/topic0-partition1` share the same tcp/ip
> connection assuming both are on broker-0 ?
> 
> In general, the major use case behind this PIP for me and my organization
> is about supporting produce spikes. We do not want to allocate absolute
> maximum throughput for a topic which would not even be utilized 99.99% of
> the time. Thus, for a topic that stays constantly at 100MBps and goes to
> 150MBps only once in a blue moon, it's unwise to allocate 150MBps worth of
> resources 100% of the time. The poller based rate limiter is also not good
> here as it would allow over use of hardware without a check, degrading the
> system.
> 
> @Asif, I have been sick these last 10 days, but will be updating the PIP
> with the discussed changes early next week.
> 
> Regards
> 
> On Fri, Nov 3, 2023 at 3:25 PM Lari Hotari  wrote:
> 
> > Hi Girish,
> >
> > In order to address your problem described in the PIP document [1], it
> > might be necessary to make improvements in how rate limiters apply
> > backpressure in Pulsar.
> >
> > 

Re: [DISCUSS] PIP-310: Support custom publish rate limiters

2023-11-03 Thread Lari Hotari
Hi Tao, 

You seemed to miss sending the link that you were referring to. Are you 
referring to 
the thread "[discuss] Support fail-fast strategy when broker rate-limited" [1] ?

-Lari

1 - https://lists.apache.org/thread/tp2f1ghomj2kw5ltgz8w8k5s58gs88qz


On 2023/11/03 12:11:31 太上玄元道君 wrote:
> Hi Girish,
> 
> There is also a discussion thread[1] about rate-limiting.
> 
> I think there is some conflicts between some kind of rate-limiter and
> backpressure
> 
> Take the fail-fast strategy as an example:
> Brokers have to reply to clients after receiving and decode the message,
> but the broker also has the back-pressure mechanism. Broker cannot read
> messages because the channel is `disableAutoRead`.
> 
> So the rate-limiters have to adapt to back-pressure.
> 
> Thanks,
> Tao Jiuming
> 
> 2023年10月19日 20:51,Girish Sharma  写道:
> 
> Hi,
> Currently, there are only 2 kinds of publish rate limiters - polling based
> and precise. Users have an option to use either one of them in the topic
> publish rate limiter, but the resource group rate limiter only uses polling
> one.
> 
> There are challenges with both the rate limiters and the fact that we can't
> use precise rate limiter in the resource group level.
> 
> Thus, in order to support custom rate limiters, I've created the PIP-310
> 
> This is the discussion thread. Please go through the PIP and provide your
> inputs.
> 
> Link - https://github.com/apache/pulsar/pull/21399
> 
> Regards
> -- 
> Girish Sharma
> 


Re: [DISCUSS] PIP-310: Support custom publish rate limiters

2023-11-03 Thread Girish Sharma
Hello Tao,
As I understand, there is a fine balance between rate-limiting,
backpressure and not keeping clients waiting. Different use cases may need
different approach to rate-limiting and thus, making rate limiter
customizable is my first step towards making pulsar more customizable as
per need.

Regards

On Fri, Nov 3, 2023 at 5:42 PM 太上玄元道君  wrote:

> Hi Girish,
>
> There is also a discussion thread[1] about rate-limiting.
>
> I think there is some conflicts between some kind of rate-limiter and
> backpressure
>
> Take the fail-fast strategy as an example:
> Brokers have to reply to clients after receiving and decode the message,
> but the broker also has the back-pressure mechanism. Broker cannot read
> messages because the channel is `disableAutoRead`.
>
> So the rate-limiters have to adapt to back-pressure.
>
> Thanks,
> Tao Jiuming
>
> 2023年10月19日 20:51,Girish Sharma  写道:
>
> Hi,
> Currently, there are only 2 kinds of publish rate limiters - polling based
> and precise. Users have an option to use either one of them in the topic
> publish rate limiter, but the resource group rate limiter only uses polling
> one.
>
> There are challenges with both the rate limiters and the fact that we can't
> use precise rate limiter in the resource group level.
>
> Thus, in order to support custom rate limiters, I've created the PIP-310
>
> This is the discussion thread. Please go through the PIP and provide your
> inputs.
>
> Link - https://github.com/apache/pulsar/pull/21399
>
> Regards
> --
> Girish Sharma
>


-- 
Girish Sharma


Re: [DISCUSS] PIP-310: Support custom publish rate limiters

2023-11-03 Thread 太上玄元道君
Hi Girish,

There is also a discussion thread[1] about rate-limiting.

I think there is some conflicts between some kind of rate-limiter and
backpressure

Take the fail-fast strategy as an example:
Brokers have to reply to clients after receiving and decode the message,
but the broker also has the back-pressure mechanism. Broker cannot read
messages because the channel is `disableAutoRead`.

So the rate-limiters have to adapt to back-pressure.

Thanks,
Tao Jiuming

2023年10月19日 20:51,Girish Sharma  写道:

Hi,
Currently, there are only 2 kinds of publish rate limiters - polling based
and precise. Users have an option to use either one of them in the topic
publish rate limiter, but the resource group rate limiter only uses polling
one.

There are challenges with both the rate limiters and the fact that we can't
use precise rate limiter in the resource group level.

Thus, in order to support custom rate limiters, I've created the PIP-310

This is the discussion thread. Please go through the PIP and provide your
inputs.

Link - https://github.com/apache/pulsar/pull/21399

Regards
-- 
Girish Sharma


Re: [DISCUSS] PIP-310: Support custom publish rate limiters

2023-11-03 Thread Girish Sharma
Hello Lari,
Thanks for bringing this to my attention. I went through the links, but
does this sharing of the same tcp/ip connection happen across partitions as
well (assuming both the partitions of the topic are on the same broker)?
i.e. producer 127.0.0.1 for partition
`persistent://tenant/ns/topic0-partition0` and producer 127.0.0.1 for
partition `persistent://tenant/ns/topic0-partition1` share the same tcp/ip
connection assuming both are on broker-0 ?

In general, the major use case behind this PIP for me and my organization
is about supporting produce spikes. We do not want to allocate absolute
maximum throughput for a topic which would not even be utilized 99.99% of
the time. Thus, for a topic that stays constantly at 100MBps and goes to
150MBps only once in a blue moon, it's unwise to allocate 150MBps worth of
resources 100% of the time. The poller based rate limiter is also not good
here as it would allow over use of hardware without a check, degrading the
system.

@Asif, I have been sick these last 10 days, but will be updating the PIP
with the discussed changes early next week.

Regards

On Fri, Nov 3, 2023 at 3:25 PM Lari Hotari  wrote:

> Hi Girish,
>
> In order to address your problem described in the PIP document [1], it
> might be necessary to make improvements in how rate limiters apply
> backpressure in Pulsar.
>
> Pulsar uses mainly TCP/IP connection level controls for achieving
> backpressure. The challenge is that Pulsar can share a single TCP/IP
> connection across multiple producers and consumers. Because of this, there
> could be multiple producers and consumers and rate limiters operating on
> the same connection on the broker, and they will do conflicting decisions,
> which results in undesired behavior.
>
> Regarding the shared TCP/IP connection backpressure issue, Apache Flink
> had a somewhat similar problem before Flink 1.5. It is described in the
> "inflicting backpressure" section of this blog post from 2019:
>
> https://flink.apache.org/2019/06/05/flink-network-stack.html#inflicting-backpressure-1
> Flink solved the issue of multiplexing multiple streams of data on a
> single TCP/IP connection in Flink 1.5 by introducing it's own flow control
> mechanism.
>
> The backpressure and rate limiting challenges have been discussed a few
> times in Pulsar community meetings over the past years. There was also a
> generic backpressure thread on the dev mailing list [2] in Sep 2022.
> However, we haven't really documented Pulsar's backpressure design and how
> rate limiters are part of the overall solution and how we could improve.
> I think it might be time to do so since there's a requirement to improve
> rate limiting. I guess that's the main motivation also for PIP-310.
>
> -Lari
>
> 1 - https://github.com/apache/pulsar/pull/21399/files
> 2 - https://lists.apache.org/thread/03w6x9zsgx11mqcp5m4k4n27cyqmp271
>
> On 2023/10/19 12:51:14 Girish Sharma wrote:
> > Hi,
> > Currently, there are only 2 kinds of publish rate limiters - polling
> based
> > and precise. Users have an option to use either one of them in the topic
> > publish rate limiter, but the resource group rate limiter only uses
> polling
> > one.
> >
> > There are challenges with both the rate limiters and the fact that we
> can't
> > use precise rate limiter in the resource group level.
> >
> > Thus, in order to support custom rate limiters, I've created the PIP-310
> >
> > This is the discussion thread. Please go through the PIP and provide your
> > inputs.
> >
> > Link - https://github.com/apache/pulsar/pull/21399
> >
> > Regards
> > --
> > Girish Sharma
> >
>


-- 
Girish Sharma


Re: [DISCUSS] PIP-310: Support custom publish rate limiters

2023-11-03 Thread Lari Hotari
Hi Girish,

In order to address your problem described in the PIP document [1], it might be 
necessary to make improvements in how rate limiters apply backpressure in 
Pulsar.

Pulsar uses mainly TCP/IP connection level controls for achieving backpressure. 
The challenge is that Pulsar can share a single TCP/IP connection across 
multiple producers and consumers. Because of this, there could be multiple 
producers and consumers and rate limiters operating on the same connection on 
the broker, and they will do conflicting decisions, which results in undesired 
behavior.

Regarding the shared TCP/IP connection backpressure issue, Apache Flink had a 
somewhat similar problem before Flink 1.5. It is described in the "inflicting 
backpressure" section of this blog post from 2019:
https://flink.apache.org/2019/06/05/flink-network-stack.html#inflicting-backpressure-1
Flink solved the issue of multiplexing multiple streams of data on a single 
TCP/IP connection in Flink 1.5 by introducing it's own flow control mechanism.

The backpressure and rate limiting challenges have been discussed a few times 
in Pulsar community meetings over the past years. There was also a generic 
backpressure thread on the dev mailing list [2] in Sep 2022. 
However, we haven't really documented Pulsar's backpressure design and how rate 
limiters are part of the overall solution and how we could improve. 
I think it might be time to do so since there's a requirement to improve rate 
limiting. I guess that's the main motivation also for PIP-310.

-Lari

1 - https://github.com/apache/pulsar/pull/21399/files
2 - https://lists.apache.org/thread/03w6x9zsgx11mqcp5m4k4n27cyqmp271

On 2023/10/19 12:51:14 Girish Sharma wrote:
> Hi,
> Currently, there are only 2 kinds of publish rate limiters - polling based
> and precise. Users have an option to use either one of them in the topic
> publish rate limiter, but the resource group rate limiter only uses polling
> one.
> 
> There are challenges with both the rate limiters and the fact that we can't
> use precise rate limiter in the resource group level.
> 
> Thus, in order to support custom rate limiters, I've created the PIP-310
> 
> This is the discussion thread. Please go through the PIP and provide your
> inputs.
> 
> Link - https://github.com/apache/pulsar/pull/21399
> 
> Regards
> -- 
> Girish Sharma
>