Re: [VOTE] Pulsar Release 2.11.3 Candidate 1

2023-11-22 Thread guo jiwei
+1 (binding)

- Checked the signatures
- Built from source
- Run standalone and check the produce, consume
- Verified Cassandra connector
- Verified stateful function

Regards
Jiwei Guo (Tboy)


On Tue, Nov 21, 2023 at 9:47 PM Baodi Shi  wrote:

>  Patch:
>
> Docker images:
> docker pull wudixiaobaozi/pulsar-all:2.11.3
> docker pull wudixiaobaozi/pulsar:2.11.3
>
> Thanks,
> Baodi Shi
>
>
> On Nov 21, 2023 at 21:23:41, Baodi Shi  wrote:
>
> > This is the first release candidate for Apache Pulsar, version 2.11.3.
> >
> > It fixes the following issues:
> >
> >
> https://github.com/apache/pulsar/pulls?q=is%3Apr+label%3Arelease%2F2.11.3+is%3Aclosed
> >
> > *** Please download, test and vote on this release. This vote will stay
> > open for at least 72 hours ***
> >
> > Note that we are voting upon the source (tag), binaries are provided for
> > convenience.
> >
> > Source and binary files:
> > https://dist.apache.org/repos/dist/dev/pulsar/pulsar-2.11.3-candidate-1/
> >
> > SHA-512 checksums:
> >
> >
> 1104ce10ee55f99f162f71487922d9883201516754936feab07a631b25b8f76bc2443735b4bdff17d821d62ca20f605d386ce9ca2e0450ce8d2555ca07fd8dd
> ./apache-pulsar-2.11.3-bin.tar.gz
> >
> >
> bdf2579d718d25def297538def0c237974c856f63aea00db30e61b10683eec29a52a354b61daa6eda5ffe5bdfda78e6a83d473f8d7f44104fc5d715ffb1892fc
> ./apache-pulsar-2.11.3-src.tar.gz
> >
> >
> > Maven staging repo:
> > https://repository.apache.org/content/repositories/orgapachepulsar-1251
> >
> > The tag to be voted upon:
> > v2.11.3-candidate-1 (aa7082efcafb58b1fc4b7bb1bc68c6e22f7bc2d3)
> > https://github.com/apache/pulsar/releases/tag/v2.11.3-candidate-1
> >
> > Pulsar’s KEYS file containing PGP keys you use to sign the release:
> > https://dist.apache.org/repos/dist/dev/pulsar/KEYS
> >
> > Docker images:
> > docker pull wudixiaobaozi/pulsar-all:2.11.3
> > docker pull wudixiaobaozi/pulsar
> >
> > Please download the source package, and follow the README to build
> > and run the Pulsar standalone service.
> >
> > Thanks,
> > Baodi Shi
> >
>


Re: [DISCUSS] PIP-310: Support custom publish rate limiters

2023-11-22 Thread Lari Hotari
I have written a long blog post that contains the context, the summary
of my view point about PIP-310 and the proposal for proceeding:
https://codingthestreams.com/pulsar/2023/11/22/pulsar-slos-and-rate-limiting.html

Let's discuss this tomorrow in the Pulsar community meeting [1]. Let's
coordinate on Pulsar Slack's #dev channel if the are issues in joining
the meeting.
See you tomorrow!

-Lari

1 - https://github.com/apache/pulsar/wiki/Community-Meetings

On Mon, 20 Nov 2023 at 20:48, Lari Hotari  wrote:
>
> Hi Girish,
>
> replies inline and after that there are some updates about my
> preparation for the community meeting on Thursday. (there's
> https://github.com/lhotari/async-tokenbucket with a PoC for a
> low-level high performance token bucket implementation)
>
> On Sat, 11 Nov 2023 at 17:25, Girish Sharma  wrote:
> > Actually, the capacity is meant to simulate that particular rate limit. if
> > we have 2 buckets anyways, the one managing the fixed rate limit part
> > shouldn't generally have a capacity more than the fixed rate, right?
>
> There are multiple ways to model and understand a dual token bucket
> implementation.
> I view the 2 buckets in a dual token bucket implementation as separate
> buckets. They are like an AND rule, so if either bucket is empty,
> there will be a need to pause to wait for new tokens.
> Since we aren't working with code yet, these comments could be out of context.
>
> > I think it can be done, especially with that one thing you mentioned about
> > holding off filling the second bucket for 10 minutes.. but it does become
> > quite complicated in terms of managing the flow of the tokens.. because
> > while we only fill the second bucket once every 10 minutes, after the 10th
> > minute, it needs to be filled continuously for a while (the duration we
> > want to support the bursting for).. and the capacity of this second bucket
> > also is governed by and exactly matches the burst value.
>
> There might not be a need for this complexity of the "filling bucket"
> in the first place. It was more of a demonstration that it's possible
> to implement the desired behavior of limited bursting by tweaking the
> basic token bucket algorithm slightly.
> I'd rather avoid this additional complexity.
>
> > Agreed that it is much higher than a single topics' max throughput.. but
> > the context of my example had multiple topics lying on the same
> > broker/bookie ensemble bursting together at the same time because they had
> > been saving up on tokens in the bucket.
>
> Yes, that makes sense.
>
> > always be a need to overprovision resources. You usually don't want to
> > > go beyond 60% or 70% utilization on disk, cpu or network resources so
> > > that queues in the system don't start to increase and impacting
> > > latencies. In Pulsar/Bookkeeper, the storage solution has a very
> > > effective load balancing, especially for writing. In Bookkeeper each
> > > ledger (the segment) of a topic selects the "ensemble" and the "write
> > > quorum", the set of bookies to write to, when the ledger is opened.
> > > The bookkeeper client could also change the ensemble in the middle of
> > > a ledger due to some event like a bookie becoming read-only or
> > >
> >
> > While it does do that on complete failure of bookie or a bookie disk, or
> > broker going down, degradations aren't handled this well. So if all topics
> > in a bookie are bursting due to the fact that they had accumulated tokens,
> > then all it will lead to is breach of write latency SLA because at one
> > point, the disks/cpu/network etc will start choking. (even after
> > considering the 70% utilization i.e. 30% buffer)
>
> Yes.
>
> > That's only in the case of the default rate limiter where the tryAcquire
> > isn't even implemented.. since the default rate limiter checks for breach
> > only at a fixed rate rather than before every produce call. But in case of
> > precise rate limiter, the response of `tryAcquire` is respected.
>
> This is one of many reasons why I think it's better to improve the
> maintainability of the current solution and remove the unnecessary
> options between "precise" and the default one.
>
> > True, and actually, due to the fact that pulsar auto distributes topics
> > based on load shedding parameters, we can actually focus on a single
> > broker's or a single bookie ensemble and assume that it works as we scale
> > it. Of course this means that putting a reasonable limit in terms of
> > cpu/network/partition/throughput limits at each broker level and pulsar
> > provides ways to do that automatically.
>
> We do have plans to improve the Pulsar so that things would simply
> work properly under heavy load. Optimally, things would work without
> the need to tune and tweak the system rigorously. These improvements
> go beyond rate limiting.
>
> > While I have shared the core requirements over these threads (fixed rate +
> > burst multiplier for upto X duration every Y minutes).. We are finalizing
> > the