[ANNOUNCE] Apache Pulsar Client C++ 3.4.2 released

2023-12-15 Thread Yunze Xu
The Apache Pulsar team is proud to announce Apache Pulsar Client C++
version 3.4.2.

Pulsar is a highly scalable, low latency messaging platform running on
commodity hardware. It provides simple pub-sub semantics over topics,
guaranteed at-least-once delivery of messages, automatic cursor management for
subscribers, and cross-datacenter replication.

For Pulsar release details and downloads, visit:
https://pulsar.apache.org/download/#pulsar-c-client

Release Notes are at:
https://pulsar.apache.org/release-notes/client-cpp

API documents are at:
https://pulsar.apache.org/api/cpp/3.4.x/index.html

We would like to thank the contributors that made the release possible.

Regards,

The Pulsar Team


Re: [VOTE] Pulsar Client C++ Release 3.4.2 Candidate 1

2023-12-15 Thread tison
Hi Yunze,

You may carry your own vote as a RM. It's a bit strange the RM doesn't
verify the candidate, while it's OK :D

Best,
tison.

Yunze Xu  于2023年12月15日周五 17:57写道:
>
> Close this vote by 3 binding +1s
> - Tison
> - Jiwei
> - Penghui
>
> Thanks,
> Yunze
>
> On Thu, Dec 14, 2023 at 7:51 PM PengHui Li  wrote:
> >
> > +1 (binding)
> >
> > - Checked the signature
> > - Tested the producer and consumer
> >
> > Regards,
> > Penghui
> >
> > On Wed, Dec 13, 2023 at 2:28 PM guo jiwei  wrote:
> >
> > > +1 (binding)
> > >
> > > - Verified the signature and checksum
> > > - Build from the source
> > > - Test SampleConsumer and SampleProducer
> > >
> > > Regards
> > > Jiwei Guo (Tboy)
> > >
> > >
> > > On Tue, Dec 12, 2023 at 4:30 PM tison  wrote:
> > >
> > > > +1 (binding)
> > > >
> > > > * Download URL valid
> > > > * Checksum and sign match
> > > > * Can build from source
> > > > * LICENSE and NOTICE present
> > > >
> > > > nit: Years in NOTICE can be updated.
> > > >
> > > > Best,
> > > > tison.
> > > >
> > > > Yunze Xu  于2023年12月6日周三 16:00写道:
> > > > >
> > > > > This is the first release candidate for Apache Pulsar Client C++,
> > > > version 3.4.2.
> > > > >
> > > > > It fixes the following issues:
> > > > >
> > > >
> > > https://github.com/apache/pulsar-client-cpp/pulls?q=is%3Apr+is%3Aclosed+label%3Arelease%2F3.4.2
> > > > >
> > > > > *** Please download, test and vote on this release. This vote will 
> > > > > stay
> > > > open
> > > > > for at least 72 hours ***
> > > > >
> > > > > Note that we are voting upon the source (tag), binaries are provided
> > > for
> > > > > convenience.
> > > > >
> > > > > Source and binary files:
> > > > >
> > > >
> > > https://dist.apache.org/repos/dist/dev/pulsar/pulsar-client-cpp/pulsar-client-cpp-3.4.2-candidate-1/
> > > > >
> > > > > SHA-512 checksums:
> > > > >
> > > >
> > > d64a07c4f78071ae0607f1afac4ab0db15f9dc25cb1f2ceae7152e262b65e660719f1520f93933da6615691ea0de2f25a6fb2806369126c4a777c0a075af0f5e
> > > > >  apache-pulsar-client-cpp-3.4.2.tar.gz
> > > > >
> > > > > The tag to be voted upon:
> > > > > v3.4.2-candidate-1 (1cb1bf8ba1ca1033b4a36d35514f22fcf150973a)
> > > > >
> > > >
> > > https://github.com/apache/pulsar-client-cpp/releases/tag/v3.4.2-candidate-1
> > > > >
> > > > > Pulsar's KEYS file containing PGP keys you use to sign the release:
> > > > > https://downloads.apache.org/pulsar/KEYS
> > > > >
> > > > > Please download the source package, and follow
> > > > >
> > > >
> > > https://github.com/apache/pulsar-client-cpp/wiki/Verify-the-candidate-release-in-your-local-env
> > > > > to compile and test.
> > > > >
> > > > > Note: If you're going to run the unit tests locally, please make sure
> > > > > the proxy is disabled.
> > > >
> > >


Re: [DISCUSS] PIP-324: Alpine Docker images

2023-12-15 Thread YuWei Sung
Another alternative is Redhat's ubi-minimal. It is glibc and has longer
support.
Alpine support is around 2 years.

https://gist.github.com/yuweisung/9a40b7af71cdf2dfbb4f7c52825acf35




Yu Wei Sung

Sr. Technologist OCTO


streamnative.io






On Fri, Dec 15, 2023 at 7:38 AM Christophe Bornet 
wrote:

> Le mer. 13 déc. 2023 à 18:03, Matteo Merli  a écrit :
> >
> > --
> > Matteo Merli
> > 
> >
> >
> > On Wed, Dec 13, 2023 at 8:20 AM Christophe Bornet <
> bornet.ch...@gmail.com>
> > wrote:
> >
> > > Thanks Matteo for bringing this subject.
> > >
> > > I share the concerns of Lari regarding the move from glibc to musl in
> > > terms of security, performance, compatibility with the JVM. Extensive
> > > performance tests will have to be done.
> > >
> >
> > Alpine is the *most* used base image across the board, thousands of
> > projects are using it with Java.
> >
> > Barring the fact that, yes, extensive performance/stress/compatibility
> > tests will be performed, can you share any specific security, performance
> > or JVM compatibility issue?
> >
> > All the native libraries we are using, from Netty, RocksDB, to BookKeeper
> > performance tricks are already providing muls versions or are compatible.
> >
>
> That's good to know.
> For Netty, I think netty-transport-native-epoll is only built against
> glibc (
> https://netty.io/wiki/native-transports.html#using-the-linux-native-transport
> ).
> Is there a workaround ?
> Other than that, there is the DNS caching issue Lari mentioned.
>
> >
> >
> > > Also, last time I tried to use alpine with a Python project, it was a
> > > nightmare as the support was very poor (libs not working with musl, no
> > > wheels for musl). So we decided to move back to glibc. It was some
> > > years ago so maybe the situation is better now but it's worth checking
> > > before imposing it to Python Function developers.
> > >
> >
> > Pulsar Python client (which is included in the image) already has
> pre-built
> > binaries for Alpine that we publish both for x86-64 as well as for arm64.
> > Same goes for all the dependencies.
> >
> My concern is for user Pulsar Functions. For instance numpy got wheels
> on Pypi only very recently. The first release to have wheels for musl
> aired in June this year (https://pypi.org/project/numpy/1.25.0/#files)
> Compiling Python libraries can be a very tedious process when there
> are no existing wheels for an environment.
> Other popular libraries may not have their wheels yet. Or maybe they
> do, I don't know. I just want to point this aspect to consider.
> >
> > > Maybe a debian-slim image could be considered as a thiner image than
> > > ubuntu, even if not as thin as alpine ?
> > >
> >
> > There are literally 100 CVEs open on the current debian-slim base image
> > (just the base with nothing else installed). Including HIGH and CRITICAL
> >
> > debian:11-slim (debian 11.8)
> > Total: 100 (UNKNOWN: 0, LOW: 72, MEDIUM: 15, HIGH: 11, CRITICAL: 2)
> >
> > Full list:
> https://gist.github.com/merlimat/65407426e1d1b1be5afdff62555470c2
>
> Let's forget debian-slim !
>
>
> Note that I'm not at all against using Alpine. I just want to point
> some difficulties I had in past experiences.
> As I said, this was some years ago and maybe now Alpine can be
> considered as the goto for linux docker images.
>
> > >
> > > Regards
> > >
> > > Christophe
> > >
> > > Le mer. 13 déc. 2023 à 14:16, Lari Hotari  a
> écrit :
> > > >
> > > > +1
> > > >
> > > > Before switching to Alpine completely, it would be worth running
> > > extensive system tests in production-like environments.
> > > >
> > > > Alpine comes with musl, which makes the JVM behave slightly
> differently.
> > > >
> > > > One of the common DNS issues with Alpine was fixed in May 2023 with
> the
> > > Alpine 3.18 release. Alpine finally got full DNS protocol support that
> > > impacts usage when there are DNS responses larger than 512 bytes [1].
> > > >
> > > > Alpine 3.18 comes with musl 1.2.4 with TCP fallback in the DNS
> resolver.
> > > The official Kubernetes docs also contain the recommendation [2] to
> upgrade
> > > Alpine to 3.18+ (newest is currently 3.19) on Kubernetes. I'm no longer
> > > concerned about possible DNS resolution issues with Alpine.
> > > >
> > > > However, one remaining concern related to DNS is the lack of local
> DNS
> > > caching in Alpine. In Pulsar, most of the DNS resolution happens with
> > > Netty's DNS resolver that has caching. I'm not sure what the broader
> impact
> > > could be when switching to Alpine that doesn't have DNS caching at the
> OS
> > > level. In Kubernetes environments, most DNS lookups go through a lot of
> > > search domains and it puts a lot of load on the DNS server unless
> clients
> > > do caching. It is possible to have a local caching DNS server in Alpine
> > > [1], but that doesn't seem to be very 

Re: [DISCUSS] PIP-324: Alpine Docker images

2023-12-15 Thread Christophe Bornet
Le mer. 13 déc. 2023 à 18:03, Matteo Merli  a écrit :
>
> --
> Matteo Merli
> 
>
>
> On Wed, Dec 13, 2023 at 8:20 AM Christophe Bornet 
> wrote:
>
> > Thanks Matteo for bringing this subject.
> >
> > I share the concerns of Lari regarding the move from glibc to musl in
> > terms of security, performance, compatibility with the JVM. Extensive
> > performance tests will have to be done.
> >
>
> Alpine is the *most* used base image across the board, thousands of
> projects are using it with Java.
>
> Barring the fact that, yes, extensive performance/stress/compatibility
> tests will be performed, can you share any specific security, performance
> or JVM compatibility issue?
>
> All the native libraries we are using, from Netty, RocksDB, to BookKeeper
> performance tricks are already providing muls versions or are compatible.
>

That's good to know.
For Netty, I think netty-transport-native-epoll is only built against
glibc 
(https://netty.io/wiki/native-transports.html#using-the-linux-native-transport).
Is there a workaround ?
Other than that, there is the DNS caching issue Lari mentioned.

>
>
> > Also, last time I tried to use alpine with a Python project, it was a
> > nightmare as the support was very poor (libs not working with musl, no
> > wheels for musl). So we decided to move back to glibc. It was some
> > years ago so maybe the situation is better now but it's worth checking
> > before imposing it to Python Function developers.
> >
>
> Pulsar Python client (which is included in the image) already has pre-built
> binaries for Alpine that we publish both for x86-64 as well as for arm64.
> Same goes for all the dependencies.
>
My concern is for user Pulsar Functions. For instance numpy got wheels
on Pypi only very recently. The first release to have wheels for musl
aired in June this year (https://pypi.org/project/numpy/1.25.0/#files)
Compiling Python libraries can be a very tedious process when there
are no existing wheels for an environment.
Other popular libraries may not have their wheels yet. Or maybe they
do, I don't know. I just want to point this aspect to consider.
>
> > Maybe a debian-slim image could be considered as a thiner image than
> > ubuntu, even if not as thin as alpine ?
> >
>
> There are literally 100 CVEs open on the current debian-slim base image
> (just the base with nothing else installed). Including HIGH and CRITICAL
>
> debian:11-slim (debian 11.8)
> Total: 100 (UNKNOWN: 0, LOW: 72, MEDIUM: 15, HIGH: 11, CRITICAL: 2)
>
> Full list: https://gist.github.com/merlimat/65407426e1d1b1be5afdff62555470c2

Let's forget debian-slim !


Note that I'm not at all against using Alpine. I just want to point
some difficulties I had in past experiences.
As I said, this was some years ago and maybe now Alpine can be
considered as the goto for linux docker images.

> >
> > Regards
> >
> > Christophe
> >
> > Le mer. 13 déc. 2023 à 14:16, Lari Hotari  a écrit :
> > >
> > > +1
> > >
> > > Before switching to Alpine completely, it would be worth running
> > extensive system tests in production-like environments.
> > >
> > > Alpine comes with musl, which makes the JVM behave slightly differently.
> > >
> > > One of the common DNS issues with Alpine was fixed in May 2023 with the
> > Alpine 3.18 release. Alpine finally got full DNS protocol support that
> > impacts usage when there are DNS responses larger than 512 bytes [1].
> > >
> > > Alpine 3.18 comes with musl 1.2.4 with TCP fallback in the DNS resolver.
> > The official Kubernetes docs also contain the recommendation [2] to upgrade
> > Alpine to 3.18+ (newest is currently 3.19) on Kubernetes. I'm no longer
> > concerned about possible DNS resolution issues with Alpine.
> > >
> > > However, one remaining concern related to DNS is the lack of local DNS
> > caching in Alpine. In Pulsar, most of the DNS resolution happens with
> > Netty's DNS resolver that has caching. I'm not sure what the broader impact
> > could be when switching to Alpine that doesn't have DNS caching at the OS
> > level. In Kubernetes environments, most DNS lookups go through a lot of
> > search domains and it puts a lot of load on the DNS server unless clients
> > do caching. It is possible to have a local caching DNS server in Alpine
> > [1], but that doesn't seem to be very convenient.
> > >
> > > The third area where there are differences in musl is in malloc. It's
> > hard to know beforehand how the different malloc algorithm impacts the
> > actual resident memory (RSS) usage. Different malloc algorithms handle
> > memory fragmentation in different ways and there are behavioral
> > differences. System testing could help verify the actual impact.
> > >
> > > -Lari
> > >
> > > 1 -
> > https://bell-sw.com/blog/how-to-deal-with-alpine-dns-issues/#mcetoc_1gtd8v3lt2b
> > > 2 -
> > https://kubernetes.io/docs/tasks/administer-cluster/dns-debugging-resolution/#known-issues
> > >
> > > On 2023/12/12 18:58:49 Matteo Merli wrote:
> > > > Hello,
> > > >
> > > > I've created a new 

Re: [DISCUSS] PIP-310: Support custom publish rate limiters

2023-12-15 Thread Girish Sharma
Closing this discussion thread and the PIP. Apart from the discussion
present in this thread, I presented the detailed requirements in a dev meet
on 23rd November and the conclusion was that we will actually go ahead and
implement the requirements in pulsar itself.
There was a pre-requisite of refactoring rate limiter codebase which is
already covered by Lari in PIP-322.

I will be creating a new parent PIP soon about the high level requirements.

Thank you everyone who participated in the thread and the discussion on
23rd dev meeting.

Regards

On Thu, Nov 23, 2023 at 8:26 PM Girish Sharma 
wrote:

> I've captured our requirements in detail in this document -
> https://docs.google.com/document/d/1-y5nBaC9QuAUHKUGMVVe4By-SmMZIL4w09U1byJBbMc/edit
> Added it to agenda document as well. Will join the meeting and discuss.
>
> Regards
>
> On Wed, Nov 22, 2023 at 10:49 PM Lari Hotari  wrote:
>
>> I have written a long blog post that contains the context, the summary
>> of my view point about PIP-310 and the proposal for proceeding:
>>
>> https://codingthestreams.com/pulsar/2023/11/22/pulsar-slos-and-rate-limiting.html
>>
>> Let's discuss this tomorrow in the Pulsar community meeting [1]. Let's
>> coordinate on Pulsar Slack's #dev channel if the are issues in joining
>> the meeting.
>> See you tomorrow!
>>
>> -Lari
>>
>> 1 - https://github.com/apache/pulsar/wiki/Community-Meetings
>>
>> On Mon, 20 Nov 2023 at 20:48, Lari Hotari  wrote:
>> >
>> > Hi Girish,
>> >
>> > replies inline and after that there are some updates about my
>> > preparation for the community meeting on Thursday. (there's
>> > https://github.com/lhotari/async-tokenbucket with a PoC for a
>> > low-level high performance token bucket implementation)
>> >
>> > On Sat, 11 Nov 2023 at 17:25, Girish Sharma 
>> wrote:
>> > > Actually, the capacity is meant to simulate that particular rate
>> limit. if
>> > > we have 2 buckets anyways, the one managing the fixed rate limit part
>> > > shouldn't generally have a capacity more than the fixed rate, right?
>> >
>> > There are multiple ways to model and understand a dual token bucket
>> > implementation.
>> > I view the 2 buckets in a dual token bucket implementation as separate
>> > buckets. They are like an AND rule, so if either bucket is empty,
>> > there will be a need to pause to wait for new tokens.
>> > Since we aren't working with code yet, these comments could be out of
>> context.
>> >
>> > > I think it can be done, especially with that one thing you mentioned
>> about
>> > > holding off filling the second bucket for 10 minutes.. but it does
>> become
>> > > quite complicated in terms of managing the flow of the tokens..
>> because
>> > > while we only fill the second bucket once every 10 minutes, after the
>> 10th
>> > > minute, it needs to be filled continuously for a while (the duration
>> we
>> > > want to support the bursting for).. and the capacity of this second
>> bucket
>> > > also is governed by and exactly matches the burst value.
>> >
>> > There might not be a need for this complexity of the "filling bucket"
>> > in the first place. It was more of a demonstration that it's possible
>> > to implement the desired behavior of limited bursting by tweaking the
>> > basic token bucket algorithm slightly.
>> > I'd rather avoid this additional complexity.
>> >
>> > > Agreed that it is much higher than a single topics' max throughput..
>> but
>> > > the context of my example had multiple topics lying on the same
>> > > broker/bookie ensemble bursting together at the same time because
>> they had
>> > > been saving up on tokens in the bucket.
>> >
>> > Yes, that makes sense.
>> >
>> > > always be a need to overprovision resources. You usually don't want to
>> > > > go beyond 60% or 70% utilization on disk, cpu or network resources
>> so
>> > > > that queues in the system don't start to increase and impacting
>> > > > latencies. In Pulsar/Bookkeeper, the storage solution has a very
>> > > > effective load balancing, especially for writing. In Bookkeeper each
>> > > > ledger (the segment) of a topic selects the "ensemble" and the
>> "write
>> > > > quorum", the set of bookies to write to, when the ledger is opened.
>> > > > The bookkeeper client could also change the ensemble in the middle
>> of
>> > > > a ledger due to some event like a bookie becoming read-only or
>> > > >
>> > >
>> > > While it does do that on complete failure of bookie or a bookie disk,
>> or
>> > > broker going down, degradations aren't handled this well. So if all
>> topics
>> > > in a bookie are bursting due to the fact that they had accumulated
>> tokens,
>> > > then all it will lead to is breach of write latency SLA because at one
>> > > point, the disks/cpu/network etc will start choking. (even after
>> > > considering the 70% utilization i.e. 30% buffer)
>> >
>> > Yes.
>> >
>> > > That's only in the case of the default rate limiter where the
>> tryAcquire
>> > > isn't even implemented.. since the default rate limiter 

Re: [VOTE] PIP-325: Add command to abort transaction

2023-12-15 Thread Girish Sharma
Hello Ruihong*,*
you actually replied to the discussion thread itself.
Moreover, you should wait for the discussion thread to have some actual
discussion before stating the voting thread..

https://github.com/apache/pulsar/blob/master/pip/README.md

Regards

On Fri, Dec 15, 2023 at 6:37 PM ruihongzhou 
wrote:

> Hi community,
>
>
> This thread is to start a vote forPIP-325: Add command to abort
> transaction.
>
>
> PIP:https://github.com/apache/pulsar/pull/21731
>
>
> Releted PR:https://github.com/apache/pulsar/pull/21630
>
> Discussion thread:
> https://lists.apache.org/thread/p559tsphr7kvbh2qqw8vsow0ylytonnz
>
>
>
>
>
>
>
>
>
> Ruihong



-- 
Girish Sharma


Re: [VOTE] Pulsar Client C++ Release 3.4.2 Candidate 1

2023-12-15 Thread Yunze Xu
Close this vote by 3 binding +1s
- Tison
- Jiwei
- Penghui

Thanks,
Yunze

On Thu, Dec 14, 2023 at 7:51 PM PengHui Li  wrote:
>
> +1 (binding)
>
> - Checked the signature
> - Tested the producer and consumer
>
> Regards,
> Penghui
>
> On Wed, Dec 13, 2023 at 2:28 PM guo jiwei  wrote:
>
> > +1 (binding)
> >
> > - Verified the signature and checksum
> > - Build from the source
> > - Test SampleConsumer and SampleProducer
> >
> > Regards
> > Jiwei Guo (Tboy)
> >
> >
> > On Tue, Dec 12, 2023 at 4:30 PM tison  wrote:
> >
> > > +1 (binding)
> > >
> > > * Download URL valid
> > > * Checksum and sign match
> > > * Can build from source
> > > * LICENSE and NOTICE present
> > >
> > > nit: Years in NOTICE can be updated.
> > >
> > > Best,
> > > tison.
> > >
> > > Yunze Xu  于2023年12月6日周三 16:00写道:
> > > >
> > > > This is the first release candidate for Apache Pulsar Client C++,
> > > version 3.4.2.
> > > >
> > > > It fixes the following issues:
> > > >
> > >
> > https://github.com/apache/pulsar-client-cpp/pulls?q=is%3Apr+is%3Aclosed+label%3Arelease%2F3.4.2
> > > >
> > > > *** Please download, test and vote on this release. This vote will stay
> > > open
> > > > for at least 72 hours ***
> > > >
> > > > Note that we are voting upon the source (tag), binaries are provided
> > for
> > > > convenience.
> > > >
> > > > Source and binary files:
> > > >
> > >
> > https://dist.apache.org/repos/dist/dev/pulsar/pulsar-client-cpp/pulsar-client-cpp-3.4.2-candidate-1/
> > > >
> > > > SHA-512 checksums:
> > > >
> > >
> > d64a07c4f78071ae0607f1afac4ab0db15f9dc25cb1f2ceae7152e262b65e660719f1520f93933da6615691ea0de2f25a6fb2806369126c4a777c0a075af0f5e
> > > >  apache-pulsar-client-cpp-3.4.2.tar.gz
> > > >
> > > > The tag to be voted upon:
> > > > v3.4.2-candidate-1 (1cb1bf8ba1ca1033b4a36d35514f22fcf150973a)
> > > >
> > >
> > https://github.com/apache/pulsar-client-cpp/releases/tag/v3.4.2-candidate-1
> > > >
> > > > Pulsar's KEYS file containing PGP keys you use to sign the release:
> > > > https://downloads.apache.org/pulsar/KEYS
> > > >
> > > > Please download the source package, and follow
> > > >
> > >
> > https://github.com/apache/pulsar-client-cpp/wiki/Verify-the-candidate-release-in-your-local-env
> > > > to compile and test.
> > > >
> > > > Note: If you're going to run the unit tests locally, please make sure
> > > > the proxy is disabled.
> > >
> >