[GitHub] [pulsar-dotpulsar] blankensteiner commented on issue #105: Support - Custom authentication

2022-06-07 Thread GitBox


blankensteiner commented on issue #105:
URL: 
https://github.com/apache/pulsar-dotpulsar/issues/105#issuecomment-1149485733

   Is this not covered by IAuthentication?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [DISCUSS] PIP-175: Extend time based release process

2022-06-07 Thread PengHui Li
I'm not sure I fully understand the LTS release and feature release.

> The LTS releases will be identified by being a `.0` version. For example:
> * `3.0` -> LTS
> * `3.1` -> regular release
> * `3.2` -> regular release
> * `4.0` -> LTS

In this example, we can only introduce new features in 3.1 and 3.2,
and 3.1.x and 3.2.x should be the patch release based on the feature
release?
We can have one patch release for a month.

3.0.x is the LTS release that will support at most three years.
After we have 4.0 LTS release, we will still support 3.0.x TLS for at least
18 months.
And 4.0 TLS will have all the new features from 3.x

> This can be translated into:
>   * We support the last 2 LTS releases and the last 2 feature releases
>   * Security patches are provided for the past 3 LTS releases and 2
> feature releases

Does this mean we can introduce new features in 3.x even if we have 4.x?
And how many patch releases for the feature releases we will support,
such as 3.1.x, 3.2.x, 3.3.x, 4.1.x, 4.2.x.

I think 2 feature releases means 3.x and 4.x here?

Thanks,
Penghui

On Wed, Jun 8, 2022 at 12:41 PM Michael Marshall 
wrote:

> Thanks for putting together this PIP to continue this discussion,
> Matteo. This is an important one.
>
> I'll need time to think over your points before I respond, but I want
> to address two of them right away.
>
> > Actually, I was wrong. PIP-47 says the last 4 releases so 2.7 would be
> included.
> > The point though still remains in that there's nowhere in the website
> > where a user could check until when the 2.7 release is going to be
> > supported.
>
> We actually do have this documented on the website. I added this page
> in February:
>
> https://pulsar.apache.org/docs/next/security-policy-and-supported-versions#supported-versions
> .
>
> > we'd be releasing 2 LTSs super close between each other
> > and we'd have to support 1 release more for the time being.
>
> I agree with this reasoning. If 2.10 is LTS, and I think it should be,
> 2.11 shouldn't be LTS.
>
> Thanks,
> Michael
>
> On Tue, Jun 7, 2022 at 6:39 PM Matteo Merli 
> wrote:
> >
> > > > There is a high cost to maintain a lot of old releases, backport bug
> > > > fixes, and security patches. In general, we actively support the last
> > > > 3 minor releases while continuing to develop the next release. E.g.,
> > > > 2.8, 2.9, and 2.10, while 2.11 is under development.
> > >
> > > Is 2.7 EOL? If so then we need to announce it explicitly.
> >
> > Actually, I was wrong. PIP-47 says the last 4 releases so 2.7 would be
> included.
> > The point though still remains in that there's nowhere in the website
> > where a user could check until when the 2.7 release is going to be
> > supported.
> >
> > > > We need to ensure that we have a date set in stone to deliver the
> > > > release to users.
> > >
> > > I would like the new plan to address the delays in cherry picking
> changes. These must never wait until a release is being made. We must keep
> these up to date. If someone marks a PR for an older release then they are
> volunteering to do the cherry pick within a few days. We need to be
> prepared for a 0-day security release.
> >
> > I agree that this is a problem, though I'd prefer to keep it in a
> > separate proposal, specifically targeted at the process for patch
> > releases, to avoid putting too many things into a single discussion.
> >
> > > > The major version bump will not carry any special meaning in terms of
> > > > "big features" included in the release or breaking API changes.
> > > > Instead, it would simply signal the type of the release.
> > >
> > > From our existing release what is LTS?
> >
> > Good point, as we discussed earlier, 2.10 should be marked as LTS for
> > being the last Java 8 release. I'll update the text to reflect this.
> >
> > > Does this mean that you are proposing the current Master as
> release/3.0 or will it remain 2.11?
> >
> >  I was actually not thinking of changing the denomination of 2.11. On
> > one hand, it could make sense for being the first Java 17 release, but
> > on the other, we'd be releasing 2 LTSs super close between each other
> > and we'd have to support 1 release more for the time being.
> >
> > I'd like to hear more opinions here :)
> >
> > > > The support model will be:
> > > >
> > > > * LTS
> > > >   * Released every 18 months
> > > >   * Support for 24 months
> > > >   * Security patches for 36 months
> > > > * Feature releases
> > > >   * Released every 3 months
> > > >   * Support for 6 months
> > > >   * Security patches for 6 months
> > >
> > > Are those times since the initial release? It would be helpful to have
> a swim lane diagram.
> >
> > Yes, from the initial release (eg: 3.0.0) and yes we would have a
> > clear diagram on the website.
> >
> > > > This can be translated into:
> > > >   * We support the last 2 LTS releases and the last 2 feature
> releases
> > > >   * Security patches are provided for the past 3 LTS releases and 2
> > > > feature 

Re: [DISCUSS] PIP-175: Extend time based release process

2022-06-07 Thread Michael Marshall
Thanks for putting together this PIP to continue this discussion,
Matteo. This is an important one.

I'll need time to think over your points before I respond, but I want
to address two of them right away.

> Actually, I was wrong. PIP-47 says the last 4 releases so 2.7 would be 
> included.
> The point though still remains in that there's nowhere in the website
> where a user could check until when the 2.7 release is going to be
> supported.

We actually do have this documented on the website. I added this page
in February:
https://pulsar.apache.org/docs/next/security-policy-and-supported-versions#supported-versions.

> we'd be releasing 2 LTSs super close between each other
> and we'd have to support 1 release more for the time being.

I agree with this reasoning. If 2.10 is LTS, and I think it should be,
2.11 shouldn't be LTS.

Thanks,
Michael

On Tue, Jun 7, 2022 at 6:39 PM Matteo Merli  wrote:
>
> > > There is a high cost to maintain a lot of old releases, backport bug
> > > fixes, and security patches. In general, we actively support the last
> > > 3 minor releases while continuing to develop the next release. E.g.,
> > > 2.8, 2.9, and 2.10, while 2.11 is under development.
> >
> > Is 2.7 EOL? If so then we need to announce it explicitly.
>
> Actually, I was wrong. PIP-47 says the last 4 releases so 2.7 would be 
> included.
> The point though still remains in that there's nowhere in the website
> where a user could check until when the 2.7 release is going to be
> supported.
>
> > > We need to ensure that we have a date set in stone to deliver the
> > > release to users.
> >
> > I would like the new plan to address the delays in cherry picking changes. 
> > These must never wait until a release is being made. We must keep these up 
> > to date. If someone marks a PR for an older release then they are 
> > volunteering to do the cherry pick within a few days. We need to be 
> > prepared for a 0-day security release.
>
> I agree that this is a problem, though I'd prefer to keep it in a
> separate proposal, specifically targeted at the process for patch
> releases, to avoid putting too many things into a single discussion.
>
> > > The major version bump will not carry any special meaning in terms of
> > > "big features" included in the release or breaking API changes.
> > > Instead, it would simply signal the type of the release.
> >
> > From our existing release what is LTS?
>
> Good point, as we discussed earlier, 2.10 should be marked as LTS for
> being the last Java 8 release. I'll update the text to reflect this.
>
> > Does this mean that you are proposing the current Master as release/3.0 or 
> > will it remain 2.11?
>
>  I was actually not thinking of changing the denomination of 2.11. On
> one hand, it could make sense for being the first Java 17 release, but
> on the other, we'd be releasing 2 LTSs super close between each other
> and we'd have to support 1 release more for the time being.
>
> I'd like to hear more opinions here :)
>
> > > The support model will be:
> > >
> > > * LTS
> > >   * Released every 18 months
> > >   * Support for 24 months
> > >   * Security patches for 36 months
> > > * Feature releases
> > >   * Released every 3 months
> > >   * Support for 6 months
> > >   * Security patches for 6 months
> >
> > Are those times since the initial release? It would be helpful to have a 
> > swim lane diagram.
>
> Yes, from the initial release (eg: 3.0.0) and yes we would have a
> clear diagram on the website.
>
> > > This can be translated into:
> > >   * We support the last 2 LTS releases and the last 2 feature releases
> > >   * Security patches are provided for the past 3 LTS releases and 2
> > > feature releases
> >
> > Please note that in the event of a security release that PMC members will 
> > generally need to do these in secret.
>
> No changes about that. This is only to set the user expectation for
> how long they can expect the security patches.
>
> It doesn't change a comma on the PMC process of discussing such
> releases, nor it would prevent doing additional security releases
> outside of the "guaranteed" window.
>
> > What is the plan for bug fix / security releases on say 3.0?
>
> Since 3.0 would be LTS, based on the above-proposed table: 2y for bug
> fixes - 3y for security patches


Re: [ANNOUNCE] New Committer: Dezhi Liu

2022-06-07 Thread Huanli Meng
Congrats!

BR//Huanli

> On Jun 8, 2022, at 10:25 AM, Yu  wrote:
> 
> Hi Dezhi,  kudos to you! Well deserved!
> 
> On Wed, Jun 8, 2022 at 9:52 AM Li Li  wrote:
> 
>> Congratulations Dezhi!
>> 
>> Thanks,
>> Li Li
>> 
>>> On Jun 7, 2022, at 9:13 PM, PengHui Li  wrote:
>>> 
>>> Congratulations Dezhi!
>>> 
>>> Penghui
>>> On Jun 7, 2022, 17:22 +0800, Enrico Olivelli ,
>> wrote:
 Congratulations !!
 
 Enrico
 
 Il giorno mar 7 giu 2022 alle ore 10:57 Zike Yang  ha
>> scritto:
> 
> Congratulations!
> 
> Best Regards,
> Zike Yang
> 
> On Tue, Jun 7, 2022 at 3:52 PM ZhangJian He 
>> wrote:
> 
>> Congratulations!
>> 
>> Thanks
>> ZhangJian He
>> 
>> Haiting Jiang  于2022年6月7日周二 15:46写道:
>> 
>>> Congrats!
>>> 
>>> BR,
>>> Haiting
>>> 
>>> On 2022/06/07 06:46:00 Hang Chen wrote:
 The Project Management Committee (PMC) for Apache Pulsar has invited
 Dezhi Liu (https://github.com/liudezhi2098) to become a committer
>> and
 we are pleased to announce that he has accepted.
 
 Dezhi Liu (with Github id liudezhi2098) contributed many
>> improvements
 and bug fixes to Pulsar.
 
 Being a committer enables easier contribution to the project since
 there is no need to go via the patch submission process. This should
 enable better productivity.
 
 Welcome and Congratulations, Dezhi Liu!
 
 Please join us in congratulating and welcoming Dezhi Liu onboard!
 
 Best Regards,
 Hang Chen on behalf of the Pulsar PMC
 
>>> 
>> 
>> 
>> 



Re: [VOTE] PIP-166: Function add MANUAL delivery semantics

2022-06-07 Thread PengHui Li
+1

Penghui
On Jun 8, 2022, 09:32 +0800, Rui Fu , wrote:
> +1
>
> Best,
>
> Rui Fu
> 在 2022年6月8日 +0800 04:51,Neng Lu ,写道:
> > Hi All,
> >
> > +1 (non-binding)
> >
> > On Tue, Jun 7, 2022 at 5:42 AM Enrico Olivelli  wrote:
> >
> > > I have left one last minute comment, can you please take a look ? then
> > > I will post my +1
> > >
> > > thanks
> > > Enrico
> > >
> >
> >
> > --
> > Best Regards,
> > Neng


[DISCUSS] PIP-172: Introduce the HEALTH_CHECK command in the binary protocol

2022-06-07 Thread zhaocong
Hello Pulsar Community,


Here is a PIP to introduce the HEALTH_CHECK command in the binary protocol.
I look forward to your feedback.


PIP: https://github.com/apache/pulsar/issues/15859


Thanks,

Cong Zhao


[GitHub] [pulsar-dotpulsar] RobertIndie opened a new issue, #105: Support - Custom authentication

2022-06-07 Thread GitBox


RobertIndie opened a new issue, #105:
URL: https://github.com/apache/pulsar-dotpulsar/issues/105

   The Pulsar Java client has support custom authentcaition: 
https://pulsar.apache.org/api/client/org/apache/pulsar/client/api/AuthenticationFactory.html#create-java.lang.String-java.lang.String-
   
   We can add support for this.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@pulsar.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [ANNOUNCE] New Committer: Dezhi Liu

2022-06-07 Thread Yu
Hi Dezhi,  kudos to you! Well deserved!

On Wed, Jun 8, 2022 at 9:52 AM Li Li  wrote:

> Congratulations Dezhi!
>
> Thanks,
> Li Li
>
> > On Jun 7, 2022, at 9:13 PM, PengHui Li  wrote:
> >
> > Congratulations Dezhi!
> >
> > Penghui
> > On Jun 7, 2022, 17:22 +0800, Enrico Olivelli ,
> wrote:
> >> Congratulations !!
> >>
> >> Enrico
> >>
> >> Il giorno mar 7 giu 2022 alle ore 10:57 Zike Yang  ha
> scritto:
> >>>
> >>> Congratulations!
> >>>
> >>> Best Regards,
> >>> Zike Yang
> >>>
> >>> On Tue, Jun 7, 2022 at 3:52 PM ZhangJian He 
> wrote:
> >>>
>  Congratulations!
> 
>  Thanks
>  ZhangJian He
> 
>  Haiting Jiang  于2022年6月7日周二 15:46写道:
> 
> > Congrats!
> >
> > BR,
> > Haiting
> >
> > On 2022/06/07 06:46:00 Hang Chen wrote:
> >> The Project Management Committee (PMC) for Apache Pulsar has invited
> >> Dezhi Liu (https://github.com/liudezhi2098) to become a committer
> and
> >> we are pleased to announce that he has accepted.
> >>
> >> Dezhi Liu (with Github id liudezhi2098) contributed many
> improvements
> >> and bug fixes to Pulsar.
> >>
> >> Being a committer enables easier contribution to the project since
> >> there is no need to go via the patch submission process. This should
> >> enable better productivity.
> >>
> >> Welcome and Congratulations, Dezhi Liu!
> >>
> >> Please join us in congratulating and welcoming Dezhi Liu onboard!
> >>
> >> Best Regards,
> >> Hang Chen on behalf of the Pulsar PMC
> >>
> >
> 
>
>


Re: [ANNOUNCE] New Committer: Dezhi Liu

2022-06-07 Thread Li Li
Congratulations Dezhi!

Thanks,
Li Li

> On Jun 7, 2022, at 9:13 PM, PengHui Li  wrote:
> 
> Congratulations Dezhi!
> 
> Penghui
> On Jun 7, 2022, 17:22 +0800, Enrico Olivelli , wrote:
>> Congratulations !!
>> 
>> Enrico
>> 
>> Il giorno mar 7 giu 2022 alle ore 10:57 Zike Yang  ha 
>> scritto:
>>> 
>>> Congratulations!
>>> 
>>> Best Regards,
>>> Zike Yang
>>> 
>>> On Tue, Jun 7, 2022 at 3:52 PM ZhangJian He  wrote:
>>> 
 Congratulations!
 
 Thanks
 ZhangJian He
 
 Haiting Jiang  于2022年6月7日周二 15:46写道:
 
> Congrats!
> 
> BR,
> Haiting
> 
> On 2022/06/07 06:46:00 Hang Chen wrote:
>> The Project Management Committee (PMC) for Apache Pulsar has invited
>> Dezhi Liu (https://github.com/liudezhi2098) to become a committer and
>> we are pleased to announce that he has accepted.
>> 
>> Dezhi Liu (with Github id liudezhi2098) contributed many improvements
>> and bug fixes to Pulsar.
>> 
>> Being a committer enables easier contribution to the project since
>> there is no need to go via the patch submission process. This should
>> enable better productivity.
>> 
>> Welcome and Congratulations, Dezhi Liu!
>> 
>> Please join us in congratulating and welcoming Dezhi Liu onboard!
>> 
>> Best Regards,
>> Hang Chen on behalf of the Pulsar PMC
>> 
> 
 



Re: [VOTE] PIP-166: Function add MANUAL delivery semantics

2022-06-07 Thread Rui Fu
+1

Best,

Rui Fu
在 2022年6月8日 +0800 04:51,Neng Lu ,写道:
> Hi All,
>
> +1 (non-binding)
>
> On Tue, Jun 7, 2022 at 5:42 AM Enrico Olivelli  wrote:
>
> > I have left one last minute comment, can you please take a look ? then
> > I will post my +1
> >
> > thanks
> > Enrico
> >
>
>
> --
> Best Regards,
> Neng


Re: [DISCUSS] PIP-174: Provide new implementation for broker dispatch cache

2022-06-07 Thread Hang Chen
+1 Great idea!

Thanks,
Hang

Lari Hotari  于2022年6月8日周三 03:32写道:
>
> This is a very useful proposal. LGTM
>
> -Lari
>
> On Tue, Jun 7, 2022 at 3:48 AM Matteo Merli  wrote:
>
> > https://github.com/apache/pulsar/issues/15954
> >
> > WIP can be seen at: https://github.com/apache/pulsar/pull/15955
> >
> > ---
> >
> >
> > ## Motivation
> >
> > The current implementation of the read cache in the Pulsar broker has
> > largely
> > remained unchanged for a long time, except for a few minor tweaks.
> >
> > While the implementation is stable and reasonably efficient for
> > typical workloads,
> > the overhead required for managing the cache evictions in a broker
> > that is running
> > many topics can be pretty high in terms of extra CPU utilization and on
> > the JVM
> > garbage collection to track an increased number of medium-lived objects.
> >
> > The goal is to provide an alternative implementation that can adapt better
> > to
> > a wider variety of operating conditions.
> >
> > ### Current implementation details
> >
> > The broker cache is implemented as part of the `ManagedLedger` component,
> > which sits in the Pulsar broker and provides a higher level of
> > abstraction of top
> > of BookKeeper.
> >
> > Each topic (and managed-ledger) has its own private cache space. This
> > cache is implemented
> > as a `ConcurrentSkipList` sorted map that maps `(ledgerId, entryId) ->
> > payload`. The payload
> > is a `ByteBuf` reference that can either be a slice of a `ByteBuf` that we
> > got
> > when reading from a socket, or it can be a copied buffer.
> >
> > Each topic cache is allowed to use the full broker max cache size before an
> > eviction is triggered. The total cache size is effectively a resource
> > shared across all
> > the topics, where a topic can use a more prominent portion of it if it
> > "asks for more".
> >
> > When the eviction happens, we need to do an expensive ranking of all
> > the caches in the broker
> > and do an eviction in a proportional way to the currently used space
> > for each of them.
> >
> > The bigger problem is represented by the `ConcurrentSkipList` and the
> > `ByteBuf` objects
> > that need to be tracked. The skip list is essentially like a "tree"
> > structure and needs to
> > maintain Java objects for each entry in the cache. We also need to
> > potentially have
> > a huge number of ByteBuf objects.
> >
> > A cache workload is typically the worst-case scenario for each garbage
> > collector implementation because it involves creating objects, storing
> > them for some amount of
> > time and then throwing them away. During that time, the GC would have
> > already tenured these
> > objects and copy them into an "old generation" space, and sometime
> > later, a costly compaction
> > of that memory would have to be performed.
> >
> > To mitigate the effect of the cache workload on the GC, we're being
> > very aggressive in
> > purging the cache by triggering time-based eviction. By putting a max
> > TTL on the elements in
> > the cache, we can avoid keeping the objects around for too long to be
> > a problem for the GC.
> >
> > The reverse side of this is that we're artificially reducing the cache
> > capacity to a very
> > short time frame, reducing the cache usefulness.
> >
> > The other problem is the CPU cost involved in doing these frequent
> > evictions, which can
> > be very high when there are 10s of thousands of topics in a broker.
> >
> >
> > ## Proposed changes
> >
> > Instead of dealing with individual caches for each topic, let's adopt
> > a model where
> > there is a single cache space for the broker.
> >
> > This cache is broken into N segments which act as a circular buffer.
> > Whenever a segment
> > is full, we start writing into the next one, and when we reach the
> > last one, we will
> > restart recycling the first segment.
> >
> > Each segment is composed of a buffer, an offset, and a hashmap which maps
> > `(ledgerId, entryId) -> offset`.
> >
> > This model has been working very well for the BookKeeper `ReadCache`:
> >
> > https://github.com/apache/bookkeeper/blob/master/bookkeeper-server/src/main/java/org/apache/bookkeeper/bookie/storage/ldb/ReadCache.java
> >
> > There are two main advantages to this approach:
> >
> >  1. Entries are copied into the cache buffer (in direct memory), and
> > we don't need to keep any
> > long-lived Java objects around
> >  2. The eviction becomes a completely trivial operation, buffers are
> > just rotated and
> > overwritten. We don't need to do any per-topic task or keep track
> > of utilization.
> >
> > ### API changes
> >
> > No user-facing API changes are required.
> >
> > ### New configuration options
> >
> > The existing cache implementation will not be removed at this point. Users
> > will
> > be able to configure the old implementation in `broker.conf`.
> >
> > This option will be helpful in case of performance regressions would be
> > seen for
> > some use cases with the new cache 

Re: [DISCUSS] PIP-175: Extend time based release process

2022-06-07 Thread Matteo Merli
> > There is a high cost to maintain a lot of old releases, backport bug
> > fixes, and security patches. In general, we actively support the last
> > 3 minor releases while continuing to develop the next release. E.g.,
> > 2.8, 2.9, and 2.10, while 2.11 is under development.
>
> Is 2.7 EOL? If so then we need to announce it explicitly.

Actually, I was wrong. PIP-47 says the last 4 releases so 2.7 would be included.
The point though still remains in that there's nowhere in the website
where a user could check until when the 2.7 release is going to be
supported.

> > We need to ensure that we have a date set in stone to deliver the
> > release to users.
>
> I would like the new plan to address the delays in cherry picking changes. 
> These must never wait until a release is being made. We must keep these up to 
> date. If someone marks a PR for an older release then they are volunteering 
> to do the cherry pick within a few days. We need to be prepared for a 0-day 
> security release.

I agree that this is a problem, though I'd prefer to keep it in a
separate proposal, specifically targeted at the process for patch
releases, to avoid putting too many things into a single discussion.

> > The major version bump will not carry any special meaning in terms of
> > "big features" included in the release or breaking API changes.
> > Instead, it would simply signal the type of the release.
>
> From our existing release what is LTS?

Good point, as we discussed earlier, 2.10 should be marked as LTS for
being the last Java 8 release. I'll update the text to reflect this.

> Does this mean that you are proposing the current Master as release/3.0 or 
> will it remain 2.11?

 I was actually not thinking of changing the denomination of 2.11. On
one hand, it could make sense for being the first Java 17 release, but
on the other, we'd be releasing 2 LTSs super close between each other
and we'd have to support 1 release more for the time being.

I'd like to hear more opinions here :)

> > The support model will be:
> >
> > * LTS
> >   * Released every 18 months
> >   * Support for 24 months
> >   * Security patches for 36 months
> > * Feature releases
> >   * Released every 3 months
> >   * Support for 6 months
> >   * Security patches for 6 months
>
> Are those times since the initial release? It would be helpful to have a swim 
> lane diagram.

Yes, from the initial release (eg: 3.0.0) and yes we would have a
clear diagram on the website.

> > This can be translated into:
> >   * We support the last 2 LTS releases and the last 2 feature releases
> >   * Security patches are provided for the past 3 LTS releases and 2
> > feature releases
>
> Please note that in the event of a security release that PMC members will 
> generally need to do these in secret.

No changes about that. This is only to set the user expectation for
how long they can expect the security patches.

It doesn't change a comma on the PMC process of discussing such
releases, nor it would prevent doing additional security releases
outside of the "guaranteed" window.

> What is the plan for bug fix / security releases on say 3.0?

Since 3.0 would be LTS, based on the above-proposed table: 2y for bug
fixes - 3y for security patches


Re: [DISCUSS] PIP-175: Extend time based release process

2022-06-07 Thread Dave Fisher
Hi -

Interesting, some responses inline.

> On Jun 7, 2022, at 3:25 PM, Matteo Merli  wrote:
> 
> https://github.com/apache/pulsar/issues/15966
> 
> 
> 
> ## Motivation
> 
> In PIP-47 
> (https://github.com/apache/pulsar/wiki/PIP-47:-Time-Based-Release-Plan),
> we have adopted a time-based release plan. This was the first attempt
> at establishing a new principle on how releases should b
> 
> The main two benefits of this approach have been:
> 
> 1. Clarity for users and developers on when to expect a release
> 2. Breaking a hard relationship between feature and release: a
> particular feature will be included in the release if it is completed
> in time. Otherwise, it will be bubbled up to the next release.
> 
> The motivation for the current proposal is to extend the existing
> process to address the issues that we have seen and that were left out
> of the scope of PIP-47.
> 
> ## Summary of existing issues in the process
> 
> ### Short maintenance cycles for releases
> 
> Since we're doing a 3 months release cycle, we are ending with 4
> releases done per year, even though it's more close to 3 releases.
> 
> There is a high cost to maintain a lot of old releases, backport bug
> fixes, and security patches. In general, we actively support the last
> 3 minor releases while continuing to develop the next release. E.g.,
> 2.8, 2.9, and 2.10, while 2.11 is under development.

Is 2.7 EOL? If so then we need to announce it explicitly.

> 
> The result is that a user adopting a particular release is forced to
> upgrade in a < 1-year timeframe to keep up to date and use a supported
> release. This timeframe is too short for many users as it imposes a
> lot of forced upgrades, for which they are not prepared in terms of
> available time and required effort.
> 
> ### Live Upgrade/Downgrade compatibility path
> 
> In Pulsar, we guarantee that users have a way to do live upgrades and
> downgrades with zero downtime.
> 
> This is very powerful because it gives them the freedom to upgrade to
> a new release with the assurance of being able to roll back to the
> previous release in case any functional or performance regressions are
> encountered.
> 
> Today, this compatibility is guaranteed across minor versions. Eg: I
> can do  `2.7 -> 2.8 -> 2.7` as a live upgrade.
> 
> What is not guaranteed is to "skip" releases. E.g.: `2.7 -> 2.9` might
> work or not, but it's not guaranteed. In that case an intermediated
> upgrade would be required: `2.7 -> 2.8 -> 2.9`.
> 
> The reasons for which the "skip" upgrade might not work are multiple:
>  1. Incompatible upgrade of some dependency (e.g., ZooKeeper) that
> might not be compatible with an older version.
>  2. Adoption of a new metadata format or data format on disk.
> Every time we introduce a new incompatible format change (outside
> of a regular Protobuf field addition), we do it in a 2 steps way:
>  - In a new release, we introduce the new feature/format,
> disabled by default. The new release can read both old and new
> formats, though it keeps writing the old format by default.
>  - In a subsequent release, we change the default to the new format
> 
> Note that this consideration is separate from the compatibility
> between clients and brokers, where we ***never*** break compatibility.
> The oldest available Pulsar client can still talk with the newest
> Pulsar broker, and vice versa, a new client, will be perfectly fine
> with an older broker (except the new features won't be working).
> 
> ### Releases getting delayed
> 
> Another problem we have been experiencing is that release cycles have
> been stretching considerably. Part of this has been because we have
> been reaching the end of the release window, preparing a candidate,
> and then taking a long time to flush out all issues found at the last
> minute in the new release.
> 
> We need to ensure that we have a date set in stone to deliver the
> release to users.

I would like the new plan to address the delays in cherry picking changes. 
These must never wait until a release is being made. We must keep these up to 
date. If someone marks a PR for an older release then they are volunteering to 
do the cherry pick within a few days. We need to be prepared for a 0-day 
security release.


> 
> ## Proposal
> 
> The proposal to address the above issues is composed of 2 parts.
> 
> ### 1. Establish Long Term Support releases
> 
> We need to provide a way for users to quickly understand the expected
> lifecycle timeline of a given release and for that timeline to be long
> enough not to be a constant update mandate.
> 
> At the same time, we need to ensure that we maintainers are not
> spending all the time just maintaining a huge list of old releases.
> 
> For that, we can use the established concept of "Long Term Releases" or LTS.
> 
> We will perform LTS releases at a fixed cadence every 18 months, and
> we will keep doing regular feature releases every 3 months as we're
> currently doing.
> 
> 

[DISCUSS] PIP-175: Extend time based release process

2022-06-07 Thread Matteo Merli
https://github.com/apache/pulsar/issues/15966



## Motivation

In PIP-47 
(https://github.com/apache/pulsar/wiki/PIP-47:-Time-Based-Release-Plan),
we have adopted a time-based release plan. This was the first attempt
at establishing a new principle on how releases should b

The main two benefits of this approach have been:

 1. Clarity for users and developers on when to expect a release
 2. Breaking a hard relationship between feature and release: a
particular feature will be included in the release if it is completed
in time. Otherwise, it will be bubbled up to the next release.

The motivation for the current proposal is to extend the existing
process to address the issues that we have seen and that were left out
of the scope of PIP-47.

## Summary of existing issues in the process

### Short maintenance cycles for releases

Since we're doing a 3 months release cycle, we are ending with 4
releases done per year, even though it's more close to 3 releases.

There is a high cost to maintain a lot of old releases, backport bug
fixes, and security patches. In general, we actively support the last
3 minor releases while continuing to develop the next release. E.g.,
2.8, 2.9, and 2.10, while 2.11 is under development.

The result is that a user adopting a particular release is forced to
upgrade in a < 1-year timeframe to keep up to date and use a supported
release. This timeframe is too short for many users as it imposes a
lot of forced upgrades, for which they are not prepared in terms of
available time and required effort.

### Live Upgrade/Downgrade compatibility path

In Pulsar, we guarantee that users have a way to do live upgrades and
downgrades with zero downtime.

This is very powerful because it gives them the freedom to upgrade to
a new release with the assurance of being able to roll back to the
previous release in case any functional or performance regressions are
encountered.

Today, this compatibility is guaranteed across minor versions. Eg: I
can do  `2.7 -> 2.8 -> 2.7` as a live upgrade.

What is not guaranteed is to "skip" releases. E.g.: `2.7 -> 2.9` might
work or not, but it's not guaranteed. In that case an intermediated
upgrade would be required: `2.7 -> 2.8 -> 2.9`.

The reasons for which the "skip" upgrade might not work are multiple:
  1. Incompatible upgrade of some dependency (e.g., ZooKeeper) that
might not be compatible with an older version.
  2. Adoption of a new metadata format or data format on disk.
 Every time we introduce a new incompatible format change (outside
of a regular Protobuf field addition), we do it in a 2 steps way:
  - In a new release, we introduce the new feature/format,
disabled by default. The new release can read both old and new
formats, though it keeps writing the old format by default.
  - In a subsequent release, we change the default to the new format

Note that this consideration is separate from the compatibility
between clients and brokers, where we ***never*** break compatibility.
The oldest available Pulsar client can still talk with the newest
Pulsar broker, and vice versa, a new client, will be perfectly fine
with an older broker (except the new features won't be working).

### Releases getting delayed

Another problem we have been experiencing is that release cycles have
been stretching considerably. Part of this has been because we have
been reaching the end of the release window, preparing a candidate,
and then taking a long time to flush out all issues found at the last
minute in the new release.

We need to ensure that we have a date set in stone to deliver the
release to users.

## Proposal

The proposal to address the above issues is composed of 2 parts.

### 1. Establish Long Term Support releases

We need to provide a way for users to quickly understand the expected
lifecycle timeline of a given release and for that timeline to be long
enough not to be a constant update mandate.

At the same time, we need to ensure that we maintainers are not
spending all the time just maintaining a huge list of old releases.

For that, we can use the established concept of "Long Term Releases" or LTS.

We will perform LTS releases at a fixed cadence every 18 months, and
we will keep doing regular feature releases every 3 months as we're
currently doing.

The LTS releases will be identified by being a `.0` version. For example:
 * `3.0` -> LTS
 * `3.1` -> regular release
 * `3.2` -> regular release
 * `4.0` -> LTS

The major version bump will not carry any special meaning in terms of
"big features" included in the release or breaking API changes.
Instead, it would simply signal the type of the release.

 Compatibility between releases

It will be guaranteed to be able to do a live upgrade/downgrade
between one LTS and the next one.

For example:

 * `3.0 -> 4.0 -> 3.0` : OK
 * `3.2 -> 4.0 -> 3.2` : OK
 * `3.2 -> 4.4 -> 3.2` : OK
 * `3.2 -> 5.0` : Not OK

 Release support expectation

We will publish clear guidelines on the 

Re: [VOTE] PIP-166: Function add MANUAL delivery semantics

2022-06-07 Thread Neng Lu
Hi All,

+1 (non-binding)

On Tue, Jun 7, 2022 at 5:42 AM Enrico Olivelli  wrote:

> I have left one last minute comment, can you please take a look ? then
> I will post my +1
>
> thanks
> Enrico
>


-- 
Best Regards,
Neng


Re: [DISCUSS] PIP-174: Provide new implementation for broker dispatch cache

2022-06-07 Thread Lari Hotari
This is a very useful proposal. LGTM

-Lari

On Tue, Jun 7, 2022 at 3:48 AM Matteo Merli  wrote:

> https://github.com/apache/pulsar/issues/15954
>
> WIP can be seen at: https://github.com/apache/pulsar/pull/15955
>
> ---
>
>
> ## Motivation
>
> The current implementation of the read cache in the Pulsar broker has
> largely
> remained unchanged for a long time, except for a few minor tweaks.
>
> While the implementation is stable and reasonably efficient for
> typical workloads,
> the overhead required for managing the cache evictions in a broker
> that is running
> many topics can be pretty high in terms of extra CPU utilization and on
> the JVM
> garbage collection to track an increased number of medium-lived objects.
>
> The goal is to provide an alternative implementation that can adapt better
> to
> a wider variety of operating conditions.
>
> ### Current implementation details
>
> The broker cache is implemented as part of the `ManagedLedger` component,
> which sits in the Pulsar broker and provides a higher level of
> abstraction of top
> of BookKeeper.
>
> Each topic (and managed-ledger) has its own private cache space. This
> cache is implemented
> as a `ConcurrentSkipList` sorted map that maps `(ledgerId, entryId) ->
> payload`. The payload
> is a `ByteBuf` reference that can either be a slice of a `ByteBuf` that we
> got
> when reading from a socket, or it can be a copied buffer.
>
> Each topic cache is allowed to use the full broker max cache size before an
> eviction is triggered. The total cache size is effectively a resource
> shared across all
> the topics, where a topic can use a more prominent portion of it if it
> "asks for more".
>
> When the eviction happens, we need to do an expensive ranking of all
> the caches in the broker
> and do an eviction in a proportional way to the currently used space
> for each of them.
>
> The bigger problem is represented by the `ConcurrentSkipList` and the
> `ByteBuf` objects
> that need to be tracked. The skip list is essentially like a "tree"
> structure and needs to
> maintain Java objects for each entry in the cache. We also need to
> potentially have
> a huge number of ByteBuf objects.
>
> A cache workload is typically the worst-case scenario for each garbage
> collector implementation because it involves creating objects, storing
> them for some amount of
> time and then throwing them away. During that time, the GC would have
> already tenured these
> objects and copy them into an "old generation" space, and sometime
> later, a costly compaction
> of that memory would have to be performed.
>
> To mitigate the effect of the cache workload on the GC, we're being
> very aggressive in
> purging the cache by triggering time-based eviction. By putting a max
> TTL on the elements in
> the cache, we can avoid keeping the objects around for too long to be
> a problem for the GC.
>
> The reverse side of this is that we're artificially reducing the cache
> capacity to a very
> short time frame, reducing the cache usefulness.
>
> The other problem is the CPU cost involved in doing these frequent
> evictions, which can
> be very high when there are 10s of thousands of topics in a broker.
>
>
> ## Proposed changes
>
> Instead of dealing with individual caches for each topic, let's adopt
> a model where
> there is a single cache space for the broker.
>
> This cache is broken into N segments which act as a circular buffer.
> Whenever a segment
> is full, we start writing into the next one, and when we reach the
> last one, we will
> restart recycling the first segment.
>
> Each segment is composed of a buffer, an offset, and a hashmap which maps
> `(ledgerId, entryId) -> offset`.
>
> This model has been working very well for the BookKeeper `ReadCache`:
>
> https://github.com/apache/bookkeeper/blob/master/bookkeeper-server/src/main/java/org/apache/bookkeeper/bookie/storage/ldb/ReadCache.java
>
> There are two main advantages to this approach:
>
>  1. Entries are copied into the cache buffer (in direct memory), and
> we don't need to keep any
> long-lived Java objects around
>  2. The eviction becomes a completely trivial operation, buffers are
> just rotated and
> overwritten. We don't need to do any per-topic task or keep track
> of utilization.
>
> ### API changes
>
> No user-facing API changes are required.
>
> ### New configuration options
>
> The existing cache implementation will not be removed at this point. Users
> will
> be able to configure the old implementation in `broker.conf`.
>
> This option will be helpful in case of performance regressions would be
> seen for
> some use cases with the new cache implementation.
>


Re: [DISCUSS] Implementation for HTTP endpoint producer/consumer

2022-06-07 Thread Enrico Olivelli
I would make a separate project and release it as a .nar. It can run as a
Broker Protocol Handler or a Proxy Extension.

Then if the project gets traction we can add it to Pulsar core repo.

Enrico and

Il Mar 7 Giu 2022, 17:05 Dave Fisher  ha scritto:

> If this is a REST endpoint then call it REST. It is very likely that users
> will want to use HTTPS to use it. Calling it HTTP is a misnomer.
>
> All The Best,
> Dave
>
> Sent from my iPhone
>
> > On Jun 7, 2022, at 7:25 AM, Zhengxin Cai  wrote:
> >
> > Thanks for bringing this up.
> > I think building a separate HTTP server to serve REST produce/consume
> > requests might be a good idea, like FunctionWorkerService, users can
> choose
> > to run with broker for simplicity or run as a separate component if user
> > wants isolation and scale independently.
> > I think we just missed this option when building V1, I think it's working
> > considering.
> >
> > mattison chao  于2022年6月6日周一 21:33写道:
> >
> >> Hi, Pulsar Community,
> >>
> >> We have the PIP-64 that introduces HTTP Rest API for producing/consuming
> >> messages(
> >>
> >>
> https://github.com/apache/pulsar/wiki/PIP-64%3A-Introduce-REST-endpoints-for-producing%2C-consuming-and-reading-messages
> >> ). But this proposal does not define the implementation.
> >>
> >> However, we already have producer HTTP API at the broker side. But,
> there
> >> are some problems, so refactored in this patch:
> >> https://github.com/apache/pulsar/pull/15876.
> >>
> >> Then we add HTTP consumer in this patch:
> >> https://github.com/apache/pulsar/pull/15942.
> >>
> >> But, currently have some ideas that do not reach a consensus. Like @Lari
> >> Hotaril mentioned at pull request
> >> https://github.com/apache/pulsar/pull/15942.
> >>
> >> It might not be a good idea to add the implementation to the main Pulsar
> >> Admin API at all.
> >>
> >> HTTP consuming would be better to handle in a separate component. PIP-64
> >> doesn't determine that this should be part of Pulsar Admin API and we
> >> should revisit this decision. I think it's a bad idea to add HTTP
> consuming
> >> to Pulsar Admin API and brokers.
> >>
> >> I want to discuss whether we should implement the HTTP endpoint in the
> >> broker or separate it at another component(like pulsar-WebSocket).
> >>
> >> Best,
> >>
> >> Mattison
> >>
>
>


Re: [DISCUSS] PIP-174: Provide new implementation for broker dispatch cache

2022-06-07 Thread Matteo Merli
On Tue, Jun 7, 2022 at 6:37 AM Enrico Olivelli  wrote:
> Great idea.
> I wonder which kind of metrics we could have. To see how each
> tenant/namespace is using the cache

It can be done, at the expense of some CPU cost, for example by adding
a tag for the topic and computing the sizes at the time of eviction.

Right now we do have that information but we're not using/exposing it
in any way.

In general, the cache usage would be exactly proportional to the
bytes/s incoming rate across all the topics.


Re: [DISCUSS] Implementation for HTTP endpoint producer/consumer

2022-06-07 Thread Dave Fisher
If this is a REST endpoint then call it REST. It is very likely that users will 
want to use HTTPS to use it. Calling it HTTP is a misnomer.

All The Best,
Dave

Sent from my iPhone

> On Jun 7, 2022, at 7:25 AM, Zhengxin Cai  wrote:
> 
> Thanks for bringing this up.
> I think building a separate HTTP server to serve REST produce/consume
> requests might be a good idea, like FunctionWorkerService, users can choose
> to run with broker for simplicity or run as a separate component if user
> wants isolation and scale independently.
> I think we just missed this option when building V1, I think it's working
> considering.
> 
> mattison chao  于2022年6月6日周一 21:33写道:
> 
>> Hi, Pulsar Community,
>> 
>> We have the PIP-64 that introduces HTTP Rest API for producing/consuming
>> messages(
>> 
>> https://github.com/apache/pulsar/wiki/PIP-64%3A-Introduce-REST-endpoints-for-producing%2C-consuming-and-reading-messages
>> ). But this proposal does not define the implementation.
>> 
>> However, we already have producer HTTP API at the broker side. But, there
>> are some problems, so refactored in this patch:
>> https://github.com/apache/pulsar/pull/15876.
>> 
>> Then we add HTTP consumer in this patch:
>> https://github.com/apache/pulsar/pull/15942.
>> 
>> But, currently have some ideas that do not reach a consensus. Like @Lari
>> Hotaril mentioned at pull request
>> https://github.com/apache/pulsar/pull/15942.
>> 
>> It might not be a good idea to add the implementation to the main Pulsar
>> Admin API at all.
>> 
>> HTTP consuming would be better to handle in a separate component. PIP-64
>> doesn't determine that this should be part of Pulsar Admin API and we
>> should revisit this decision. I think it's a bad idea to add HTTP consuming
>> to Pulsar Admin API and brokers.
>> 
>> I want to discuss whether we should implement the HTTP endpoint in the
>> broker or separate it at another component(like pulsar-WebSocket).
>> 
>> Best,
>> 
>> Mattison
>> 



Re: [DISCUSS] Implementation for HTTP endpoint producer/consumer

2022-06-07 Thread Zhengxin Cai
Thanks for bringing this up.
I think building a separate HTTP server to serve REST produce/consume
requests might be a good idea, like FunctionWorkerService, users can choose
to run with broker for simplicity or run as a separate component if user
wants isolation and scale independently.
I think we just missed this option when building V1, I think it's working
considering.

mattison chao  于2022年6月6日周一 21:33写道:

> Hi, Pulsar Community,
>
> We have the PIP-64 that introduces HTTP Rest API for producing/consuming
> messages(
>
> https://github.com/apache/pulsar/wiki/PIP-64%3A-Introduce-REST-endpoints-for-producing%2C-consuming-and-reading-messages
> ). But this proposal does not define the implementation.
>
> However, we already have producer HTTP API at the broker side. But, there
> are some problems, so refactored in this patch:
> https://github.com/apache/pulsar/pull/15876.
>
> Then we add HTTP consumer in this patch:
> https://github.com/apache/pulsar/pull/15942.
>
> But, currently have some ideas that do not reach a consensus. Like @Lari
> Hotaril mentioned at pull request
> https://github.com/apache/pulsar/pull/15942.
>
> It might not be a good idea to add the implementation to the main Pulsar
> Admin API at all.
>
> HTTP consuming would be better to handle in a separate component. PIP-64
> doesn't determine that this should be part of Pulsar Admin API and we
> should revisit this decision. I think it's a bad idea to add HTTP consuming
> to Pulsar Admin API and brokers.
>
> I want to discuss whether we should implement the HTTP endpoint in the
> broker or separate it at another component(like pulsar-WebSocket).
>
> Best,
>
> Mattison
>


New proposal for chunk messages with shared subscriptions

2022-06-07 Thread Yunze Xu
Hi folks,

Recently I'm working on the implementation of PIP-37, see
https://github.com/apache/pulsar/wiki/PIP-37%3A-Large-message-size-handling-in-Pulsar#usecase-3-multiple-producers-with-shared-consumers
 

As we can see, https://github.com/apache/pulsar/pull/4400 only
implements chunking messages with non-shared subscriptions. When I
followed the **Option 2** section, I found it works but there are many
details that need to be taken care of.

For example,
- Should we add a marker type to indicate the chunk marker?
- Normally, the markers like Transaction markers are not visible to
  the client, but we need to send the chunk marker to client.
- What's the format of the chunk marker?
- Which compatibility problems would be brought by this design?

I think we need a new proposal to explain it in details and I'm
working on that, as well as the demo.

Feel free to ping me if you have any concern.

Thanks,
Yunze






Re: [DISCUSS] PIP-174: Provide new implementation for broker dispatch cache

2022-06-07 Thread Enrico Olivelli
Great idea.
I wonder which kind of metrics we could have. To see how each
tenant/namespace is using the cache
Enrico

Il giorno mar 7 giu 2022 alle ore 15:12 PengHui Li
 ha scritto:
>
> +1
>
> Penghui
> On Jun 7, 2022, 08:48 +0800, Matteo Merli , wrote:
> > https://github.com/apache/pulsar/issues/15954
> >
> > WIP can be seen at: https://github.com/apache/pulsar/pull/15955
> >
> > ---
> >
> >
> > ## Motivation
> >
> > The current implementation of the read cache in the Pulsar broker has 
> > largely
> > remained unchanged for a long time, except for a few minor tweaks.
> >
> > While the implementation is stable and reasonably efficient for
> > typical workloads,
> > the overhead required for managing the cache evictions in a broker
> > that is running
> > many topics can be pretty high in terms of extra CPU utilization and on the 
> > JVM
> > garbage collection to track an increased number of medium-lived objects.
> >
> > The goal is to provide an alternative implementation that can adapt better 
> > to
> > a wider variety of operating conditions.
> >
> > ### Current implementation details
> >
> > The broker cache is implemented as part of the `ManagedLedger` component,
> > which sits in the Pulsar broker and provides a higher level of
> > abstraction of top
> > of BookKeeper.
> >
> > Each topic (and managed-ledger) has its own private cache space. This
> > cache is implemented
> > as a `ConcurrentSkipList` sorted map that maps `(ledgerId, entryId) ->
> > payload`. The payload
> > is a `ByteBuf` reference that can either be a slice of a `ByteBuf` that we 
> > got
> > when reading from a socket, or it can be a copied buffer.
> >
> > Each topic cache is allowed to use the full broker max cache size before an
> > eviction is triggered. The total cache size is effectively a resource
> > shared across all
> > the topics, where a topic can use a more prominent portion of it if it
> > "asks for more".
> >
> > When the eviction happens, we need to do an expensive ranking of all
> > the caches in the broker
> > and do an eviction in a proportional way to the currently used space
> > for each of them.
> >
> > The bigger problem is represented by the `ConcurrentSkipList` and the
> > `ByteBuf` objects
> > that need to be tracked. The skip list is essentially like a "tree"
> > structure and needs to
> > maintain Java objects for each entry in the cache. We also need to
> > potentially have
> > a huge number of ByteBuf objects.
> >
> > A cache workload is typically the worst-case scenario for each garbage
> > collector implementation because it involves creating objects, storing
> > them for some amount of
> > time and then throwing them away. During that time, the GC would have
> > already tenured these
> > objects and copy them into an "old generation" space, and sometime
> > later, a costly compaction
> > of that memory would have to be performed.
> >
> > To mitigate the effect of the cache workload on the GC, we're being
> > very aggressive in
> > purging the cache by triggering time-based eviction. By putting a max
> > TTL on the elements in
> > the cache, we can avoid keeping the objects around for too long to be
> > a problem for the GC.
> >
> > The reverse side of this is that we're artificially reducing the cache
> > capacity to a very
> > short time frame, reducing the cache usefulness.
> >
> > The other problem is the CPU cost involved in doing these frequent
> > evictions, which can
> > be very high when there are 10s of thousands of topics in a broker.
> >
> >
> > ## Proposed changes
> >
> > Instead of dealing with individual caches for each topic, let's adopt
> > a model where
> > there is a single cache space for the broker.
> >
> > This cache is broken into N segments which act as a circular buffer.
> > Whenever a segment
> > is full, we start writing into the next one, and when we reach the
> > last one, we will
> > restart recycling the first segment.
> >
> > Each segment is composed of a buffer, an offset, and a hashmap which maps
> > `(ledgerId, entryId) -> offset`.
> >
> > This model has been working very well for the BookKeeper `ReadCache`:
> > https://github.com/apache/bookkeeper/blob/master/bookkeeper-server/src/main/java/org/apache/bookkeeper/bookie/storage/ldb/ReadCache.java
> >
> > There are two main advantages to this approach:
> >
> > 1. Entries are copied into the cache buffer (in direct memory), and
> > we don't need to keep any
> > long-lived Java objects around
> > 2. The eviction becomes a completely trivial operation, buffers are
> > just rotated and
> > overwritten. We don't need to do any per-topic task or keep track
> > of utilization.
> >
> > ### API changes
> >
> > No user-facing API changes are required.
> >
> > ### New configuration options
> >
> > The existing cache implementation will not be removed at this point. Users 
> > will
> > be able to configure the old implementation in `broker.conf`.
> >
> > This option will be helpful in case of performance regressions 

Re: [DISCUSS] Apache Pulsar 2.9.3 release

2022-06-07 Thread mattison chao
Thanks for your update. I will continue to release 2.9.3


Best,
Mattison

On Sat, 4 Jun 2022 at 04:04, Dave Fisher  wrote:

>
> > On Jun 2, 2022, at 11:55 PM, mattison chao 
> wrote:
> >
> > Hi Dave Fisher,
> >
> >> There are some PRs that are coming in that must be included.
> >
> > How's the progress on these PRs?
>
> They are merged.
>
> Regards,
> Dave
>
>
> >
> > Best,
> > Mattison
> >
> > On Wed, 25 May 2022 at 21:33, Just do it 
> > wrote:
> >
> >> +1
> >> Thanks,
> >> Dezhi
> >>
> >>
> >>
> >>
> >>
> >> -- Original --
> >> From: Hang Chen  >> Date: Wed,May 25,2022 9:10 AM
> >> To: dev  >> Subject: Re: [DISCUSS] Apache Pulsar 2.9.3 release
> >>
> >>
> >>
> >> +1
> >>
> >> Thanks,
> >> Hang
> >>
> >> Dave Fisher  >> 
> >>  There are some PRs that are coming in that must be included.
> >> 
> >>  Thanks,
> >>  Dave
> >> 
> >> 
> >>   On May 23, 2022, at 4:29 AM, PengHui Li  
> >> wrote:
> >>  
> >>   +1
> >>  
> >>   Thanks
> >>   Penghui
> >>  
> >>   On Mon, May 23, 2022 at 3:31 PM mattison chao <
> >> mattisonc...@apache.org
> >>   wrote:
> >>  
> >>   Hello, Pulsar community:
> >>  
> >>   I'd like to propose to release Apache Pulsar 2.9.3
> >>  
> >>   Currently, we have 192 commits [0] and there are many
> >> transaction
> >>   fixes, security fixes.
> >>  
> >>   And there are 22 open PRs [1], I will follow them to make
> >> sure that
> >>   the important fixes could be contained in 2.9.3
> >>  
> >>   If you have any important fixes or any questions,
> >>   please reply to this email, we will evaluate whether to
> >>   include it in 2.9.3
> >>  
> >>   [0]
> >>  
> >>  
> >>
> https://github.com/apache/pulsar/pulls?q=is%3Amerged+is%3Apr+label%3Arelease%2F2.9.3+
> >> 
> >> <
> https://github.com/apache/pulsar/pulls?q=is%3Amerged+is%3Apr+label%3Arelease%2F2.9.3+
> >;
> >>  [1]
> >>  
> >>  
> >>
> https://github.com/apache/pulsar/pulls?q=is%3Aopen+is%3Apr+label%3Arelease%2F2.9.3+
> >> 
> >> <
> https://github.com/apache/pulsar/pulls?q=is%3Aopen+is%3Apr+label%3Arelease%2F2.9.3+
> >;
> >> 
> >>   Best Regards
> >>   Mattison
> >>  
> >> 
>
>


Re: [ANNOUNCE] New Committer: Dezhi Liu

2022-06-07 Thread PengHui Li
Congratulations Dezhi!

Penghui
On Jun 7, 2022, 17:22 +0800, Enrico Olivelli , wrote:
> Congratulations !!
>
> Enrico
>
> Il giorno mar 7 giu 2022 alle ore 10:57 Zike Yang  ha 
> scritto:
> >
> > Congratulations!
> >
> > Best Regards,
> > Zike Yang
> >
> > On Tue, Jun 7, 2022 at 3:52 PM ZhangJian He  wrote:
> >
> > > Congratulations!
> > >
> > > Thanks
> > > ZhangJian He
> > >
> > > Haiting Jiang  于2022年6月7日周二 15:46写道:
> > >
> > > > Congrats!
> > > >
> > > > BR,
> > > > Haiting
> > > >
> > > > On 2022/06/07 06:46:00 Hang Chen wrote:
> > > > > The Project Management Committee (PMC) for Apache Pulsar has invited
> > > > > Dezhi Liu (https://github.com/liudezhi2098) to become a committer and
> > > > > we are pleased to announce that he has accepted.
> > > > >
> > > > > Dezhi Liu (with Github id liudezhi2098) contributed many improvements
> > > > > and bug fixes to Pulsar.
> > > > >
> > > > > Being a committer enables easier contribution to the project since
> > > > > there is no need to go via the patch submission process. This should
> > > > > enable better productivity.
> > > > >
> > > > > Welcome and Congratulations, Dezhi Liu!
> > > > >
> > > > > Please join us in congratulating and welcoming Dezhi Liu onboard!
> > > > >
> > > > > Best Regards,
> > > > > Hang Chen on behalf of the Pulsar PMC
> > > > >
> > > >
> > >


Re: [DISCUSS] PIP-174: Provide new implementation for broker dispatch cache

2022-06-07 Thread PengHui Li
+1

Penghui
On Jun 7, 2022, 08:48 +0800, Matteo Merli , wrote:
> https://github.com/apache/pulsar/issues/15954
>
> WIP can be seen at: https://github.com/apache/pulsar/pull/15955
>
> ---
>
>
> ## Motivation
>
> The current implementation of the read cache in the Pulsar broker has largely
> remained unchanged for a long time, except for a few minor tweaks.
>
> While the implementation is stable and reasonably efficient for
> typical workloads,
> the overhead required for managing the cache evictions in a broker
> that is running
> many topics can be pretty high in terms of extra CPU utilization and on the 
> JVM
> garbage collection to track an increased number of medium-lived objects.
>
> The goal is to provide an alternative implementation that can adapt better to
> a wider variety of operating conditions.
>
> ### Current implementation details
>
> The broker cache is implemented as part of the `ManagedLedger` component,
> which sits in the Pulsar broker and provides a higher level of
> abstraction of top
> of BookKeeper.
>
> Each topic (and managed-ledger) has its own private cache space. This
> cache is implemented
> as a `ConcurrentSkipList` sorted map that maps `(ledgerId, entryId) ->
> payload`. The payload
> is a `ByteBuf` reference that can either be a slice of a `ByteBuf` that we got
> when reading from a socket, or it can be a copied buffer.
>
> Each topic cache is allowed to use the full broker max cache size before an
> eviction is triggered. The total cache size is effectively a resource
> shared across all
> the topics, where a topic can use a more prominent portion of it if it
> "asks for more".
>
> When the eviction happens, we need to do an expensive ranking of all
> the caches in the broker
> and do an eviction in a proportional way to the currently used space
> for each of them.
>
> The bigger problem is represented by the `ConcurrentSkipList` and the
> `ByteBuf` objects
> that need to be tracked. The skip list is essentially like a "tree"
> structure and needs to
> maintain Java objects for each entry in the cache. We also need to
> potentially have
> a huge number of ByteBuf objects.
>
> A cache workload is typically the worst-case scenario for each garbage
> collector implementation because it involves creating objects, storing
> them for some amount of
> time and then throwing them away. During that time, the GC would have
> already tenured these
> objects and copy them into an "old generation" space, and sometime
> later, a costly compaction
> of that memory would have to be performed.
>
> To mitigate the effect of the cache workload on the GC, we're being
> very aggressive in
> purging the cache by triggering time-based eviction. By putting a max
> TTL on the elements in
> the cache, we can avoid keeping the objects around for too long to be
> a problem for the GC.
>
> The reverse side of this is that we're artificially reducing the cache
> capacity to a very
> short time frame, reducing the cache usefulness.
>
> The other problem is the CPU cost involved in doing these frequent
> evictions, which can
> be very high when there are 10s of thousands of topics in a broker.
>
>
> ## Proposed changes
>
> Instead of dealing with individual caches for each topic, let's adopt
> a model where
> there is a single cache space for the broker.
>
> This cache is broken into N segments which act as a circular buffer.
> Whenever a segment
> is full, we start writing into the next one, and when we reach the
> last one, we will
> restart recycling the first segment.
>
> Each segment is composed of a buffer, an offset, and a hashmap which maps
> `(ledgerId, entryId) -> offset`.
>
> This model has been working very well for the BookKeeper `ReadCache`:
> https://github.com/apache/bookkeeper/blob/master/bookkeeper-server/src/main/java/org/apache/bookkeeper/bookie/storage/ldb/ReadCache.java
>
> There are two main advantages to this approach:
>
> 1. Entries are copied into the cache buffer (in direct memory), and
> we don't need to keep any
> long-lived Java objects around
> 2. The eviction becomes a completely trivial operation, buffers are
> just rotated and
> overwritten. We don't need to do any per-topic task or keep track
> of utilization.
>
> ### API changes
>
> No user-facing API changes are required.
>
> ### New configuration options
>
> The existing cache implementation will not be removed at this point. Users 
> will
> be able to configure the old implementation in `broker.conf`.
>
> This option will be helpful in case of performance regressions would be seen 
> for
> some use cases with the new cache implementation.


Re: [VOTE] PIP-166: Function add MANUAL delivery semantics

2022-06-07 Thread Enrico Olivelli
I have left one last minute comment, can you please take a look ? then
I will post my +1

thanks
Enrico


Re: [VOTE] PIP-166: Function add MANUAL delivery semantics

2022-06-07 Thread Asaf Mesika
+1

On Mon, Jun 6, 2022 at 4:04 AM Baodi Shi 
wrote:

> Hi Pulsar Community,
>
>
> I would like to start a VOTE on "Function add MANUAL delivery semantics"
> (PIP-166).
>
>
> The proposal can be read at https://github.com/apache/pulsar/issues/15560
>
> and the discussion thead is available at
>
> https://lists.apache.org/thread/4f2w1mqvhhs3mvccbcg2sk19b60xwkgf
>
>
> Voting will stay open for at least 48h.
>
>
> Thanks,
>
> Baodi Shi
>


Re: [ANNOUNCE] New Committer: Dezhi Liu

2022-06-07 Thread Enrico Olivelli
Congratulations !!

Enrico

Il giorno mar 7 giu 2022 alle ore 10:57 Zike Yang  ha scritto:
>
> Congratulations!
>
> Best Regards,
> Zike Yang
>
> On Tue, Jun 7, 2022 at 3:52 PM ZhangJian He  wrote:
>
> > Congratulations!
> >
> > Thanks
> > ZhangJian He
> >
> > Haiting Jiang  于2022年6月7日周二 15:46写道:
> >
> > > Congrats!
> > >
> > > BR,
> > > Haiting
> > >
> > > On 2022/06/07 06:46:00 Hang Chen wrote:
> > > > The Project Management Committee (PMC) for Apache Pulsar has invited
> > > > Dezhi Liu (https://github.com/liudezhi2098) to become a committer and
> > > > we are pleased to announce that he has accepted.
> > > >
> > > > Dezhi Liu (with Github id liudezhi2098) contributed many improvements
> > > > and bug fixes to Pulsar.
> > > >
> > > > Being a committer enables easier contribution to the project since
> > > > there is no need to go via the patch submission process. This should
> > > > enable better productivity.
> > > >
> > > > Welcome and Congratulations, Dezhi Liu!
> > > >
> > > > Please join us in congratulating and welcoming Dezhi Liu onboard!
> > > >
> > > > Best Regards,
> > > > Hang Chen on behalf of the Pulsar PMC
> > > >
> > >
> >


Re: [ANNOUNCE] New Committer: Dezhi Liu

2022-06-07 Thread Zike Yang
Congratulations!

Best Regards,
Zike Yang

On Tue, Jun 7, 2022 at 3:52 PM ZhangJian He  wrote:

> Congratulations!
>
> Thanks
> ZhangJian He
>
> Haiting Jiang  于2022年6月7日周二 15:46写道:
>
> > Congrats!
> >
> > BR,
> > Haiting
> >
> > On 2022/06/07 06:46:00 Hang Chen wrote:
> > > The Project Management Committee (PMC) for Apache Pulsar has invited
> > > Dezhi Liu (https://github.com/liudezhi2098) to become a committer and
> > > we are pleased to announce that he has accepted.
> > >
> > > Dezhi Liu (with Github id liudezhi2098) contributed many improvements
> > > and bug fixes to Pulsar.
> > >
> > > Being a committer enables easier contribution to the project since
> > > there is no need to go via the patch submission process. This should
> > > enable better productivity.
> > >
> > > Welcome and Congratulations, Dezhi Liu!
> > >
> > > Please join us in congratulating and welcoming Dezhi Liu onboard!
> > >
> > > Best Regards,
> > > Hang Chen on behalf of the Pulsar PMC
> > >
> >
>


Re: [ANNOUNCE] New Committer: Dezhi Liu

2022-06-07 Thread ZhangJian He
Congratulations!

Thanks
ZhangJian He

Haiting Jiang  于2022年6月7日周二 15:46写道:

> Congrats!
>
> BR,
> Haiting
>
> On 2022/06/07 06:46:00 Hang Chen wrote:
> > The Project Management Committee (PMC) for Apache Pulsar has invited
> > Dezhi Liu (https://github.com/liudezhi2098) to become a committer and
> > we are pleased to announce that he has accepted.
> >
> > Dezhi Liu (with Github id liudezhi2098) contributed many improvements
> > and bug fixes to Pulsar.
> >
> > Being a committer enables easier contribution to the project since
> > there is no need to go via the patch submission process. This should
> > enable better productivity.
> >
> > Welcome and Congratulations, Dezhi Liu!
> >
> > Please join us in congratulating and welcoming Dezhi Liu onboard!
> >
> > Best Regards,
> > Hang Chen on behalf of the Pulsar PMC
> >
>


Re: [ANNOUNCE] New Committer: Dezhi Liu

2022-06-07 Thread Haiting Jiang
Congrats!

BR,
Haiting

On 2022/06/07 06:46:00 Hang Chen wrote:
> The Project Management Committee (PMC) for Apache Pulsar has invited
> Dezhi Liu (https://github.com/liudezhi2098) to become a committer and
> we are pleased to announce that he has accepted.
> 
> Dezhi Liu (with Github id liudezhi2098) contributed many improvements
> and bug fixes to Pulsar.
> 
> Being a committer enables easier contribution to the project since
> there is no need to go via the patch submission process. This should
> enable better productivity.
> 
> Welcome and Congratulations, Dezhi Liu!
> 
> Please join us in congratulating and welcoming Dezhi Liu onboard!
> 
> Best Regards,
> Hang Chen on behalf of the Pulsar PMC
> 


[ANNOUNCE] New Committer: Dezhi Liu

2022-06-07 Thread Hang Chen
The Project Management Committee (PMC) for Apache Pulsar has invited
Dezhi Liu (https://github.com/liudezhi2098) to become a committer and
we are pleased to announce that he has accepted.

Dezhi Liu (with Github id liudezhi2098) contributed many improvements
and bug fixes to Pulsar.

Being a committer enables easier contribution to the project since
there is no need to go via the patch submission process. This should
enable better productivity.

Welcome and Congratulations, Dezhi Liu!

Please join us in congratulating and welcoming Dezhi Liu onboard!

Best Regards,
Hang Chen on behalf of the Pulsar PMC