Re: [VOTE] Pulsar Client C++ Release 3.2.0 Candidate 3

2023-05-10 Thread Yuto Furuta
+1(non-binding)

Verified:
  - Verified checksum and signatures
  - Build from source code on MacOS
  - Verified produce and consume

差出人: Baodi Shi 
送信日時: 2023年5月9日 10:47
宛先: dev@pulsar.apache.org 
件名: Re: [VOTE] Pulsar Client C++ Release 3.2.0 Candidate 3

+1(non-binding)

Verify:

   - SHA512 for source code.
   - Compilation on MacOS M1(13.2.1)
   - Run SampleProducer and SampleConsumer


Thanks,
Baodi Shi


On May 7, 2023 at 10:53:32, Yunze Xu  wrote:

> This is the third release candidate for Apache Pulsar Client C++, version
> 3.2.0.
>
> It fixes the following issues:
> https://github.com/apache/pulsar-client-cpp/milestone/3?closed=1
>
> *** Please download, test and vote on this release. This vote will stay
> open
> for at least 72 hours ***
>
> Note that we are voting upon the source (tag), binaries are provided for
> convenience.
>
> Source and binary files:
>
> https://jpn01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdist.apache.org%2Frepos%2Fdist%2Fdev%2Fpulsar%2Fpulsar-client-cpp%2Fpulsar-client-cpp-3.2.0-candidate-3%2F=05%7C01%7Cyfuruta%40yahoo-corp.jp%7Cf2c3576a396a4f9efeae08db502f6a42%7Ca208d369cd4e4f87b11998eaf31df2c3%7C1%7C0%7C638191936796372639%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C=2YOV4X7YLdYBBMl%2F9XbaaPhVZyapDDjAd1oxreXTFbc%3D=0
>
> SHA-512 checksums:
>
> 4422088c9d16e91caf90f6991a0ca0e3a5f50328c3503acf90641b100fe72b1fef7bec782cc693b947d53841d61470a56175a29d27fb609b937a6f79486b
> apache-pulsar-client-cpp-3.2.0.tar.gz
>
> The tag to be voted upon:
> v3.2.0-candidate-3 (1dad87bb3b804d2aa8542ac48e4c35228ac2f1bf)
> https://github.com/apache/pulsar-client-cpp/releases/tag/v3.2.0-candidate-3
>
> Pulsar's KEYS file containing PGP keys you use to sign the release:
> https://jpn01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdownloads.apache.org%2Fpulsar%2FKEYS=05%7C01%7Cyfuruta%40yahoo-corp.jp%7Cf2c3576a396a4f9efeae08db502f6a42%7Ca208d369cd4e4f87b11998eaf31df2c3%7C1%7C0%7C638191936796372639%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C=JhG7VyPM5r52DyD%2F2srJQXBeLChi5K2uXkOxE2xfaUg%3D=0
>
> Please download the source package, and follow the README to compile and
> test.
>


Re: [DISCUSS] Improve Pulsar Function Source Primitive Schema Mapping

2023-05-10 Thread Neng Lu
Hi All,

Here's the PR for this proposed change:
https://github.com/apache/pulsar/pull/20294
If you have time, please take a look.

On Fri, May 5, 2023 at 6:08 AM Rui Fu  wrote:

> Hi Neng,
>
> Thanks for bringing this issue up. Using JSON as the default schema and
> wrapping it with other primitive types are counterintuitive, and +1 to make
> [2] align with [1] so that both Pulsar Source and Pulsar Sink will make
> correct support with other primitive types.
>
> And as per the code [3], if the topic already exists, it will try to use
> the existing schema instead of the schema type returned by [2]. So the
> changes will only affect the newly deployed instances.
>
> [3]
> https://github.com/apache/pulsar/blob/branch-3.0/pulsar-functions/instance/src/main/java/org/apache/pulsar/functions/source/TopicSchema.java#L102-L122
>
> Best,
>
> Rui Fu
> On Apr 28, 2023 at 13:36 +0800, Pengcheng Jiang
> , wrote:
> > Hello Neng,
> >
> > IMO, we should update code[2] to follow the doc, and for existing
> > functions, if they are in running status, they won't touch code[2]; and
> for
> > a new run, functions
> > will fail to start, and this will remind users to update their function
> >
> > Regards,
> > Pengcheng Jiang
> >
> > Neng Lu  于2023年4月28日周五 06:59写道:
> >
> > > Hi All,
> > >
> > > Based on [1], Pulsar has various primitive schema types and has a very
> > > clear mapping between java classes to primitive schema types.
> > >
> > > But in code [2], Pulsar Function Source only handles the byte and
> String
> > > java classes primitive schema mapping while default all other primitive
> > > types to JSON schema. Also for byte class types, the NONE schema is
> used
> > > instead of the BYTES schema.
> > >
> > > All these differences cause confusion for users trying to use Pulsar
> > > Functions for the first time, and also make Pulsar Function not
> following
> > > the Pulsar Schema official document.
> > >
> > > Ideally, we should change the code [2], to make it following [1]. But
> such
> > > changes may lead to breaking behaviors for existing users who adapted
> their
> > > code to run the Pulsar Functions.
> > >
> > > I would like to hear your thoughts on this and see how we should
> proceed.
> > >
> > > Thank you! Regards
> > >
> > > [1]
> > >
> https://pulsar.apache.org/docs/2.11.x/schema-understand/#primitive-type
> > > [2]
> > >
> > >
> https://github.com/apache/pulsar/blob/master/pulsar-functions/instance/src/main/java/org/apache/pulsar/functions/source/TopicSchema.java#L124
> > >
>


Re: [DISCUSS] PIP-265: PR-based system for managing and reviewing PIPs

2023-05-10 Thread Asaf Mesika
The documentation has severe issues with diagrams in general, today.
There is no standard way yet to do it. We have all kinds of ways to do
diagrams, resulting in an inconsistent look for the documentation.
I think it deserves its own discussion/PIP/issue.

Regardless, I think it's part of a PIP to add documentation to describe the
feature.


On Wed, May 10, 2023 at 3:58 PM Dave Fisher  wrote:

>
>
> Sent from my iPhone
>
> > On May 10, 2023, at 12:01 AM, Asaf Mesika  wrote:
> >
> > On Tue, May 9, 2023 at 8:03 PM Dave Fisher  wrote:
> >
> >>
> >>
>  On May 9, 2023, at 5:47 AM, Asaf Mesika 
> wrote:
> >>>
>  On Tue, May 9, 2023 at 5:18 AM Hang Chen  wrote:
> >>>
>  Thanks for driving this discussion.
> 
>  I agree to change the proposal discussion from issue and dev mail list
>  to PR. It will be easier to review and comment, especially for large
>  proposals.
>  I have two questions about this change.
>  - Some proposals contain images, and putting those images into Pulsar
>  main repo will make the git db large. What's more, some images can be
>  up to several MBs
> 
> >>>
> >>> That's a great point, and we must address it in the PIP.
> >>> How about we say that you only use:
> >>> 1. Mermaid  - it's a tiny language to
> create
> >>> drawings? GitHub supports this language on code highlight and renders
> it
> >>> correctly.
> >>
> >> Does Docusaurus support Mermaid? The design documents for a PIP should
> be
> >> available for easy inclusion in pulsar-site.
> >>
> >
> > Do we have plans to have a section dedicated to displays PIPs on the
> > website ?
>
> We should make sure it is easy to convert a PIP into user documentation of
> what is finally merged.
>
> Best,
> Dave
> >
> >
> >>
> >>> 2. Use SVG files which will be located in a folder named after the pip
> >>> issue number. SVG are vector graphics saved as text. For diagrams,
> >>> they should be ok in size, and compress well.
> >>>
> >>> I think Mermaid should be enough for all drawings needed for
> illustration
> >>> of design document purposes. WDYT?
> >>
> >> I think that any reasonable format should be OK, but easily editable
> >> versions should be preferred. All modern tools ought to be able to
> export
> >> SVG and all modern browsers render them.
> >>
> >> Best,
> >> Dave
> >>
> >>>
> >>>
> >>>
>  - After merging one proposal, if we want to update the content, do we
>  need to discuss it in the dev mail list or just push one PR to update
>  it?
> 
> >>>
> >>> Does it happen often?
> >>> I guess if the change is not big, it's ok just to do PR.
> >>>
> >>> I can clarify that as well, if it is agreed upon.
> >>>
> >>>
> 
>  Thanks,
>  Hang
> 
> 
>  PengHui Li  于2023年5月8日周一 18:06写道:
> >
> > Thanks for driving the improvements in proposal managing and
> reviewing.
> > The proposal looks good to me. I have only one question about the dir
>  name
> > for the pips.
> >
> > For now, we have
>  https://github.com/apache/pulsar/tree/master/wiki/proposals
> > Is it better to use the existing one? Or change the existing one to
>  "pip".
> > I mean, we'd better don't use two dirs for proposals.
> >
> > Thanks,
> > Penghui
> >
> > On Sun, May 7, 2023 at 5:52 PM Asaf Mesika 
>  wrote:
> >
> >> Ping, in case it was lost in the barrage of mails
> >>
> >> On Sun, Apr 30, 2023 at 10:38 AM Asaf Mesika  >
> >> wrote:
> >>
> >>> Hi,
> >>>
> >>> I've summarized all comments from
> >>> https://lists.apache.org/thread/5kpddlfh5xdbsjmv47ymnk3z6wd92jbh
>  into a
> >>> PIP.
> >>>
> >>> PIP: https://github.com/apache/pulsar/issues/20207
> >>> 
> >>>
> >>> I'm leaving this discussion open for 2-3 days to make sure I
> haven't
> >>> missed a comment, and proceed to vote, since we had most of the
> >> discussion
> >>> already in the link provided above.
> >>>
> >>> Thanks!
> >>>
> >>> Asaf
> >>>
> >>
> 
> >>
> >>
>
>


Re: [DISCUSS] PIP-265: PR-based system for managing and reviewing PIPs

2023-05-10 Thread Dave Fisher
Sent from my iPhoneOn May 10, 2023, at 5:58 AM, Dave Fisher  wrote:Sent from my iPhoneOn May 10, 2023, at 12:01 AM, Asaf Mesika  wrote:On Tue, May 9, 2023 at 8:03 PM Dave Fisher  wrote:On May 9, 2023, at 5:47 AM, Asaf Mesika  wrote:On Tue, May 9, 2023 at 5:18 AM Hang Chen  wrote:Thanks for driving this discussion.I agree to change the proposal discussion from issue and dev mail listto PR. It will be easier to review and comment, especially for largeproposals.I have two questions about this change.- Some proposals contain images, and putting those images into Pulsarmain repo will make the git db large. What's more, some images can beup to several MBsThat's a great point, and we must address it in the PIP.How about we say that you only use:1. Mermaid  - it's a tiny language to createdrawings? GitHub supports this language on code highlight and renders itcorrectly.Does Docusaurus support Mermaid? The design documents for a PIP should beavailable for easy inclusion in pulsar-site.Do we have plans to have a section dedicated to displays PIPs on thewebsite ?We should make sure it is easy to convert a PIP into user documentation of what is finally merged.I went through some documentation. While Docusaurus is not listed it should be possible to find a way to use Mermaid in pulsar-site. It will be work someone will need to do.Usagemermaid.js.orgBest,DaveBest,Dave2. Use SVG files which will be located in a folder named after the pipissue number. SVG are vector graphics saved as text. For diagrams,they should be ok in size, and compress well.I think Mermaid should be enough for all drawings needed for illustrationof design document purposes. WDYT?I think that any reasonable format should be OK, but easily editableversions should be preferred. All modern tools ought to be able to exportSVG and all modern browsers render them.Best,Dave- After merging one proposal, if we want to update the content, do weneed to discuss it in the dev mail list or just push one PR to updateit?Does it happen often?I guess if the change is not big, it's ok just to do PR.I can clarify that as well, if it is agreed upon.Thanks,HangPengHui Li  于2023年5月8日周一 18:06写道:Thanks for driving the improvements in proposal managing and reviewing.The proposal looks good to me. I have only one question about the dirnamefor the pips.For now, we havehttps://github.com/apache/pulsar/tree/master/wiki/proposalsIs it better to use the existing one? Or change the existing one to"pip".I mean, we'd better don't use two dirs for proposals.Thanks,PenghuiOn Sun, May 7, 2023 at 5:52 PM Asaf Mesika wrote:Ping, in case it was lost in the barrage of mailsOn Sun, Apr 30, 2023 at 10:38 AM Asaf Mesika wrote:Hi,I've summarized all comments fromhttps://lists.apache.org/thread/5kpddlfh5xdbsjmv47ymnk3z6wd92jbhinto aPIP.PIP: https://github.com/apache/pulsar/issues/20207I'm leaving this discussion open for 2-3 days to make sure I haven'tmissed a comment, and proceed to vote, since we had most of thediscussionalready in the link provided above.Thanks!Asaf

Re: [DISCUSS] PIP-265: PR-based system for managing and reviewing PIPs

2023-05-10 Thread Dave Fisher



Sent from my iPhone

> On May 10, 2023, at 12:01 AM, Asaf Mesika  wrote:
> 
> On Tue, May 9, 2023 at 8:03 PM Dave Fisher  wrote:
> 
>> 
>> 
 On May 9, 2023, at 5:47 AM, Asaf Mesika  wrote:
>>> 
 On Tue, May 9, 2023 at 5:18 AM Hang Chen  wrote:
>>> 
 Thanks for driving this discussion.
 
 I agree to change the proposal discussion from issue and dev mail list
 to PR. It will be easier to review and comment, especially for large
 proposals.
 I have two questions about this change.
 - Some proposals contain images, and putting those images into Pulsar
 main repo will make the git db large. What's more, some images can be
 up to several MBs
 
>>> 
>>> That's a great point, and we must address it in the PIP.
>>> How about we say that you only use:
>>> 1. Mermaid  - it's a tiny language to create
>>> drawings? GitHub supports this language on code highlight and renders it
>>> correctly.
>> 
>> Does Docusaurus support Mermaid? The design documents for a PIP should be
>> available for easy inclusion in pulsar-site.
>> 
> 
> Do we have plans to have a section dedicated to displays PIPs on the
> website ?

We should make sure it is easy to convert a PIP into user documentation of what 
is finally merged.

Best,
Dave
> 
> 
>> 
>>> 2. Use SVG files which will be located in a folder named after the pip
>>> issue number. SVG are vector graphics saved as text. For diagrams,
>>> they should be ok in size, and compress well.
>>> 
>>> I think Mermaid should be enough for all drawings needed for illustration
>>> of design document purposes. WDYT?
>> 
>> I think that any reasonable format should be OK, but easily editable
>> versions should be preferred. All modern tools ought to be able to export
>> SVG and all modern browsers render them.
>> 
>> Best,
>> Dave
>> 
>>> 
>>> 
>>> 
 - After merging one proposal, if we want to update the content, do we
 need to discuss it in the dev mail list or just push one PR to update
 it?
 
>>> 
>>> Does it happen often?
>>> I guess if the change is not big, it's ok just to do PR.
>>> 
>>> I can clarify that as well, if it is agreed upon.
>>> 
>>> 
 
 Thanks,
 Hang
 
 
 PengHui Li  于2023年5月8日周一 18:06写道:
> 
> Thanks for driving the improvements in proposal managing and reviewing.
> The proposal looks good to me. I have only one question about the dir
 name
> for the pips.
> 
> For now, we have
 https://github.com/apache/pulsar/tree/master/wiki/proposals
> Is it better to use the existing one? Or change the existing one to
 "pip".
> I mean, we'd better don't use two dirs for proposals.
> 
> Thanks,
> Penghui
> 
> On Sun, May 7, 2023 at 5:52 PM Asaf Mesika 
 wrote:
> 
>> Ping, in case it was lost in the barrage of mails
>> 
>> On Sun, Apr 30, 2023 at 10:38 AM Asaf Mesika 
>> wrote:
>> 
>>> Hi,
>>> 
>>> I've summarized all comments from
>>> https://lists.apache.org/thread/5kpddlfh5xdbsjmv47ymnk3z6wd92jbh
 into a
>>> PIP.
>>> 
>>> PIP: https://github.com/apache/pulsar/issues/20207
>>> 
>>> 
>>> I'm leaving this discussion open for 2-3 days to make sure I haven't
>>> missed a comment, and proceed to vote, since we had most of the
>> discussion
>>> already in the link provided above.
>>> 
>>> Thanks!
>>> 
>>> Asaf
>>> 
>> 
 
>> 
>> 



[VOTE] PIP-265: PR-based system for managing and reviewing PIPs

2023-05-10 Thread Asaf Mesika
Hi,

I'm starting the vote process for PIP-265.

Link: https://github.com/apache/pulsar/issues/20207

Thanks!

Asaf


Re: [DISCUSS] PIP-265: PR-based system for managing and reviewing PIPs

2023-05-10 Thread Asaf Mesika
I've also updated the doc regarding images, adding:

### Handling images
Since documents will now reside as files in git, we need to avoid large
image files.
Hence, we'll specify to author that images needs to be created using
[mermaidJS](https://mermaid.js.org/#/) diagram language, which GitHub
supports rendering. It covers 99% of the cases. For the 1% case, they can
use small file size format SVG, and make sure the file is 1k-5k size.

Since I haven't any other blocker for this PIP discussed and this issue has
been in discussion for almost 2 months, I'll open the vote for the PIP.

Thanks!

Asaf

On Wed, May 10, 2023 at 10:00 AM Asaf Mesika  wrote:

>
>
> On Tue, May 9, 2023 at 8:03 PM Dave Fisher  wrote:
>
>>
>>
>> > On May 9, 2023, at 5:47 AM, Asaf Mesika  wrote:
>> >
>> > On Tue, May 9, 2023 at 5:18 AM Hang Chen  wrote:
>> >
>> >> Thanks for driving this discussion.
>> >>
>> >> I agree to change the proposal discussion from issue and dev mail list
>> >> to PR. It will be easier to review and comment, especially for large
>> >> proposals.
>> >> I have two questions about this change.
>> >> - Some proposals contain images, and putting those images into Pulsar
>> >> main repo will make the git db large. What's more, some images can be
>> >> up to several MBs
>> >>
>> >
>> > That's a great point, and we must address it in the PIP.
>> > How about we say that you only use:
>> > 1. Mermaid  - it's a tiny language to create
>> > drawings? GitHub supports this language on code highlight and renders it
>> > correctly.
>>
>> Does Docusaurus support Mermaid? The design documents for a PIP should be
>> available for easy inclusion in pulsar-site.
>>
>
> Do we have plans to have a section dedicated to displays PIPs on the
> website ?
>
>
>>
>> > 2. Use SVG files which will be located in a folder named after the pip
>> > issue number. SVG are vector graphics saved as text. For diagrams,
>> > they should be ok in size, and compress well.
>> >
>> > I think Mermaid should be enough for all drawings needed for
>> illustration
>> > of design document purposes. WDYT?
>>
>> I think that any reasonable format should be OK, but easily editable
>> versions should be preferred. All modern tools ought to be able to export
>> SVG and all modern browsers render them.
>>
>> Best,
>> Dave
>>
>> >
>> >
>> >
>> >> - After merging one proposal, if we want to update the content, do we
>> >> need to discuss it in the dev mail list or just push one PR to update
>> >> it?
>> >>
>> >
>> > Does it happen often?
>> > I guess if the change is not big, it's ok just to do PR.
>> >
>> > I can clarify that as well, if it is agreed upon.
>> >
>> >
>> >>
>> >> Thanks,
>> >> Hang
>> >>
>> >>
>> >> PengHui Li  于2023年5月8日周一 18:06写道:
>> >>>
>> >>> Thanks for driving the improvements in proposal managing and
>> reviewing.
>> >>> The proposal looks good to me. I have only one question about the dir
>> >> name
>> >>> for the pips.
>> >>>
>> >>> For now, we have
>> >> https://github.com/apache/pulsar/tree/master/wiki/proposals
>> >>> Is it better to use the existing one? Or change the existing one to
>> >> "pip".
>> >>> I mean, we'd better don't use two dirs for proposals.
>> >>>
>> >>> Thanks,
>> >>> Penghui
>> >>>
>> >>> On Sun, May 7, 2023 at 5:52 PM Asaf Mesika 
>> >> wrote:
>> >>>
>>  Ping, in case it was lost in the barrage of mails
>> 
>>  On Sun, Apr 30, 2023 at 10:38 AM Asaf Mesika 
>>  wrote:
>> 
>> > Hi,
>> >
>> > I've summarized all comments from
>> > https://lists.apache.org/thread/5kpddlfh5xdbsjmv47ymnk3z6wd92jbh
>> >> into a
>> > PIP.
>> >
>> > PIP: https://github.com/apache/pulsar/issues/20207
>> > 
>> >
>> > I'm leaving this discussion open for 2-3 days to make sure I haven't
>> > missed a comment, and proceed to vote, since we had most of the
>>  discussion
>> > already in the link provided above.
>> >
>> > Thanks!
>> >
>> > Asaf
>> >
>> 
>> >>
>>
>>


[VOTE] PIP-251 Enhancing Transaction Buffer Stats and Introducing TransactionBufferInternalStats API

2023-05-10 Thread Xiangying Meng
Hello Pulsar community,

This thread is to start a vote for PIP-251: Enhancing Transaction
Buffer Stats and Introducing TransactionBufferInternalStats API.

Discussion thread:
https://lists.apache.org/thread/jsh2rod208xg28mojxwrod84p5zt1nrw
Issue:
https://github.com/apache/pulsar/issues/20291

Voting will be open for at least 48 hours.

Thanks!
Xiangying


Re: [DISCUSS] PIP-264: Enhanced OTel-based metric system

2023-05-10 Thread Asaf Mesika
On Tue, May 9, 2023 at 11:29 PM Dave Fisher  wrote:

>
>
> > On May 8, 2023, at 2:49 AM, Asaf Mesika  wrote:
> >
> > Your feedback made me realized I need to add "TL;DR" section, which I
> just
> > added.
> >
> > I'm quoting it here. It gives a brief summary of the proposal, which
> > requires up to 5 min of read time, helping you get a high level picture
> > before you dive into the background/motivation/solution.
> >
> > --
> > TL;DR
> >
> > Working with Metrics today as a user or a developer is hard and has many
> > severe issues.
> >
> > From the user perspective:
> >
> >   - One of Pulsar strongest feature is "cheap" topics so you can easily
> >   have 10k - 100k topics per broker. Once you do that, you quickly learn
> that
> >   the amount of metrics you export via "/metrics" (Prometheus style
> endpoint)
> >   becomes really big. The cost to store them becomes too high, queries
> >   time-out or even "/metrics" endpoint it self times out.
> >   The only option Pulsar gives you today is all-or-nothing filtering and
> >   very crude aggregation. You switch metrics from topic aggregation
> level to
> >   namespace aggregation level. Also you can turn off producer and
> consumer
> >   level metrics. You end up doing it all leaving you "blind", looking at
> the
> >   metrics from a namespace level which is too high level. You end up
> >   conjuring all kinds of scripts on top of topic stats endpoint to glue
> some
> >   aggregated metrics view for the topics you need.
> >   - Summaries (metric type giving you quantiles like p95) which are used
> >   in Pulsar, can't be aggregated across topics / brokers due its inherent
> >   design.
> >   - Plugin authors spend too much time on defining and exposing metrics
> to
> >   Pulsar since the only interface Pulsar offers is writing your metrics
> by
> >   your self as UTF-8 bytes in Prometheus Text Format to byte stream
> interface
> >   given to you.
> >   - Pulsar histograms are exported in a way that is not conformant with
> >   Prometheus, which means you can't get the p95 quantile on such
> histograms,
> >   making them very hard to use in day to day life.
>
> What version of DataSketches is used to produce the histogram? Is is still
> an old Yahoo one, or are we using an updated one from Apache DataSketches?
>
> Seems like this is a single PR/small PIP for 3.1?


Histograms are a list of buckets, each is a counter.
Summary is a collection of values collected over a time window, which at
the end you get a calculation of the quantiles of those values: p95, p50,
and those are exported from Pulsar.

Pulsar histogram do not use Data Sketches. They are just counters.
They are not adhere to Prometheus since:
a. The counter is expected to be cumulative, but Pulsar resets each bucket
counter to 0 every 1 min
b. The bucket upper range is expected to be written as an attribute "le"
but today it is encoded in the name of the metric itself.

This is a breaking change, hence hard to mark in any small release.
This is why it's part of this PIP since so many things will break, and all
of them will break on a separate layer (OTel metrics), hence not break
anyone without their consent.



>
>
> >   - Too many metrics are rates which also delta reset every interval you
> >   configure in Pulsar and restart, instead of relying on cumulative (ever
> >   growing) counters and let Prometheus use its rate function.
> >   - and many more issues
> >
> > From the developer perspective:
> >
> >   - There are 4 different ways to define and record metrics in Pulsar:
> >   Pulsar own metrics library, Prometheus Java Client, Bookkeeper metrics
> >   library and plain native Java SDK objects (AtomicLong, ...). It's very
> >   confusing for the developer and create inconsistencies for the end user
> >   (e.g. Summary for example is different in each).
> >   - Patching your metrics into "/metrics" Prometheus endpoint is
> >   confusing, cumbersome and error prone.
> >   - many more
> >
> > This proposal offers several key changes to solve that:
> >
> >   - Cardinality (supporting 10k-100k topics per broker) is solved by
> >   introducing a new aggregation level for metrics called Topic Metric
> Group.
> >   Using configuration, you specify for each topic its group (using
> >   wildcard/regex). This allows you to "zoom" out to a more detailed
> >   granularity level like groups instead of namespaces, which you control
> how
> >   many groups you'll have hence solving the cardinality issue, without
> >   sacrificing level of detail too much.
> >   - Fine-grained filtering mechanism, dynamic. You'll have rule-based
> >   dynamic configuration, allowing you to specify per
> namespace/topic/group
> >   which metrics you'd like to keep/drop. Rules allows you to set the
> default
> >   to have small amount of metrics in group and namespace level only and
> drop
> >   the rest. When needed, you can add an override rule to "open" up a
> certain
> >   group to have more metrics 

Re: [DISCUSS] PIP-264: Enhanced OTel-based metric system

2023-05-10 Thread Asaf Mesika
On Tue, May 9, 2023 at 9:33 PM Yunze Xu 
wrote:

> > Basically you have a full fledged metrics library objects: Meter, Gauge,
> Histogram, Counter.
>
> It sounds good, but not so attractive. Currently KoP implements its
> own metrics library objects. So after that, we need to leverage the
> similar classes from OTel.
>
Yes, well that's the cost associated with the benefit, not the benefit it
self :)
You can see we have many goals to solve in this PIP, among them some
serious pain people suffer today as users:
* Inability to observe 10k topics per broker and above (One of the key
advantages Pulsar has is many topics).
* Very expensive metrics. 100 UTS per topic. That's expensive, even for 1k
topics per broker.

So the PIP goal is to solve those (and many more).
The cost is that we need to make some heavy breaking changes, among them
Pulsar Plugin author like KoP (you) will need to spend time to migrate
their code. You are correct.

The attractive part is solving the pains of the user I described.
For existing plugin authors, OTel is not attractive, yes.
For future plugins authors, IMO, very attractive, since you remove a lot of
work they need to do for something so basic in today's world, such as
metrics.

 Is the cost worth it - this is what I'm trying to figure out by multiple
people's feedback.



> I want to talk a little more beyond that. IIUC, this proposal wants to
> replace the current metrics systems with the OTel. But for most
> developers and maintainers, the most important thing that they cared
> about might be how many changes could it bring?For example, currently
> the Grafana dashboards have been widely used. How many changes could
> it bring? Do users need to learn completely different dashboards? I
> asked this question before but it's not answered. Then I found the
> "Breaking changes" section. So many breaking changes are usually not
> acceptable.
>

The dashboards will not be changed in the way they look and their
semantics. Each panel remains.
The changes are internal to each panel, which means the queries will change
since the metric name will slightly change.

For the users, they will import the new dashboard, if they used it as is.
If someone created a custom dashboard, yes, they will have to invest some
time to upgrade it.
I think it's 1 hour top to make the fixes.

I will edit the PIP to clarify that.

Dashboards are mostly a user issue, so why do you think it's related to
developers and maintainers?

Regarding so many breaking changes are not acceptable - I'm new to this
community, hence I raise that here.
Do you find the amount of breaking changes not worth the huge benefit to
the users of Pulsar?
Do you have any suggestion to obtain same benefit and have smaller breaking
changes?

Please bear in mind, all changes are happening in a separate layer of Otel,
*co-existing* together with current metric system layer.
I'm not breaking anything until you make the switch.



> I see you listed a lot of problems for the current design. I think
> each of them needs a PIP or at least a PR to resolve if a breaking
> change could be made. Why not solve them one by one in Pulsar?
>
> That's precisely what I wrote in the PIP:
* It's a master PIP.
* Many sections will turn into sub PIPs

Meaning, each problem I mentioned would be solved one by one (in Pulsar, of
course).

The reason for this PIP (master pip) to be introduced, is to make sure we
first have an agreement from the community of developers and users before
we go and spend such a huge amount of work. 2nd reason is that the PIP was
done to ensure all general sub PIPs will align and nothing will surprise us
and find out after 1 year of work that we have stumbled into a wall which
we can't pass. The master gives you that guarantee.




> Thanks,
> Yunze
>
> On Mon, May 8, 2023 at 12:53 AM Asaf Mesika  wrote:
> >
> > On Sun, May 7, 2023 at 4:23 PM Yunze Xu 
> > wrote:
> >
> > > I'm excited to learn much more about metrics when I started reading
> > > this proposal. But I became more and more frustrated when I found
> > > there is still too much content left even if I've already spent much
> > > time reading this proposal. I'm wondering how much time did you expect
> > > reviewers to read through this proposal? I just recalled the
> > > discussion you started before [1]. Did you expect each PMC member that
> > > gives his/her +1 to read only parts of this proposal?
> > >
> >
> > I estimated around 2 hours needed for a reviewer.
> > I hate it being so long, but I simply couldn't find a way to downsize it
> > more. Furthermore, I consulted with my colleagues including Matteo, but
> we
> > couldn't see a way to scope it down.
> > Why? Because once you begin this journey, you need to know how it's going
> > to end.
> > What I ended up doing, is writing all the crucial details for review in
> the
> > High Level Design section.
> > It's still a big, hefty section, but I don't think I can step out or let
> > anyone else change Pulsar so 

Re: [DISCUSS] Add checklist for PMC binding vote of PIP

2023-05-10 Thread Asaf Mesika
Hi Yunze,

Thanks for the feedback.

I re-read your comments 3 times and I can't seem to be able to understand
your key points in the matter of the checklist, so I have some
clarification questions:

1. You said you reviewed PIP-261, remembered the checklist proposal, but
couldn't add it. Can you explain why?
2. Why would the author of a PIP give you a checklist for their vote? Can
you please expand on that?
I completely agree if the author of PIP needs to add a checklist it
will burden, hence I don't see the reason for it and didn't suggest it.
3. You say you want the process of PIP to be more friendly to contributors.
 a) Can you please explain which changes you propose to make it more
friendly?
 b) The checklist is for the voters (mainly PMC members), not the PIP
authors. Why would adding the checklist create any burden for the PIP
author and make the PIP process unfriendly?

4. In the 2nd paragraph, if I try to summarize, you say it's hard to avoid
changes between the implementation of the PIP and the PIP it self. Also,
it's hard to review PIP implementation since it's divided to many PRs.
Can you please explain the connection between this and a checklist for
voters on PIP?

5. You said a checklist won't solve the key difficulties you described for
a huge PIP.
 You are correct. It won't. It's the goal of the checklist to solve
those, at all.

 My main goal in the checklist is to make sure that a person, having
basic Pulsar user knowledge, can read the PIP and fully understand it.

You think the checklist doesn't serve that goal?

I think for huge PIPs it's even more important that the PIP will be
coherent for the reader and supply all background knowledge.

6. I agree with you that implementation can avoid following the design, but
it's a completely different problem we need to solve, unrelated to the
checklist goal. Let's open a separate discussion for it to brainstorm.

7. "A complicated proposal could not be understood by many reviewers. If
the author left the community, it could be a hard job to maintain it."

  This is exactly what I want to avoid.
  When you vote +1, you must make sure most people reading it can
understand it.  If it's not, let's help the author making it so. It must be
the minimum bar for any PIP.
  The checklist is to remind you of that.
  If the design can be easily understandable, you just made the
implementation x10 easier to follow and maintain when the authors leave the
project.





On Tue, May 9, 2023 at 9:39 PM Yunze Xu 
wrote:

> I cannot agree more with Dave's comments.
>
> I just reviewed PIP-261 and PIP-264 yesterday. When I gave +1 to
> PIP-261, I recalled this thread so I'm wondering if I can add a
> checklist. Eventually, I did not do that. IMO, it's the author's
> responsibility to give a checklist for authors to choose for his/her
> proposal. However, it burdens the new contributors to the community.
> PIPs should be more friendly to new contributors. That's also my
> perspective to Rajan's concern: we should still require a PIP for
> changes of metrics or configurations, but the process should be more
> friendly to new contributors.
>
> When I reviewed PIP-264, I recalled PIP-45 and PIP-192 as well, while
> PIP-264 is much more huge than them. Accidentally, I was developing
> KoP for 2.8.0 (not released) when PIP-45 was in progress. It's really
> annoying to see the interfaces changed again and again in the master
> branch. The partner developers maintain their own version of Pulsar
> based on 2.6.x. It's also annoying for them to cherry-pick PRs from
> the master branch. PIP-192 is also a huge proposal. There are so many
> PRs for a proposal. From what I know, It seems the design was slightly
> changed when implementing it.
>
> Adding a checklist cannot solve the key issues for a huge proposal:
> - The design when being voted could be different from PRs
> - The changes could not be easily realized and eventually it was ignored
> - A complicated proposal could not be understood by many reviewers. If
> the author left the community, it could be a hard job to maintain it.
>
> > Being overly dependent on rules is not a replacement for open discussion.
>
> +1. I also hear voices to make some rules for cherry-picking PRs
> during the release process. But it's still necessary to start a
> discussion even if we have any rule.
>
> Thanks,
> Yunze
>
> On Mon, May 8, 2023 at 1:58 AM Dave Fisher  wrote:
> >
> > You asked. Here it is.
> >
> > 1. You brushed aside Enrico’s concerns with that comment. It was not
> subtle.
> >
> > 2. I think the project should pay more attention to Rajan’s concerns
> about new contributors being either ignored or told they need a PIP for
> what seems to them a trivial change. We lose contributors. We need to
> handle that more gently by helping them figure how to better make their PR.
> >
> > 3. For minor PIPs this is too much. Minor PIPs should be easy.
> >
> > 4. For master PIPs like your OTel nothing here 

Re: [DISCUSS] PIP-265: PR-based system for managing and reviewing PIPs

2023-05-10 Thread Asaf Mesika
On Tue, May 9, 2023 at 8:03 PM Dave Fisher  wrote:

>
>
> > On May 9, 2023, at 5:47 AM, Asaf Mesika  wrote:
> >
> > On Tue, May 9, 2023 at 5:18 AM Hang Chen  wrote:
> >
> >> Thanks for driving this discussion.
> >>
> >> I agree to change the proposal discussion from issue and dev mail list
> >> to PR. It will be easier to review and comment, especially for large
> >> proposals.
> >> I have two questions about this change.
> >> - Some proposals contain images, and putting those images into Pulsar
> >> main repo will make the git db large. What's more, some images can be
> >> up to several MBs
> >>
> >
> > That's a great point, and we must address it in the PIP.
> > How about we say that you only use:
> > 1. Mermaid  - it's a tiny language to create
> > drawings? GitHub supports this language on code highlight and renders it
> > correctly.
>
> Does Docusaurus support Mermaid? The design documents for a PIP should be
> available for easy inclusion in pulsar-site.
>

Do we have plans to have a section dedicated to displays PIPs on the
website ?


>
> > 2. Use SVG files which will be located in a folder named after the pip
> > issue number. SVG are vector graphics saved as text. For diagrams,
> > they should be ok in size, and compress well.
> >
> > I think Mermaid should be enough for all drawings needed for illustration
> > of design document purposes. WDYT?
>
> I think that any reasonable format should be OK, but easily editable
> versions should be preferred. All modern tools ought to be able to export
> SVG and all modern browsers render them.
>
> Best,
> Dave
>
> >
> >
> >
> >> - After merging one proposal, if we want to update the content, do we
> >> need to discuss it in the dev mail list or just push one PR to update
> >> it?
> >>
> >
> > Does it happen often?
> > I guess if the change is not big, it's ok just to do PR.
> >
> > I can clarify that as well, if it is agreed upon.
> >
> >
> >>
> >> Thanks,
> >> Hang
> >>
> >>
> >> PengHui Li  于2023年5月8日周一 18:06写道:
> >>>
> >>> Thanks for driving the improvements in proposal managing and reviewing.
> >>> The proposal looks good to me. I have only one question about the dir
> >> name
> >>> for the pips.
> >>>
> >>> For now, we have
> >> https://github.com/apache/pulsar/tree/master/wiki/proposals
> >>> Is it better to use the existing one? Or change the existing one to
> >> "pip".
> >>> I mean, we'd better don't use two dirs for proposals.
> >>>
> >>> Thanks,
> >>> Penghui
> >>>
> >>> On Sun, May 7, 2023 at 5:52 PM Asaf Mesika 
> >> wrote:
> >>>
>  Ping, in case it was lost in the barrage of mails
> 
>  On Sun, Apr 30, 2023 at 10:38 AM Asaf Mesika 
>  wrote:
> 
> > Hi,
> >
> > I've summarized all comments from
> > https://lists.apache.org/thread/5kpddlfh5xdbsjmv47ymnk3z6wd92jbh
> >> into a
> > PIP.
> >
> > PIP: https://github.com/apache/pulsar/issues/20207
> > 
> >
> > I'm leaving this discussion open for 2-3 days to make sure I haven't
> > missed a comment, and proceed to vote, since we had most of the
>  discussion
> > already in the link provided above.
> >
> > Thanks!
> >
> > Asaf
> >
> 
> >>
>
>