from:"Asaf Mesika"

Re: Preparing for Pulsar 4.0: cleaning up the Managed Ledger interfaces

2024-06-12 Thread Asaf Mesika

In high level I see this work as needed cleanup indeed and very much needed

On Tue, 11 Jun 2024 at 20:29 Lari Hotari  wrote:

> Hi all,
>
> We have the next LTS release, Pulsar 4.0, scheduled for October. The
> current master branch will most likely become the 4.0 release branch in
> September.
>
> Since 4.0 will be the next LTS release, we aim to complete larger
> changes in the codebase before the release. I'm particularly interested
> in cleaning up the Managed Ledger interface.
>
> The cleanup of the Managed Ledger interface involves changing the Pulsar
> core to depend on the interface, not on the implementation details.
> Currently, this is not the case. A lot of Pulsar core code specifically
> depends on implementation classes such as ManagedLedgerImpl,
> ManagedCursorImpl and PositionImpl.
>
> In addition to decoupling the core of Pulsar from the implementation
> details of Managed Ledger, I'm also planning to rename the base package
> of the Pulsar Managed Ledger module from org.apache.bookkeeper.mledger
> to org.apache.pulsar.mledger. There are historical reasons why the
> package prefix is org.apache.bookkeeper. I suppose there was a plan a
> long time ago to move the module as part of the Bookkeeper project. That
> goal no longer exists. The consequence of the
> org.apache.bookkeeper.mledger package is that Managed Ledger log
> messages contain the org.apache.bookkeeper.mledger prefix, which often
> confuses Pulsar users and developers about the component's origin. This
> will finally be addressed during this work.
>
> I plan to make changes gradually in the master branch without breaking
> external Pulsar end-user interfaces. Therefore, these changes can be
> considered as internal cleanup and refactoring, which don't necessarily
> require a PIP decision before proceeding.
>
> I do plan to create a PIP document later to collect some of the design
> decisions once they are made. However, at this moment, there are no
> significant up-front design decisions that need to be made by the
> community. In Apache projects, decisions are made on the mailing list
> and this thread is one form of making decisions. Unless there are
> objections to what is presented here, I will proceed with the work
> as planned. Individual work items will be reviewed in PRs as usual.
>
> One detail to solve and document is how Pulsar code maintenance will be
> handled once the master branch has diverged significantly from the
> maintenance branches. It seems that Pulsar 4.0 will be the first branch
> where there will be significant differences between the previous
> maintenance branches. This challenge will be addressed in the PIP
> document when it becomes timely. It will also require documenting
> changes to contribution guidelines so that contributors know how to
> contribute to the maintenance branches. Currently we primarily target
> the master branch in PRs.
>
> Decoupling Pulsar core from Managed Ledger implementations will open up
> ways to add new Managed Ledger implementations in the future and make it
> pluggable. This will help Apache Pulsar stay relevant in the future. I
> believe other Pulsar contributors share and support this goal too.
>
> Please review the first PR for cleaning up Managed Ledger interfaces:
> "Replace dependencies on PositionImpl with Position interface"
> https://github.com/apache/pulsar/pull/22891
>
> Since this is a prerequisite for the next work items in Managed Ledger
> interface cleanup, I hope that we can complete the review of this PR asap.
> We don't have much time to waste before the 4.0 release freeze in
> September.
>
> Looking forward to your feedback and contributions in this area,
>
> -Lari
>

Re: [DISCUSS] PIP-184: Cluster migration or Blue-Green cluster deployment support in Pulsar

2024-04-03 Thread Asaf Mesika

Hi Rajan,

Thanks a lot for contributing to this feature. I think it is super helpful.

Can you add documentation for it? I tried searching for it and couldn't
find it in the docs.

Once you do, I can add it to the list of features pulsar has in this new
page we have: https://pulsar.apache.org/features/


On Thu, Jul 14, 2022 at 8:13 PM Rajan Dhabalia  wrote:

> On Thu, Jul 14, 2022 at 9:59 AM Asaf Mesika  wrote:
>
> > >
> > > No, as it's mentioned in PIP: this API will terminate the topic so, it
> > will
> > > not allow any new write and producers on that topic, and then it flags
> > the
> > > topic as migrated and finally it sends migrated-topic response to
> > > producers/caught-up consumers(with 0 backlog) for further redirection.
> >
> > So, this method will complete once "termination" of topic is complete:
> wait
> > for all producers to disconnect, wait for consumers to finish backlog and
> > then all disconnect?
> >
> >> Producer can be immediately disconnected and allowed to redirect to
> green cluster as soon as topic is terminated and marked as part of
> termination. (considering the topic doesn't have geo-replication enabled.
> For geo-replicated topics , you can read "Replicator and message ordering
> handling" section in PIP).
>
> >
> > This flag is to persist the state of managed-ledger that it's considered
> > > for migration so, if a broker crashes then it knows about the
> > > managed-ledger state.
> > >
> >
> So perhaps the flag means "migrationInProgress"  or "migrationStarted"?
> >
> >> It's just a state to mark that broker has considered it for migration
> similar to the Terminated state. we don't need extra state for completion
> because the broker is going to delete the topic once all subscribers reach
> the end of the topic.
>
> >
> > On Wed, Jul 13, 2022 at 7:23 PM Rajan Dhabalia 
> > wrote:
> >
> > > On Wed, Jul 13, 2022 at 1:55 AM Asaf Mesika 
> > wrote:
> > >
> > > > Few questions
> > > >
> > > > "CompletableFuture asyncMigrate();"
> > > > Does this method only change the status of the managed ledger?
> > > >
> > > No, as it's mentioned in PIP: this API will terminate the topic so, it
> > will
> > > not allow any new write and producers on that topic, and then it flags
> > the
> > > topic as migrated and finally it sends migrated-topic response to
> > > producers/caught-up consumers(with 0 backlog) for further redirection.
> > >
> > > >
> > > > "message ManagedLedgerInfo {
> > > >
> > > >// Flag to check if topic is terminated and migrated to different
> > > > cluster
> > > >optional bool migrated = 4;
> > > >
> > > > }"
> > > >
> > > > This flag then is only changed to true when it has finished
> migration:
> > > i.e.
> > > > no new messages were written, all existing consumers finished reading
> > all
> > > > messages and disconnected and the topic can now be deleted?
> > > >
> > > This flag is to persist the state of managed-ledger that it's
> considered
> > > for migration so, if a broker crashes then it knows about the
> > > managed-ledger state.
> > >
> > >
> > > > "Broker sends topic migration message to client so, producer/consumer
> > at
> > > > client side can handle redirection accordingly"
> > > >
> > > > For producers, the message will be sent the moment the status of the
> > > topic
> > > > has changed, so all messages from there on will be written to the new
> > > > cluster?
> > > >
> > > Yes.
> > >
> > >
> > > > For consumers, the message will be sent when there are no more
> messages
> > > to
> > > > read?
> > > >
> > > Yes.
> > >
> > > >
> > > >
> > > >
> > > > On Tue, Jul 12, 2022 at 8:23 PM Rajan Dhabalia  >
> > > > wrote:
> > > >
> > > > > Hi,
> > > > >
> > > > > We have created PIP-184 which helps users to perform cluster
> > migration
> > > > with
> > > > > Apache Pulsar. Cluster migration or Blue-Green cluster deployment
> is
> > > one
> > > > of
> > > > > the proven solutions to migrate live traffic from one cluster to
> &g

Re: [DISCUSS] PIP-180: Shadow Topic, an alternative way to support readonly topic ownership.

2024-04-03 Thread Asaf Mesika

Hi Haiting,

I've noticed Shadow Topic is *not* covered in the documentation.

Can you please add documentation for it?

It's such a great feature, it's almost a waste not having it documented.

Once it is, I can it to the proud list of features Pulsar have in this
lovely page we recently launched: https://pulsar.apache.org/features/


On Fri, Jul 29, 2022 at 12:29 PM Haiting Jiang 
wrote:

> > I think that there are still some references to shadow_message_id but
> > IIUC we don't need to add that field anymore because we are going to
> > use
>
>
> > optional MessageIdData message_id = 9;
>
>
> > is this correct ?
>
>
> Yes, no more `shadow_message_id`. Sorry I missed these references, already
> updated.
>
> Thanks,
> Haiting
>
> On Fri, Jul 29, 2022 at 5:15 PM Enrico Olivelli 
> wrote:
>
> > Il giorno ven 29 lug 2022 alle ore 09:05 Haiting Jiang
> >  ha scritto:
> > >
> > > Hi Enrico,
> > >
> > > Any further suggestion on this PIP?
> > > If not, I would like to raise a  revote on this in a few days.
> >
> > now the PIP LGTM
> >
> > I think that there are still some references to shadow_message_id but
> > IIUC we don't need to add that field anymore because we are going to
> > use
> >
> > optional MessageIdData message_id = 9;
> >
> > is this correct ?
> >
> > Enrico
> >
> > >
> > > Thanks,
> > > Haiting
> > >
> > > On 2022/07/07 11:30:59 Haiting Jiang wrote:
> > > > Hi Enrico,
> > > >
> > > > Thanks for your feedback.
> > > >
> > > > On 2022/07/05 08:03:43 Enrico Olivelli wrote:
> > > > > I have a couple of additional questions.
> > > > >
> > > > > 1. Security
> > > > > What about security permissions about the shadow topic ?
> > > > > We are reading from another topic.
> > > > > I think we must clarify the decisions in the PIP
> > > >
> > > > As shadow topic is usually in another namespace, it would have its
> own
> > > > independent permission settings, and we can configure different
> > permissions
> > > > for source topic and shadow topic. So there would be no guarantee
> that
> > you are
> > > > allowed to consume shadow topic if you have permission to consume
> > source
> > > > topic.
> > > >
> > > > On the other hand, we uses topic policy to store shadow topic
> > settings, so a
> > > > new policy permission item needs be added as PolicyName.SHADOW_TOPIC,
> > and user
> > > > must have PolicyOperation.WRITE to this policy to create/delete
> shadow
> > topics.
> > > >
> > > > >
> > > > > 2. Truncation and deletion
> > > > > What happens when you truncate or delete the source topic ?
> > > > > please add a paragraph on the PIP
> > > > >
> > > >
> > > > 1. Truncation, from command `bin/pulsar-admin topics truncate
> > source-topic`.
> > > > For source topic truncation, nothing changes. It still move all
> > cursors to the
> > > > end of the topic and delete all inactive ledgers.
> > > > As shadow topic will watch `ManagedLedgerInfo` in metadata store,
> once
> > it
> > > > knows ledgers deleted, all cursors will skip all deleted ledgers.
> > > >
> > > > 2. Deletion, from command `bin/pulsar-admin topics delete
> > source-topic`.
> > > > Like geo-replication, topic deletion is forbidden if topic have
> shadow
> > > > replicators, users have to delete shadow topics first. Here is the
> new
> > admin
> > > > API for managing shadow topics with source topic in
> > > > `org.apache.pulsar.client.admin.Topics` :
> > > > ```
> > > > void createShadowTopic(String sourceTopicName, String
> shadowTopicName);
> > > > void deleteShadowTopic(String sourceTopicName, String
> shadowTopicName);
> > > > List admin.topics().getShadowTopics(String sourceTopicName);
> > > >
> > > > //And their async version methods.
> > > > ```
> > > > And this requires new REST interfaces in admin server, where
> > > > ```
> > > > PATH = "/{tenant}/{namespace}/{topic}/shadowTopics";
> > > > METHOD = POST/DELETE/GET;
> > > > ```
> > > >
> > > > > 3. Offloaders
> > > > > We are talking about BK metadata, how do Shadow Topics work with
> > > > > Offloaded ledgers ?
> > > > > Please clarify in the PIP
> > > >
> > > > Offloading a ledger is a kind of writing operation to topic's
> > metadata, so
> > > > shadow topic can't offload ledgers to other long term storage.
> > However, for
> > > > ledgers thats are already offloaded by source topic, it's expected to
> > support
> > > > reading from offload ledgers in shadow topic, just like read from
> > source
> > > > topic.
> > > >
> > > > The implementation depends on shadow topic watching
> > `ManagedLedgerInfo` in
> > > > metadata store, and if LedgerInfo.offloadContext is updated by source
> > topic
> > > > offloader, shadow topic can get fully information to get a readHandle
> > from
> > > > ledgerOffload. And of course, the pre-condition is the shadow topic
> > must have
> > > > the same offload driver settings.
> > > >
> > > > >
> > > > > 4. Changes in the number of partitions
> > > > > the PIP says that the number of partitions must match the source
> > topic.
> > > > > Are we preventing

Re: [VOTE] PIP-342: Support OpenTelemetry metrics in Pulsar client

2024-03-14 Thread Asaf Mesika

+1 (non-binding)

On Thu, Mar 14, 2024 at 8:29 PM Apurva Telang 
wrote:

> +1 (non-binding)
>
> On Thu, Mar 14, 2024 at 2:12 AM mattison chao 
> wrote:
>
> > +1 (binding)
> >
> > Best,
> > Mattison
> > On Mar 14, 2024 at 15:55 +0800, Lari Hotari , wrote:
> > > +1 (binding)
> > >
> > > -Lari
> > >
> > > On Thu, 14 Mar 2024 at 03:45, Matteo Merli 
> > wrote:
> > > >
> > > > PIP: https://github.com/apache/pulsar/pull/22178
> > > >
> > > > WIP PR: https://github.com/apache/pulsar/pull/22179
> > > >
> > > >
> > > >
> > > >
> > > > --
> > > > Matteo Merli
> > > > 
> >
>
>
> --
> Best regards,
> Apurva Telang.
>

Re: (Apache committer criteria) [ANNOUNCE] New Committer: Asaf Mesika

2024-03-10 Thread Asaf Mesika

>
> I have selected these examples from the most recent discussions and they
> may not be the best examples to fully illustrate the point.

You should respect people's time by taking the time to craft your replies
in this discussion.
>From my experience, I've been contributing a lot in recent years to
OpenTelemetry, and sometimes it takes me a few good hours, even a day to
write a good reply: I research the proposed idea, research the discussions
or issues sent to me and only then craft the reply.
I can tell you from my experience working on OpenTelemetry, that the way
you describe is practically how Open Source works - and OpenTelemetry is
the 2nd biggest project in CNCF!
* I wrote a proposal and it was ignored.
* I joined the weekly community meeting. Pitched my idea and then I had to
ping the people replying to me in that Zoom, a good few times on Slack, to
get a reply. After many days and sometimes weeks waiting between replies, I
scheduled a Zoom call with two of them to get their ideas and mine sync.
* I revised my proposal - and what do you think happened? I had to ping
them AGAIN, and yes, wait sometimes a week for a reply.

In reality, you have to constantly push your proposal, idea, and PR forward.
Today it's so much easier for me in OpenTelemetry. After so many
contributions ranging from specification changes, proposal, documentation
changes and code PRs, the feedback is much more prompt. You know why?
Personal relationship - you build it over time.

If I would act your way, I would never have gotten anything in OpenTelemtry.
Regarding the criteria for accepting a committer. Your response clearly
shows you were not part of the community these past 2 years, otherwise you
would notice the contribution I've made to Apache Pulsar. The PMC members
have noticed hence the invitation to join as committer. Feel free to browse
the mailing list archives to see it and GitHub issues/PRs/Discussions and
Slack.

I can tell you personally that one of my goals when I started reviewing
PIPs in Pulsar roughly 2 years ago, was to "raise the bar" of the quality
of Pulsar. For me, it starts with exceptional design documents. I stated it
multiple times in the mailing list, and it came to manifest itself also in
the new PIP template I proposed and is now used: One should be able to
grasp the PIP from start to bottom without any prior background knowledge.
One of the key reasons for Pulsar's biggest points for improvements is the
documentation: It's a big encyclopedia with a lot of information missing.
Same goes with architecture of the code - many are not documented. As the
person who writes the design already takes the time to research the code
quite well before writing the design, it makes sense to write it in a
concise manner in the PIP. The good PIPs have that. Since the introduction
of the template, and reviewing many PIPs and also writing a few as "show by
example" it finally reached that state in Pulsar.

Those high quality PIPs in my opinion will lead Pulsar to be of higher
quality.

By the way: can you share a personal open source contribution experience
which was vastly different from what you or I described? I personally only
experienced that or worse :)

Cheers,

Asaf

On Fri, Mar 8, 2024 at 12:55 AM Kalwit S  wrote:

> >> If you check the votes and the participation in discussion of PIPs it is
> *always* involving contributions of >3 different companies.
> >> Well, I think you've chosen *very* bad examples. In none of these cases
> the discussion
> I have selected these examples from the most recent discussions and they
> may not be the best examples to fully illustrate the point. However, there
> are several things to note with most of these examples that are still open
> PIPs and do not have closure after a couple of months. Also, you do not see
> non provider company reviews and VOTE in the majority of the discussion,
> which leads me to believe that you cannot move forward without relying on
> the provider’s mercy (same as Confluent).
>
>
> >> > BTW, if your team is really tired of managing a stable Pulsar,
> StreamNative
> > can help you :)
> >> And so can DataStax ;-)
> I will take this as a conclusion to run a stable Pulsar release.
>
>
> On Thu, Mar 7, 2024 at 5:09 AM Dave Fisher  wrote:
>
> >
> >
> > > On Mar 7, 2024, at 4:20 AM, Neng Lu  wrote:
> > >
> > > BTW, if your team is really tired of managing a stable Pulsar,
> > StreamNative
> > > can help you :)
> >
> > And so can DataStax ;-)
>

Re: [DISCUSS] Clarify the relation between supported Pulsar versions and versioned docs

2024-03-10 Thread Asaf Mesika

I personally agree it is frustrating to update in multiple places.
It's not time consuming, just annoying.
Maybe we can only update for the support LTS version and onwards?

On Mar 6 2024, at 1:56 pm, Kiryl Valkovich  wrote:
> Idea: don't require updating versioned docs from contributors.
> Making a small documentation fix in a single place is easy, but if we ask 
> contributors to fix it in 5-10 places, it may prevent the initiative at all.
>
> It could increase the amount of contributions to the documentation.
> I'm not sure how to better organize this process. Who should do this job - 
> the PR reviewer or someone else like a technical writer?
>
> Best,
> Kiryl
>
> > On Mar 5, 2024, at 12:15 AM, Kiryl Valkovich  wrote:
> >
> > The release policy page states that Pulsar has two supported versions on 
> > the current date.
> > The documentation site provides four versions to choose from in the 
> > dropdown list. If some of them aren't actively supported, should they also 
> > be updated?
> >
> > GitHub issue with more details and screenshots: 
> > https://github.com/apache/pulsar/issues/22177
> >
> >
> > Best,
> > Kiryl
>

Re: [ANNOUNCE] New Committer: Kiryl Valkovich

2024-03-05 Thread Asaf Mesika

Congrats! You are definitely a great addition to Pulsar committers!

On Thu, Feb 29, 2024 at 4:58 AM Zixuan Liu  wrote:

> Congrats!
>
> Lari Hotari  于2024年2月27日周二 14:14写道：
>
> > Congrats, Kiryl!
> >
> > -Lari
> >
> > On Tue, 27 Feb 2024 at 06:53, tison  wrote:
> > >
> > > The Apache Pulsar Project Management Committee (PMC) has invited
> > > Kiryl Valkovich https://github.com/visortelle to become a committer,
> > and we
> > > are pleased to announce that he has accepted.
> > >
> > > Welcome and Congratulations, Kiryl Valkovich!
> > >
> > > Please join us in congratulating and welcoming Kiryl onboard!
> > >
> > > Best Regards,
> > >
> > > tison
> > > on behalf of the Pulsar PMC
> >
>

PIP-264: Implementation status update

2024-02-18 Thread Asaf Mesika

Hi,

PIP-264 (approved Sep 2023) fixes many pitfalls of the current metric
system inside Pulsar. We partly do so by adding another option of adding
metrics: OpenTelemetry Java SDK, and then redefining the existing metrics
using OTel.

The first part was adding OpenTelemetry Java SDK into Pulsar in this PR
. This was an
implementation of (sub) PIP-320 and was contributed by Dragos Misca.

The first metrics to be migrated are the lookup metrics, as contributed by this
PR , contributed by Dragos
Misca.
Take a look - feedback is appreciated.

Thanks,

Asaf

Re: [DISCUSS] PIP-331: WASM Function API

2024-02-18 Thread Asaf Mesika

Hi ZiCheng,

Brilliant suggestion!

I replied in the PR section, which I couldn't understand.

On Tue, Jan 30, 2024 at 1:18 PM dragon-zhang 
wrote:

> Hi Pulsar Community,
>
> I want to add a new feature that supports run WASM bytecode to the
> pulsar-functions module.
>
> Please see the PIP: https://github.com/apache/pulsar/pull/21992
>
> Thanks,
> ZiCheng Zhang

Re: Ability to decrease partition count in pulsar

2024-02-18 Thread Asaf Mesika

Hey Girish,

First, I say that I *love* this proposal and, in general, those types of
proposals.
This is what strides Pulsar towards being an even more next-generation
messaging system.

I read and have a few questions and brainstorming ideas popping into my
mind:

1. The current design basically says: Let’s have a read-only toggle (flag)
for each partition. When I decrease the partitions from, say, 2 to 1, then
if the partitions were “billing-0” and “billing-1”, now “billing-1” will be
marked read-only, and eventually, the client will only produce messages to
“billing-0”. After 1 hour, I can scale it back to 2 partitions, and now the
“billing-“1 will be toggled back to read-only=false.

* I know you stated that ordered consumption is out of scope. The thing I
fear here is that even for shared subscriptions, in which order doesn’t
matter, it still feels a bit weird that when you consume from the
beginning, you can suddenly consume messages that are 1 hour apart from
each other, one after another. Something like:

P0  | t1 | t3 | t7 | t10| t11| t13| t17|
++++++++
P1  | t2 | t4 | t6 | t9 | t12| t14| t16|
++++++++
P2  ||| t5 | t8 ||| t15|
||||||||
++++++++
^  ^
RO URO

t5 - you scaled to 3 partitions.
“R0” is when you change from 3 partitions to 2
“URO” is when you change back to 3 partitions.

When you consume this partitioned topic from the beginning, you will
consume t15 mixed with t6 and t7, which can be hours apart.

I understand this can happen today if you only add a partition and read
from the beginning.

2. If we keep ordered consumption out of scope, how do we keep the users
from doing “wrong” things, like using failover type subscriptions on
partitioned topics that have decreased their partitions? Topic and its
partition count is a detached “entity” from its consumption type.

I’m curious if you thought of implementing it following the pattern we have
today for BK. When an ensemble changes, it simply adds the new ensemble to
a list of ensembles, so you follow a chain of servers when you read from a
ledger. You read from (b1,b2,b3) and then switch to (b1, b3, b5).

What if a partitioned topic is exactly that? It is a chain of lists. Each
list contains the topics (partitions).
Something like:
(billing-0-100, billing-1-101), (billing-0-102, billing-1-103,
billing-2-104), (billing-0-105, billing-1-106)

It’s only a direction - just wondering if something like that has been
considered.

On Fri, Jan 19, 2024 at 8:28 AM Girish Sharma 
wrote:

> Hello everyone,
>
> A a true cloud native platform, which supports scale up and scale down, I
> feel like there is a need to be able to reduce partition count in pulsar to
> truly achieve a scale down after events like sales (akin to black friday,
> etc) or huge temporary publish burst due to backfill.
>
> I looked through the archives (upto 2021) and did not find any prior
> discussion on the same topic.
>
> I have given this an initial thought to figure out what would it need to
> support such a feature in the lowest footprint possible. I am attaching the
> document explaining the need, requirements and initial high level details
> [0]. What I would like is to understand if the community also finds this
> feature helpful and does the approach described in the document have some
> fatal flaw? Summarizing the approach here as well:
>
>- Introduce an ability to convert a normal topic object into a read-only
>topic via admin api and an additional partitioned-topic metadata
> property
>(just like shadow source, etc)
>- Add logic to block produce but allow new consumers and dispatch call
>based on this flag
>- Add logic in GC to clean out read only topics when all of their
>ledgers expire (TTL/retention)
>
> Goal is that there is no data movement involved and no impact on existing
> partitions during this scale down.
>
> Looking forward to the discussion.
>
> [0]
>
> https://docs.google.com/document/d/1sbGQSwDihQftIRsxAXg5Zm4uxKQ0kRk9HadKYRFTswI/edit?usp=sharing
>
> Regards
> --
> Girish Sharma
>

Re: [VOTE] PIP-330: getMessagesById gets all messages

2024-01-17 Thread Asaf Mesika

+1 (non-binding)

On Tue, Jan 16, 2024 at 4:43 AM Dezhi Liu  wrote:

> +1 (non-binding)
>
> Thanks,
> Dezhi Liu
>
> On 2024/01/15 09:33:48 Zixuan Liu wrote:
> > Hi Pulsar Community,
> >
> > Voting for PIP-330: getMessagesById gets all messages
> >
> > PIP: https://github.com/apache/pulsar/pull/21873
> > Discussion thread:
> > https://lists.apache.org/thread/vqyh3mvtvovd383sd8zxnlzsspdr863z
> >
> > Thanks,
> > Zixuan
> >
>

Re: [VOTE] PIP-323: Complete Backlog Quota Telemetry

2023-12-20 Thread Asaf Mesika

The vote is now closed.
The PIP  is approved, with 4 +1 binding votes, by Yubaio, Mattison, Penghui
and Guo.


On Mon, Dec 18, 2023 at 4:55 PM mattison chao 
wrote:

> +1(binding)
>
> Best,
> Mattison
>
> > On Dec 13, 2023, at 16:22, Asaf Mesika  wrote:
> >
> > Hi,
> >
> > I'm starting the vote for PIP-323, since it has been reviewed by several
> > people and all comments have been resolved.
> >
> > Reminder:
> >
> > PIP-323 is introduced to fill the gap of backlog quota telemetry. It
> allows
> > the user to know when a time-based backlog quota is about to exceed, and
> > how many times it exceeded. It also adds backlog quota check duration
> > metric allowing the user to configure the interval for that check based
> on
> > data. Once this is implemented, the user can finally create an alert to
> > know when a certain topic is about to exceed its defined backlog quota
> > limit, alert on it, and use topic stats Admin API to grab the
> subscription
> > name causing it.
> >
> > PIP link: https://github.com/apache/pulsar/pull/21709
> >
> >
> > Thanks,
> >
> > Asaf
>
>

Re: [VOTE] PIP-320: OpenTelemetry Scaffolding

2023-12-13 Thread Asaf Mesika

The PIP has been approved by the vote.

Summary:
5 binding +1 votes
1 non-binding +1 vote

Thanks!

Asaf

On Mon, Dec 11, 2023 at 9:19 PM Matteo Merli  wrote:

> +1
> --
> Matteo Merli
> 
>
>
> On Mon, Dec 11, 2023 at 10:13 AM Apurva Telang 
> wrote:
>
> > +1 (non-binding)
> >
> > On Mon, Dec 11, 2023 at 4:50 AM Lari Hotari  wrote:
> >
> > > +1 (binding)
> > >
> > > -Lari
> > >
> > > On 2023/12/11 07:33:53 Asaf Mesika wrote:
> > > > Hi,
> > > >
> > > > I'm starting the vote for PIP-320 (1st sub-PIP of PIP-264)
> > > >
> > > > Link: https://github.com/apache/pulsar/pull/21635
> > > >
> > > > Reminder:
> > > > This PIPs goal is to introduce OpenTelemetry into Apache Pulsar. When
> > > this
> > > > PIP is implemented, we will be able to start converting (not
> replacing)
> > > > existing metrics into OpenTelemetry.
> > > >
> > > >
> > > > Thank you!
> > > >
> > > > Asaf Mesika
> > > >
> > >
> >
> >
> > --
> > Best regards,
> > Apurva Telang.
> >
>

[VOTE] PIP-323: Complete Backlog Quota Telemetry

2023-12-13 Thread Asaf Mesika

Hi,

I'm starting the vote for PIP-323, since it has been reviewed by several
people and all comments have been resolved.

Reminder:

PIP-323 is introduced to fill the gap of backlog quota telemetry. It allows
the user to know when a time-based backlog quota is about to exceed, and
how many times it exceeded. It also adds backlog quota check duration
metric allowing the user to configure the interval for that check based on
data. Once this is implemented, the user can finally create an alert to
know when a certain topic is about to exceed its defined backlog quota
limit, alert on it, and use topic stats Admin API to grab the subscription
name causing it.

PIP link: https://github.com/apache/pulsar/pull/21709


Thanks,

Asaf

[DISCUSS] PIP-323: Complete Backlog Quota Telemetry

2023-12-11 Thread Asaf Mesika

Hi,

PIP-323 is introduced to fill the gap of backlog quota telemetry. It allows
the user to know when a time-based backlog quota is about to exceed, and
how many times it exceeded. It also adds backlog quota check duration
metric allowing the user to configure the interval for that check based on
data. Once this is implemented, the user can finally create an alert to
know when a certain topic is about to exceed its defined backlog quota
limit, alert on it, and use topic stats Admin API to grab the subscription
name causing it.

PIP link: https://github.com/apache/pulsar/pull/21709

This picks up PIP-248 <https://github.com/apache/pulsar/issues/19601> which
was left stale and unapproved, improves it a bit and will implement once
approved.


Thanks,

Asaf Mesika

[VOTE] PIP-320: OpenTelemetry Scaffolding

2023-12-10 Thread Asaf Mesika

Hi,

I'm starting the vote for PIP-320 (1st sub-PIP of PIP-264)

Link: https://github.com/apache/pulsar/pull/21635

Reminder:
This PIPs goal is to introduce OpenTelemetry into Apache Pulsar. When this
PIP is implemented, we will be able to start converting (not replacing)
existing metrics into OpenTelemetry.


Thank you!

Asaf Mesika

Re: [DISCUSS] Introducing Apache Pulsar Office Hour Meetings

2023-11-29 Thread Asaf Mesika

+1

On Wed, Nov 29, 2023 at 5:03 AM mattison chao 
wrote:

> Hi, folks.
>
> I suggest we can run the office hour meeting every two weeks.
>
> - APAC  Wednesday 01:00 (UTC)
> - EMEA  Wednesday 09:00  (UTC)
>
> Best,
> Mattison
>
>
> > On Nov 27, 2023, at 07:41, mattison chao  wrote:
> >
> > Hi, guys.
> >
> > It’s better to discuss the meeting time. Could you please help share
> your thoughts?
> >
> > Best,
> > Mattison
> >
> >> On Nov 21, 2023, at 19:39, mattison chao 
> wrote:
> >>
> >> Hi, Enrico
> >>
> >>> Hello,
> >>> this sounds like a good idea.
> >>> It is not clear to me who is going to run these meetings (also when
> and how)
> >>
> >> Who?
> >>
> >> IMO, our PMC chair can lead this meeting. And all of the commuters can
> volunteer to run this meeting.
> >>
> >> When?
> >>
> >> We need to talk about a suitable time range. It’s better to cover
> different time zones. EMEA, APAC, etc.
> >>
> >> How?
> >>
> >> I think this office hour is more like Q, which can help users adopt
> Pulsar quickly. But we still need to talk about it.
> >>
> >>
> >> Best,
> >> Mattison
> >>
> >>> On Nov 20, 2023, at 23:39, Enrico Olivelli 
> wrote:
> >>>
> >>> Hello,
> >>> this sounds like a good idea.
> >>> It is not clear to me who is going to run these meetings (also when
> and how)
> >>>
> >>> Can you please share some more context ? Maybe I missed something
> >>> (both here or on private@)
> >>>
> >>> Thanks
> >>> Enrico
> >>>
> >>> Il giorno lun 20 nov 2023 alle ore 16:36 PengHui Li
> >>>  ha scritto:
> >>>>
> >>>> Hi Mattison,
> >>>>
> >>>> It's better to share the information on the Pulsar Slack channel and
> >>>> user mailing list since the meeting is focused on users.
> >>>>
> >>>> Thanks,
> >>>> Penghui
> >>>>
> >>>> On Mon, Nov 20, 2023 at 11:18 PM Asaf Mesika 
> wrote:
> >>>>
> >>>>> I think it's a wonderful idea.
> >>>>>
> >>>>> Since community engagement in the community meetings are quite low
> (1-3
> >>>>> people top), hopefully we can engage more users through those open
> office
> >>>>> hours.
> >>>>>
> >>>>>
> >>>>> On Fri, Nov 17, 2023 at 5:11 AM mattison chao <
> mattisonc...@gmail.com>
> >>>>> wrote:
> >>>>>
> >>>>>> Dear Apache Pulsar Community,
> >>>>>>
> >>>>>> I hope you are doing well. As our community continues to grow, we
> want to
> >>>>>> ensure that everyone has the opportunity to actively participate and
> >>>>>> benefit from the collective knowledge and experience within the
> Apache
> >>>>>> Pulsar ecosystem. To achieve this, I discussed with some
> contributors,
> >>>>> and
> >>>>>> we want to introduce a new initiative: Apache Pulsar Office Hour
> >>>>> Meetings.
> >>>>>>
> >>>>>> What are Apache Pulsar Office Hour Meetings?
> >>>>>>
> >>>>>> Apache Pulsar Office Hour Meetings are informal gatherings where
> both
> >>>>>> developers and users can come together to discuss various aspects of
> >>>>> Apache
> >>>>>> Pulsar. Unlike our traditional community meetings, which tend to be
> more
> >>>>>> developer-focused, these office hours are designed to cater to the
> >>>>> broader
> >>>>>> community, including users, enthusiasts, and those who are just
> starting
> >>>>>> their journey with Apache Pulsar.
> >>>>>>
> >>>>>> Why Office Hours?
> >>>>>>
> >>>>>> We've noticed that our community meetings have primarily attracted
> >>>>>> developers, and we want to ensure that we address the needs and
> questions
> >>>>>> of our diverse user base. The office hour format allows for a more
> open
> >>>>> and
> >>>>>> user-friendly discussion, allowing users to share their
> experiences, ask
> >>>>>> questions, and learn from one another.
> >>>>>>
> >>>>>> What to Expect?
> >>>>>>
> >>>>>> During these office hours, we encourage participants to bring their
> >>>>>> questions, share use cases, and discuss any challenges they might
> face
> >>>>> with
> >>>>>> Apache Pulsar. We aim to foster a collaborative environment where
> >>>>> seasoned
> >>>>>> users and newcomers feel comfortable engaging with the community. We
> >>>>>> believe that this format will help us bridge the gap between
> developers
> >>>>> and
> >>>>>> users, creating a more inclusive and vibrant community.
> >>>>>>
> >>>>>> When?
> >>>>>>
> >>>>>> We plan to integrate Apache Pulsar Office Hour Meetings into our
> existing
> >>>>>> community meeting schedule. This way, we can maximise participation
> and
> >>>>>> ensure that everyone has the opportunity to join the conversation.
> The
> >>>>>> office hours will parallel the community meeting, allowing both
> groups to
> >>>>>> benefit from shared insights.
> >>>>>>
> >>>>>> Please leave your valuable comments and suggestions.
> >>>>>>
> >>>>>> Thanks!
> >>>>>>
> >>>>>> Sincerely,
> >>>>>> Mattison
> >>>>>>
> >>>>>
> >>
> >
>
>

Re: [DISCUSS] PIP-320: OpenTelemetry Scaffolding

2023-11-29 Thread Asaf Mesika

On Wed, Nov 29, 2023 at 12:18 AM Enrico Olivelli 
wrote:

> Asaf,
>
>
>
> Il Mar 28 Nov 2023, 19:14 Asaf Mesika  ha scritto:
>
> > Hi,
> >
> > This is the first sub-PIP for parent PIP-264
> > <https://github.com/apache/pulsar/pull/21080> ("Enhanced OTel-based
> metric
> > system").
> >
> > This PIPs goal is to introduce OpenTelemetry into Apache Pulsar. When
> this
> > PIP is implemented, we will be able to start converting (not replacing)
> > existing metrics into OpenTelemetry.
> >
>
> I support the proposal.
> In the document it is explained that OTel is experimental, not GA and but
> default it is disabled.

Just  to clarify: The sub-title in the PIP referring to that is "Why OTel
in Pulsar will be marked experimental and not GA".
Using OTel in Pulsar is experimental, not OTel itself, which is of course
stable and GA already.

>
> My understanding is that in case it is disabled the impact on the runtime
> is negligible, is this correct?
>

I added the following paragraph to the PIP to better explain.

With OTel disabled, the user remains with the existing metrics system.
OTel in a disabled state operates in a
no-op mode. This means, instruments do get built, but the instrument
builders return the same instance of a
no-op instrument, which does nothing on record-values method (e.g.
`add(number)`, `record(number)`). The no-op
`MeterProvider` has no registered `MetricReader` hence when no metric
collection will be made. The memory impact
is almost 0 and the same goes for CPU impact.

>
> Enrico
>
> >
> > Link: https://github.com/apache/pulsar/pull/21635
> >
> > Thanks,
> >
> > Asaf
> >
>

[DISCUSS] PIP-320: OpenTelemetry Scaffolding

2023-11-28 Thread Asaf Mesika

Hi,

This is the first sub-PIP for parent PIP-264
 ("Enhanced OTel-based metric
system").

This PIPs goal is to introduce OpenTelemetry into Apache Pulsar. When this
PIP is implemented, we will be able to start converting (not replacing)
existing metrics into OpenTelemetry.

Link: https://github.com/apache/pulsar/pull/21635

Thanks,

Asaf

Re: [DISCUSS] Introducing Apache Pulsar Office Hour Meetings

2023-11-20 Thread Asaf Mesika

 I believe Mattisson wanted some feedback/suggestion before proceeding to
the announcement to the users via Slack and website


On Mon, Nov 20, 2023 at 5:42 PM Julien Jakubowski
 wrote:

> Thank you for this. This will be a huge benefit to the user community!
> Could you please share this on the community Slack and the Pulsar website?
> For the Pulsar website, I can submit a PR.
>
>
> On Fri, Nov 17, 2023 at 4:11 AM mattison chao 
> wrote:
>
> > Dear Apache Pulsar Community,
> >
> > I hope you are doing well. As our community continues to grow, we want to
> > ensure that everyone has the opportunity to actively participate and
> > benefit from the collective knowledge and experience within the Apache
> > Pulsar ecosystem. To achieve this, I discussed with some contributors,
> and
> > we want to introduce a new initiative: Apache Pulsar Office Hour
> Meetings.
> >
> > What are Apache Pulsar Office Hour Meetings?
> >
> > Apache Pulsar Office Hour Meetings are informal gatherings where both
> > developers and users can come together to discuss various aspects of
> Apache
> > Pulsar. Unlike our traditional community meetings, which tend to be more
> > developer-focused, these office hours are designed to cater to the
> broader
> > community, including users, enthusiasts, and those who are just starting
> > their journey with Apache Pulsar.
> >
> > Why Office Hours?
> >
> > We've noticed that our community meetings have primarily attracted
> > developers, and we want to ensure that we address the needs and questions
> > of our diverse user base. The office hour format allows for a more open
> and
> > user-friendly discussion, allowing users to share their experiences, ask
> > questions, and learn from one another.
> >
> > What to Expect?
> >
> > During these office hours, we encourage participants to bring their
> > questions, share use cases, and discuss any challenges they might face
> with
> > Apache Pulsar. We aim to foster a collaborative environment where
> seasoned
> > users and newcomers feel comfortable engaging with the community. We
> > believe that this format will help us bridge the gap between developers
> and
> > users, creating a more inclusive and vibrant community.
> >
> > When?
> >
> > We plan to integrate Apache Pulsar Office Hour Meetings into our existing
> > community meeting schedule. This way, we can maximise participation and
> > ensure that everyone has the opportunity to join the conversation. The
> > office hours will parallel the community meeting, allowing both groups to
> > benefit from shared insights.
> >
> > Please leave your valuable comments and suggestions.
> >
> > Thanks!
> >
> > Sincerely,
> > Mattison
> >
>

Re: [DISCUSS] Introducing Apache Pulsar Office Hour Meetings

2023-11-20 Thread Asaf Mesika

I think it's a wonderful idea.

Since community engagement in the community meetings are quite low (1-3
people top), hopefully we can engage more users through those open office
hours.


On Fri, Nov 17, 2023 at 5:11 AM mattison chao 
wrote:

> Dear Apache Pulsar Community,
>
> I hope you are doing well. As our community continues to grow, we want to
> ensure that everyone has the opportunity to actively participate and
> benefit from the collective knowledge and experience within the Apache
> Pulsar ecosystem. To achieve this, I discussed with some contributors, and
> we want to introduce a new initiative: Apache Pulsar Office Hour Meetings.
>
> What are Apache Pulsar Office Hour Meetings?
>
> Apache Pulsar Office Hour Meetings are informal gatherings where both
> developers and users can come together to discuss various aspects of Apache
> Pulsar. Unlike our traditional community meetings, which tend to be more
> developer-focused, these office hours are designed to cater to the broader
> community, including users, enthusiasts, and those who are just starting
> their journey with Apache Pulsar.
>
> Why Office Hours?
>
> We've noticed that our community meetings have primarily attracted
> developers, and we want to ensure that we address the needs and questions
> of our diverse user base. The office hour format allows for a more open and
> user-friendly discussion, allowing users to share their experiences, ask
> questions, and learn from one another.
>
> What to Expect?
>
> During these office hours, we encourage participants to bring their
> questions, share use cases, and discuss any challenges they might face with
> Apache Pulsar. We aim to foster a collaborative environment where seasoned
> users and newcomers feel comfortable engaging with the community. We
> believe that this format will help us bridge the gap between developers and
> users, creating a more inclusive and vibrant community.
>
> When?
>
> We plan to integrate Apache Pulsar Office Hour Meetings into our existing
> community meeting schedule. This way, we can maximise participation and
> ensure that everyone has the opportunity to join the conversation. The
> office hours will parallel the community meeting, allowing both groups to
> benefit from shared insights.
>
> Please leave your valuable comments and suggestions.
>
> Thanks!
>
> Sincerely,
> Mattison
>

Re: [DISCUSS] Replace stale bot with ping-pong workflow

2023-11-09 Thread Asaf Mesika

Submitted a PR to disable it: https://github.com/apache/pulsar/pull/21549

On Tue, Nov 7, 2023 at 3:58 PM Asaf Mesika  wrote:

> Tison let's start as you suggested by disabling it
>
>
> On Tue, May 16, 2023 at 5:13 AM Yunze Xu  wrote:
>
>> +1 to me
>>
>> Thanks,
>> Yunze
>>
>> On Sun, May 14, 2023 at 9:28 PM Dave Fisher 
>> wrote:
>> >
>> > Hi -
>> >
>> > I have not looked at all your links but I think this is a great idea.
>> This will help everyone pay attention better.
>> >
>> > Best,
>> > Dave
>> >
>> > Sent from my iPhone
>> >
>> > > On May 14, 2023, at 12:33 AM, tison  wrote:
>> > >
>> > > Of course, changing the workflow cannot magically increase the
>> bandwidth to
>> > > handle stale issues. That is what the triage guide wants to encourage
>> > > committers to practice. But such a move can reduce the frustrating
>> > > experience and explicitly express who is responsible for taking the
>> next
>> > > action to nudge the conversation.
>> > >
>> > > Best,
>> > > tison.
>> > >
>> > >
>> > > tison  于2023年5月14日周日 15:28写道：
>> > >
>> > >> Hi devs,
>> > >>
>> > >> Recently, I have handled a large number of stale issues and noticed
>> that
>> > >> periodically notifying users that "the issue is stale" without any
>> human
>> > >> reaction can be a frustrating experience, e.g., ISSUE-13925[1].
>> > >>
>> > >> Learning from the INFRA JIRA project experience, I propose we
>> replace the
>> > >> stale bot with a ping-pong workflow. That is -
>> > >>
>> > >> ping - Labeling waiting-for-reviewer on issue created and commented
>> by
>> > >> non-committers
>> > >> pong - Labeling waiting-for-user on issue responded by committers
>> > >>
>> > >> Here is a demo implementation[2] you can refer to and you can try the
>> > >> workflow in my fork[3].
>> > >>
>> > >> Previous references -
>> > >>
>> > >> * The triage guide[4]
>> > >> * [DISCUSS] Does stale bot make value for you?[5]
>> > >> * [COMMITTER ATTENTION] You can close stale issues as not planned [6]
>> > >>
>> > >> Looking forward to your feedback :D
>> > >>
>> > >> Best,
>> > >> tison.
>> > >>
>> > >> [1] https://github.com/apache/pulsar/issues/13925
>> > >> [2] https://github.com/apache/pulsar/pull/20319
>> > >> [3] https://github.com/tisonkun/pulsar
>> > >> [4] https://pulsar.apache.org/contribute/develop-triage
>> > >> [5] https://lists.apache.org/thread/tv774jqohdpx8x0dymsskrd90xwwfvgp
>> > >> [6] https://lists.apache.org/thread/x2c7xod8y0wvh14nsb6bknf0dq3r9gls
>> > >>
>> > >>
>> >
>>
>

Re: [DISCUSS] PIP-310: Support custom publish rate limiters

2023-11-07 Thread Asaf Mesika

I just want to add one thing to the mix here.

You can see by the amount of plugin interfaces Pulsar has, somebody "left
the door open" for too long.
You can agree with me that the number of those interfaces is not normal for
any open source software. I know HBase for example, or Kafka - never seen
so many in them.

You can also see the lack of attention to code quality and high level
overview by the poor implementation of current rate limiter.

The feeling is: I just need this tiny little thing and I don't have time -
so over time Pulsar got into this unmaintainable mess of public APIs and
some parts are simply unreadable - such as the rate limiters. I *still*
don't understand how rate limiting works in Pulsar, even when I read the
background  and browsed quickly through the code.

I can see the people on this thread are highly talented - let's use this to
make Pulsar better, both from a bird's-eye view and your own
personal requirement.


On Tue, Nov 7, 2023 at 3:26 PM Girish Sharma 
wrote:

> Hello Lari, replies inline.
>
> I will also be going through some textbook rate limiters (the one you
> shared, plus others) and propose the one that at least suits our needs in
> the next reply.
>
> On Tue, Nov 7, 2023 at 2:49 PM Lari Hotari  wrote:
>
>>
>> It is bi-weekly on Thursdays. The meeting calendar, zoom link and
>> meeting notes can be found at
>> https://github.com/apache/pulsar/wiki/Community-Meetings .
>>
>>
> Would it make sense for me to join this time given that you are skipping
> it?
>
>
>>
>> ok. btw. "metrics" doesn't necessarily mean providing the rate limiter
>> metrics via Prometheus. There might be other ways to provide this
>> information for components that could react to this.
>> For example, it could a be system topic where these rate limiters emit
>> events.
>>
>>
> Are there any other system topics than `tenent/namespace/__change_events`
> . While it's an improvement over querying metrics, it would still mean one
> consumer per namespace and would form a cyclic dependency - for example, in
> case a broker is degrading due to mis-use of bursting, it might lead to
> delays in the consumption of the event from the __change_events topic.
>
> I agree. I just brought up this example to ensure that your
>> expectation about bursting isn't about controlling the rate limits
>> based on situational information, such as end-to-end latency
>> information.
>> Such a feature could be useful, but it does complicate things.
>> However, I think it's good to keep this on the radar since this might
>> be needed to solve some advanced use cases.
>>
>>
> I still envision auto-scaling to be admin API driven rather than produce
> throughput driven. That way, it remains deterministic in nature. But it
> probably doesn't make sense to even talk about it until (partition)
> scale-down is possible.
>
>
>>
>> >- A producer(s) is producing at a near constant rate into a topic,
>> with
>> >equal distribution among partitions. Due to a hiccup in their
>> downstream
>> >component, the produce rate goes to 0 for a few seconds, and thus, to
>> >compensate, in the next few seconds, the produce rate tries to
>> double up.
>>
>> Could you also elaborate on details such as what is the current
>> behavior of Pulsar rate limiting / throttling solution and what would
>> be the desired behavior?
>> Just guessing that you mean that the desired behavior would be to
>> allow the produce rate to double up for some time (configurable)?
>> Compared to what rate is it doubled?
>> Please explain in detail what the current and desired behaviors would
>> be so that it's easier to understand the gap.
>>
>
> In all of the 3 cases that I listed, the current behavior, with precise
> rate limiting enabled, is to pause the netty channel in case the throughput
> breaches the set limits. This eventually leads to timeout at the client
> side in case the burst is significantly greater than the configured timeout
> on the producer side.
>
> The desired behavior in all three situations is to have a multiplier based
> bursting capability for a fixed duration. For example, it could be that a
> pulsar topic would be able to support 1.5x of the set quota for a burst
> duration of up to 5 minutes. There also needs to be a cooldown period in
> such a case that it would only accept one such burst every X minutes, say
> every 1 hour.
>
>
>>
>> >- In a visitor based produce rate (where produce rate goes up in the
>> day
>> >and goes down in the night, think in terms of popular website hourly
>> view
>> >counts pattern) , there are cases when, due to certain
>> external/internal
>> >triggers, the views - and thus - the produce rate spikes for a few
>> minutes.
>>
>> Again, please explain the current behavior and desired behavior.
>> Explicit example values of number of messages, bandwidth, etc. would
>> also be helpful details.
>>
>
> Adding to what I wrote above, think of this pattern like the following:
> the produce rate

Re: [DISCUSS] Replace stale bot with ping-pong workflow

2023-11-07 Thread Asaf Mesika

Tison let's start as you suggested by disabling it


On Tue, May 16, 2023 at 5:13 AM Yunze Xu  wrote:

> +1 to me
>
> Thanks,
> Yunze
>
> On Sun, May 14, 2023 at 9:28 PM Dave Fisher  wrote:
> >
> > Hi -
> >
> > I have not looked at all your links but I think this is a great idea.
> This will help everyone pay attention better.
> >
> > Best,
> > Dave
> >
> > Sent from my iPhone
> >
> > > On May 14, 2023, at 12:33 AM, tison  wrote:
> > >
> > > Of course, changing the workflow cannot magically increase the
> bandwidth to
> > > handle stale issues. That is what the triage guide wants to encourage
> > > committers to practice. But such a move can reduce the frustrating
> > > experience and explicitly express who is responsible for taking the
> next
> > > action to nudge the conversation.
> > >
> > > Best,
> > > tison.
> > >
> > >
> > > tison  于2023年5月14日周日 15:28写道：
> > >
> > >> Hi devs,
> > >>
> > >> Recently, I have handled a large number of stale issues and noticed
> that
> > >> periodically notifying users that "the issue is stale" without any
> human
> > >> reaction can be a frustrating experience, e.g., ISSUE-13925[1].
> > >>
> > >> Learning from the INFRA JIRA project experience, I propose we replace
> the
> > >> stale bot with a ping-pong workflow. That is -
> > >>
> > >> ping - Labeling waiting-for-reviewer on issue created and commented by
> > >> non-committers
> > >> pong - Labeling waiting-for-user on issue responded by committers
> > >>
> > >> Here is a demo implementation[2] you can refer to and you can try the
> > >> workflow in my fork[3].
> > >>
> > >> Previous references -
> > >>
> > >> * The triage guide[4]
> > >> * [DISCUSS] Does stale bot make value for you?[5]
> > >> * [COMMITTER ATTENTION] You can close stale issues as not planned [6]
> > >>
> > >> Looking forward to your feedback :D
> > >>
> > >> Best,
> > >> tison.
> > >>
> > >> [1] https://github.com/apache/pulsar/issues/13925
> > >> [2] https://github.com/apache/pulsar/pull/20319
> > >> [3] https://github.com/tisonkun/pulsar
> > >> [4] https://pulsar.apache.org/contribute/develop-triage
> > >> [5] https://lists.apache.org/thread/tv774jqohdpx8x0dymsskrd90xwwfvgp
> > >> [6] https://lists.apache.org/thread/x2c7xod8y0wvh14nsb6bknf0dq3r9gls
> > >>
> > >>
> >
>

Re: Reporting and tooling to detect thread leaks in Pulsar tests

2023-10-30 Thread Asaf Mesika

Ok, got it. While scrolling down on the main page, and cursor is on the
graph, nothing happens. When you move the mouse away from the graph, you
scroll and then you see those annotations.
Very nice additions and good to generally know about them.


On Mon, Oct 30, 2023 at 10:53 AM Lari Hotari  wrote:

> Thanks for the review. I merged the PR and triggered a manual build
> https://github.com/apache/pulsar/actions/runs/6690374946 to get the latest
> report of leaked threads.
>
> -Lari
>
> On Mon, 30 Oct 2023, 9.15 Enrico Olivelli,  wrote:
>
> > Il Lun 30 Ott 2023, 06:39 Lari Hotari  ha scritto:
> >
> > > Hi Asaf,
> > >
> > > Yes, the visibility aspect is already solved by using warnings in the
> > > summary view. Please check the example
> > > https://github.com/apache/pulsar/actions/runs/6680066364?pr=21450 .
> > >
> > > Job summaries could also be used, but they have less visibility in the
> > > summary view, as you can see from the example. Job summaries are on
> > placed
> > > on the summary page after errors/warnings and build artifacts and when
> > > there are more than a few summaries, each job summary will need to be
> > > explicitly expanded by clicking "Load Summary" to view the content.
> That
> > > makes their visibility lower than warnings.
> > >
> > > Since this is a change in the build and isn't really intrusive, I think
> > we
> > > could get it merged and revisit it based on the experiences we get from
> > the
> > > use of it. I have been iterating on the solution while fixing a lot of
> > the
> > > test resource leaks in the last few weeks. Without support for
> detecting
> > > the resource leaks, it's really hard to keep the test suite clean.
> > >
> >
> >
> >
> > > Looking forward to more reviews on
> > > https://github.com/apache/pulsar/pull/21450 . :)
> >
> >
> >
> >
> > Looks great
> >
> > Thanks
> > Enrico
> >
> > >
> > >
> > > -Lari
> > >
> > > On 2023/10/29 18:34:28 Asaf Mesika wrote:
> > > > Larry, I know there is a way to add like a Job summary, so we can
> write
> > > it
> > > > there - do you think this can increase visibility?
> > > >
> > > > On Sun, Oct 29, 2023 at 4:53 AM Lari Hotari 
> > wrote:
> > > >
> > > > > Hi all,
> > > > >
> > > > > I have submitted a PR (https://github.com/apache/pulsar/pull/21450
> )
> > > which
> > > > > includes changes to add reporting and tooling to detect thread
> leaks
> > in
> > > > > Pulsar tests.
> > > > >
> > > > > It should be ensured in each test that resources created by the
> test
> > > are
> > > > > properly cleaned up. Failing to do so can lead to memory leaks and,
> > in
> > > some
> > > > > instances, unnecessary CPU consumption. These issues can, in turn,
> > slow
> > > > > down test execution, increase Pulsar CI build durations, and cause
> > > > > flakiness.  A significant source of memory leaks in Pulsar tests
> > stems
> > > from
> > > > > thread leaks.
> > > > >
> > > > > After the PR is merged, it will be easy to detect thread leaks
> since
> > > the
> > > > > build will add warnings to the summary view for the GitHub Actions
> > > build
> > > > > run. An example can be seen in the PR build run:
> > > > > https://github.com/apache/pulsar/actions/runs/6680066364?pr=21450
> .
> > > > > There will be more detailed information in the "Report detected
> > thread
> > > > > leaks" build step, for example
> > > > >
> > >
> >
> https://github.com/apache/pulsar/actions/runs/6680066364/job/18153890519?pr=21450#step:16:23
> > > > > .
> > > > >
> > > > > Please review the PR https://github.com/apache/pulsar/pull/21450
> so
> > > that
> > > > > we can continue to get rid of the remaining thread leaks in the
> > future
> > > and
> > > > > keep the tests cleaner and less flaky.
> > > > >
> > > > > -Lari
> > > > >
> > > >
> > >
> >
>

Re: Reporting and tooling to detect thread leaks in Pulsar tests

2023-10-29 Thread Asaf Mesika

Larry, I know there is a way to add like a Job summary, so we can write it
there - do you think this can increase visibility?

On Sun, Oct 29, 2023 at 4:53 AM Lari Hotari  wrote:

> Hi all,
>
> I have submitted a PR (https://github.com/apache/pulsar/pull/21450) which
> includes changes to add reporting and tooling to detect thread leaks in
> Pulsar tests.
>
> It should be ensured in each test that resources created by the test are
> properly cleaned up. Failing to do so can lead to memory leaks and, in some
> instances, unnecessary CPU consumption. These issues can, in turn, slow
> down test execution, increase Pulsar CI build durations, and cause
> flakiness.  A significant source of memory leaks in Pulsar tests stems from
> thread leaks.
>
> After the PR is merged, it will be easy to detect thread leaks since the
> build will add warnings to the summary view for the GitHub Actions build
> run. An example can be seen in the PR build run:
> https://github.com/apache/pulsar/actions/runs/6680066364?pr=21450 .
> There will be more detailed information in the "Report detected thread
> leaks" build step, for example
> https://github.com/apache/pulsar/actions/runs/6680066364/job/18153890519?pr=21450#step:16:23
> .
>
> Please review the PR https://github.com/apache/pulsar/pull/21450 so that
> we can continue to get rid of the remaining thread leaks in the future and
> keep the tests cleaner and less flaky.
>
> -Lari
>

Re: [DISCUSS] Roll up project status for pulsar-helm-chart

2023-10-24 Thread Asaf Mesika

Tison, can we mark this repo as suggested?

On Tue, Aug 8, 2023 at 12:24 PM Matteo Merli  wrote:

> Thanks Tison,
>
> I fully agree that we should have a clear representation of the actual
> status of the Helm chart, so that users can have the correct expectation.
>
> In particular I think we should have this Helm chart to provide basic
> functionality out of the box and serve as a template/example on top of
> which one can tweek to its specific needs for productions environments.
>
> Since this thread got started, there was no one stepping up, so I guess
> your prevision was also correct :) and reinforces the idea that we
> should reduce a bit the scope and expectation for this Helm chart.
>
>
> --
> Matteo Merli
> 
>
>
> On Wed, Jul 19, 2023 at 11:26 AM tison  wrote:
>
>> Hi Pulsar devs and users,
>>
>> The pulsar-helm-chart[1] was initially developed in the main repo[2] and
>> later moved to its own repo in 2018[3].
>>
>> During the past years, it gets little attention on both development or
>> maintenance, while the Pulsar ecosystem has grown multiple alternatives to
>> distributed pulsar deployment via helm charts or Kubernetes operators -
>>
>> 1. https://github.com/streamnative/charts
>> 2. https://github.com/streamnative/pulsar-operators
>> 3. https://github.com/streamnative/terraform-helm-charts
>> 4. https://github.com/datastax/pulsar-helm-chart
>> 5. https://github.com/datastax/kaap
>>
>> Almost all of the ecosystem projects have better maturity than the
>> upstream
>> one. Although, we in the upstream still recommend the pulsar-helm-chart as
>> the "official" helm chart among the README[4] and docs[5].
>>
>> Of course, it's by-defined the "official" one. But such an advertisement
>> can mislead Pulsar users to choose a half-unmaintained project over better
>> implemented and maintained projects.
>>
>> The upstream community, when it doesn't have the bandwidth to maintain the
>> repo, doesn't have to take the place of an official helm chart. And I saw
>> a
>> pull request proposing to update the README of pulsar-helm-chart[6].
>>
>> I approved that PR and understand that our community makes decisions on
>> mailing lists. Thus, here is the discussion thread to roll up the project
>> status for pulsar-helm-chart.
>>
>> My suggestion is -
>>
>> 1. Accept the PR to update README reflecting the project status and remove
>> the "official" advertisement. It is not wrong, but it can mislead our
>> users
>> as described above.
>> 2. Correspondingly update the word in the docs on the Pulsar website.
>>
>> There can be some arguments that we can pick up the project again and
>> develop and maintain it - this is good.
>>
>> However, generally talk is cheap and real effort is slow to apply. I
>> totally appreciate anyone who is willing to maintain the
>> pulsar-helm-chart,
>> but let's do not block the description updates by such an argument.
>> Instead, update the description, and change it back when the development
>> and maintenance really happen. It also reflects the low traffic during the
>> past few years.
>>
>> Looking forward to your feedback :)
>>
>> Best,
>> tison.
>>
>> [1] https://github.com/apache/pulsar-helm-chart
>> [2]
>> https://github.com/apache/pulsar/tree/master/deployment/kubernetes/helm
>> [3] https://github.com/apache/pulsar-helm-chart/graphs/contributors
>> [4]
>>
>> https://github.com/apache/pulsar-helm-chart/blob/73fe688a439c3ab9b56f2d249f16505292391f4b/README.md
>> [5] https://pulsar.apache.org/docs/3.0.x/deploy-kubernetes/
>> [6] https://github.com/apache/pulsar-helm-chart/pull/367
>>
>

Re: [DISCUSS] PIP-310: Support custom publish rate limiters

2023-10-22 Thread Asaf Mesika

Replied in PR.


On Thu, Oct 19, 2023 at 3:51 PM Girish Sharma 
wrote:

> Hi,
> Currently, there are only 2 kinds of publish rate limiters - polling based
> and precise. Users have an option to use either one of them in the topic
> publish rate limiter, but the resource group rate limiter only uses polling
> one.
>
> There are challenges with both the rate limiters and the fact that we can't
> use precise rate limiter in the resource group level.
>
> Thus, in order to support custom rate limiters, I've created the PIP-310
>
> This is the discussion thread. Please go through the PIP and provide your
> inputs.
>
> Link - https://github.com/apache/pulsar/pull/21399
>
> Regards
> --
> Girish Sharma
>

Re: [DISCUSS] PIP-309: Adding Pulsar Client Stats Reporter

2023-10-22 Thread Asaf Mesika

I've replied in the PR it self

On Fri, Oct 20, 2023 at 2:24 AM Ying  wrote:

> Hi dev,
>
> Currently, Pulsar Client can provide recorded stats for both Producer and
> Consumer, but not all stats are fixed values during the statsInterval. So
> start the PIP-309 to add the Pulsar Client Stats Reporter to allow all
> stats being reported precisely at the end of each interval.
>
> Please share your thoughts!
>
> Ref:
> PIP-309 https://github.com/apache/pulsar/pull/21393
>
> Thanks,
> ywango
>

Re: [DISCUSS] Consistent code style (esp. ws/indent) and autotools

2023-09-20 Thread Asaf Mesika

Hang, we can do this piecemeal - Folder by folder, so we can throttle the
amount of PR to make them easy to review.
To make it sustainable, perhaps we can do this only on master?

Regarding git blame - you have the same issue when you view a file that has
been heavily modified in one commit, right?
It shouldn't be an issue - Show History in IntelliJ, pick the one commit
before. ,right click -> annotate - boom you have the git blame for the
previous commit.

I agree we need a reliable Pulsar, but I think a program doing the
modification is likely to produce 0 bugs - it's not human based.

Also, we are afraid to make changes, Pulsar will not get better.

Remember Pulsar is very early stage adoption. Now is *exactly* the time to
do it.

Also keep in mind: reliability comes when code is readable.


On Tue, Sep 5, 2023 at 5:56 AM Hang Chen  wrote:

> >While I can agree that a consistent style can help I don’t agree that it
> is necessary. If the compiler understands the code then IMO we are good.
>
> I agree with Dave's idea. My concern is how much value this change
> will bring to Pulsar. What's more, it will bring other burdens, such
> as PR review, PR cherry-pick, git blames. The main goal of Pulsar is
> to improve the reliability.
>
> -1 for this change.
>
> Best,
> Hang
>
> 徐昀泽  于2023年9月4日周一 19:24写道：
> >
> > Well, I’m just back to this thread.
> >
> > Now I’m +1 to this extremely huge change, but to be more friendly to
> developers,
> > we should document the workarounds for the git blame issue. And we
> should apply
> > the spotless tool to every active branches.
> >
> > > On Sep 3, 2023, at 19:43, Asaf Mesika  wrote:
> > >
> > > I couldn't stress how much I oppose the sentence "If the compiler
> > > understands the code then IMO we are good."
> > >
> > > Sinan is right: This project needs to take calculated risks in order to
> > > move forward to be better.
> > > Yes I agree prioritizing is super important, since Pulsar has *so many*
> > > fronts to be better at.
> > >
> > > We need more people on this thread, to get a wide angle on this IMO.
> > >
> > >
> > > On Sat, Sep 2, 2023 at 7:27 AM Dave Fisher 
> wrote:
> > >
> > >> While I can agree that a consistent style can help I don’t agree that
> it
> > >> is necessary. If the compiler understands the code then IMO we are
> good.
> > >>
> > >> I am a bit of a dinosaur since I have keypunched code on cards in my
> > >> career. I’ve played with writing interpreters and specialized
> languages.
> > >>
> > >> But I’m -0 and if the project prefers strict code style then that is
> fine
> > >> too!
> > >>
> > >> If anyone agrees with me know that part of consensus building is to
> > >> provide opinions and accept divergent results.
> > >>
> > >> Best,
> > >> Dave
> > >>
> > >> PS. If tisun wants to put on their superhero cape and convert the code
> > >> base then let’s acknowledge that AND let’s consider all of the PRs
> that are
> > >> now effectively closed.
> > >>
> > >> Sent from my iPhone
> > >>
> > >>> On Sep 1, 2023, at 8:57 PM, SiNan Liu 
> wrote:
> > >>>
> > >>> Consistent code style is crucial for a large project. In order to
> make
> > >>> Pulsar better, I believe we shouldn't be afraid of "risks".
> > >>> By introducing Spotless, we can permanently resolve the issue of
> > >>> inconsistent code style and ensure that all contributors adhere to
> this
> > >>> rule moving forward.
> > >>> If we don't make these changes now, we might have to deal with
> changes in
> > >>> over 3000 files today and potentially over 5000 files tomorrow.
> > >>>
> > >>> Thanks,
> > >>> sinan
> > >>>
> > >>>
> > >>> Dave Fisher  于2023年9月1日周五 12:19写道：
> > >>>
> > >>>> -0. I think it will introduce too much change. We have classes that
> are
> > >>>> quite large and having to fix code style after making a small
> correction
> > >>>> seems like a waste of volunteer energy.
> > >>>>
> > >>>> Best,
> > >>>> Dave
> > >>>>
> > >>>> Sent from my iPhone
> > >>>>
> > >>>>>> On Aug 31, 2023, at 9:05 PM, Zixuan Liu 
> wrot

Re: [DISCUSS] Consistent code style (esp. ws/indent) and autotools

2023-09-03 Thread Asaf Mesika

I couldn't stress how much I oppose the sentence "If the compiler
understands the code then IMO we are good."

Sinan is right: This project needs to take calculated risks in order to
move forward to be better.
Yes I agree prioritizing is super important, since Pulsar has *so many*
fronts to be better at.

We need more people on this thread, to get a wide angle on this IMO.


On Sat, Sep 2, 2023 at 7:27 AM Dave Fisher  wrote:

> While I can agree that a consistent style can help I don’t agree that it
> is necessary. If the compiler understands the code then IMO we are good.
>
> I am a bit of a dinosaur since I have keypunched code on cards in my
> career. I’ve played with writing interpreters and specialized languages.
>
> But I’m -0 and if the project prefers strict code style then that is fine
> too!
>
> If anyone agrees with me know that part of consensus building is to
> provide opinions and accept divergent results.
>
> Best,
> Dave
>
> PS. If tisun wants to put on their superhero cape and convert the code
> base then let’s acknowledge that AND let’s consider all of the PRs that are
> now effectively closed.
>
> Sent from my iPhone
>
> > On Sep 1, 2023, at 8:57 PM, SiNan Liu  wrote:
> >
> > Consistent code style is crucial for a large project. In order to make
> > Pulsar better, I believe we shouldn't be afraid of "risks".
> > By introducing Spotless, we can permanently resolve the issue of
> > inconsistent code style and ensure that all contributors adhere to this
> > rule moving forward.
> > If we don't make these changes now, we might have to deal with changes in
> > over 3000 files today and potentially over 5000 files tomorrow.
> >
> > Thanks,
> > sinan
> >
> >
> > Dave Fisher  于2023年9月1日周五 12:19写道：
> >
> >> -0. I think it will introduce too much change. We have classes that are
> >> quite large and having to fix code style after making a small correction
> >> seems like a waste of volunteer energy.
> >>
> >> Best,
> >> Dave
> >>
> >> Sent from my iPhone
> >>
> >>>> On Aug 31, 2023, at 9:05 PM, Zixuan Liu  wrote:
> >>>
> >>> This idea looks good to me, but we need to format all codebase to
> >>> avoid conflicts during cherry picking.
> >>>
> >>> +1
> >>>
> >>> Asaf Mesika  于2023年8月31日周四 20:39写道：
> >>>>
> >>>> Opentelemetry-java employs both spotless for coding style
> >>>> You run "./gradlew spotlessCheck" and it shows the problems.
> >>>> You run "./gradlew spotlessApply" to automatically fix it.
> >>>>
> >>>> It also uses errorprone to detect bugs.
> >>>>
> >>>> I wonder if we can run it only in GitHub PRs, so we can instruct it to
> >> run
> >>>> only on files you have changed / added. This is we "throttle" the
> style
> >>>> through files touched.
> >>>>
> >>>>
> >>>>
> >>>>> On Tue, Aug 15, 2023 at 8:31 PM tison  wrote:
> >>>>>
> >>>>> These have been discussed that -
> >>>>>
> >>>>> 1.  Cherrypick: we will reformat for all maintained branches.
> >>>>> 2. Commit noise: metadata to filter out during blame.
> >>>>> 3. PR rebased: invincible, while we don't have a large backlog or
> >> active
> >>>>> large pending PR.
> >>>>>
> >>>>> But if our critical mass agree this is not a good tradeoff, there is
> no
> >>>>> issue to "resolve".
> >>>>>
> >>>>> Enrico Olivelli 于2023年8月16日 周三00:50写道：
> >>>>>
> >>>>>> Tison,
> >>>>>>
> >>>>>> Il Mar 15 Ago 2023, 16:56 tison  ha scritto:
> >>>>>>
> >>>>>>> A demostration for change set -
> >>>>>>> https://github.com/apache/pulsar/pull/20974
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> While I think it is great to start with Spotless for little projects
> >> or
> >>>>>> when you start from scratch,
> >>>>>> appling a patch that changes 3.000+ files will make it very hard to
> >>>>> rebase
> >>>>>> all the pending PRs and also to cherry pick changes to older
> branches
> >>>>>

Re: [DISCUSS] Consistent code style (esp. ws/indent) and autotools

2023-08-31 Thread Asaf Mesika

Opentelemetry-java employs both spotless for coding style
You run "./gradlew spotlessCheck" and it shows the problems.
You run "./gradlew spotlessApply" to automatically fix it.

It also uses errorprone to detect bugs.

I wonder if we can run it only in GitHub PRs, so we can instruct it to run
only on files you have changed / added. This is we "throttle" the style
through files touched.



On Tue, Aug 15, 2023 at 8:31 PM tison  wrote:

> These have been discussed that -
>
> 1.  Cherrypick: we will reformat for all maintained branches.
> 2. Commit noise: metadata to filter out during blame.
> 3. PR rebased: invincible, while we don't have a large backlog or active
> large pending PR.
>
> But if our critical mass agree this is not a good tradeoff, there is no
> issue to "resolve".
>
> Enrico Olivelli 于2023年8月16日 周三00:50写道：
>
> > Tison,
> >
> > Il Mar 15 Ago 2023, 16:56 tison  ha scritto:
> >
> > > A demostration for change set -
> > > https://github.com/apache/pulsar/pull/20974
> >
> >
> >
> > While I think it is great to start with Spotless for little projects or
> > when you start from scratch,
> > appling a patch that changes 3.000+ files will make it very hard to
> rebase
> > all the pending PRs and also to cherry pick changes to older branches
> that
> > we support.
> >
> > I think that this is not a good option for Pulsar at this stage.
> >
> > Maybe if we had a configuration that doesn't change so many files
> >
> > Enrico
> >
> > >
> > >
> > > If we go forward this direction, it should be picked to branch-3.0 and
> > > branch-3.1. Earlier versions can be ported on demand and I'm glad to
> > > volunteer doing that.
> > >
> > > Best,
> > > tison.
> > >
> > >
> > > PengHui Li  于2023年7月10日周一 10:00写道：
> > >
> > > > My concern is how much value it will add.
> > > > From my experience, it's fine. The code style
> > > > is not consistent but doesn't affect my code reading
> > > > and writing much. But it might introduce risks and we
> > > > need to pay much effort to the code review.
> > > >
> > > > Regards,
> > > > Penghui
> > > >
> > > > On Wed, Jul 5, 2023 at 7:39 PM tison  wrote:
> > > >
> > > > > ... which seems a GitHub only extension -
> > > > >
> > > > >
> > > >
> > >
> >
> https://github.blog/changelog/2022-03-24-ignore-commits-in-the-blame-view-beta/
> > > > >
> > > > > Best,
> > > > > tison.
> > > > >
> > > > >
> > > > > tison  于2023年7月5日周三 19:38写道：
> > > > >
> > > > > > For the git blame issue, I found also this practice in
> > StreamPark[1].
> > > > > >
> > > > > > cc @Yunze.
> > > > > >
> > > > > > Best,
> > > > > > tison.
> > > > > >
> > > > > > [1]
> > > > > >
> > > > >
> > > >
> > >
> >
> https://github.com/apache/incubator-streampark/blob/cac931ae289e0753892279336e1c4e70e5f7d7c6/.git-blame-ignore-revs
> > > > > >
> > > > > >
> > > > > > Kiryl Valkovich  于2023年6月30日周五
> > 13:03写道：
> > > > > >
> > > > > >> My mistake. It looks that for some reason Spotless supports
> > > > > .editorconfig
> > > > > >> only for ktlint.
> > > > > >>
> > > > > >> Best,
> > > > > >> Kiryl
> > > > > >>
> > > > > >> From: Kiryl Valkovich 
> > > > > >> Date: Friday, June 30, 2023 at 6:45 AM
> > > > > >> To: dev@pulsar.apache.org 
> > > > > >> Subject: Re: [DISCUSS] Consistent code style (esp. ws/indent)
> and
> > > > > >> autotools
> > > > > >> Hi,
> > > > > >>
> > > > > >> tison, are you going to use .editorconfig for specifying indent
> > > rules?
> > > > > >>
> > > > > >> https://editorconfig.org/
> > > > > >>
> > > > > >> It looks like Spotless supports it.
> > > > > >> https://github.com/diffplug/spotless/issues/734
> > > > > >>
> > > > > >> Flink and many other ASF projects use it.
> > > > > >>
> > > > >
> > > >
> > >
> >
> https://github.com/apache/flink/blob/21eba4ca4cb235a2189c94cdbf3abcec5cde1e6e/.editorconfig
> > > > > >>
> > https://github.com/search?q=org%3Aapache%20.editorconfig=code
> > > > > >>
> > > > > >> Unlike Spotless, the .editorconfig works out of the box in most
> of
> > > the
> > > > > >> modern code editors.
> > > > > >>
> > > > > >> Best,
> > > > > >> Kiryl
> > > > > >>
> > > > > >> From: tison 
> > > > > >> Date: Friday, June 30, 2023 at 3:58 AM
> > > > > >> To: Dev 
> > > > > >> Subject: [DISCUSS] Consistent code style (esp. ws/indent) and
> > > > autotools
> > > > > >> Hi,
> > > > > >>
> > > > > >> I'd like to propose applying a consistent code style (especially
> > > > > >> whitespace
> > > > > >> and indent) with an autotool Spotless.
> > > > > >>
> > > > > >> // Background
> > > > > >>
> > > > > >> Over and over we argue contributors reformat their patch
> manually
> > > for
> > > > > >> checkstyle violations, or even whitespace changes that are not
> > > > detected
> > > > > by
> > > > > >> checkstyle. [1]
> > > > > >>
> > > > > >> A common reason is that such style-only changes increase the
> > burden
> > > to
> > > > > do
> > > > > >> cherry-pick if a later bug fix is made around the code while we
> > > often
> > > > do
> > > > > >> not pick the style change barely or even

Re: [VOTE] PIP-264: Enhanced OTel-based metric system

2023-08-31 Thread Asaf Mesika

Thank you all for your review and corresponding votes.

The PIP vote has passed with 3 binding +1 votes by Matteo, Lari and Hang.


On Wed, Aug 30, 2023 at 3:40 PM Lari Hotari  wrote:

> +1 (binding)
>
> -Lari
>
> On Mon, Aug 28, 2023 at 5:55 PM Asaf Mesika  wrote:
>
> > Hi,
> >
> > I'm very happy to start the vote process for PIP-264.
> >
> > PIP is located at https://github.com/apache/pulsar/pull/21080.
> >
> > The PIP was at the discussion stage from April 27th (~4 months). I want
> to
> > express my sincere gratitude to Matteo, Hang and Larry for taking the
> time
> > to read through the *entire* PIP, validate the solution, and help me make
> > the PIP better. Also thank Penghui for bearing me with so many questions
> > and helping through the construction and validation of the solution.
> >
> > I believe this PIP to be a cornerstone for the successful adoption of
> > Pulsar and the welfare of the  existing user base.
> >
> > Thanks!
> >
> > Asaf
> >
>

[VOTE] PIP-264: Enhanced OTel-based metric system

2023-08-28 Thread Asaf Mesika

Hi,

I'm very happy to start the vote process for PIP-264.

PIP is located at https://github.com/apache/pulsar/pull/21080.

The PIP was at the discussion stage from April 27th (~4 months). I want to
express my sincere gratitude to Matteo, Hang and Larry for taking the time
to read through the *entire* PIP, validate the solution, and help me make
the PIP better. Also thank Penghui for bearing me with so many questions
and helping through the construction and validation of the solution.

I believe this PIP to be a cornerstone for the successful adoption of
Pulsar and the welfare of the  existing user base.

Thanks!

Asaf

Re: [DISCUSS] PIP-264: Enhanced OTel-based metric system

2023-08-28 Thread Asaf Mesika

I've relocated the PIP content from the issue (
https://github.com/apache/pulsar/issues/20197) to a PR (
https://github.com/apache/pulsar/pull/21080) so I could add TOC and also be
inlined with the new process.



On Mon, Aug 28, 2023 at 5:46 PM Asaf Mesika  wrote:

> Thanks for taking the time to review the document - *highly appreciated*.
> I'm inlined my comments below.
>
>
> On Mon, Aug 21, 2023 at 12:19 PM Hang Chen  wrote:
>
>> Hi Asaf,
>> Thanks for bring up the great proposal.
>>
>> After reading this proposal, I have the following questions.
>> 1. This proposal will introduce a big break change in Pulsar,
>> especially in code perspective. I’m interested in how to support both
>> old and new implementation at the same time step by step?
>>
>> >We will keep the current metric system as is, and add a new layer of
>> metrics using OpenTelemetry Java SDK. All of Pulsar’s metrics will be
>> create also using OpenTelemetry. A feature flag will allow enabling
>> OpenTelemetry metrics (init, recording and exporting). All the features and
>> changes described here will be done only in the OpenTelemetry layer,
>> allowing to keep the old version working until you’re ready to switch using
>> the OTel (OpenTelemetry) implementation. In the far future, once OTel usage
>> has stabilized and became widely adopted we’ll deprecate current metric
>> system and eventually remove it. We will also make sure there is feature
>> flag to turn off current Prometheus based metric system.
>>
>>
> Current metrics code remains as is, untouched.
> I'm adding new code, using OpenTelemetry API and SDK. The code in most
> cases will read the existing variables (like msgsReceived), and in other
> cases will setup its own new objects like Counter, Histogram and *also*
> record values to them.
> You can take a look at the revised PIP as I've added tiny code sample to
> be use as an idea how it will look like. Look here
> <https://github.com/apache/pulsar/blob/6ec0bde4127a54ab8e8bb67fb091c932fa2952a4/pip/pip-264.md#consolidating-to-opentelemetry>
> .
>
>
>
>> 2. We introduced Group and filter logic in the metric system, and I
>> have the following concerns.
>> - We need to add protection logic or pre-validation for the group and
>> filter rules to avoid users mis-configured causes huge performance
>> impaction on Pulsar brokers
>>
>>
> Good call. I've added a note in the PIP, that we will reject any filter
> rules update if the expected number of data points exceed certain
> threshold. I left this as detail to be specified in the sub-PIP.
>
> - We need to support expose all the topic-level metrics when the
>> Pulsar cluster just has thounds of topics
>>
>> I've added a new goal: "- New system should support the maximum
> supported number of topics in current system (i.e. 4k topics) without
> filtering"
>
>
>> - Even though we introduced group and filter for the metrics, we still
>> can’t resolve large number of metrics exposed to Prometheus. Exposing
>> large a mount of data (100MB+) throughput HTTP endpoint in
>> ineffective. We can consider expose those metric data by Pulsar topic
>> and develop a Pulsar to Prometheus connector to write Pulsar metric
>> data to Prometheus in streaming mode instead of batch mode to reduce
>> the performace impaction
>>
>
> As I wrote in my PIP. If you find your self exporting 100MB or more of
> metric data *every* 30 seconds you will suffer from:
> * High cost of TSDB holding that (e.g. Prometheus, Cortex, VictoriaMetrics)
> * Query time out since there is too much data to read
>
> Also, the bottleneck is not transfer time over the wire. It's mostly the
> memory needed by any TSDB to hold it for at least 2 hours before flushing
> it to disk - this it the most expensive of all.
>
> At 100mb response size, filtering and grouping are a must.
>
>
>
>>
>> - Group and filter logic uses regular expressions extensively in
>> rules. Regular expression parsing and matching are CPU and time
>> intensive operations. We have push-down filter to reduce the generated
>> metrics number, but still can’t solve the regular expression matching
>> issues. If the user provide a complex regular expression for group and
>> filter rule, the metric generating thread will be the bottleneck and
>> will block other threads if we use synchronous call.
>>
>>
> I plan to use caching as wrote in the PIP. Roughly (instrument,
> attributes) -> boolean. It's basically as if we are adding one boolean to
> PersistentTopic class - it has so many properties and size added is
> negligib

Re: [DISCUSS] PIP-264: Enhanced OTel-based metric system

2023-08-28 Thread Asaf Mesika

, there is no real breaking change. There will
> be a
> > switch to choose the existing metrics or the new ones. The dashboards
> will
> > be updated and provided.
> >
> > At the same time, the best sure way to motivate users to switch or not
> > adopt a platform is to stick with confusing/inconsistent APIs/Metrics.
> >
> >
> > --
> > Matteo Merli
> > 
> >
> >
> > On Wed, Jun 14, 2023 at 6:10 PM Devin Bost  wrote:
> >
> > > > Thanks for the details, Devin. Curios - 'We' stands for which
> company?
> > >
> > > What do you mean? I was quoting Rajan when I said, "we."
> > >
> > >
> > > Devin G. Bost
> > >
> > >
> > > On Wed, Jun 14, 2023 at 10:02 AM Asaf Mesika 
> > > wrote:
> > >
> > > > Thanks for the details, Devin. Curios - 'We' stands for which
> company?
> > > >
> > > > Can you take a look at my previous response to see if it answers the
> > > > concern you raised?
> > > >
> > > > Thanks!
> > > >
> > > >
> > > > On Wed, Jun 14, 2023 at 1:49 PM Devin Bost 
> wrote:
> > > >
> > > > > > Hi,
> > > > > >
> > > > > > Are we proposing a change to break existing metrics compatibility
> > > > > > (prometheus)? If that is the case then it's a big red flag as it
> will
> > > > be
> > > > > a
> > > > > > pain for any company to upgrade Pulsar as monitoring is THE most
> > > > > important
> > > > > > part of the system and we don't even want to break compatibility
> for
> > > > any
> > > > > > small things to avoid interruption for users that are using
> Pulsar
> > > > > system.
> > > > > > I think it's always good to enhance a system by maintaining
> > > > compatibility
> > > > > > and I would be fine if we can introduce new metrics API without
> > > causing
> > > > > ANY
> > > > > > interruption to existing metrics API. But if we can't maintain
> > > > > > compatibility then it's a big red flag and not acceptable for the
> > > > Pulsar
> > > > > > community.
> > > > >
> > > > > Proposing a large breaking change (even if it's crucial) is the
> single
> > > > > fastest way to motivate your users to migrate to a different
> platform.
> > > I
> > > > > wish it wasn't the case, but it's the cold reality.
> > > > >
> > > > > With that said, I'm a big proponent of Open Telemetry. I did a big
> > > video
> > > > a
> > > > > while back that some of you may remember on the use of Open Tracing
> > > > (before
> > > > > it was merged into Open Telemetry). Open Telemetry has gained
> > > > considerable
> > > > > momentum in the industry since then.
> > > > >
> > > > > I'm also very interested in a solution to the metrics problem.
> I've run
> > > > > into the scalability issues with metrics in production, and I've
> been
> > > > very
> > > > > concerned about the metrics bottlenecks around our ability to
> deliver
> > > our
> > > > > promises around supporting large numbers of topics. One of the big
> > > > > advantages of Pulsar over Kafka is supposed to be that topics are
> > > cheap,
> > > > > but as it stands, our current metrics design gets seriously in the
> way
> > > of
> > > > > that. Generally speaking, I'm open to solutions, especially if they
> > > align
> > > > > us with a growing industry standard.
> > > > >
> > > > > - Devin
> > > > >
> > > > >
> > > > > On Wed, Jun 14, 2023, 3:28 AM Enrico Olivelli  >
> > > > wrote:
> > > > >
> > > > > > Il Mer 14 Giu 2023, 04:33 Rajan Dhabalia 
> ha
> > > > > > scritto:
> > > > > >
> > > > > > > Hi,
> > > > > > >
> > > > > > > Are we proposing a change to break existing metrics
> compatibility
> > > > > > > (prometheus)? If that is the case then it's a big red flag as
> it
> > > will
> > > > > be
> > > > > > a
> > > > &

Re: [VOTE] PIP-268: Add support of topic stats/stats-internal using

2023-08-03 Thread Asaf Mesika

So, I'm trying to summarize the motivation you have for this feature and
what we know about it, so we can proceed with discussion.
I think Pulsar users should be happy with using Pulsar and not hit a
"wall", on the other hand, there are cons to adding it:
* Exposing more to the API, adding more surface area, thus it must be done
with good motivation for it.
* Adding endpoints, will create more backlog - another endpoint we need to
add to all other SDKs other than Java (we have 5 of those officially)

Problem #1: I'm using SNI proxy to access Pulsar brokers. It's only
configured for the binary protocol port, thus I can't use it to access the
admin http API.
Discussion:
* What exactly prevents you from using HTTPS with SNI Proxy? I've been
reading up on SNI proxy and istio as a specific implementation of SNI, and
I haven't read any issue with doing it.
Seems you create a TCP connection to 443, which then the proxy create a
connection for you to the broker you specified in the protocol.

Problem #2: Doing many calls to getStats using current HTTP server
implementation is not possible. I've tried it and it fails after Y
concurrent requests or the throughput is X requests/sec and I need more.
Discussion:
* We agreed after much lengthy discussion that the bottleneck is in fact
the framework implementing the REST server. We can replace Jersey with
something else which doesn't do blocking call when writing the result JSON
back to the HTTP connection. This is the limiting factor from my analysis
of the code.
Doing that refactor will help all Pulsar Admin HTTP users.

Let's try to continue based on that summary, unless I missed something
Rajan, Peghui?

On Thu, Aug 3, 2023 at 5:07 AM Rajan Dhabalia  wrote:

> >> If so, I think it could be a good reason for introducing binary
> protocol support here. For the security sensitive users like financial
> application. Usually they will try to reduce the dependencies (less
> dependencies, less
> CVEs and the exposed service endpoints. For example, the flink connector
> also have pulsar-admin dependency but some of the users want to remove it.
>
> Yes, that could be another usecase to have such API available for users.
> So, I would like to bump up this discussion again and see if we have any
> other suggestions or concerns as we have multiple users who need it and
> would like to move forward with this API to serve those usecases.
>
> Thanks,
> Rajan
>
> On Mon, Jun 26, 2023 at 9:03 PM PengHui Li  wrote:
>
> > Hi Rajan,
> >
> > Thanks for the explanation
> >
> > > This feature helps them to avoid multiple different extra
> > efforts
> >
> > If I understand correctly. You want to say users don't want to
> > add the pulsar-admin dependency or the the cluster don't want
> > to expose the REST API to external (not the cluster admin or
> > tenant admin)?
> >
> > If so, I think it could be a good reason for introducing binary
> > protocol support here. For the security sensitive users like financial
> > application.
> >
> > Usually they will try to reduce the dependencies (less dependencies, less
> > CVEs)
> > and the exposed service endpoints. For example, the flink connector
> > also have pulsar-admin dependency but some of the users want to
> > remove it.
> >
> > I want to say that from the perspective of improving performance,
> > it may not be more convincing than the above reason.
> >
> > Thanks,
> > Penghui
> >
> >
> > On Mon, Jun 26, 2023 at 3:37 PM Rajan Dhabalia 
> > wrote:
> >
> > > > I do not deny that binary protocol has performance advantages. But
> > maybe
> > > the bottleneck is not the protocol level for now.
> > >
> > > Well, sure. We had serious issue with performance in past over http but
> > > this feature we would mainly like to introduce it now for the
> > applications
> > > to enhance api user accessibility experience where we have these
> multiple
> > > usecases where applications with large number of topics and high fanout
> > > consumers would like to fetch stats and stats-internal to retrieve
> > various
> > > metadata for application startup and managing application state based
> on
> > > managed-ledge states. You can think of producer/consumer stats api
> which
> > is
> > > used by many usecases in different scenarios of monitoring or state
> > > management. This feature helps them to avoid multiple different extra
> > > efforts and performance considerations, which helps to give clean and
> > easy
> > > experience for their application.
> > > I am open to hear about alternative of json string and response schema
> > > definition but keeping similar response-schema as admin-api response
> for
> > > stats/stats-internal helps to give consistent view of stats across all
> > APIs
> > > and transferring json format helps to skip transformation stats
> > definition
> > > and we don't have to make any wire protocol changes whenever we add any
> > new
> > > field or state in response which makes this protocol response agnostic
> > and
> > > do not require

Can we archive https://github.com/apache/pulsar-release?

2023-06-27 Thread Asaf Mesika

Seems quite neglected

Re: New pip process reminder

2023-06-21 Thread Asaf Mesika

On Wed, Jun 21, 2023 at 10:27 AM Zixuan Liu  wrote:

> I think we can reference https://www.apache.org/foundation/voting.html
>
> > Votes on code modifications follow a different model. In this scenario,
> a negative vote constitutes a veto , which the voting group (generally the
> PMC of a project) cannot override. Again, this model may be modified by a
> lazy consensus declaration when the request for a vote is raised, but the
> full-stop nature of a negative vote does not change. Under normal (non-lazy
> consensus) conditions, the proposal requires three positive votes and no
> negative votes in order to pass; if it fails to garner the requisite amount
> of support, it doesn't. Then the proposer either withdraws the proposal or
> modifies the code and resubmits it, or the proposal simply languishes as an
> open issue until someone gets around to removing it.
>
> It seems that there is no need for three binding votes for code
> modifications. If I am wrong, please point it out.
>
> I believe you may be wrong.

Lazy Consensus is described here
<https://www.apache.org/foundation/voting.html#LazyConsensus> as:

Lazy consensus is simply an announcement of 'silence gives assent.' When
> someone wants to determine the sense of the community this way, they might
> do so with a mail message such as:
> "The patch below fixes bug #8271847; if no-one objects within three
> days, I'll assume lazy consensus and commit it."
> You cannot apply lazy consensus to code changes when the
> review-then-commit
> <https://www.apache.org/foundation/glossary.html#ReviewThenCommit> policy
> is in effect.


My understanding is that for the PIP process, we are using a
review-then-commit policy, which actually means we can't use lazy consensus.

The definition of a Lazy Consensus defined here
<https://www.apache.org/foundation/glossary.html#LazyConsensus> is:

A decision-making policy which assumes general consent if no responses are
> posted within a defined period. For example, "I'm going to commit this by
> lazy consensus if no-one objects within the next three days." Also see 
> Consensus
> Approval
> <https://www.apache.org/foundation/glossary.html#ConsensusApproval> , Majority
> Approval
> <https://www.apache.org/foundation/glossary.html#MajorityApproval> , and
> the description of the voting process
> <https://www.apache.org/foundation/voting.html>.



So if I summarize, a PIP needs to follow the "the proposal requires three
positive votes and no negative votes in order to pass;"


> Thanks,
> Zixuan
>
> Asaf Mesika  于2023年6月21日周三 14:59写道：
> >
> > I'm not a committer or PMC member, so I can't comment on this.
> >
> > I am curious to know the difference between other Apache projects and
> other
> > foundation projects, such as CNCF, if you know about it.
> > Do you think the Apache Foundation's view on individuals, not part of a
> > commercial entity, does not live up to today's state of affairs?
> >
> > On Tue, Jun 20, 2023 at 10:40 PM Rajan Dhabalia 
> > wrote:
> >
> > > Hi,
> > >
> > > > (" a lazy majority of at least 3 binding +1s votes")
> > >
> > > I don't think it's fair at this stage where majority Pulsar committers
> are
> > > mostly part of one enterprise and only their PIP/PRs are moving
> forward and
> > > PR/PIP created by other community members get blocked or not reviewed
> > > without any major reasons. I can list down many different examples but
> I
> > > don't want to start that destructive discussion for now but I strongly
> ask
> > > to help other community members to let them contribute to Pulsar so,
> we can
> > > grow Pulsar community and let Pulsar be at the stage where it has
> > > committers from various different institutions and we have good number
> of
> > > reviewers to review PIP/PR on time.
> > > Right now, there are many examples where PRs are sitting unreviewed
> for a
> > > long time and we have to fix it first by encouraging and having more
> > > committers/reviewers across multiple organizations as a part of the
> Pulsar
> > > community. So, this is not the right time to restrict and this is
> > > indirectly making it difficult for many Pulsar committers and
> contributors
> > > who don't belong to specific enterprises.
> > >
> > > Thanks,
> > > Rajan
> > >
> > >
> > >
> > >
> > > On Tue, Jun 20, 2023 at 12:14 PM Asaf Mesika 
> > > wrote:
> > >
> > > > Hi,
> > > >
> > > > This is just a reminder that PMC/Committers can only merge the PIP PR
> > > when
> > > > the vote thread is concluded and in a positive manner, as described
> (" a
> > > > lazy
> > > > majority of at least 3 binding +1s votes")
> > > >
> > > > So please, before clicking that merge button, double-check those two
> > > > conditions
> > > >
> > > > Thanks!
> > > >
> > > > Asaf
> > > >
> > >
>

Re: [DISCUSS] Pluggable Pulsar Functions runtime to support new runtimes

2023-06-21 Thread Asaf Mesika

Lari, would it be possible to explain in more detail the paint points
you're describing?

You say processing messages individually is slow; hence, processing them in
batches is better. I guess it's especially useful if you need to group a
batch based on a key. What I don't understand is how the framework today
limits you from using something like a reactive client which does the
batching inside.

On Tue, Jun 20, 2023 at 10:33 AM Lari Hotari  wrote:

> Dear Pulsar Community Members,
>
> I would like to initiate a discussion on making the Pulsar Functions
> runtime "pluggable". In doing so, we can ensure that the addition of new
> runtime types becomes more straightforward.
>
> This use case will allow us to add support for Pulsar Functions based on
> various platforms such as:
>
> * Pulsar Client Reactive
> * Node.js / JavaScript
> * WebAssembly (WASM)
> * Spring Pulsar & Reactive Spring
>
> One of the weak points in the current Pulsar Functions runtime is the
> default handling of messages individually. Individual message processing
> can be slow and inefficient in cases where the main function of the
> Pulsar Function (or Sink) is to do backend API calls.
>
> Although pipelining (processing multiple in-flight messages) is possible
> in current Pulsar Functions and Sinks, it often leads to complex and
> error-prone solutions, especially when there's a need to combine
> key-based ordered processing with retry and backoff implementations.
>
> The Reactive Pulsar Client provides an inbuilt solution for implementing
> pipelining. With its ReactiveMessagePipelineBuilder, we can configure
> concurrency levels with key-ordered processing support. This capability
> could potentially eliminate the need to use key-shared subscriptions to
> scale Pulsar processing. If a reactive Pulsar Function were primarily to
> serve as a router for API calls, we could adjust the concurrency level
> to hundreds or even thousands, provided the backend could handle the
> load.
>
> With a pluggable Pulsar Functions runtime, we could introduce new
> runtime types without the need for implementing each type in the
> upstream project. This strategy could likely lead to new opportunities
> for innovative ideas and contributions in this field.
>
> I am interested to know your thoughts on making the Pulsar Functions
> runtime pluggable so that we can add new runtime types.
>
> Best Regards,
>
> -Lari
>

Re: [DISCUSS] PIP-267: Support multi-topic messageId deserialization to ack messages

2023-06-21 Thread Asaf Mesika

I'll continue this on Slack #dev and write the summary here.

Just to clarify any misunderstanding: My intention is to make Pulsar PIP
readable by anyone, which means: Adding the required background information
and explaining your idea in a way people can understand.

In light of this goal, I've introduced a PIP template to make it clear what
is missing and also switched to PRs to make discussion easier than in the
mailing list, thus making participation easier for everyone, which means
more feedback ==> clearer proposals.



On Tue, Jun 20, 2023 at 11:17 PM Rajan Dhabalia 
wrote:

> Hi Asaf,
>
> I really don't know what's your concern but it seems you don't have much
> understanding about Pulsar client/server protocol or you really would like
> to block the PIP. I tried to answer your concerns but let me try again to
> add more context about the implementation if that something can help you:
> this PIP makes change only in protobuf of message-id which is in
> implementation named as MessageIdData and it uses to serialize and
> deserialize messageId for the users. and this PIP is adding a new field to
> support messageId deserialization for partition-topic or multi-consumer
> topics.
> Now, does it impact wire protocol and will the client start sending this
> newly added field topic-name to broker? then answer is no because while
> sending ack command to broker client creates messageID where it doesn't set
> this field [1] and this new field only used during message
> serialization/deserialization when client app calls
> toByteArray()/fromByteArray() methods. so, this should not add any n/w
> overhead for the payload when client sends ack command to broker.
> [1]
>
> https://github.com/apache/pulsar/blob/master/pulsar-common/src/main/java/org/apache/pulsar/common/protocol/Commands.java#L1018
>
> I am not sure if that helps you to answer the question or I should try to
> talk about Pulsar client-server protocol implementation here but we can
> help you in slack#dev channel if you have more implementation questions.
>
> Thanks,
> Rajan
>
>
>
> On Tue, Jun 20, 2023 at 11:31 AM Asaf Mesika 
> wrote:
>
> > On Tue, Jun 20, 2023 at 9:39 AM Rajan Dhabalia 
> > wrote:
> >
> > > > So you say in that sentence that you will add the topic name into
> > > MessageIdData. MessageIdData is defined in PulsarApi.proto and is
> > > transferred over the wire, so how can you add the topic to this class
> > > without changing the wire protocol?
> > > Yes, the client creates a separate MessageId while creating a
> serialized
> > > payload for acking where it doesn't set or send topicname and it won't
> > > change the payload.
> > >
> > >
> > But it contradicts what you wrote in the design doc. I'm sorry, but I
> don't
> > get it.
> > Can you please help me understand this by elaborating so anyone,
> including
> > me, can fully understand it?
> > Preferably all your answers should be injected into the document, of
> > course.
> >
> > Thanks!
> >
> > Asaf
> >
> >
> >
> > > Thanks,
> > > Rajan
> > >
> > > On Mon, Jun 19, 2023 at 5:45 AM Asaf Mesika 
> > wrote:
> > >
> > > > First, let me add some data that should be added to the Background
> > > section
> > > > of the PIP since I had to reverse engineer the code to understand
> that,
> > > > which is the opposite of the goal of a design document.
> > > >
> > > > 
> > > > Pulsar Broker has a binary protocol, which allows the client to
> consume
> > > > messages, acknowledge them, and much more. The protocol comprises
> > > Commands
> > > > containing the data needed to apply that Command on the broker side.
> > Many
> > > > commands allow a consumer (client) to acknowledge messages, among
> them:
> > > > CommandSendReceipt, CommandSend, CommandAck, and more. All those
> > commands
> > > > use the message type MessageIdData to specify the details of the
> > message
> > > to
> > > > acknowledge.
> > > >
> > > > Here's what this data structure looks like:
> > > > message MessageIdData {
> > > > required uint64 ledgerId = 1;
> > > > required uint64 entryId = 2;
> > > > optional int32 partition = 3 [default = -1];
> > > > optional int32 batch_index = 4 [default = -1];
> > > > repeated int64 ack_set = 5;
> > > > optional int32 batch_size = 6;
> > > >
> > > > // For the chunk message id, we need to specify the first c

Re: New pip process reminder

2023-06-21 Thread Asaf Mesika

I'm not a committer or PMC member, so I can't comment on this.

I am curious to know the difference between other Apache projects and other
foundation projects, such as CNCF, if you know about it.
Do you think the Apache Foundation's view on individuals, not part of a
commercial entity, does not live up to today's state of affairs?

On Tue, Jun 20, 2023 at 10:40 PM Rajan Dhabalia 
wrote:

> Hi,
>
> > (" a lazy majority of at least 3 binding +1s votes")
>
> I don't think it's fair at this stage where majority Pulsar committers are
> mostly part of one enterprise and only their PIP/PRs are moving forward and
> PR/PIP created by other community members get blocked or not reviewed
> without any major reasons. I can list down many different examples but I
> don't want to start that destructive discussion for now but I strongly ask
> to help other community members to let them contribute to Pulsar so, we can
> grow Pulsar community and let Pulsar be at the stage where it has
> committers from various different institutions and we have good number of
> reviewers to review PIP/PR on time.
> Right now, there are many examples where PRs are sitting unreviewed for a
> long time and we have to fix it first by encouraging and having more
> committers/reviewers across multiple organizations as a part of the Pulsar
> community. So, this is not the right time to restrict and this is
> indirectly making it difficult for many Pulsar committers and contributors
> who don't belong to specific enterprises.
>
> Thanks,
> Rajan
>
>
>
>
> On Tue, Jun 20, 2023 at 12:14 PM Asaf Mesika 
> wrote:
>
> > Hi,
> >
> > This is just a reminder that PMC/Committers can only merge the PIP PR
> when
> > the vote thread is concluded and in a positive manner, as described (" a
> > lazy
> > majority of at least 3 binding +1s votes")
> >
> > So please, before clicking that merge button, double-check those two
> > conditions
> >
> > Thanks!
> >
> > Asaf
> >
>

New pip process reminder

2023-06-20 Thread Asaf Mesika

Hi,

This is just a reminder that PMC/Committers can only merge the PIP PR when
the vote thread is concluded and in a positive manner, as described (" a lazy
majority of at least 3 binding +1s votes")

So please, before clicking that merge button, double-check those two
conditions

Thanks!

Asaf

Re: [VOTE] PIP-267: Support multi-topic messageId deserialization to ack messages

2023-06-20 Thread Asaf Mesika

-1 (non-binding)

The reason I'm asking all these questions on the DISCUSS is that I still
haven't managed to understand how you plan to solve the pain described.
Not to mention the lack of information in the design document I mentioned
in my replies to the discussion

This DISCUSS thread is not resolved yet from my point of view.
The design document is not clear to me at all.

Hence I would like to continue to understand it in the discussion thread.

On Tue, Jun 20, 2023 at 10:00 AM Rajan Dhabalia 
wrote:

>  Hi.
>
> Pulsar api provides MessageId interface which is generally used by producer
> and consumer applications to manage topic offset. Sometimes, these
> applications would like to serialize and deserialize messageIds,
> specifically consumer app which would like to persist messageId and ack
> with those messageIds by deserializing them. However, right now Pulsar
> doesn't support correct deserialization of multi-topic or partitioned-topic
> because of that 1acknowledge` API call fails for those topics with below
> error:
> "Only TopicMessageId is allowed to acknowledge for a multi-topics consumer"
>
> Please visit PIP for any suggestions:
> https://github.com/apache/pulsar/issues/20221
>
> This PIP is created with PR: https://github.com/apache/pulsar/pull/19944
>
> Thanks,
> Rajan
>

Re: [DISCUSS] PIP-267: Support multi-topic messageId deserialization to ack messages

2023-06-20 Thread Asaf Mesika

On Tue, Jun 20, 2023 at 9:39 AM Rajan Dhabalia  wrote:

> > So you say in that sentence that you will add the topic name into
> MessageIdData. MessageIdData is defined in PulsarApi.proto and is
> transferred over the wire, so how can you add the topic to this class
> without changing the wire protocol?
> Yes, the client creates a separate MessageId while creating a serialized
> payload for acking where it doesn't set or send topicname and it won't
> change the payload.
>
>
But it contradicts what you wrote in the design doc. I'm sorry, but I don't
get it.
Can you please help me understand this by elaborating so anyone, including
me, can fully understand it?
Preferably all your answers should be injected into the document, of course.

Thanks!

Asaf



> Thanks,
> Rajan
>
> On Mon, Jun 19, 2023 at 5:45 AM Asaf Mesika  wrote:
>
> > First, let me add some data that should be added to the Background
> section
> > of the PIP since I had to reverse engineer the code to understand that,
> > which is the opposite of the goal of a design document.
> >
> > 
> > Pulsar Broker has a binary protocol, which allows the client to consume
> > messages, acknowledge them, and much more. The protocol comprises
> Commands
> > containing the data needed to apply that Command on the broker side. Many
> > commands allow a consumer (client) to acknowledge messages, among them:
> > CommandSendReceipt, CommandSend, CommandAck, and more. All those commands
> > use the message type MessageIdData to specify the details of the message
> to
> > acknowledge.
> >
> > Here's what this data structure looks like:
> > message MessageIdData {
> > required uint64 ledgerId = 1;
> > required uint64 entryId = 2;
> > optional int32 partition = 3 [default = -1];
> > optional int32 batch_index = 4 [default = -1];
> > repeated int64 ack_set = 5;
> > optional int32 batch_size = 6;
> >
> > // For the chunk message id, we need to specify the first chunk message
> id.
> > optional MessageIdData first_chunk_message_id = 7;
> > }
> >
> > The key fields are the ledgerID at which the message is contained and
> > entryId, which indicates the offset inside the ledger (message number).
> >
> > The client uses a class named MessageIdData which is the auto-generated
> > code representing the message MessageIdData.
> > -
> >
> > Now, in the design, you wrote:
> >
> > > Thefore, we need to add topic-name into MessageIdData and allow
> > > multi-topic/partitioned topic to deserialize message correctly so, API
> > like
> > > acknowledge can perform as expected.
> >
> >
> > So you say in that sentence that you will add the topic name into
> > MessageIdData.
> > MessageIdData is defined in PulsarApi.proto and is transferred over the
> > wire, so how can you add the topic to this class without changing the
> wire
> > protocol?
> >
> >
> >
> >
> >
> > On Fri, Jun 16, 2023 at 10:47 PM Rajan Dhabalia 
> > wrote:
> >
> > > Yes, the topic name will not be transferred and it's not part of the
> wire
> > > protocol. Message uses MessageID protobuf data-structure to serialize
> and
> > > deserialize MessageId and it doesn't change any behavior nor will
> > transfer
> > > any additional fields to the broker. and I would not like to introduce
> > any
> > > additional data-structure as that will create data copy, field
> > > inconsistencies, and more garbage due to more object allocation and
> > that's
> > > something we would like to avoid.
> > >
> > > Thanks,
> > > Rajan
> > >
> > > On Mon, May 15, 2023 at 6:18 AM PengHui Li  wrote:
> > >
> > > > I think the topic name will not be transmitted to the broker.
> > > > The client side used the class generated by the protobuf message.
> > > > Or, we can create another class to avoid coupling issues, but it
> > > > will introduce more changes and copy data from one structure
> > > > to another. For the long-term, I think it should be a good way if
> > > > we don't have blockers with this solution. Because I don't think
> > > > there is a higher priority in the long run than keeping the protocol
> > > clear.
> > > >
> > > > If the above options are not feasible. At least, we should clarify
> > > > it in the proposal and add comments in the proto file to avoid
> > > > other clients transmitting the topic name to the broker.
> > > >
> > > >

Re: Improving the usability of the bookie Isolation feature

2023-06-19 Thread Asaf Mesika

I want to add only one step to your plan.
If you introduce this flag in Y.X, then in Y.(X+1), let's remove this flag
and keep the "true" value as the behavior.


On Mon, Jun 19, 2023 at 4:57 AM horizonzy  wrote:

> Background
>
> In the Pulsar, it has two features:
>
>-
>
>The first feature allows users to set group and rack information for
>bookies using pulsar-admin bookies set-bookie-rack.
>
> Here, users set bookie1 to bookie5 to the default group and bookie6 to
> bookie10 to the share group using commands, they don't care about rack
> information, they only care about which group the bookie belongs to.
>
> default={bookie1:3181=BookieInfoImpl(rack=default-rack,
> hostname=null), bookie2:3181=BookieInfoImpl(rack=default-rack,
> hostname=null), bookie3:3181=BookieInfoImpl(rack=default-rack,
> hostname=null), bookie4:3181=BookieInfoImpl(rack=default-rack,
> hostname=null), bookie5:3181=BookieInfoImpl(rack=default-rack,
> hostname=null)}
>
> _shared_={bookie6:3181=BookieInfoImpl(rack=default-rack,
> hostname=null), bookie7:3181=BookieInfoImpl(rack=default-rack,
> hostname=null), bookie8:3181=BookieInfoImpl(rack=default-rack,
> hostname=null), bookie9:3181=BookieInfoImpl(rack=default-rack,
> hostname=null), bookie10:3181=BookieInfoImpl(rack=default-rack,
> hostname=null)}
>
>
>-
>
>The second feature allows users to set the priority of traffic for a
>namespace, where traffic is directed to the primary group first and
> then to
>the secondary group. Users can set this priority using pulsar-admin
>ns-isolation-policy set --namespaces public/default --primary "group"
>--secondary "group".
>
> Here, users set the primary group of the /public/default namespace to
> "share" using a command.
>
> {
>   "bookkeeperAffinityGroupPrimary" : "share"
> }
>
> After this work is completed, all traffic under the /public/default
> namespace will be directed to bookie6-10 in the "share" group.
>
> Drawbacks
>
> After a period of time, users added some new bookies [bk11, bk12, bk13,
> bk14, bk15] to the bookie cluster, they found that some traffic under the
> /public/default namespace was directed to the newly added machines. After
> investigation, we eventually found that this was a defect in the working
> mechanism of bookkeeperAffinityGroupPrimary.
>
> *bookkeeperAffinityGroupPrimary work mechanism*
>
> All bookies in the cluster: bk1-bk15.
>
> Here are the steps of the broker pick bookies.
>
>1.
>
>Get the bookie rack info config default: [bk1, bk2, bk3, bk4, bk5];
> share:
>[bk6, bk7, bk8, bk9, bk10]
>2.
>
>Exclude the bookies which are not the bookkeeperAffinityGroupPrimary
>(share).
>3.
>
>Exclude the default group bookies [bk1, bk2, bk3, bk4, bk5].
>4.
>
>Pick bookies from the remaining bookies [bk6, bk7, bk8, bk9, bk10, bk11,
>bk12, bk13, bk14, bk15]
>
> Therefore, some traffic may go to bk11-bk15, which is not what the users
> expect. The reason is that the new bookies, bk11 to bk15, did not have rack
> information set and were not part of any group.
>
> We provided a workaround for users to set the rack information for bk11 to
> bk15 in advance using the command pulsar-admin bookies set-bookie-rack
> before starting them. After user adopting this workaround, the traffic
> worked as expected.
>
> For user, it may be a bit inconvenient as they need to set rack information
> in advance before bringing new bookies online. In scenarios where there are
> strict limitations on traffic, if the bookie operation and maintenance
> personnel overlook this step, it could cause problems.
>
> Improvement
>
> I would like to introduce a new configuration strict for
> bookkeeperAffinityGroupPrimary. The default value for this configuration is
> false, which means that for old users upgrading to the new version, the
> logic will remain the same and bookies without rack information will not be
> constrained.
>
> If users manually set strict to true using the command pulsar-admin
> ns-isolation-policy set --namespaces public/default --primary "group"
> --secondary "group" --strict true, when the broker selects a bookie, it
> will only choose from the bookies in the primary group. If there are not
> enough bookies in the primary group, it will choose from the bookies in the
> secondary group. If there are not enough bookies in either group, an
> exception will be thrown. Bookies without rack information set using
> pulsar-admin
> bookies set-bookie-rack will not be selected.
>
> Compatibility
>
> When users upgrade from the old version to the new version, the working
> mechanism of bookkeeperAffinityGroupPrimary remains the same as before.
> When users upgrade to the new version and set strict to true using the
> command pulsar-admin ns-isolation-policy set --namespaces public/default
> --primary "group" --secondary "group" --strict true, and then roll back to
> the old version, the broker should be able to correctly parse the
> ns-isolation-policy

Re: [VOTE] PIP-275: Introduce topicOrderedExecutorThreadNum to deprecate numWorkerThreadsForNonPersistentTopic in configuration

2023-06-19 Thread Asaf Mesika

+1 (non binding)

On Mon, Jun 19, 2023 at 9:19 AM 丛搏  wrote:

> +1(binding)
>
> Thanks,
> Bo
>
> houxiaoyu  于2023年6月19日周一 14:04写道：
> >
> > Hi, community:
> >
> > This thread is to start a vote for PIP-275: Introduce
> > topicOrderedExecutorThreadNum to deprecate
> > numWorkerThreadsForNonPersistentTopic in configuration.
> >
> > Discussion thread:
> > https://lists.apache.org/thread/hx8v824v5wdoz3kn44s4t9pzgfnqkt1o
> > PIP-PR: https://github.com/apache/pulsar/pull/20507
> >
> > Sincerely
> > Xiaoyu Hou
>

Re: [DISCUSS] PIP-267: Support multi-topic messageId deserialization to ack messages

2023-06-19 Thread Asaf Mesika

First, let me add some data that should be added to the Background section
of the PIP since I had to reverse engineer the code to understand that,
which is the opposite of the goal of a design document.

Pulsar Broker has a binary protocol, which allows the client to consume
messages, acknowledge them, and much more. The protocol comprises Commands
containing the data needed to apply that Command on the broker side. Many
commands allow a consumer (client) to acknowledge messages, among them:
CommandSendReceipt, CommandSend, CommandAck, and more. All those commands
use the message type MessageIdData to specify the details of the message to
acknowledge.

Here's what this data structure looks like:
message MessageIdData {
required uint64 ledgerId = 1;
required uint64 entryId = 2;
optional int32 partition = 3 [default = -1];
optional int32 batch_index = 4 [default = -1];
repeated int64 ack_set = 5;
optional int32 batch_size = 6;

// For the chunk message id, we need to specify the first chunk message id.
optional MessageIdData first_chunk_message_id = 7;
}

The key fields are the ledgerID at which the message is contained and
entryId, which indicates the offset inside the ledger (message number).

The client uses a class named MessageIdData which is the auto-generated
code representing the message MessageIdData.
-

Now, in the design, you wrote:

> Thefore, we need to add topic-name into MessageIdData and allow
> multi-topic/partitioned topic to deserialize message correctly so, API like
> acknowledge can perform as expected.

So you say in that sentence that you will add the topic name into
MessageIdData.
MessageIdData is defined in PulsarApi.proto and is transferred over the
wire, so how can you add the topic to this class without changing the wire
protocol?

On Fri, Jun 16, 2023 at 10:47 PM Rajan Dhabalia 
wrote:

> Yes, the topic name will not be transferred and it's not part of the wire
> protocol. Message uses MessageID protobuf data-structure to serialize and
> deserialize MessageId and it doesn't change any behavior nor will transfer
> any additional fields to the broker. and I would not like to introduce any
> additional data-structure as that will create data copy, field
> inconsistencies, and more garbage due to more object allocation and that's
> something we would like to avoid.
>
> Thanks,
> Rajan
>
> On Mon, May 15, 2023 at 6:18 AM PengHui Li  wrote:
>
> > I think the topic name will not be transmitted to the broker.
> > The client side used the class generated by the protobuf message.
> > Or, we can create another class to avoid coupling issues, but it
> > will introduce more changes and copy data from one structure
> > to another. For the long-term, I think it should be a good way if
> > we don't have blockers with this solution. Because I don't think
> > there is a higher priority in the long run than keeping the protocol
> clear.
> >
> > If the above options are not feasible. At least, we should clarify
> > it in the proposal and add comments in the proto file to avoid
> > other clients transmitting the topic name to the broker.
> >
> > Thanks,
> > Penghui
> >
> > On Fri, May 12, 2023 at 5:59 PM Asaf Mesika 
> wrote:
> >
> > > I don't get it - you say msgId is a data structure contained within
> > > MessageId implementation, right? I presume msgId is the data structure
> > the
> > > client transmit to the server, so that means you are transmitting topic
> > to
> > > the server?
> > >
> > >
> > > On Fri, May 12, 2023 at 7:45 AM Rajan Dhabalia 
> > > wrote:
> > >
> > > > Thank you for sharing your knowledge about the PIP which should be
> > > created
> > > > before PR and I think everyone in the community knows about it. but
> you
> > > can
> > > > check the PR for context which was blocked for sometime and we
> decided
> > to
> > > > create PIP with proto changes.
> > > >
> > > > This PIP/PR tries to fix the issue where partitioned topic fails
> while
> > > > acking deserialized messageId. topic name will be part of MsgIdData
> > which
> > > > is the data-structure used by messageID to store msgID context along
> > with
> > > > partition, batching, and other metadata. topic name will be attached
> > only
> > > > when the user tries to serialize and deserialize the messageId which
> > will
> > > > be purely client side implementation and in other cases it will not
> be
> > > > transmitted to server. Also, partitioned topic's abstract concept for
> > > user
> > > > and messageID must be also remai

Re: [VOTE] PIP-276: Add metric `pulsar_topic_load_times

2023-06-19 Thread Asaf Mesika

Voting +1 (non-binding)

On Fri, Jun 16, 2023 at 12:23 PM guo jiwei  wrote:

> @Asaf Thanks, I have addressed the comment.
>
> Regards
> Jiwei Guo (Tboy)
>
>
> On Fri, Jun 16, 2023 at 3:55 AM Asaf Mesika  wrote:
>
> > -1 (non-binding)
> >
> > I'm perfectly ok with the idea; just please fix the document. It looks
> too
> > messy. Even 1 paragraph changes can look neat and clean.
> > I left notes in the draft PR you opened for the pip.
> >
> > I'll change my non-binding vote once that's done.
> >
> > On Thu, Jun 15, 2023 at 11:07 AM guo jiwei  wrote:
> >
> > > Hi, community:
> > > The metrics are all started with `pulsar_`, so that both users and
> > > operators can quickly find the metrics of the entire system through
> > > this prefix. However, due to some other reasons, it was found that
> > > `topic_load_times` was missing the prefix, so want to get it right.
> > > In the master branch :
> > > *  `pulsar_topic_load_times`: Add this new metric which has the
> same
> > > meaning as `topic_load_times`
> > > *  `topic_load_times`:  Mark this metric as deprecated and remove
> it
> > in
> > > the next version
> > >
> > > PIP: https://github.com/apache/pulsar/pull/20518
> > >
> > > Regards
> > > Jiwei Guo (Tboy)
> > >
> >
>

Re: [VOTE] PIP-276: Add metric `pulsar_topic_load_times

2023-06-15 Thread Asaf Mesika

-1 (non-binding)

I'm perfectly ok with the idea; just please fix the document. It looks too
messy. Even 1 paragraph changes can look neat and clean.
I left notes in the draft PR you opened for the pip.

I'll change my non-binding vote once that's done.

On Thu, Jun 15, 2023 at 11:07 AM guo jiwei  wrote:

> Hi, community:
> The metrics are all started with `pulsar_`, so that both users and
> operators can quickly find the metrics of the entire system through
> this prefix. However, due to some other reasons, it was found that
> `topic_load_times` was missing the prefix, so want to get it right.
> In the master branch :
> *  `pulsar_topic_load_times`: Add this new metric which has the same
> meaning as `topic_load_times`
> *  `topic_load_times`:  Mark this metric as deprecated and remove it in
> the next version
>
> PIP: https://github.com/apache/pulsar/pull/20518
>
> Regards
> Jiwei Guo (Tboy)
>

Re: Pulsar roadmap

2023-06-14 Thread Asaf Mesika

Fully understand.

How about a roadmap that primarily conveys general plans we see and agree
upon as a community?
Of course, it wouldn't contain any dates (not even years).

Maybe we can use that to reflect the "half-ness" / progress bar of certain
features that were left unfinished?

Here's an example from stuff I saw in recent years:

1. https://prometheus.io/docs/introduction/roadmap/

2. As this is brainstorming, there is also an option to do it on the GitHub
board, but I'm wondering whether it's good or bad.
https://github.com/thanos-io/thanos/projects/8

WDYT?

On Wed, Jun 14, 2023 at 11:19 AM Enrico Olivelli 
wrote:

> Asaf,
>
> Il Dom 4 Giu 2023, 15:37 Asaf Mesika  ha scritto:
>
> > Hi,
> >
> >
> > Do we have a place we manage future Pulsar roadmap? Big ticket items,
> > smaller ticket ones?
> > I mean, I know we have GitHub issues, but that’s a forest.
> > I was wondering if we have created a way to display those as a roadmap,
> or
> > hierarchy?
> >
>
> In my experience with OSS projects in the ASF I have seen a few times
> tentatives of setting up 'roadmaps'. Every time I have seen failures.
>
> This is because an OSS project in the ASF lives thanks to volunteers and
> contributions.
>
> There is no central government and you cannot set a roadmap.
>
> When you are in a company with a strict organization you can set deadlines
> and direction, this cannot happen in OSS.
>
> Many times we see that there are interesting features that are not fully
> completed.
>
> Don't get me wrong, I think that it will be great to have a direction but
> simply this cannot happen when you only have volunteers.
>
> We have the PIP process and we have our convention and best practices but
> nobody can predict which features will be added in the mid/long term.
>
> My two cents
> Enrico
>
>
>
> > Thanks,
> >
> > Asaf
> >
>

Re: [DISCUSS] PIP-264: Enhanced OTel-based metric system

2023-06-14 Thread Asaf Mesika

Thanks for the details, Devin. Curios - 'We' stands for which company?

Can you take a look at my previous response to see if it answers the
concern you raised?

Thanks!


On Wed, Jun 14, 2023 at 1:49 PM Devin Bost  wrote:

> > Hi,
> >
> > Are we proposing a change to break existing metrics compatibility
> > (prometheus)? If that is the case then it's a big red flag as it will be
> a
> > pain for any company to upgrade Pulsar as monitoring is THE most
> important
> > part of the system and we don't even want to break compatibility for any
> > small things to avoid interruption for users that are using Pulsar
> system.
> > I think it's always good to enhance a system by maintaining compatibility
> > and I would be fine if we can introduce new metrics API without causing
> ANY
> > interruption to existing metrics API. But if we can't maintain
> > compatibility then it's a big red flag and not acceptable for the Pulsar
> > community.
>
> Proposing a large breaking change (even if it's crucial) is the single
> fastest way to motivate your users to migrate to a different platform. I
> wish it wasn't the case, but it's the cold reality.
>
> With that said, I'm a big proponent of Open Telemetry. I did a big video a
> while back that some of you may remember on the use of Open Tracing (before
> it was merged into Open Telemetry). Open Telemetry has gained considerable
> momentum in the industry since then.
>
> I'm also very interested in a solution to the metrics problem. I've run
> into the scalability issues with metrics in production, and I've been very
> concerned about the metrics bottlenecks around our ability to deliver our
> promises around supporting large numbers of topics. One of the big
> advantages of Pulsar over Kafka is supposed to be that topics are cheap,
> but as it stands, our current metrics design gets seriously in the way of
> that. Generally speaking, I'm open to solutions, especially if they align
> us with a growing industry standard.
>
> - Devin
>
>
> On Wed, Jun 14, 2023, 3:28 AM Enrico Olivelli  wrote:
>
> > Il Mer 14 Giu 2023, 04:33 Rajan Dhabalia  ha
> > scritto:
> >
> > > Hi,
> > >
> > > Are we proposing a change to break existing metrics compatibility
> > > (prometheus)? If that is the case then it's a big red flag as it will
> be
> > a
> > > pain for any company to upgrade Pulsar as monitoring is THE most
> > important
> > > part of the system and we don't even want to break compatibility for
> any
> > > small things to avoid interruption for users that are using Pulsar
> > system.
> > > I think it's always good to enhance a system by maintaining
> compatibility
> > > and I would be fine if we can introduce new metrics API without causing
> > ANY
> > > interruption to existing metrics API. But if we can't maintain
> > > compatibility then it's a big red flag and not acceptable for the
> Pulsar
> > > community.
> > >
> >
> > I agree.
> >
> > If it is possible to export data Ina way that is compatible with
> Prometheus
> > without adding too much overhead then I would support this work.
> >
> > About renaming the metrics: we can do it only if tue changes for users
> are
> > as trivial as replacing the queries in the grafana dashboard or in
> alerting
> > systems.
> >
> > Asaf, do you have prototype? Built over any version of Pulsar?
> >
> > Also, it would be very useful to start an initiative to collect the list
> of
> > metrics that people really use in production, especially for automated
> > alerts.
> >
> > In my experience you usually care about:
> > - in/out traffic (rates, bytes...)
> > - number of producer, consumers, topics, subscriptions...
> > - backlog
> > - jvm metrics
> > - function custom metrics
> >
> >
> > Enrico
> >
> >
> >
> >
> > > Thanks,
> > > Rajan
> > >
> > > On Sun, May 21, 2023 at 9:01 AM Asaf Mesika 
> > wrote:
> > >
> > > > Thanks for the reply, Enrico.
> > > > Completely agree.
> > > > This made me realize my TL;DR wasn't talking about export.
> > > > I added this to it:
> > > >
> > > > ---
> > > > Pulsar OTel Metrics will support exporting as Prometheus HTTP
> endpoint
> > > > (`/metrics` but different port) for backward compatibility and also
> > OLTP,
> > > > so you can push the metrics to OTel Collector and from there ship it
> to
> > > any
> > > > destina

Re: [DISCUSS] PIP-264: Enhanced OTel-based metric system

2023-06-14 Thread Asaf Mesika

On Wed, Jun 14, 2023 at 11:28 AM Enrico Olivelli 
wrote:

> Il Mer 14 Giu 2023, 04:33 Rajan Dhabalia  ha
> scritto:
>
> > Hi,
> >
> > Are we proposing a change to break existing metrics compatibility
> > (prometheus)? If that is the case then it's a big red flag as it will be
> a
> > pain for any company to upgrade Pulsar as monitoring is THE most
> important
> > part of the system and we don't even want to break compatibility for any
> > small things to avoid interruption for users that are using Pulsar
> system.
> > I think it's always good to enhance a system by maintaining compatibility
> > and I would be fine if we can introduce new metrics API without causing
> ANY
> > interruption to existing metrics API. But if we can't maintain
> > compatibility then it's a big red flag and not acceptable for the Pulsar
> > community.
> >
>
> I agree.
>
> If it is possible to export data Ina way that is compatible with Prometheus
> without adding too much overhead then I would support this work.
>
> About renaming the metrics: we can do it only if tue changes for users are
> as trivial as replacing the queries in the grafana dashboard or in alerting
> systems.
>

Can you look at the answer I gave in the prior response and see if it
answers your comments?

>
> Asaf, do you have prototype? Built over any version of Pulsar?
>

No. The idea of the parent PIP  is to agree on the solution's direction
before I start. We're talking about a year of work.


>
> Also, it would be very useful to start an initiative to collect the list of
> metrics that people really use in production, especially for automated
> alerts.
>
> In my experience you usually care about:
> - in/out traffic (rates, bytes...)
> - number of producer, consumers, topics, subscriptions...
> - backlog
> - jvm metrics
> - function custom metrics
>
> I am trying to understand. Do you mean I can use it as default exposed
metrics since introducing the new filter mechanism?


>
> Enrico
>
>
>
>
> > Thanks,
> > Rajan
> >
> > On Sun, May 21, 2023 at 9:01 AM Asaf Mesika 
> wrote:
> >
> > > Thanks for the reply, Enrico.
> > > Completely agree.
> > > This made me realize my TL;DR wasn't talking about export.
> > > I added this to it:
> > >
> > > ---
> > > Pulsar OTel Metrics will support exporting as Prometheus HTTP endpoint
> > > (`/metrics` but different port) for backward compatibility and also
> OLTP,
> > > so you can push the metrics to OTel Collector and from there ship it to
> > any
> > > destination.
> > > ---
> > >
> > > OTel supports two kinds of exporter: Prometheus (HTTP) and OTLP (push).
> > > We'll just configure to use them.
> > >
> > >
> > >
> > > On Mon, May 15, 2023 at 10:35 AM Enrico Olivelli 
> > > wrote:
> > >
> > > > Asaf,
> > > > thanks for contributing in this area.
> > > > Metrics are a fundamental feature of Pulsar.
> > > >
> > > > Currently I find it very awkward to maintain metrics, and also I see
> > > > it as a problem to support only Prometheus.
> > > >
> > > > Regarding your proposal, IIRC in the past someone else proposed to
> > > > support other metrics systems and they have been suggested to use a
> > > > sidecar approach,
> > > > that is to add something next to Pulsar services that served the
> > > > metrics in the preferred format/way.
> > > > I find that the sidecar approach is too inefficient and I am not
> > > > proposing it (but I wanted to add this reference for the benefit of
> > > > new people on the list).
> > > >
> > > > I wonder if it would be possible to keep compatibility with the
> > > > current Prometheus based metrics.
> > > > Now Pulsar reached a point in which is is widely used by many
> > > > companies and also with big clusters,
> > > > telling people that they have to rework all the infrastructure
> related
> > > > to metrics because we don't support Prometheus anymore or because we
> > > > changed radically the way we publish metrics
> > > > It is a step that seems too hard from my point of view.
> > > >
> > > > Currently I believe that compatibility is more important than
> > > > versatility, and if we want to introduce new (and far better)
> features
> > > > we must take it into account.
> > > >
> > > > So my point is that I generally

Re: [DISCUSS] PIP-264: Enhanced OTel-based metric system

2023-06-14 Thread Asaf Mesika

In my proposal, I do suggest deleting the existing metric system. You still
have the same Prometheus /metrics endpoint exposing metrics exactly as you
have today.
I plan to add an additional metric system based on OTel.
You will be able to consume via its native exporters;
1. Prometheus - there will be a new server listening on a new port serving
the `/metrics` endpoint, serving only the metrics defined in OTel.
2. OTLP (gRPC, HTTP) - This pushes the metrics out in a defined interval to
a URL in OLTP format.

Once you make the switch, of course, things will break, as we're changing
many things, but you can have them both turned on, so you can slowly
convert your observability system.
The stock Granfana dashboards will have another version for OTel.

It answers your last sentence, I believe.

On Wed, Jun 14, 2023 at 5:32 AM Rajan Dhabalia  wrote:

> Hi,
>
> Are we proposing a change to break existing metrics compatibility
> (prometheus)? If that is the case then it's a big red flag as it will be a
> pain for any company to upgrade Pulsar as monitoring is THE most important
> part of the system and we don't even want to break compatibility for any
> small things to avoid interruption for users that are using Pulsar system.
> I think it's always good to enhance a system by maintaining compatibility
> and I would be fine if we can introduce new metrics API without causing ANY
> interruption to existing metrics API. But if we can't maintain
> compatibility then it's a big red flag and not acceptable for the Pulsar
> community.
>
> Thanks,
> Rajan
>
> On Sun, May 21, 2023 at 9:01 AM Asaf Mesika  wrote:
>
> > Thanks for the reply, Enrico.
> > Completely agree.
> > This made me realize my TL;DR wasn't talking about export.
> > I added this to it:
> >
> > ---
> > Pulsar OTel Metrics will support exporting as Prometheus HTTP endpoint
> > (`/metrics` but different port) for backward compatibility and also OLTP,
> > so you can push the metrics to OTel Collector and from there ship it to
> any
> > destination.
> > ---
> >
> > OTel supports two kinds of exporter: Prometheus (HTTP) and OTLP (push).
> > We'll just configure to use them.
> >
> >
> >
> > On Mon, May 15, 2023 at 10:35 AM Enrico Olivelli 
> > wrote:
> >
> > > Asaf,
> > > thanks for contributing in this area.
> > > Metrics are a fundamental feature of Pulsar.
> > >
> > > Currently I find it very awkward to maintain metrics, and also I see
> > > it as a problem to support only Prometheus.
> > >
> > > Regarding your proposal, IIRC in the past someone else proposed to
> > > support other metrics systems and they have been suggested to use a
> > > sidecar approach,
> > > that is to add something next to Pulsar services that served the
> > > metrics in the preferred format/way.
> > > I find that the sidecar approach is too inefficient and I am not
> > > proposing it (but I wanted to add this reference for the benefit of
> > > new people on the list).
> > >
> > > I wonder if it would be possible to keep compatibility with the
> > > current Prometheus based metrics.
> > > Now Pulsar reached a point in which is is widely used by many
> > > companies and also with big clusters,
> > > telling people that they have to rework all the infrastructure related
> > > to metrics because we don't support Prometheus anymore or because we
> > > changed radically the way we publish metrics
> > > It is a step that seems too hard from my point of view.
> > >
> > > Currently I believe that compatibility is more important than
> > > versatility, and if we want to introduce new (and far better) features
> > > we must take it into account.
> > >
> > > So my point is that I generally support the idea of opening the way to
> > > Open Telemetry, but we must have a way to not force all of our users
> > > to throw away their alerting systems, dashboards and know-how in
> > > troubleshooting Pulsar problems in production and dev
> > >
> > > Best regards
> > > Enrico
> > >
> > > Il giorno lun 15 mag 2023 alle ore 02:17 Dave Fisher
> > >  ha scritto:
> > > >
> > > >
> > > >
> > > > > On May 10, 2023, at 1:01 AM, Asaf Mesika 
> > > wrote:
> > > > >
> > > > > On Tue, May 9, 2023 at 11:29 PM Dave Fisher 
> > wrote:
> > > > >
> > > > >>
> > > > >>
> > > > >>>> On May 8, 2023, at 2:49 AM, Asaf Mesika 
> >

Re: [VOTE] PIP-274: Add metric `pulsar_topic_load_times`

2023-06-13 Thread Asaf Mesika

Your pip number is wrong. Better close this thread and open a new one.
Recommended to include a link to the PIP PR

On Mon, Jun 12, 2023 at 9:15 AM guo jiwei  wrote:

> Hi, community:
> The metrics are all started with `pulsar_`, so that both users and
> operators can quickly find the metrics of the entire system through
> this prefix. However, due to some other reasons, it was found that
> `topic_load_times` was missing the prefix, so want to get it right.
> In the master branch :
> *  `pulsar_topic_load_times`: Add this new metric which has the same
> meaning as `topic_load_times`
> *  `topic_load_times`:  Mark this metric as deprecated and remove it in
> the next version
>
>
>
> Regards
> Jiwei Guo (Tboy)
>

Re: [DISCUSS] PIP-263: Just auto-create no-partitioned DLQ And Prevent auto-create a DLQ for a DLQ

2023-06-11 Thread Asaf Mesika

 I agree with you. wasAutoCreated seems a bit like a patch in light of the
recent issues.


On Fri, Jun 9, 2023 at 9:03 AM Michael Marshall 
wrote:

> > In general, these problems hint to me that we need better definitions
> > for "system" and "special" topics that break the rules of auto created
> > topics in well defined ways.
>
> This point reminds me of this discussion from a year and a half ago:
>
> https://lists.apache.org/thread/qgbpzr6o3k5rqbs2jvpkdh8hr9jpnw39
>
> For PIP 124, I challenged a feature based on the question of whether
> the DLQ is a special topic or not.
>
> Ultimately, we determined it is a special topic (see Matteo's response
> later in that thread).
>
> I think it is interesting because it feels like we continue to find
> exceptions.
>
> Just this week, we found yet another and lamented the lack of design
> for these topics:
> https://github.com/apache/pulsar/pull/20514#issuecomment-1579937937.
>
> I mention those points because I am concerned that creating a DLQ
> specific piece of boolean metadata in the topic properties does not
> move us toward a unified and extensible abstraction.
>
> Unless we want to continue to infer everything from topic names, it
> seems like we almost need a "topic type" enum to capture the many
> types of special topics we have. Type has the same issues I mentioned
> in my last email, but it could help create a more generic solution.
>
> > ### Properties of Topic
> > ```properties
> > "wasAutoCreated": boolean
> > ```
>
> I don't have the answer yet, but I think we should spend more time
> working on a general solution for topic classification before adding
> per topic metadata.
>
> I'd be interested to know what others think on the subject.
>
> Thanks,
> Michael
>
> On Thu, Jun 1, 2023 at 2:08 AM Yubiao Feng
>
>  wrote:
> >
> > Hi Michael, Enrico
> >
> > > The identifier is the name. The topics have `-DLQ` and `-RETRY`
> > > as suffixes, which makes them "special" (which reminds me of this
> > > thread [0]). We could choose to always make these non partitioned
> > > or to make them configurable in some other way.
> >
> > I changed the design like this:
> >
> > ### Properties of Topic
> > ```properties
> > "wasAutoCreated": boolean
> > ```
> >
> > - If the topic name ends with "-RETRY" or "-DLQ", only non-partitioned
> >topics will be automatically created
> > - If the property `wasAutoCreated` of the topic is true and the topic
> name
> >ends with "-RETRY" or "-DLQ", retry topics and dead letter queues
> >will no longer be created for this topic.
> >
> > How are you feeling?
> >
> > Thanks
> > Yubiao Feng
> >
> > On Thu, Jun 1, 2023 at 12:48 PM Michael Marshall 
> > wrote:
> >
> > > We've had several bugs related to this topic of auto created
> > > partitioned topics recently. These three PRs come to mind:
> > >
> > > * https://github.com/apache/pulsar/pull/20370
> > > * https://github.com/apache/pulsar/pull/20392
> > > * https://github.com/apache/pulsar/pull/20397
> > >
> > > Those PRs rely on server side inference based on the topic name to
> > > know that a topic should not be a partitioned topic.
> > >
> > > In general, these problems hint to me that we need better definitions
> > > for "system" and "special" topics that break the rules of auto created
> > > topics in well defined ways.
> > >
> > > I agree with Enrico that "purposeOfAutoCreatedTopic" will create
> overhead.
> > >
> > > > I agree with you, but now the problem is that we still need
> > > > an identifier to say that it is a DLQ. Do you have some
> > > > suggestions?
> > >
> > > The identifier is the name. The topics have `-DLQ` and `-RETRY` as
> > > suffixes, which makes them "special" (which reminds me of this thread
> > > [0]). We could choose to always make these non partitioned or to make
> > > them configurable in some other way.
> > >
> > > Or, we could consider giving producers and consumers the option to
> > > configure a `partitionCountHint`. This could be used by the broker
> > > during auto creation to create a partitioned topic with x partitions
> > > when greater than 0 and to create a non-partitioned topic when 0. This
> > > moves the logic out of inference on topic names and into the realm of
> > > client configuration. However, it could create hard to debug
> > > scenarios.
> > >
> > > Thanks,
> > > Michael
> > >
> > > [0] https://lists.apache.org/thread/yrkf88jjpjzhmk6hy15ynnk3l6n96l9w
> > >
> > > On Wed, May 31, 2023 at 7:42 AM Yubiao Feng
> > >  wrote:
> > > >
> > > > Hi @Enrico
> > > >
> > > > > I think that it is better to add flags like:
> > > > > - allowAutoTopiCreation: default "true", if "false" the broker
> > > > > won't create the topic in any case
> > > > > autoTopicCreationMode: undefined/partitioned/non-partitioned
> > > >
> > > > I agree with you, but now the problem is that we still need
> > > >  an identifier to say that it is a DLQ. Do you have some
> > > > suggestions?
> > > >
> > > > Thanks
> > > >
> > > > Yubiao Feng
> > > >
> > > >
>

Re: [VOTE] PIP-272

2023-06-08 Thread Asaf Mesika

After several ping pongs over the PR, I'm changing my vote:

+1 (non-binding)

Just please make sure to review all pending grammar suggestions.

On Tue, Jun 6, 2023 at 10:53 PM Asaf Mesika  wrote:

> I'm sorry for not getting back to you sooner: Weekend, work, etc.
>
> I vote -1 (non-binding), as I found something rather disturbing in the
> design.
>
> Can we please go back to the discussion (in the PR)?
>
> Rui Fu - I see you voted +1 (binding) - what do you think about the
> comments I've made?
>
> Thanks!
>
>
> On Tue, Jun 6, 2023 at 8:07 AM Rui Fu  wrote:
>
>> +1
>>
>> Best,
>>
>> Rui Fu
>> On Jun 6, 2023 at 09:50 +0800, Pengcheng Jiang
>> , wrote:
>> > Hello, community:
>> >
>> > This thread is to start a vote for PIP-272: Add stateStorageConfig to
>> > WorkerConfig.
>> >
>> > Discussion thread:
>> > https://lists.apache.org/thread/pwfv7nj64frfnbw7jfydzx8my15b3lj6
>> > PR: https://github.com/apache/pulsar/pull/20455
>> >
>> > Sincerely
>> > Pengcheng Jiang
>>
>

Re: [VOTE] PIP-272

2023-06-06 Thread Asaf Mesika

I'm sorry for not getting back to you sooner: Weekend, work, etc.

I vote -1 (non-binding), as I found something rather disturbing in the
design.

Can we please go back to the discussion (in the PR)?

Rui Fu - I see you voted +1 (binding) - what do you think about the
comments I've made?

Thanks!

On Tue, Jun 6, 2023 at 8:07 AM Rui Fu  wrote:

> +1
>
> Best,
>
> Rui Fu
> On Jun 6, 2023 at 09:50 +0800, Pengcheng Jiang
> , wrote:
> > Hello, community:
> >
> > This thread is to start a vote for PIP-272: Add stateStorageConfig to
> > WorkerConfig.
> >
> > Discussion thread:
> > https://lists.apache.org/thread/pwfv7nj64frfnbw7jfydzx8my15b3lj6
> > PR: https://github.com/apache/pulsar/pull/20455
> >
> > Sincerely
> > Pengcheng Jiang
>

Re: [DISCUSS] PIP-264: Enhanced OTel-based metric system

2023-06-06 Thread Asaf Mesika

pic groups are completely unrelated to bundles or load.
Even if you come up with a unique design that introduces an abstraction on
top of topics, it probably won't be user controlled as groups - it will
probably be automatic. I hope I managed to explain in the PIP and here why
the user controlling which topics are included in each group is essential.
Right now, there is nothing in motion in that regard, and as I explained,
it's unrelated to this proposal.



> (3) Perhaps the issues of 100s of metrics should be turned to being
> intentional about what metrics are helpful and then slowly switching to
> those that the whole community finds are most helpful in their operations.
>
I think most of the times you don't need 90% of the metrics and you want
them filtered, until that one time you are facing with severe issue and you
want all the metrics you can get to solve it - hence get them back out of
the filter.
I don't think we can get delete those metrics.


>
> Only by doing these three steps carefully in the Open on this list in the
> Community can there be enough consensus that the whole change is acceptable
> for Pulsar 4.0 in 18 months.
>

I'm doing - very carefully - I've taken 11 months to design this.
It's as open as it can be. I shared my intent before I started back in July
in the community meetings. I published a preliminary idea doc in October to
the community, and then went heads down to solve this huge challenge and
wrote the PIP, released at the end of April.

Thanks,

Asaf


>
> Best,
> Dave
>
> > On May 21, 2023, at 9:00 AM, Asaf Mesika  wrote:
> >
> > Thanks for the reply, Enrico.
> > Completely agree.
> > This made me realize my TL;DR wasn't talking about export.
> > I added this to it:
> >
> > ---
> > Pulsar OTel Metrics will support exporting as Prometheus HTTP endpoint
> > (`/metrics` but different port) for backward compatibility and also OLTP,
> > so you can push the metrics to OTel Collector and from there ship it to
> any
> > destination.
> > ---
> >
> > OTel supports two kinds of exporter: Prometheus (HTTP) and OTLP (push).
> > We'll just configure to use them.
> >
> >
> >
> > On Mon, May 15, 2023 at 10:35 AM Enrico Olivelli 
> > wrote:
> >
> >> Asaf,
> >> thanks for contributing in this area.
> >> Metrics are a fundamental feature of Pulsar.
> >>
> >> Currently I find it very awkward to maintain metrics, and also I see
> >> it as a problem to support only Prometheus.
> >>
> >> Regarding your proposal, IIRC in the past someone else proposed to
> >> support other metrics systems and they have been suggested to use a
> >> sidecar approach,
> >> that is to add something next to Pulsar services that served the
> >> metrics in the preferred format/way.
> >> I find that the sidecar approach is too inefficient and I am not
> >> proposing it (but I wanted to add this reference for the benefit of
> >> new people on the list).
> >>
> >> I wonder if it would be possible to keep compatibility with the
> >> current Prometheus based metrics.
> >> Now Pulsar reached a point in which is is widely used by many
> >> companies and also with big clusters,
> >> telling people that they have to rework all the infrastructure related
> >> to metrics because we don't support Prometheus anymore or because we
> >> changed radically the way we publish metrics
> >> It is a step that seems too hard from my point of view.
> >>
> >> Currently I believe that compatibility is more important than
> >> versatility, and if we want to introduce new (and far better) features
> >> we must take it into account.
> >>
> >> So my point is that I generally support the idea of opening the way to
> >> Open Telemetry, but we must have a way to not force all of our users
> >> to throw away their alerting systems, dashboards and know-how in
> >> troubleshooting Pulsar problems in production and dev
> >>
> >> Best regards
> >> Enrico
> >>
> >> Il giorno lun 15 mag 2023 alle ore 02:17 Dave Fisher
> >>  ha scritto:
> >>>
> >>>
> >>>
> >>>> On May 10, 2023, at 1:01 AM, Asaf Mesika 
> >> wrote:
> >>>>
> >>>> On Tue, May 9, 2023 at 11:29 PM Dave Fisher  wrote:
> >>>>
> >>>>>
> >>>>>
> >>>>>>> On May 8, 2023, at 2:49 AM, Asaf Mesika 
> >> wrote:
> >>>>>>
> >>>>>> Your feedback made me realized I need to add "TL;D

Re: [DISCUS] PIP-273: Add metric prefix for `topic_load_times`

2023-06-04 Thread Asaf Mesika

You will break anyone using this from 2017.
Perhaps add a new metric with correct name and mark that one deprecated.
Next LTS version it can be removed?

How does this things managed?

On Sat, Jun 3, 2023 at 2:34 AM Dave Fisher  wrote:

> There are two PIP-273s. Whichever was last please update.
>
> Sent from my iPhone
>
> > On Jun 2, 2023, at 2:41 PM, Michael Marshall 
> wrote:
> >
> > +1. Can we also add a test to verify that this kind of addition isn't
> > possible in the future?
> >
> > On a procedure note, I created PIP 273 earlier this week. I think this
> > is PIP 274.
> >
> > Thanks,
> > Michael
> >
> >> On Fri, Jun 2, 2023 at 2:40 AM Yubiao Feng
> >>  wrote:
> >>
> >> Hi Jiwei
> >>
> >> +1
> >>
> >> Thanks
> >> Yubiao Feng
> >>
> >>> On Fri, Jun 2, 2023 at 3:23 PM guo jiwei  wrote:
> >>>
> >>> Hi, community:
> >>>The metrics are all started with `pulsar_`, so that both users and
> >>> operators can quickly find the metrics of the entire system through
> >>> this prefix. However, due to some other reasons, it was found that
> >>> `topic_load_times` was missing the prefix, so want to get it right.
> >>>In master release, change :
> >>>*  `topic_load_times`:  -> `pulsar_topic_load_times`
> >>>
> >>>
> >>>
> >>> Regards
> >>> Jiwei Guo (Tboy)
> >>>
>
>

Pulsar roadmap

2023-06-04 Thread Asaf Mesika

Hi,


Do we have a place we manage future Pulsar roadmap? Big ticket items,
smaller ticket ones?
I mean, I know we have GitHub issues, but that’s a forest.
I was wondering if we have created a way to display those as a roadmap, or
hierarchy?

Thanks,

Asaf

Re: [DISUCSS] Fix and republish the pulsar-all image for Pulsar 3.0.0

2023-06-01 Thread Asaf Mesika

I understand that you can't build a release process due to Apache
foundation rules, that makes it mandatory to release something you built on
your own machine.


On Wed, May 31, 2023 at 5:54 PM Enrico Olivelli  wrote:

> Il giorno mer 31 mag 2023 alle ore 16:50 Michael Marshall
>  ha scritto:
> >
> > The best solution is to make the build repeatable. IIUC, that will let
> > us build all of the artifacts using CI instead of personal machines,
> > which removes this class of errors.
> >
> > That being said, I don't know how much effort it would be to achieve.
> >
> > Thanks,
> > Michael
> >
> > On Tue, May 30, 2023 at 2:24 AM Zike Yang  wrote:
> > >
> > > Hi, Enrico
> > >
> > > > When we ran the VOTE and we provided the docker images, were they
> > > already broken ?
> > >
> > > Actually, they are not broken unless we use the new features of Pulsar
> 3.0.0.
>
> It is not Pulsar 3 but Pulsar 2.11.
> So for users this is a problem
>
> If we are able to prepare the docker image from the RELEASED tarbal
> and connectors then I am fine with it.
> If you have to rebuild from the sources I strongly believe that we
> cannot do it without a proper release process
>
> Enrico
>
>
>
> > >
> > > I think we need something like verification test scripts, to verify
> > > the release candidate. For example, we use the image provided in the
> > > RC to run the integration tests. And we need to make sure that we have
> > > tested the newly added feature for the RC docker image.
> > >
> > > What do you think?
> > >
> > > BR,
> > > Zike Yang
> > >
> > > On Tue, May 30, 2023 at 3:14 PM Zike Yang  wrote:
> > > >
> > > > Hi, Asaf
> > > >
> > > > > How do you suggest we prevent it from happening next time?
> > > >
> > > > I have pushed a PR to fix it:
> https://github.com/apache/pulsar/pull/20435
> > > > This PR specifies the correct image name for `pulsar` image to build
> pulsar-all.
> > > >
> > > > Note that, in the release of Pulsar 3.0, we build the docker image by
> > > > executing the following command instead of the `docker/build.sh`:
> > > > ```
> > > > mvn install -DUBUNTU_MIRROR=http://azure.archive.ubuntu.com/ubuntu/
> \
> > > > -DskipTests \
> > > > -Pdocker -Pdocker-push \
> > > > -Ddocker.platforms=linux/amd64,linux/arm64 \
> > > > -Ddocker.organization=snzkyang \
> > > >  -pl docker/pulsar,docker/pulsar-all
> > > > ```
> > > > I think to take it a step further, we could fix these
> scripts(build.sh
> > > > and publish.sh) and use the shell scripts to build the image.
> > > >
> > > > I have verified the PR, and it works well. Please see more detail in
> > > > the PR description.
> > > >
> > > > Thanks,
> > > > Zike Yang
> > > >
> > > > On Mon, May 29, 2023 at 9:50 PM Enrico Olivelli 
> wrote:
> > > > >
> > > > > I am really worried about the process.
> > > > >
> > > > > When we ran the VOTE and we provided the docker images, were they
> > > > > already broken ?
> > > > >
> > > > > In any case we cannot overwrite those images, they have been cached
> > > > > all over the world now.
> > > > >
> > > > > It is safer to cut a new 3.0.1 release  and run a VOTE.
> > > > >
> > > > > Maybe we can remove the old images, forever
> > > > >
> > > > > Enrico
> > > > >
> > > > > Il giorno lun 29 mag 2023 alle ore 13:55 Asaf Mesika
> > > > >  ha scritto:
> > > > > >
> > > > > > Good catch!
> > > > > >
> > > > > > How do you suggest we prevent it from happening next time?
> > > > > >
> > > > > > On Mon, May 29, 2023 at 1:34 PM Zike Yang 
> wrote:
> > > > > >
> > > > > > > Hi, all
> > > > > > >
> > > > > > > Recently, we found an issue with the `pulsar-all:3.0.0` image.
> The
> > > > > > > pulsar library included in `pulsar-all:3.0.0` is the version of
> > > > > > > 2.11.0:
> > > > > > >
> > > > > > > ```
> > > > &g

Re: [DISCUSS] PIP-272 Add a `StateStoreConfig` to the `WorkerConfig`

2023-05-31 Thread Asaf Mesika

Pengcheng, would you be willing to be the inaugural PIP in our PIP
submission process?
Yesterday, we officially moved from the GitHub issue to a markdown file for
PIP submissions.

For you, it basically means moving your proposal to a markdown file and
submitting a PR (and deleting the content in the github issue, just placing
a link. Next time no need to open github issue)

The process is described step by step here:
https://github.com/apache/pulsar/blob/master/pip/README.md

Thanks!

Asaf


On Wed, May 31, 2023 at 12:55 AM Neng Lu  wrote:

> thanks for the improvements, +1
>
> On Tue, May 30, 2023 at 2:20 AM Pengcheng Jiang
>  wrote:
>
> > Hi Mesika:
> >
> > Thanks for the suggestions, I updated the pip, and for the rest
> questions:
> >
> > 5. yes, all config goes through arguments instead of a file
> > 6. it should be a JSON string that can be deserialized to a `Map > Object>`, updated in pip
> > 7. it should be `pulsar-admin functions localrun` command, updated in pip
> > 8. the `stateStorageServiceUrl` won't be touched
> >
> > Sincerely
> > Pengcheng Jiang
> >
> > Asaf Mesika  于2023年5月29日周一 19:53写道：
> >
> > > Hi Pengcheng,
> > >
> > > Looks like a solid improvement, definitely helping people using their
> own
> > > state store.
> > >
> > > I have a few comments:
> > >
> > > 1. Background knowledge should explain what is a state storage
> > > 2. Move problem description from Background Knowledge to Motivation.
> > >
> > > I'm quoting the template to understand what should be included in
> > > the Background knowledge section:
> > >
> > > 
> > >
> > > 3. `WorkerConfig` - explain briefly what is Worker and how it differs
> > from
> > > Broker. Should be in background knowledge section.
> > >
> > > 4. Background knowledge should explain briefly what is a runtime and
> > > runtime factory.
> > >
> > > 5.
> > >
> > > Add a new cli argument to JavaInstanceStarter and LocalRunner so
> > > > process runtime can pass state related config to them
> > >
> > >
> > > Today all config goes through arguments and not a file?
> > >
> > > 6. `--stateStorageConfig`
> > >   What format is the expected value?
> > >
> > > 7. `functions local run`
> > >  What is this?
> > >
> > > 8. Are you keeping `stateStorageServiceUrl`? Maybe people rely on it?
> > >
> > > 9. Don't forget to include link to discussion thread using Apache Pony
> > Mail
> > >
> > >
> > > On Mon, May 29, 2023 at 10:44 AM Rui Fu  wrote:
> > >
> > > > Hi Pengcheng,
> > > >
> > > > Thanks for bringing this up, the PIP lgtm, +1.
> > > >
> > > > Best,
> > > >
> > > > Rui Fu
> > > > On May 29, 2023 at 13:52 +0800, Enrico Olivelli  >,
> > > > wrote:
> > > > > Looks good
> > > > > +1
> > > > >
> > > > > Enrico
> > > > >
> > > > > Il Lun 29 Mag 2023, 04:47 Pengcheng Jiang
> > > > >  ha scritto:
> > > > >
> > > > > > Dear Pulsar community,
> > > > > >
> > > > > > I created a pip to make pulsar functions' `StateStoreProvider`
> > > > configurable
> > > > > > with custom configurations:
> > > > https://github.com/apache/pulsar/issues/20419
> > > > > >
> > > > > > Any feedback and suggestions are welcome
> > > > > >
> > > > > > Sincerely
> > > > > > Pengcheng Jiang
> > > > > >
> > > >
> > >
> >
>

PLEASE NOTE - from now on PIPs are PR based

2023-05-30 Thread Asaf Mesika

Hi,

I've completed all steps necessary for implementing "PIP-265: PR-based
system for managing and reviewing PIPs" (
https://github.com/apache/pulsar/issues/20207).

PLEASE NOTE, from now on, PIPs are to be submitted *through Pull Request*.

You can read the process detail here:
https://github.com/apache/pulsar/blob/master/pip/README.md

I've marked the PIP issue deprecated and if you're using it, you'll see
instructions to view the new PIP process, so I don't anticipate people will
submit PIP through GitHub issues anymore - thanks Michael for that
suggestion.

Let's try to help future contributors to follow the new process.

Thanks!

Asaf

Re: [DISUCSS] Fix and republish the pulsar-all image for Pulsar 3.0.0

2023-05-29 Thread Asaf Mesika

Good catch!

How do you suggest we prevent it from happening next time?

On Mon, May 29, 2023 at 1:34 PM Zike Yang  wrote:

> Hi, all
>
> Recently, we found an issue with the `pulsar-all:3.0.0` image. The
> pulsar library included in `pulsar-all:3.0.0` is the version of
> 2.11.0:
>
> ```
> docker run apachepulsar/pulsar-all:3.0.0 ls lib/ | grep pulsar-broker
>
> org.apache.pulsar-pulsar-broker-2.11.0.jar
> org.apache.pulsar-pulsar-broker-auth-sasl-2.11.0.jar
> org.apache.pulsar-pulsar-broker-common-2.11.0.jar
> ```
>
> The root cause is that we use `apachepulsar/pulsar:latest` to build
> the `pulsar-all` image. But at the time of building Pulsar 3.0.0,
> `apachepulsar/pulsar:latest` was pointing to version 2.11.0.
>
> Therefore, the `pulsar-all:3.0.0` is actually a version 2.11.0 of
> Pulsar but with 3.0.0 connectors and offloaders.
>
> Please see more detail in this issue:
> https://github.com/apache/pulsar/issues/20420
>
> I have rebuilt the `pulsar-all:3.0.0` image:
>
> https://hub.docker.com/layers/snzkyang/pulsar-all/3.0.0/images/sha256-833ea988bce8c704b179cc4c9c38fac8980e108b0bc67454e06c22927990b169?context=explore
>
> Please help and verify it. And check if there are any other problems
> with the image.
>
> I'm going to publish the image to the `apachepulsar` organization to
> replace the old one. But before we do that, do we need a Vote or other
> ways to reach a consensus? Is there any problem if we replace the old
> image?
>
> Besides, I will also fix the docker build script to avoid similar issues.
>
> Thanks,
> Zike Yang
>

Re: [DISCUSS] PIP-272 Add a `StateStoreConfig` to the `WorkerConfig`

2023-05-29 Thread Asaf Mesika

Hi Pengcheng,

Looks like a solid improvement, definitely helping people using their own
state store.

I have a few comments:

1. Background knowledge should explain what is a state storage
2. Move problem description from Background Knowledge to Motivation.

I'm quoting the template to understand what should be included in
the Background knowledge section:

3. `WorkerConfig` - explain briefly what is Worker and how it differs from
Broker. Should be in background knowledge section.

4. Background knowledge should explain briefly what is a runtime and
runtime factory.

5.

Add a new cli argument to JavaInstanceStarter and LocalRunner so
> process runtime can pass state related config to them

Today all config goes through arguments and not a file?

6. `--stateStorageConfig`
  What format is the expected value?

7. `functions local run`
 What is this?

8. Are you keeping `stateStorageServiceUrl`? Maybe people rely on it?

9. Don't forget to include link to discussion thread using Apache Pony Mail

On Mon, May 29, 2023 at 10:44 AM Rui Fu  wrote:

> Hi Pengcheng,
>
> Thanks for bringing this up, the PIP lgtm, +1.
>
> Best,
>
> Rui Fu
> On May 29, 2023 at 13:52 +0800, Enrico Olivelli ,
> wrote:
> > Looks good
> > +1
> >
> > Enrico
> >
> > Il Lun 29 Mag 2023, 04:47 Pengcheng Jiang
> >  ha scritto:
> >
> > > Dear Pulsar community,
> > >
> > > I created a pip to make pulsar functions' `StateStoreProvider`
> configurable
> > > with custom configurations:
> https://github.com/apache/pulsar/issues/20419
> > >
> > > Any feedback and suggestions are welcome
> > >
> > > Sincerely
> > > Pengcheng Jiang
> > >
>

Re: [DISCUSS] PIP-265: PR-based system for managing and reviewing PIPs

2023-05-28 Thread Asaf Mesika

I pushed the first PR defining the new process:
https://github.com/apache/pulsar/pull/20418

On Wed, May 10, 2023 at 7:46 PM Asaf Mesika  wrote:

> The documentation has severe issues with diagrams in general, today.
> There is no standard way yet to do it. We have all kinds of ways to do
> diagrams, resulting in an inconsistent look for the documentation.
> I think it deserves its own discussion/PIP/issue.
>
> Regardless, I think it's part of a PIP to add documentation to describe
> the feature.
>
>
> On Wed, May 10, 2023 at 3:58 PM Dave Fisher  wrote:
>
>>
>>
>> Sent from my iPhone
>>
>> > On May 10, 2023, at 12:01 AM, Asaf Mesika 
>> wrote:
>> >
>> > On Tue, May 9, 2023 at 8:03 PM Dave Fisher  wrote:
>> >
>> >>
>> >>
>> >>>> On May 9, 2023, at 5:47 AM, Asaf Mesika 
>> wrote:
>> >>>
>> >>>> On Tue, May 9, 2023 at 5:18 AM Hang Chen 
>> wrote:
>> >>>
>> >>>> Thanks for driving this discussion.
>> >>>>
>> >>>> I agree to change the proposal discussion from issue and dev mail
>> list
>> >>>> to PR. It will be easier to review and comment, especially for large
>> >>>> proposals.
>> >>>> I have two questions about this change.
>> >>>> - Some proposals contain images, and putting those images into Pulsar
>> >>>> main repo will make the git db large. What's more, some images can be
>> >>>> up to several MBs
>> >>>>
>> >>>
>> >>> That's a great point, and we must address it in the PIP.
>> >>> How about we say that you only use:
>> >>> 1. Mermaid <https://mermaid.js.org/#/> - it's a tiny language to
>> create
>> >>> drawings? GitHub supports this language on code highlight and renders
>> it
>> >>> correctly.
>> >>
>> >> Does Docusaurus support Mermaid? The design documents for a PIP should
>> be
>> >> available for easy inclusion in pulsar-site.
>> >>
>> >
>> > Do we have plans to have a section dedicated to displays PIPs on the
>> > website ?
>>
>> We should make sure it is easy to convert a PIP into user documentation
>> of what is finally merged.
>>
>> Best,
>> Dave
>> >
>> >
>> >>
>> >>> 2. Use SVG files which will be located in a folder named after the pip
>> >>> issue number. SVG are vector graphics saved as text. For diagrams,
>> >>> they should be ok in size, and compress well.
>> >>>
>> >>> I think Mermaid should be enough for all drawings needed for
>> illustration
>> >>> of design document purposes. WDYT?
>> >>
>> >> I think that any reasonable format should be OK, but easily editable
>> >> versions should be preferred. All modern tools ought to be able to
>> export
>> >> SVG and all modern browsers render them.
>> >>
>> >> Best,
>> >> Dave
>> >>
>> >>>
>> >>>
>> >>>
>> >>>> - After merging one proposal, if we want to update the content, do we
>> >>>> need to discuss it in the dev mail list or just push one PR to update
>> >>>> it?
>> >>>>
>> >>>
>> >>> Does it happen often?
>> >>> I guess if the change is not big, it's ok just to do PR.
>> >>>
>> >>> I can clarify that as well, if it is agreed upon.
>> >>>
>> >>>
>> >>>>
>> >>>> Thanks,
>> >>>> Hang
>> >>>>
>> >>>>
>> >>>> PengHui Li  于2023年5月8日周一 18:06写道：
>> >>>>>
>> >>>>> Thanks for driving the improvements in proposal managing and
>> reviewing.
>> >>>>> The proposal looks good to me. I have only one question about the
>> dir
>> >>>> name
>> >>>>> for the pips.
>> >>>>>
>> >>>>> For now, we have
>> >>>> https://github.com/apache/pulsar/tree/master/wiki/proposals
>> >>>>> Is it better to use the existing one? Or change the existing one to
>> >>>> "pip".
>> >>>>> I mean, we'd better don't use two dirs for proposals.
>> >>>>>
>> >>>>> Thanks,
>> >>>>> Penghui
>> >>>>>
>> >>>>> On Sun, May 7, 2023 at 5:52 PM Asaf Mesika 
>> >>>> wrote:
>> >>>>>
>> >>>>>> Ping, in case it was lost in the barrage of mails
>> >>>>>>
>> >>>>>> On Sun, Apr 30, 2023 at 10:38 AM Asaf Mesika <
>> asaf.mes...@gmail.com>
>> >>>>>> wrote:
>> >>>>>>
>> >>>>>>> Hi,
>> >>>>>>>
>> >>>>>>> I've summarized all comments from
>> >>>>>>> https://lists.apache.org/thread/5kpddlfh5xdbsjmv47ymnk3z6wd92jbh
>> >>>> into a
>> >>>>>>> PIP.
>> >>>>>>>
>> >>>>>>> PIP: https://github.com/apache/pulsar/issues/20207
>> >>>>>>> <https://github.com/apache/pulsar/issues/20207>
>> >>>>>>>
>> >>>>>>> I'm leaving this discussion open for 2-3 days to make sure I
>> haven't
>> >>>>>>> missed a comment, and proceed to vote, since we had most of the
>> >>>>>> discussion
>> >>>>>>> already in the link provided above.
>> >>>>>>>
>> >>>>>>> Thanks!
>> >>>>>>>
>> >>>>>>> Asaf
>> >>>>>>>
>> >>>>>>
>> >>>>
>> >>
>> >>
>>
>>

Re: [VOTE] PIP-265: PR-based system for managing and reviewing PIPs

2023-05-22 Thread Asaf Mesika

Who can help me by updating the wiki with pass vote to this?

On Mon, May 22, 2023 at 4:20 PM Asaf Mesika  wrote:

> Thank you all for your votes.
>
> Summary
> Binding votes: Peghui, Hang, Enrico, Mattisson,
> Non binding: Max, Zike
> Binding no vote (0): Dave Fisher
>
> Passed with 4 binding +1 and 2 non-binding +1 and 1 binding 0.
>
> I'll publish a separate email to start with the new process once I finish
> with the PRs
>
> On Mon, May 22, 2023 at 3:27 PM Enrico Olivelli 
> wrote:
>
>> +1 (binding)
>>
>> Enrico
>>
>> Il giorno dom 21 mag 2023 alle ore 18:05 Asaf Mesika
>>  ha scritto:
>> >
>> > We're not dropping. We're allowing both.
>> > Quoting from PIP below.
>> > Are you ok with this, Enrico?
>> > Again - we're trying this out, and we'll iterate after we feel this for
>> a
>> > while.
>> >
>> >
>> >
>> >1.
>> >
>> >People discuss using PR comments, each is its own-threaded comment.
>> >General comments can be made both as replies in the mailing list or
>> as
>> >general comment in the PR. After 10 PIPs in this way we’ll be able
>> to see
>> >what people gravitate towards and what’s more convenient and consider
>> >refining that.
>> >
>> >
>> > On Tue, May 16, 2023 at 8:55 AM Enrico Olivelli 
>> wrote:
>> >
>> > > Dave,
>> > > I don't think that we are dropping the discussion on the mailing list.
>> > > In that case I would cast a -1.
>> > >
>> > > Enrico
>> > >
>> > > Il giorno mar 16 mag 2023 alle ore 06:27 Dave Fisher
>> > >  ha scritto:
>> > > >
>> > > > -0 (binding) having discussions on the mailing list makes it much
>> easier
>> > > to understand decisions years in the future. Just yesterday I reviewed
>> > > email threads in another project from 16 years ago.
>> > > >
>> > > > Best,
>> > > > Dave
>> > > >
>> > > > Sent from my iPhone
>> > > >
>> > > > > On May 10, 2023, at 3:52 AM, Asaf Mesika 
>> > > wrote:
>> > > > >
>> > > > > Hi,
>> > > > >
>> > > > > I'm starting the vote process for PIP-265.
>> > > > >
>> > > > > Link: https://github.com/apache/pulsar/issues/20207
>> > > > >
>> > > > > Thanks!
>> > > > >
>> > > > > Asaf
>> > > >
>> > >
>>
>

Re: [VOTE] PIP-265: PR-based system for managing and reviewing PIPs

2023-05-22 Thread Asaf Mesika

Thank you all for your votes.

Summary
Binding votes: Peghui, Hang, Enrico, Mattisson,
Non binding: Max, Zike
Binding no vote (0): Dave Fisher

Passed with 4 binding +1 and 2 non-binding +1 and 1 binding 0.

I'll publish a separate email to start with the new process once I finish
with the PRs

On Mon, May 22, 2023 at 3:27 PM Enrico Olivelli  wrote:

> +1 (binding)
>
> Enrico
>
> Il giorno dom 21 mag 2023 alle ore 18:05 Asaf Mesika
>  ha scritto:
> >
> > We're not dropping. We're allowing both.
> > Quoting from PIP below.
> > Are you ok with this, Enrico?
> > Again - we're trying this out, and we'll iterate after we feel this for a
> > while.
> >
> >
> >
> >1.
> >
> >People discuss using PR comments, each is its own-threaded comment.
> >General comments can be made both as replies in the mailing list or as
> >general comment in the PR. After 10 PIPs in this way we’ll be able to
> see
> >what people gravitate towards and what’s more convenient and consider
> >refining that.
> >
> >
> > On Tue, May 16, 2023 at 8:55 AM Enrico Olivelli 
> wrote:
> >
> > > Dave,
> > > I don't think that we are dropping the discussion on the mailing list.
> > > In that case I would cast a -1.
> > >
> > > Enrico
> > >
> > > Il giorno mar 16 mag 2023 alle ore 06:27 Dave Fisher
> > >  ha scritto:
> > > >
> > > > -0 (binding) having discussions on the mailing list makes it much
> easier
> > > to understand decisions years in the future. Just yesterday I reviewed
> > > email threads in another project from 16 years ago.
> > > >
> > > > Best,
> > > > Dave
> > > >
> > > > Sent from my iPhone
> > > >
> > > > > On May 10, 2023, at 3:52 AM, Asaf Mesika 
> > > wrote:
> > > > >
> > > > > Hi,
> > > > >
> > > > > I'm starting the vote process for PIP-265.
> > > > >
> > > > > Link: https://github.com/apache/pulsar/issues/20207
> > > > >
> > > > > Thanks!
> > > > >
> > > > > Asaf
> > > >
> > >
>

Re: [DISCUSS] PIP-270 Add config to set metadata size threshold for compression.

2023-05-22 Thread Asaf Mesika

Hi,

I have a couple of questions:

current we support to set compression on ManagedLedger and ManagedCursor
> metadata
> by managedLedgerInfoCompressionType and managedCursorInfoCompressionType

Can you please elaborate what is the metadata of ManagedLedger and
ManagedCursor?
Where is that metadata saved to?
Can you elaborate the managed ledger role in short - it's the underlying
object for topics right?

I would be grateful if you can expand the PIP, so the reader will have
answers to those questions when reading it.

On Thu, May 11, 2023 at 7:28 PM Enrico Olivelli  wrote:

> Makes sense to me.
> Thanks for your contribution
>
> Enrico
>
> Il giorno gio 11 mag 2023 alle ore 18:01 lifepuzzlefun
>  ha scritto:
> >
> > Dear Pulsar community,
> >
> >
> > I create a pip aim to add configuration on size based metadata
> compresssion.
> >
> >
> > https://github.com/apache/pulsar/issues/20307
> >
> >
> > We welcome your feedback and suggestions on this proposal.
> >
>

Re: [VOTE] PIP-265: PR-based system for managing and reviewing PIPs

2023-05-21 Thread Asaf Mesika

We're not dropping. We're allowing both.
Quoting from PIP below.
Are you ok with this, Enrico?
Again - we're trying this out, and we'll iterate after we feel this for a
while.

   1.

   People discuss using PR comments, each is its own-threaded comment.
   General comments can be made both as replies in the mailing list or as
   general comment in the PR. After 10 PIPs in this way we’ll be able to see
   what people gravitate towards and what’s more convenient and consider
   refining that.

On Tue, May 16, 2023 at 8:55 AM Enrico Olivelli  wrote:

> Dave,
> I don't think that we are dropping the discussion on the mailing list.
> In that case I would cast a -1.
>
> Enrico
>
> Il giorno mar 16 mag 2023 alle ore 06:27 Dave Fisher
>  ha scritto:
> >
> > -0 (binding) having discussions on the mailing list makes it much easier
> to understand decisions years in the future. Just yesterday I reviewed
> email threads in another project from 16 years ago.
> >
> > Best,
> > Dave
> >
> > Sent from my iPhone
> >
> > > On May 10, 2023, at 3:52 AM, Asaf Mesika 
> wrote:
> > >
> > > Hi,
> > >
> > > I'm starting the vote process for PIP-265.
> > >
> > > Link: https://github.com/apache/pulsar/issues/20207
> > >
> > > Thanks!
> > >
> > > Asaf
> >
>

Re: [DISCUSS] PIP-264: Enhanced OTel-based metric system

2023-05-21 Thread Asaf Mesika

Thanks for the reply, Enrico.
Completely agree.
This made me realize my TL;DR wasn't talking about export.
I added this to it:

---
Pulsar OTel Metrics will support exporting as Prometheus HTTP endpoint
(`/metrics` but different port) for backward compatibility and also OLTP,
so you can push the metrics to OTel Collector and from there ship it to any
destination.
---

OTel supports two kinds of exporter: Prometheus (HTTP) and OTLP (push).
We'll just configure to use them.



On Mon, May 15, 2023 at 10:35 AM Enrico Olivelli 
wrote:

> Asaf,
> thanks for contributing in this area.
> Metrics are a fundamental feature of Pulsar.
>
> Currently I find it very awkward to maintain metrics, and also I see
> it as a problem to support only Prometheus.
>
> Regarding your proposal, IIRC in the past someone else proposed to
> support other metrics systems and they have been suggested to use a
> sidecar approach,
> that is to add something next to Pulsar services that served the
> metrics in the preferred format/way.
> I find that the sidecar approach is too inefficient and I am not
> proposing it (but I wanted to add this reference for the benefit of
> new people on the list).
>
> I wonder if it would be possible to keep compatibility with the
> current Prometheus based metrics.
> Now Pulsar reached a point in which is is widely used by many
> companies and also with big clusters,
> telling people that they have to rework all the infrastructure related
> to metrics because we don't support Prometheus anymore or because we
> changed radically the way we publish metrics
> It is a step that seems too hard from my point of view.
>
> Currently I believe that compatibility is more important than
> versatility, and if we want to introduce new (and far better) features
> we must take it into account.
>
> So my point is that I generally support the idea of opening the way to
> Open Telemetry, but we must have a way to not force all of our users
> to throw away their alerting systems, dashboards and know-how in
> troubleshooting Pulsar problems in production and dev
>
> Best regards
> Enrico
>
> Il giorno lun 15 mag 2023 alle ore 02:17 Dave Fisher
>  ha scritto:
> >
> >
> >
> > > On May 10, 2023, at 1:01 AM, Asaf Mesika 
> wrote:
> > >
> > > On Tue, May 9, 2023 at 11:29 PM Dave Fisher  wrote:
> > >
> > >>
> > >>
> > >>>> On May 8, 2023, at 2:49 AM, Asaf Mesika 
> wrote:
> > >>>
> > >>> Your feedback made me realized I need to add "TL;DR" section, which I
> > >> just
> > >>> added.
> > >>>
> > >>> I'm quoting it here. It gives a brief summary of the proposal, which
> > >>> requires up to 5 min of read time, helping you get a high level
> picture
> > >>> before you dive into the background/motivation/solution.
> > >>>
> > >>> --
> > >>> TL;DR
> > >>>
> > >>> Working with Metrics today as a user or a developer is hard and has
> many
> > >>> severe issues.
> > >>>
> > >>> From the user perspective:
> > >>>
> > >>>  - One of Pulsar strongest feature is "cheap" topics so you can
> easily
> > >>>  have 10k - 100k topics per broker. Once you do that, you quickly
> learn
> > >> that
> > >>>  the amount of metrics you export via "/metrics" (Prometheus style
> > >> endpoint)
> > >>>  becomes really big. The cost to store them becomes too high, queries
> > >>>  time-out or even "/metrics" endpoint it self times out.
> > >>>  The only option Pulsar gives you today is all-or-nothing filtering
> and
> > >>>  very crude aggregation. You switch metrics from topic aggregation
> > >> level to
> > >>>  namespace aggregation level. Also you can turn off producer and
> > >> consumer
> > >>>  level metrics. You end up doing it all leaving you "blind", looking
> at
> > >> the
> > >>>  metrics from a namespace level which is too high level. You end up
> > >>>  conjuring all kinds of scripts on top of topic stats endpoint to
> glue
> > >> some
> > >>>  aggregated metrics view for the topics you need.
> > >>>  - Summaries (metric type giving you quantiles like p95) which are
> used
> > >>>  in Pulsar, can't be aggregated across topics / brokers due its
> inherent
> > >>>  design.
> > >>>

Re: [DISCUSS] PIP-264: Enhanced OTel-based metric system

2023-05-14 Thread Asaf Mesika

Bumping, as I have received only 1 partial feedback so far on the PIP.

Any feedback is highly appreciated!

Thanks!

Asaf

On Wed, May 10, 2023 at 11:00 AM Asaf Mesika  wrote:

>
>
> On Tue, May 9, 2023 at 11:29 PM Dave Fisher  wrote:
>
>>
>>
>> > On May 8, 2023, at 2:49 AM, Asaf Mesika  wrote:
>> >
>> > Your feedback made me realized I need to add "TL;DR" section, which I
>> just
>> > added.
>> >
>> > I'm quoting it here. It gives a brief summary of the proposal, which
>> > requires up to 5 min of read time, helping you get a high level picture
>> > before you dive into the background/motivation/solution.
>> >
>> > --
>> > TL;DR
>> >
>> > Working with Metrics today as a user or a developer is hard and has many
>> > severe issues.
>> >
>> > From the user perspective:
>> >
>> >   - One of Pulsar strongest feature is "cheap" topics so you can easily
>> >   have 10k - 100k topics per broker. Once you do that, you quickly
>> learn that
>> >   the amount of metrics you export via "/metrics" (Prometheus style
>> endpoint)
>> >   becomes really big. The cost to store them becomes too high, queries
>> >   time-out or even "/metrics" endpoint it self times out.
>> >   The only option Pulsar gives you today is all-or-nothing filtering and
>> >   very crude aggregation. You switch metrics from topic aggregation
>> level to
>> >   namespace aggregation level. Also you can turn off producer and
>> consumer
>> >   level metrics. You end up doing it all leaving you "blind", looking
>> at the
>> >   metrics from a namespace level which is too high level. You end up
>> >   conjuring all kinds of scripts on top of topic stats endpoint to glue
>> some
>> >   aggregated metrics view for the topics you need.
>> >   - Summaries (metric type giving you quantiles like p95) which are used
>> >   in Pulsar, can't be aggregated across topics / brokers due its
>> inherent
>> >   design.
>> >   - Plugin authors spend too much time on defining and exposing metrics
>> to
>> >   Pulsar since the only interface Pulsar offers is writing your metrics
>> by
>> >   your self as UTF-8 bytes in Prometheus Text Format to byte stream
>> interface
>> >   given to you.
>> >   - Pulsar histograms are exported in a way that is not conformant with
>> >   Prometheus, which means you can't get the p95 quantile on such
>> histograms,
>> >   making them very hard to use in day to day life.
>>
>> What version of DataSketches is used to produce the histogram? Is is
>> still an old Yahoo one, or are we using an updated one from Apache
>> DataSketches?
>>
>> Seems like this is a single PR/small PIP for 3.1?
>
>
> Histograms are a list of buckets, each is a counter.
> Summary is a collection of values collected over a time window, which at
> the end you get a calculation of the quantiles of those values: p95, p50,
> and those are exported from Pulsar.
>
> Pulsar histogram do not use Data Sketches. They are just counters.
> They are not adhere to Prometheus since:
> a. The counter is expected to be cumulative, but Pulsar resets each bucket
> counter to 0 every 1 min
> b. The bucket upper range is expected to be written as an attribute "le"
> but today it is encoded in the name of the metric itself.
>
> This is a breaking change, hence hard to mark in any small release.
> This is why it's part of this PIP since so many things will break, and all
> of them will break on a separate layer (OTel metrics), hence not break
> anyone without their consent.
>
>
>
>>
>>
>> >   - Too many metrics are rates which also delta reset every interval you
>> >   configure in Pulsar and restart, instead of relying on cumulative
>> (ever
>> >   growing) counters and let Prometheus use its rate function.
>> >   - and many more issues
>> >
>> > From the developer perspective:
>> >
>> >   - There are 4 different ways to define and record metrics in Pulsar:
>> >   Pulsar own metrics library, Prometheus Java Client, Bookkeeper metrics
>> >   library and plain native Java SDK objects (AtomicLong, ...). It's very
>> >   confusing for the developer and create inconsistencies for the end
>> user
>> >   (e.g. Summary for example is different in each).
>> >   - Patching your metrics into "/metrics" Prometheus endpoint is
>&g

Re: [DISCUSS] PIP-268: Add support of topic stats/stats-internal using client api

2023-05-14 Thread Asaf Mesika

If I dive into the exact details of the cause of the
performance implications in current Admin HTTP API:

Do you think the root cause of the performance is the Jersey implementation
of `AsyncResponse.resume(stats)` which takes a thread from a thread pool,
serialize the object and then performs a blocking I/O write of the JSON
string?

If I compared that with Netty HTTP, it would write the same String using
async I/O and not blocking a thread.

Given the normally large response size of those objects, the response
headers or request headers are negligible in terms of performance impact.

In terms of accepting connection, both Netty and Jetty has async IO
implementation.

Compared Jetty and Jersey with Netty based binary TCP, if we end up writing
a JSON string, the only difference I see is the blocking I/O of writing the
response.

WDYT?


On Fri, May 12, 2023 at 7:29 AM Rajan Dhabalia  wrote:

> Communicating over binary protocol is more scalable and performant than
> HTTP. Admin API over http has a long history of bottleneck and performance
> issues which could also sometimes be a bottleneck for lookup requests and
> that was the reason we introduced lookup over binary protocol as well. We
> have multiple usecases which require fetching stats with relatively higher
> rate and definitely we would like to avoid it over http which could be a
> bottleneck for those applications or could be for others.
> This PIP doesn't mention security so, let's not misinterpret the usecases.
> Sometimes, pulsar is deployed behind the proxy and potentially used SNI
> routing proxy which can't be used as http proxy and we would like to let
> users access stats for a given topic using the same broker-service url
> rather than having separate http endpoints. So, this api addresses scale,
> performance, and use accessibility in pulsar.
>
> Thanks,
> Rajan
>
> On Thu, May 11, 2023 at 6:24 AM Asaf Mesika  wrote:
>
> > Before I dive into the PIP, I have several questions on the background
> > provided below:
> >
> >
> > On Tue, May 9, 2023 at 9:08 AM Rajan Dhabalia 
> > wrote:
> >
> > > Hi,
> > >
> > > Right now, Pulsar provides the topic's stats and stats-internal over
> HTTP
> > > admin API, and this stats data is used by user applications and also by
> > > Pulsar internal components such as Pulsar-functions to derive the
> certain
> > > states of the applications.
> > > for example, there are use cases where the application wants to check
> the
> > > topic's backlog, subscription's state (readPosition, list of
> > > subscriptions), numberOfEntriesSinceFirstNotAckedMessage, etc to
> > bootstrap
> > > the application or handle the application’s resiliency and state
> > > dynamically. Applications can retrieve this stats information by using
> > the
> > > broker’s admin HTTP APIs.
> > >
> > > However, stats retrieval over HTTP API doesn’t work well in use cases
> > when
> > > users would like to access this API at a higher scale when a large
> number
> > > of application nodes would like to use it over HTTP which could
> overload
> > > brokers and sometimes makes broker irresponsive and impact admin API
> > > performance. It also becomes difficult when Pulsar is deployed in the
> > cloud
> > > behind the SNI proxy and applications also want to access large-scale
> > stats
> > > information periodically over different HTTP ports. Instead it would be
> > > better if applications can fetch stats over on the same binary protocol
> > for
> > > scalability and accessibility reasons.
> > >
> >
> > Why do you think using a binary protocol over HTTP would make more
> > performant to respond to multiple calls at once?
> > Same question but for the security issue - why do you think the HTTP port
> > of admin API is harder to access than the binary protocol port?
> >
> >
> >
> >
> > >
> > > Therefore, there are multiple use cases where producer/consumer
> > > applications need stats information for topics using the client library
> > > over binary protocol. Hence, this PIP introduces client API for
> producers
> > > and consumers to access topic stats/internal-stats information which
> can
> > be
> > > used by applications as needed.
> > >
> > > Please visit and review the PIP:
> > > https://github.com/apache/pulsar/issues/20265
> > >
> > >
> > > Thanks,
> > >
> > > Rajan
> > >
> >
>

Re: [DISCUSS] PIP-267: Support multi-topic messageId deserialization to ack messages

2023-05-12 Thread Asaf Mesika

I don't get it - you say msgId is a data structure contained within
MessageId implementation, right? I presume msgId is the data structure the
client transmit to the server, so that means you are transmitting topic to
the server?


On Fri, May 12, 2023 at 7:45 AM Rajan Dhabalia  wrote:

> Thank you for sharing your knowledge about the PIP which should be created
> before PR and I think everyone in the community knows about it. but you can
> check the PR for context which was blocked for sometime and we decided to
> create PIP with proto changes.
>
> This PIP/PR tries to fix the issue where partitioned topic fails while
> acking deserialized messageId. topic name will be part of MsgIdData which
> is the data-structure used by messageID to store msgID context along with
> partition, batching, and other metadata. topic name will be attached only
> when the user tries to serialize and deserialize the messageId which will
> be purely client side implementation and in other cases it will not be
> transmitted to server. Also, partitioned topic's abstract concept for user
> and messageID must be also remain abstract for users and users must not
> know about different implementation of messageId and our goal is to
> maintain that abstraction without telling user about MessageIdImpl or
> TopicMessageIdImpl.
>
> Thanks,
> Rajan
>
>
> On Thu, May 11, 2023 at 7:29 AM Asaf Mesika  wrote:
>
> > Hi Rajan,
> >
> > A few comments on the PIP as I couldn't understand it fully as some
> pieces
> > of information is missing.
> >
> > First, I would like to remind about the rules, that exists in the
> beginning
> > of the PIP template:
> >
> > 
> >
> >
> > In this specific case
> > 1. I would include explanation and detail the data structures fields of
> > objects you mentioned, such as: MessageIdImpl and MessageIdData.
> > 2. I would not put a PR as the design section, so I need to read code to
> > understand what the exact solution details.
> >
> > You wrote:
> >
> > > Pulsar api provides MessageId
> > > <
> >
> https://github.com/apache/pulsar/blob/master/pulsar-client-api/src/main/java/org/apache/pulsar/client/api/MessageId.java
> >
> > interface
> > > which is generally used by producer and consumer applications to manage
> > > topic offset.
> >
> >
> > I think it's used to allow consumers to acknowledge (can be per message)
> so
> > offset if wrong terminology here.
> > For producers, not sure exactly its usage. Maybe if they need to refer to
> > this message later when reading by Reader interface.
> > I would correct this section.
> >
> > However, right now Pulsar doesn't support correct deserialization of
> > > multi-topic or partitioned-topic because of that 1acknowledge` API call
> > > fails for those topics with below error
> >
> >
> > You're saying that the acknowledgement API method signature receives
> > MessageId, but do not receive TopicMessageId?
> >
> > I have a few questions on that:
> >
> > 1. The acknowledgement API is part of Pulsar binary protocol. Is your
> plan
> > to alter that protocol so it will also include the topic field as part of
> > the message ID?
> >
> > 2. I think your PIP needs to explain the following items which are
> missing
> > as context:
> > - There are two implementation for MessageId interface, one for topic and
> > one for partitioned topic.
> > - The problem is that the serialization/desrialization method is used
> > mainly for translating the ID into the binary protocol, which only
> requires
> > the ID (ledger ID, entry ID).
> > - The reason for that is that once you created a consumer, it has a topic
> > attached to it. Transferring the topic for the ack is redundant.
> >
> > All of this needs to be in the background.
> >
> > I have several ideas on solving that, which IMO should mainly be in the
> > client level, but I must get answers to the questions above before I can
> > continue.
> >
> > Last note
> > You have basically placed a link to a pull request as the design solution
> > (high-level/detailed design).
> > The whole idea of the design is that you describe the solution without
> > resorting to code.
> > IMO you should amend the design, state the goal shortly, and have a high
> > level design section which contains 1-2 short paragraphs describing
> exactly
> > your solution.
> >
> > Thanks,
> >
> > Asaf
> >
> >
> >
> > On Tue, May 9, 2023 at 3:24 PM Yunze Xu  wrote:
> >
> > &

Re: [DISCUSS] Add checklist for PMC binding vote of PIP

2023-05-12 Thread Asaf Mesika

We have the whole thread answering that question Yunze.

On Fri, May 12, 2023 at 9:29 AM Yunze Xu  wrote:

> I found I just misunderstood the "checklist" you mean. I thought it's
> more like a "summary" of a proposal. So I thought you wanted the
> reviewers to give a summary list and select which of them are
> understood. But why do we need a checklist? Is there any reason that
> any item of the list is not selected?
>
> Thanks,
> Yunze
>
>
>
> On Wed, May 10, 2023 at 3:32 PM Asaf Mesika  wrote:
> >
> > Hi Yunze,
> >
> > Thanks for the feedback.
> >
> > I re-read your comments 3 times and I can't seem to be able to understand
> > your key points in the matter of the checklist, so I have some
> > clarification questions:
> >
> > 1. You said you reviewed PIP-261, remembered the checklist proposal, but
> > couldn't add it. Can you explain why?
> > 2. Why would the author of a PIP give you a checklist for their vote? Can
> > you please expand on that?
> > I completely agree if the author of PIP needs to add a checklist it
> > will burden, hence I don't see the reason for it and didn't suggest it.
> > 3. You say you want the process of PIP to be more friendly to
> contributors.
> >  a) Can you please explain which changes you propose to make it more
> > friendly?
> >  b) The checklist is for the voters (mainly PMC members), not the PIP
> > authors. Why would adding the checklist create any burden for the PIP
> > author and make the PIP process unfriendly?
> >
> > 4. In the 2nd paragraph, if I try to summarize, you say it's hard to
> avoid
> > changes between the implementation of the PIP and the PIP it self. Also,
> > it's hard to review PIP implementation since it's divided to many PRs.
> > Can you please explain the connection between this and a checklist
> for
> > voters on PIP?
> >
> > 5. You said a checklist won't solve the key difficulties you described
> for
> > a huge PIP.
> >  You are correct. It won't. It's the goal of the checklist to solve
> > those, at all.
> >
> >  My main goal in the checklist is to make sure that a person, having
> > basic Pulsar user knowledge, can read the PIP and fully understand it.
> >
> > You think the checklist doesn't serve that goal?
> >
> > I think for huge PIPs it's even more important that the PIP will be
> > coherent for the reader and supply all background knowledge.
> >
> > 6. I agree with you that implementation can avoid following the design,
> but
> > it's a completely different problem we need to solve, unrelated to the
> > checklist goal. Let's open a separate discussion for it to brainstorm.
> >
> > 7. "A complicated proposal could not be understood by many reviewers. If
> > the author left the community, it could be a hard job to maintain it."
> >
> >   This is exactly what I want to avoid.
> >   When you vote +1, you must make sure most people reading it can
> > understand it.  If it's not, let's help the author making it so. It must
> be
> > the minimum bar for any PIP.
> >   The checklist is to remind you of that.
> >   If the design can be easily understandable, you just made the
> > implementation x10 easier to follow and maintain when the authors leave
> the
> > project.
> >
> >
> >
> >
> >
> > On Tue, May 9, 2023 at 9:39 PM Yunze Xu 
> > wrote:
> >
> > > I cannot agree more with Dave's comments.
> > >
> > > I just reviewed PIP-261 and PIP-264 yesterday. When I gave +1 to
> > > PIP-261, I recalled this thread so I'm wondering if I can add a
> > > checklist. Eventually, I did not do that. IMO, it's the author's
> > > responsibility to give a checklist for authors to choose for his/her
> > > proposal. However, it burdens the new contributors to the community.
> > > PIPs should be more friendly to new contributors. That's also my
> > > perspective to Rajan's concern: we should still require a PIP for
> > > changes of metrics or configurations, but the process should be more
> > > friendly to new contributors.
> > >
> > > When I reviewed PIP-264, I recalled PIP-45 and PIP-192 as well, while
> > > PIP-264 is much more huge than them. Accidentally, I was developing
> > > KoP for 2.8.0 (not released) when PIP-45 was in progress. It's really
> > > annoying to see the interfaces changed again and again in the master
> > > branch. The partner developers maintain their own version of Pulsar
> > &g

Re: [VOTE] PIP-270 Add config to set metadata size threshold for compression

2023-05-11 Thread Asaf Mesika

30 minutes is not enough time to read a pip :)



On Thu, 11 May 2023 at 19:04 lifepuzzlefun  wrote:

> Hello Pulsar community,
>
> This thread is to start a PIP-270 Add config to set metadata size
> threshold for compression
>
>
> Discussion thread:
> https://lists.apache.org/thread/6930c74m31rflrql9y3dpjmm0sbccqkb Issue:
> https://github.com/apache/pulsar/issues/20307
> Voting will be open for at least 48 hours. Thanks!
> : - )
>
>

Re: [DISCUSS] PIP-267: Support multi-topic messageId deserialization to ack messages

2023-05-11 Thread Asaf Mesika

Hi Rajan,

A few comments on the PIP as I couldn't understand it fully as some pieces
of information is missing.

First, I would like to remind about the rules, that exists in the beginning
of the PIP template:

In this specific case
1. I would include explanation and detail the data structures fields of
objects you mentioned, such as: MessageIdImpl and MessageIdData.
2. I would not put a PR as the design section, so I need to read code to
understand what the exact solution details.

You wrote:

> Pulsar api provides MessageId
> 
>  interface
> which is generally used by producer and consumer applications to manage
> topic offset.

I think it's used to allow consumers to acknowledge (can be per message) so
offset if wrong terminology here.
For producers, not sure exactly its usage. Maybe if they need to refer to
this message later when reading by Reader interface.
I would correct this section.

However, right now Pulsar doesn't support correct deserialization of
> multi-topic or partitioned-topic because of that 1acknowledge` API call
> fails for those topics with below error

You're saying that the acknowledgement API method signature receives
MessageId, but do not receive TopicMessageId?

I have a few questions on that:

1. The acknowledgement API is part of Pulsar binary protocol. Is your plan
to alter that protocol so it will also include the topic field as part of
the message ID?

2. I think your PIP needs to explain the following items which are missing
as context:
- There are two implementation for MessageId interface, one for topic and
one for partitioned topic.
- The problem is that the serialization/desrialization method is used
mainly for translating the ID into the binary protocol, which only requires
the ID (ledger ID, entry ID).
- The reason for that is that once you created a consumer, it has a topic
attached to it. Transferring the topic for the ack is redundant.

All of this needs to be in the background.

I have several ideas on solving that, which IMO should mainly be in the
client level, but I must get answers to the questions above before I can
continue.

Last note
You have basically placed a link to a pull request as the design solution
(high-level/detailed design).
The whole idea of the design is that you describe the solution without
resorting to code.
IMO you should amend the design, state the goal shortly, and have a high
level design section which contains 1-2 short paragraphs describing exactly
your solution.

Thanks,

Asaf

On Tue, May 9, 2023 at 3:24 PM Yunze Xu  wrote:

> I'm talking about whether to add a new separate API. I'm concerned
> about whether existing applications would be affected, no matter if
> the existing implementation has the limitation. If yes, we should
> document it in the PIP so that users can know that.
>
> > it's a new optional field which would not break the compatibility
>
> And yes, I just confirmed it with simple demos in my local env. So I'm
> +1 to this proposal.
>
> Thanks,
> Yunze
>
> On Tue, May 9, 2023 at 3:05 PM Rajan Dhabalia 
> wrote:
> >
> > Weill there are multiple things: it's a new optional field which would
> not
> > break the compatibility , also current messaegId serialization and
> > deserialization anyway only impact multi-topic consumer which is already
> > broken or has the limitation and, adding a new separate API for
> partitioned
> > topic is not only not acceptable but creates too much confusion for users
> > to use separate ack APIs for non-partition and partition topics and that
> > also breaks partitioned topic abstraction which we would like to avoid.
> >
> > Thanks,
> > Rajan
> >
> > On Mon, May 8, 2023 at 11:27 PM Yunze Xu  wrote:
> >
> > > It seems that `TopicMessageIdImpl#toByteArray` now could serialize the
> > > optional topic field to the bytes. I didn't test it now but I have a
> > > concern about whether it would bring a breaking change.
> > >
> > > Assuming there are two applications (let's say A and B) based on an
> > > older Pulsar client, A writes serialized bytes into a file, B reads
> > > bytes from the file and parses it to a MessageId. If A upgraded its
> > > Pulsar client to the latest while B did not, what would happen? Could
> > > B still get the correct MessageId or the bytes would not be able to
> > > parsed?
> > >
> > > P.S. it's better to add the API changes and potential breaking changes
> > > in the proposal.
> > >
> > > Thanks,
> > > Yunze
> > >
> > > On Tue, May 9, 2023 at 1:59 PM Rajan Dhabalia 
> > > wrote:
> > > >
> > > > Hi,
> > > >
> > > > Pulsar api provides MessageId interface which is generally used by
> > > producer
> > > > and consumer applications to manage topic offset. Sometimes, these
> > > > applications would like to serialize and deserialize messageIds,
> > > > specifically consumer app which would like to persist messageId and
> ack
> > > > with

Re: [DISCUSS] PIP-268: Add support of topic stats/stats-internal using client api

2023-05-11 Thread Asaf Mesika

Before I dive into the PIP, I have several questions on the background
provided below:


On Tue, May 9, 2023 at 9:08 AM Rajan Dhabalia  wrote:

> Hi,
>
> Right now, Pulsar provides the topic's stats and stats-internal over HTTP
> admin API, and this stats data is used by user applications and also by
> Pulsar internal components such as Pulsar-functions to derive the certain
> states of the applications.
> for example, there are use cases where the application wants to check the
> topic's backlog, subscription's state (readPosition, list of
> subscriptions), numberOfEntriesSinceFirstNotAckedMessage, etc to bootstrap
> the application or handle the application’s resiliency and state
> dynamically. Applications can retrieve this stats information by using the
> broker’s admin HTTP APIs.
>
> However, stats retrieval over HTTP API doesn’t work well in use cases when
> users would like to access this API at a higher scale when a large number
> of application nodes would like to use it over HTTP which could overload
> brokers and sometimes makes broker irresponsive and impact admin API
> performance. It also becomes difficult when Pulsar is deployed in the cloud
> behind the SNI proxy and applications also want to access large-scale stats
> information periodically over different HTTP ports. Instead it would be
> better if applications can fetch stats over on the same binary protocol for
> scalability and accessibility reasons.
>

Why do you think using a binary protocol over HTTP would make more
performant to respond to multiple calls at once?
Same question but for the security issue - why do you think the HTTP port
of admin API is harder to access than the binary protocol port?




>
> Therefore, there are multiple use cases where producer/consumer
> applications need stats information for topics using the client library
> over binary protocol. Hence, this PIP introduces client API for producers
> and consumers to access topic stats/internal-stats information which can be
> used by applications as needed.
>
> Please visit and review the PIP:
> https://github.com/apache/pulsar/issues/20265
>
>
> Thanks,
>
> Rajan
>

Re: [DISCUSS] PIP-265: PR-based system for managing and reviewing PIPs

2023-05-10 Thread Asaf Mesika

The documentation has severe issues with diagrams in general, today.
There is no standard way yet to do it. We have all kinds of ways to do
diagrams, resulting in an inconsistent look for the documentation.
I think it deserves its own discussion/PIP/issue.

Regardless, I think it's part of a PIP to add documentation to describe the
feature.


On Wed, May 10, 2023 at 3:58 PM Dave Fisher  wrote:

>
>
> Sent from my iPhone
>
> > On May 10, 2023, at 12:01 AM, Asaf Mesika  wrote:
> >
> > On Tue, May 9, 2023 at 8:03 PM Dave Fisher  wrote:
> >
> >>
> >>
> >>>> On May 9, 2023, at 5:47 AM, Asaf Mesika 
> wrote:
> >>>
> >>>> On Tue, May 9, 2023 at 5:18 AM Hang Chen  wrote:
> >>>
> >>>> Thanks for driving this discussion.
> >>>>
> >>>> I agree to change the proposal discussion from issue and dev mail list
> >>>> to PR. It will be easier to review and comment, especially for large
> >>>> proposals.
> >>>> I have two questions about this change.
> >>>> - Some proposals contain images, and putting those images into Pulsar
> >>>> main repo will make the git db large. What's more, some images can be
> >>>> up to several MBs
> >>>>
> >>>
> >>> That's a great point, and we must address it in the PIP.
> >>> How about we say that you only use:
> >>> 1. Mermaid <https://mermaid.js.org/#/> - it's a tiny language to
> create
> >>> drawings? GitHub supports this language on code highlight and renders
> it
> >>> correctly.
> >>
> >> Does Docusaurus support Mermaid? The design documents for a PIP should
> be
> >> available for easy inclusion in pulsar-site.
> >>
> >
> > Do we have plans to have a section dedicated to displays PIPs on the
> > website ?
>
> We should make sure it is easy to convert a PIP into user documentation of
> what is finally merged.
>
> Best,
> Dave
> >
> >
> >>
> >>> 2. Use SVG files which will be located in a folder named after the pip
> >>> issue number. SVG are vector graphics saved as text. For diagrams,
> >>> they should be ok in size, and compress well.
> >>>
> >>> I think Mermaid should be enough for all drawings needed for
> illustration
> >>> of design document purposes. WDYT?
> >>
> >> I think that any reasonable format should be OK, but easily editable
> >> versions should be preferred. All modern tools ought to be able to
> export
> >> SVG and all modern browsers render them.
> >>
> >> Best,
> >> Dave
> >>
> >>>
> >>>
> >>>
> >>>> - After merging one proposal, if we want to update the content, do we
> >>>> need to discuss it in the dev mail list or just push one PR to update
> >>>> it?
> >>>>
> >>>
> >>> Does it happen often?
> >>> I guess if the change is not big, it's ok just to do PR.
> >>>
> >>> I can clarify that as well, if it is agreed upon.
> >>>
> >>>
> >>>>
> >>>> Thanks,
> >>>> Hang
> >>>>
> >>>>
> >>>> PengHui Li  于2023年5月8日周一 18:06写道：
> >>>>>
> >>>>> Thanks for driving the improvements in proposal managing and
> reviewing.
> >>>>> The proposal looks good to me. I have only one question about the dir
> >>>> name
> >>>>> for the pips.
> >>>>>
> >>>>> For now, we have
> >>>> https://github.com/apache/pulsar/tree/master/wiki/proposals
> >>>>> Is it better to use the existing one? Or change the existing one to
> >>>> "pip".
> >>>>> I mean, we'd better don't use two dirs for proposals.
> >>>>>
> >>>>> Thanks,
> >>>>> Penghui
> >>>>>
> >>>>> On Sun, May 7, 2023 at 5:52 PM Asaf Mesika 
> >>>> wrote:
> >>>>>
> >>>>>> Ping, in case it was lost in the barrage of mails
> >>>>>>
> >>>>>> On Sun, Apr 30, 2023 at 10:38 AM Asaf Mesika  >
> >>>>>> wrote:
> >>>>>>
> >>>>>>> Hi,
> >>>>>>>
> >>>>>>> I've summarized all comments from
> >>>>>>> https://lists.apache.org/thread/5kpddlfh5xdbsjmv47ymnk3z6wd92jbh
> >>>> into a
> >>>>>>> PIP.
> >>>>>>>
> >>>>>>> PIP: https://github.com/apache/pulsar/issues/20207
> >>>>>>> <https://github.com/apache/pulsar/issues/20207>
> >>>>>>>
> >>>>>>> I'm leaving this discussion open for 2-3 days to make sure I
> haven't
> >>>>>>> missed a comment, and proceed to vote, since we had most of the
> >>>>>> discussion
> >>>>>>> already in the link provided above.
> >>>>>>>
> >>>>>>> Thanks!
> >>>>>>>
> >>>>>>> Asaf
> >>>>>>>
> >>>>>>
> >>>>
> >>
> >>
>
>

[VOTE] PIP-265: PR-based system for managing and reviewing PIPs

2023-05-10 Thread Asaf Mesika

Hi,

I'm starting the vote process for PIP-265.

Link: https://github.com/apache/pulsar/issues/20207

Thanks!

Asaf

Re: [DISCUSS] PIP-265: PR-based system for managing and reviewing PIPs

2023-05-10 Thread Asaf Mesika

I've also updated the doc regarding images, adding:

### Handling images
Since documents will now reside as files in git, we need to avoid large
image files.
Hence, we'll specify to author that images needs to be created using
[mermaidJS](https://mermaid.js.org/#/) diagram language, which GitHub
supports rendering. It covers 99% of the cases. For the 1% case, they can
use small file size format SVG, and make sure the file is 1k-5k size.

Since I haven't any other blocker for this PIP discussed and this issue has
been in discussion for almost 2 months, I'll open the vote for the PIP.

Thanks!

Asaf

On Wed, May 10, 2023 at 10:00 AM Asaf Mesika  wrote:

>
>
> On Tue, May 9, 2023 at 8:03 PM Dave Fisher  wrote:
>
>>
>>
>> > On May 9, 2023, at 5:47 AM, Asaf Mesika  wrote:
>> >
>> > On Tue, May 9, 2023 at 5:18 AM Hang Chen  wrote:
>> >
>> >> Thanks for driving this discussion.
>> >>
>> >> I agree to change the proposal discussion from issue and dev mail list
>> >> to PR. It will be easier to review and comment, especially for large
>> >> proposals.
>> >> I have two questions about this change.
>> >> - Some proposals contain images, and putting those images into Pulsar
>> >> main repo will make the git db large. What's more, some images can be
>> >> up to several MBs
>> >>
>> >
>> > That's a great point, and we must address it in the PIP.
>> > How about we say that you only use:
>> > 1. Mermaid <https://mermaid.js.org/#/> - it's a tiny language to create
>> > drawings? GitHub supports this language on code highlight and renders it
>> > correctly.
>>
>> Does Docusaurus support Mermaid? The design documents for a PIP should be
>> available for easy inclusion in pulsar-site.
>>
>
> Do we have plans to have a section dedicated to displays PIPs on the
> website ?
>
>
>>
>> > 2. Use SVG files which will be located in a folder named after the pip
>> > issue number. SVG are vector graphics saved as text. For diagrams,
>> > they should be ok in size, and compress well.
>> >
>> > I think Mermaid should be enough for all drawings needed for
>> illustration
>> > of design document purposes. WDYT?
>>
>> I think that any reasonable format should be OK, but easily editable
>> versions should be preferred. All modern tools ought to be able to export
>> SVG and all modern browsers render them.
>>
>> Best,
>> Dave
>>
>> >
>> >
>> >
>> >> - After merging one proposal, if we want to update the content, do we
>> >> need to discuss it in the dev mail list or just push one PR to update
>> >> it?
>> >>
>> >
>> > Does it happen often?
>> > I guess if the change is not big, it's ok just to do PR.
>> >
>> > I can clarify that as well, if it is agreed upon.
>> >
>> >
>> >>
>> >> Thanks,
>> >> Hang
>> >>
>> >>
>> >> PengHui Li  于2023年5月8日周一 18:06写道：
>> >>>
>> >>> Thanks for driving the improvements in proposal managing and
>> reviewing.
>> >>> The proposal looks good to me. I have only one question about the dir
>> >> name
>> >>> for the pips.
>> >>>
>> >>> For now, we have
>> >> https://github.com/apache/pulsar/tree/master/wiki/proposals
>> >>> Is it better to use the existing one? Or change the existing one to
>> >> "pip".
>> >>> I mean, we'd better don't use two dirs for proposals.
>> >>>
>> >>> Thanks,
>> >>> Penghui
>> >>>
>> >>> On Sun, May 7, 2023 at 5:52 PM Asaf Mesika 
>> >> wrote:
>> >>>
>> >>>> Ping, in case it was lost in the barrage of mails
>> >>>>
>> >>>> On Sun, Apr 30, 2023 at 10:38 AM Asaf Mesika 
>> >>>> wrote:
>> >>>>
>> >>>>> Hi,
>> >>>>>
>> >>>>> I've summarized all comments from
>> >>>>> https://lists.apache.org/thread/5kpddlfh5xdbsjmv47ymnk3z6wd92jbh
>> >> into a
>> >>>>> PIP.
>> >>>>>
>> >>>>> PIP: https://github.com/apache/pulsar/issues/20207
>> >>>>> <https://github.com/apache/pulsar/issues/20207>
>> >>>>>
>> >>>>> I'm leaving this discussion open for 2-3 days to make sure I haven't
>> >>>>> missed a comment, and proceed to vote, since we had most of the
>> >>>> discussion
>> >>>>> already in the link provided above.
>> >>>>>
>> >>>>> Thanks!
>> >>>>>
>> >>>>> Asaf
>> >>>>>
>> >>>>
>> >>
>>
>>

Re: [DISCUSS] PIP-264: Enhanced OTel-based metric system

2023-05-10 Thread Asaf Mesika

On Tue, May 9, 2023 at 11:29 PM Dave Fisher  wrote:

>
>
> > On May 8, 2023, at 2:49 AM, Asaf Mesika  wrote:
> >
> > Your feedback made me realized I need to add "TL;DR" section, which I
> just
> > added.
> >
> > I'm quoting it here. It gives a brief summary of the proposal, which
> > requires up to 5 min of read time, helping you get a high level picture
> > before you dive into the background/motivation/solution.
> >
> > --
> > TL;DR
> >
> > Working with Metrics today as a user or a developer is hard and has many
> > severe issues.
> >
> > From the user perspective:
> >
> >   - One of Pulsar strongest feature is "cheap" topics so you can easily
> >   have 10k - 100k topics per broker. Once you do that, you quickly learn
> that
> >   the amount of metrics you export via "/metrics" (Prometheus style
> endpoint)
> >   becomes really big. The cost to store them becomes too high, queries
> >   time-out or even "/metrics" endpoint it self times out.
> >   The only option Pulsar gives you today is all-or-nothing filtering and
> >   very crude aggregation. You switch metrics from topic aggregation
> level to
> >   namespace aggregation level. Also you can turn off producer and
> consumer
> >   level metrics. You end up doing it all leaving you "blind", looking at
> the
> >   metrics from a namespace level which is too high level. You end up
> >   conjuring all kinds of scripts on top of topic stats endpoint to glue
> some
> >   aggregated metrics view for the topics you need.
> >   - Summaries (metric type giving you quantiles like p95) which are used
> >   in Pulsar, can't be aggregated across topics / brokers due its inherent
> >   design.
> >   - Plugin authors spend too much time on defining and exposing metrics
> to
> >   Pulsar since the only interface Pulsar offers is writing your metrics
> by
> >   your self as UTF-8 bytes in Prometheus Text Format to byte stream
> interface
> >   given to you.
> >   - Pulsar histograms are exported in a way that is not conformant with
> >   Prometheus, which means you can't get the p95 quantile on such
> histograms,
> >   making them very hard to use in day to day life.
>
> What version of DataSketches is used to produce the histogram? Is is still
> an old Yahoo one, or are we using an updated one from Apache DataSketches?
>
> Seems like this is a single PR/small PIP for 3.1?


Histograms are a list of buckets, each is a counter.
Summary is a collection of values collected over a time window, which at
the end you get a calculation of the quantiles of those values: p95, p50,
and those are exported from Pulsar.

Pulsar histogram do not use Data Sketches. They are just counters.
They are not adhere to Prometheus since:
a. The counter is expected to be cumulative, but Pulsar resets each bucket
counter to 0 every 1 min
b. The bucket upper range is expected to be written as an attribute "le"
but today it is encoded in the name of the metric itself.

This is a breaking change, hence hard to mark in any small release.
This is why it's part of this PIP since so many things will break, and all
of them will break on a separate layer (OTel metrics), hence not break
anyone without their consent.



>
>
> >   - Too many metrics are rates which also delta reset every interval you
> >   configure in Pulsar and restart, instead of relying on cumulative (ever
> >   growing) counters and let Prometheus use its rate function.
> >   - and many more issues
> >
> > From the developer perspective:
> >
> >   - There are 4 different ways to define and record metrics in Pulsar:
> >   Pulsar own metrics library, Prometheus Java Client, Bookkeeper metrics
> >   library and plain native Java SDK objects (AtomicLong, ...). It's very
> >   confusing for the developer and create inconsistencies for the end user
> >   (e.g. Summary for example is different in each).
> >   - Patching your metrics into "/metrics" Prometheus endpoint is
> >   confusing, cumbersome and error prone.
> >   - many more
> >
> > This proposal offers several key changes to solve that:
> >
> >   - Cardinality (supporting 10k-100k topics per broker) is solved by
> >   introducing a new aggregation level for metrics called Topic Metric
> Group.
> >   Using configuration, you specify for each topic its group (using
> >   wildcard/regex). This allows you to "zoom" out to a more detailed
> >   granularity level like groups instead of namespaces, which you control
> how
> >   many groups you'll have h

Re: [DISCUSS] PIP-264: Enhanced OTel-based metric system

2023-05-10 Thread Asaf Mesika

On Tue, May 9, 2023 at 9:33 PM Yunze Xu 
wrote:

> > Basically you have a full fledged metrics library objects: Meter, Gauge,
> Histogram, Counter.
>
> It sounds good, but not so attractive. Currently KoP implements its
> own metrics library objects. So after that, we need to leverage the
> similar classes from OTel.
>
Yes, well that's the cost associated with the benefit, not the benefit it
self :)
You can see we have many goals to solve in this PIP, among them some
serious pain people suffer today as users:
* Inability to observe 10k topics per broker and above (One of the key
advantages Pulsar has is many topics).
* Very expensive metrics. 100 UTS per topic. That's expensive, even for 1k
topics per broker.

So the PIP goal is to solve those (and many more).
The cost is that we need to make some heavy breaking changes, among them
Pulsar Plugin author like KoP (you) will need to spend time to migrate
their code. You are correct.

The attractive part is solving the pains of the user I described.
For existing plugin authors, OTel is not attractive, yes.
For future plugins authors, IMO, very attractive, since you remove a lot of
work they need to do for something so basic in today's world, such as
metrics.

 Is the cost worth it - this is what I'm trying to figure out by multiple
people's feedback.

> I want to talk a little more beyond that. IIUC, this proposal wants to
> replace the current metrics systems with the OTel. But for most
> developers and maintainers, the most important thing that they cared
> about might be how many changes could it bring?For example, currently
> the Grafana dashboards have been widely used. How many changes could
> it bring? Do users need to learn completely different dashboards? I
> asked this question before but it's not answered. Then I found the
> "Breaking changes" section. So many breaking changes are usually not
> acceptable.
>

The dashboards will not be changed in the way they look and their
semantics. Each panel remains.
The changes are internal to each panel, which means the queries will change
since the metric name will slightly change.

For the users, they will import the new dashboard, if they used it as is.
If someone created a custom dashboard, yes, they will have to invest some
time to upgrade it.
I think it's 1 hour top to make the fixes.

I will edit the PIP to clarify that.

Dashboards are mostly a user issue, so why do you think it's related to
developers and maintainers?

Regarding so many breaking changes are not acceptable - I'm new to this
community, hence I raise that here.
Do you find the amount of breaking changes not worth the huge benefit to
the users of Pulsar?
Do you have any suggestion to obtain same benefit and have smaller breaking
changes?

Please bear in mind, all changes are happening in a separate layer of Otel,
*co-existing* together with current metric system layer.
I'm not breaking anything until you make the switch.

> I see you listed a lot of problems for the current design. I think
> each of them needs a PIP or at least a PR to resolve if a breaking
> change could be made. Why not solve them one by one in Pulsar?
>
> That's precisely what I wrote in the PIP:
* It's a master PIP.
* Many sections will turn into sub PIPs

Meaning, each problem I mentioned would be solved one by one (in Pulsar, of
course).

The reason for this PIP (master pip) to be introduced, is to make sure we
first have an agreement from the community of developers and users before
we go and spend such a huge amount of work. 2nd reason is that the PIP was
done to ensure all general sub PIPs will align and nothing will surprise us
and find out after 1 year of work that we have stumbled into a wall which
we can't pass. The master gives you that guarantee.

> Thanks,
> Yunze
>
> On Mon, May 8, 2023 at 12:53 AM Asaf Mesika  wrote:
> >
> > On Sun, May 7, 2023 at 4:23 PM Yunze Xu 
> > wrote:
> >
> > > I'm excited to learn much more about metrics when I started reading
> > > this proposal. But I became more and more frustrated when I found
> > > there is still too much content left even if I've already spent much
> > > time reading this proposal. I'm wondering how much time did you expect
> > > reviewers to read through this proposal? I just recalled the
> > > discussion you started before [1]. Did you expect each PMC member that
> > > gives his/her +1 to read only parts of this proposal?
> > >
> >
> > I estimated around 2 hours needed for a reviewer.
> > I hate it being so long, but I simply couldn't find a way to downsize it
> > more. Furthermore, I consulted with my colleagues including Matteo, but
> we
> > couldn't see a way to scope it down.
> > Why? Because once you begin this journey, you need to know how it's going
> >

Re: [DISCUSS] Add checklist for PMC binding vote of PIP

2023-05-10 Thread Asaf Mesika

 helping them figure how to better make their PR.
> >
> > 3. For minor PIPs this is too much. Minor PIPs should be easy.
> >
> > 4. For master PIPs like your OTel nothing here is enough. Experience
> with PIP-45 and PIP-192 is that there will be breakage, divergence, and not
> everyone will agree on the result. You worked for 11 months in apparent
> secrecy, yet seemingly ignored Lari’s similar open discussion about scaling
> which occurred in the same time frame.
> >
> > Being overly dependent on rules is not a replacement for open discussion.
> >
> > Sorry if this seems harsh, but this is what I think as an individual.
> >
> > The ASF has a saying “Community over Code”
> >
> > Best,
> > Dave
> >
> > Sent from my iPhone
> >
> > > On May 7, 2023, at 9:55 AM, Asaf Mesika  wrote:
> > >
> > > I understand that Dave, and hence I only started a discussion.
> > > What do you think of last reply I made there?
> > >
> > >
> > >> On Sun, May 7, 2023 at 5:31 PM Dave Fisher 
> wrote:
> > >>
> > >>
> > >>
> > >> Sent from my iPhone
> > >>
> > >>>> On Apr 18, 2023, at 5:14 AM, Asaf Mesika 
> wrote:
> > >>>
> > >>> The problem I'm trying to solve is: lack of ability to understand
> PIPs.
> > >>> PIPs I had the chance of reading lack:
> > >>> * Background information: It should contain all background
> information
> > >>> necessary to understand the problem and the solution
> > >>> * Clarity: It should be written in a coherent and easy to understand
> way.
> > >>>
> > >>> I thought this could improve using 2 ways:
> > >>> 1. Define a clear template for PIPs - this should solve all the
> missing
> > >>> information. This is in progress.
> > >>> 2. Provide a checklist to verify the +1 voter check those 3 things:
> > >>> background information, clarity, solid technical solution.
> > >>>
> > >>> Both Enrico and Yunze say, if I understand correctly, that the +1
> voter
> > >>> checks those 3 things implicitly.
> > >>> Yet when I try to learn Pulsar by reading historical PIPs, I find
> some
> > >>> lacking on those things (clarity, background information) making it
> super
> > >>> hard for me to get onboard into Pulsar.
> > >>>
> > >>> Another aspect worth noting is: community increase. In my own
> opinion,
> > >>> documents with clarity and enough background information produce a
> > >> feeling
> > >>> of quality - high quality. Making Pulsar PIPs clear and have all
> > >>> information to understand them will help grow Pulsar adoption.
> > >>>
> > >>> Maybe incremental improvements are better.. If I understand
> correctly,
> > >> both
> > >>> Enrico and Yunze - you are ok with having a summary template, but
> have it
> > >>> non-required?
> > >>>
> > >>> Enrico - Regarding previous suggestions. Root cause - help make
> Pulsar
> > >>> better from my own perspective. Some suggestions may be super bad
> > >>> suggestions and hopefully some will be good :)
> > >>> This specific one - I validated with the PMC members in the weekly
> zoom
> > >>> meeting roughly 3 weeks ago, and got +1 across the board (we had 5
> > >> people).
> > >>> I did it since I felt it was a touchy subject.
> > >>
> > >> Nothing discussed in that meeting was a decision. PMC Members in the
> > >> community meeting are not making PMC decisions. Decisions are ONLY
> made
> > >> here. Whatever you may think I said my intent was for you to start
> this
> > >> discussion and only that.
> > >>
> > >> Best,
> > >> Dave
> > >>
> > >>>
> > >>> Thanks,
> > >>>
> > >>> Asaf
> > >>>
> > >>>
> > >>>
> > >>>
> > >>>
> > >>>
> > >>>> On Tue, Apr 18, 2023 at 9:15 AM Yunze Xu
> 
> > >>>> wrote:
> > >>>>
> > >>>> Basically I think describing how much work the reviewer did to give
> > >>>> his +1 is good. Just like the vote for a release, each +1 follows
> with
> > >>>> the verifi

Re: [DISCUSS] PIP-265: PR-based system for managing and reviewing PIPs

2023-05-10 Thread Asaf Mesika

On Tue, May 9, 2023 at 8:03 PM Dave Fisher  wrote:

>
>
> > On May 9, 2023, at 5:47 AM, Asaf Mesika  wrote:
> >
> > On Tue, May 9, 2023 at 5:18 AM Hang Chen  wrote:
> >
> >> Thanks for driving this discussion.
> >>
> >> I agree to change the proposal discussion from issue and dev mail list
> >> to PR. It will be easier to review and comment, especially for large
> >> proposals.
> >> I have two questions about this change.
> >> - Some proposals contain images, and putting those images into Pulsar
> >> main repo will make the git db large. What's more, some images can be
> >> up to several MBs
> >>
> >
> > That's a great point, and we must address it in the PIP.
> > How about we say that you only use:
> > 1. Mermaid <https://mermaid.js.org/#/> - it's a tiny language to create
> > drawings? GitHub supports this language on code highlight and renders it
> > correctly.
>
> Does Docusaurus support Mermaid? The design documents for a PIP should be
> available for easy inclusion in pulsar-site.
>

Do we have plans to have a section dedicated to displays PIPs on the
website ?


>
> > 2. Use SVG files which will be located in a folder named after the pip
> > issue number. SVG are vector graphics saved as text. For diagrams,
> > they should be ok in size, and compress well.
> >
> > I think Mermaid should be enough for all drawings needed for illustration
> > of design document purposes. WDYT?
>
> I think that any reasonable format should be OK, but easily editable
> versions should be preferred. All modern tools ought to be able to export
> SVG and all modern browsers render them.
>
> Best,
> Dave
>
> >
> >
> >
> >> - After merging one proposal, if we want to update the content, do we
> >> need to discuss it in the dev mail list or just push one PR to update
> >> it?
> >>
> >
> > Does it happen often?
> > I guess if the change is not big, it's ok just to do PR.
> >
> > I can clarify that as well, if it is agreed upon.
> >
> >
> >>
> >> Thanks,
> >> Hang
> >>
> >>
> >> PengHui Li  于2023年5月8日周一 18:06写道：
> >>>
> >>> Thanks for driving the improvements in proposal managing and reviewing.
> >>> The proposal looks good to me. I have only one question about the dir
> >> name
> >>> for the pips.
> >>>
> >>> For now, we have
> >> https://github.com/apache/pulsar/tree/master/wiki/proposals
> >>> Is it better to use the existing one? Or change the existing one to
> >> "pip".
> >>> I mean, we'd better don't use two dirs for proposals.
> >>>
> >>> Thanks,
> >>> Penghui
> >>>
> >>> On Sun, May 7, 2023 at 5:52 PM Asaf Mesika 
> >> wrote:
> >>>
> >>>> Ping, in case it was lost in the barrage of mails
> >>>>
> >>>> On Sun, Apr 30, 2023 at 10:38 AM Asaf Mesika 
> >>>> wrote:
> >>>>
> >>>>> Hi,
> >>>>>
> >>>>> I've summarized all comments from
> >>>>> https://lists.apache.org/thread/5kpddlfh5xdbsjmv47ymnk3z6wd92jbh
> >> into a
> >>>>> PIP.
> >>>>>
> >>>>> PIP: https://github.com/apache/pulsar/issues/20207
> >>>>> <https://github.com/apache/pulsar/issues/20207>
> >>>>>
> >>>>> I'm leaving this discussion open for 2-3 days to make sure I haven't
> >>>>> missed a comment, and proceed to vote, since we had most of the
> >>>> discussion
> >>>>> already in the link provided above.
> >>>>>
> >>>>> Thanks!
> >>>>>
> >>>>> Asaf
> >>>>>
> >>>>
> >>
>
>

Re: [DISCUSS] Add checklist for PMC binding vote of PIP

2023-05-09 Thread Asaf Mesika

On Sun, May 7, 2023 at 8:58 PM Dave Fisher  wrote:

> You asked. Here it is.
>
> 1. You brushed aside Enrico’s concerns with that comment. It was not
> subtle.
>

I don't understand. Enrico wrote:
"+1 to writing a clear and very brief summary of the consideration you hBe
to take before casting your vote.
-1 to requiring this checklist when we cast a vote"

I changed it from required to optional.

So why do you say I brushed aside?


>
> 2. I think the project should pay more attention to Rajan’s concerns about
> new contributors being either ignored or told they need a PIP for what
> seems to them a trivial change. We lose contributors. We need to handle
> that more gently by helping them figure how to better make their PR.
>
> Rajan did not reply on the suggestion for vote checklist. Are you
referring to something else?


> 3. For minor PIPs this is too much. Minor PIPs should be easy.
>

Do you refer to the PIP template we recently merged?

I don't have any ideas how to tackle this.
I think it's ok for people to write a very short description for each
section and delete a section which seems unrelated, especially if it's a
small PIP.


>
> 4. For master PIPs like your OTel nothing here is enough. Experience with
> PIP-45 and PIP-192 is that there will be breakage, divergence, and not
> everyone will agree on the result. You worked for 11 months in apparent
> secrecy, yet seemingly ignored Lari’s similar open discussion about scaling
> which occurred in the same time frame.
>

I personally haven't seen a single mail about scaling *metrics* to handle a
massive amount of topics or the multitude of problems.
I did see emails about trying to solve Pulsar's ability to handle 1M
topics, but it's tangent since Metrics has to be fixed unrelated to which
solution is chosen.

Secrecy?
- I posted a big Google Doc to the community detailing all the existing
problems I found with existing metric system, and pitched my idea to solve
it there.
  I posted it in Slack as well since I really needed feedback on it.
  This happened 4 months after I started (out of the 11 months).
- I talked about it twice I believe in the Pulsar Summit bi-weekly meetings.
- I conducted a huge POC for all the months after that trying to see if my
ideas would actually hold, and if OpenTelemetry community can pitch and go
in the direction I wasn thinking of. I didn't want to post anything until I
was sure it was a valid direction.

So nothing was secret about it.

Back to the topic: The checklist is not aimed at anomalies of PIPs but to
the majority of them.



>
> Being overly dependent on rules is not a replacement for open discussion.
>

My suggestion was to make the checklist optional, so it's not a rule, but
just a suggestion.


>
> Sorry if this seems harsh, but this is what I think as an individual.
>
> The ASF has a saying “Community over Code”
>
> I'm trying to suggest ways which in my opinion would make the community
better.
I'm ok with getting concrete feedback why those ways do not achieve that.




> Best,
> Dave
>
> Sent from my iPhone
>
> > On May 7, 2023, at 9:55 AM, Asaf Mesika  wrote:
> >
> > I understand that Dave, and hence I only started a discussion.
> > What do you think of last reply I made there?
> >
> >
> >> On Sun, May 7, 2023 at 5:31 PM Dave Fisher 
> wrote:
> >>
> >>
> >>
> >> Sent from my iPhone
> >>
> >>>> On Apr 18, 2023, at 5:14 AM, Asaf Mesika 
> wrote:
> >>>
> >>> The problem I'm trying to solve is: lack of ability to understand
> PIPs.
> >>> PIPs I had the chance of reading lack:
> >>> * Background information: It should contain all background information
> >>> necessary to understand the problem and the solution
> >>> * Clarity: It should be written in a coherent and easy to understand
> way.
> >>>
> >>> I thought this could improve using 2 ways:
> >>> 1. Define a clear template for PIPs - this should solve all the missing
> >>> information. This is in progress.
> >>> 2. Provide a checklist to verify the +1 voter check those 3 things:
> >>> background information, clarity, solid technical solution.
> >>>
> >>> Both Enrico and Yunze say, if I understand correctly, that the +1 voter
> >>> checks those 3 things implicitly.
> >>> Yet when I try to learn Pulsar by reading historical PIPs, I find some
> >>> lacking on those things (clarity, background information) making it
> super
> >>> hard for me to get onboard into Pulsar.
> >>>
> >>> Another aspect worth noting is: community increase. In my own opinion,
> >>> documents

Re: Making Pulsar JIRA read only

2023-05-09 Thread Asaf Mesika

Thanks !

On Mon, May 8, 2023 at 9:39 AM Zili Chen  wrote:

> Filed a ticket at https://issues.apache.org/jira/browse/INFRA-24567
>
> On 2023/05/07 14:20:58 Dave Fisher wrote:
> > Hi -
> >
> > +1 to this. And yes all that’s needed is an INFRA JIRA with a link to
> this thread.
> >
> > Best,
> > Dave
> >
> > Sent from my iPhone
> >
> > > On May 7, 2023, at 3:35 AM, tison  wrote:
> > >
> > > Or, with some more explicit agreement here, we can open a JIRA ticket
> to
> > > ASF INFRA to ask them to perform the action in case I don't know how
> to do
> > > the archive actually :P
> > >
> > > Best,
> > > tison.
> > >
> > >
> > > tison  于2023年5月7日周日 18:15写道：
> > >
> > >> +1 to make the JIRA project read-only.
> > >>
> > >> I don't have the permission to perform, though. IMO this can go
> through a
> > >> lazy consensus process that any PMC member can help archive the
> project
> > >> while we can always reopen on demand.
> > >>
> > >> Best,
> > >> tison.
> > >>
> > >>
> > >> Asaf Mesika  于2023年5月7日周日 17:51写道：
> > >>
> > >>> Ping, in case it was lost in the barrage of mails
> > >>>
> > >>>> On Mon, May 1, 2023 at 9:11 AM Asaf Mesika 
> wrote:
> > >>>
> > >>>> Hi,
> > >>>>
> > >>>> I understand JIRA was used from incubation for issue tracking and at
> > >>> some
> > >>>> point GitHub was also used.
> > >>>>
> > >>>> Since GitHub is mostly used, how about we make JIRA read only ?
> > >>>>
> > >>>> We can close all tickets there and then RO.
> > >>>>
> > >>>> (Thanks Dave for background info on this).
> > >>>>
> > >>>> Asaf
> > >>>>
> > >>>>
> > >>>>
> > >>>
> > >>
> >
> >
>

Re: [DISCUSS] PIP-264: Enhanced OTel-based metric system

2023-05-09 Thread Asaf Mesika

o
you can work with existing system, and/or OTel, until gradually deprecating
existing system.

It's a big breaking change for Pulsar users on many fronts: names,
semantics, configuration. Read at the end of this doc to learn exactly what
will change for the user (in high level).

In my opinion, it will make Pulsar user experience so much better, they
will want to migrate to it, despite the breaking change.

This was a very short summary. You are most welcomed to read the full
design document below and express feedback, so we can make it better.

On Sun, May 7, 2023 at 7:52 PM Asaf Mesika  wrote:

>
>
> On Sun, May 7, 2023 at 4:23 PM Yunze Xu 
> wrote:
>
>> I'm excited to learn much more about metrics when I started reading
>> this proposal. But I became more and more frustrated when I found
>> there is still too much content left even if I've already spent much
>> time reading this proposal. I'm wondering how much time did you expect
>> reviewers to read through this proposal? I just recalled the
>> discussion you started before [1]. Did you expect each PMC member that
>> gives his/her +1 to read only parts of this proposal?
>>
>
> I estimated around 2 hours needed for a reviewer.
> I hate it being so long, but I simply couldn't find a way to downsize it
> more. Furthermore, I consulted with my colleagues including Matteo, but we
> couldn't see a way to scope it down.
> Why? Because once you begin this journey, you need to know how it's going
> to end.
> What I ended up doing, is writing all the crucial details for review in
> the High Level Design section.
> It's still a big, hefty section, but I don't think I can step out or let
> anyone else change Pulsar so invasively without the full extent of the
> change.
>
> I don't think it's wise to read parts.
> I did my very best effort to minimize it, but the scope is simply big.
> Open for suggestions, but it requires reading all the PIP :)
>
> Thanks a lot Yunze for dedicating any time to it.
>
>
>
>
>>
>> Let's talk back to the proposal, for now, what I mainly learned and
>> are concerned about mostly are:
>> 1. Pulsar has many ways to expose metrics. It's not unified and confusing.
>> 2. The current metrics system cannot support a large amount of topics.
>> 3. It's hard for plugin authors to integrate metrics. (For example,
>> KoP [2] integrates metrics by implementing the
>> PrometheusRawMetricsProvider interface and it indeed needs much work)
>>
>> Regarding the 1st issue, this proposal chooses OpenTelemetry (OTel).
>>
>> Regarding the 2nd issue, I scrolled to the "Why OpenTelemetry?"
>> section. It's still frustrating to see no answer. Eventually, I found
>>
>
> OpenTelemetry isn't the solution for large amount of topic.
> The solution is described at
> "Aggregate and Filtering to solve cardinality issues" section.
>
>
>
>> the explanation in the "What we need to fix in OpenTelemetry -
>> Performance" section. It seems that we still need some enhancements in
>> OTel. In other words, currently OTel is not ready for resolving all
>> these issues listed in the proposal but we believe it will.
>>
>
> Let me rephrase "believe" --> we work together with the maintainers to do
> it, yes.
> I am open for any other suggestion.
>
>
>
>>
>> As for the 3rd issue, from the "Integrating with Pulsar Plugins"
>> section, the plugin authors still need to implement the new OTel
>> interfaces. Is it much easier than using the existing ways to expose
>> metrics? Could metrics still be easily integrated with Grafana?
>>
>
> Yes, it's way easier.
> Basically you have a full fledged metrics library objects: Meter, Gauge,
> Histogram, Counter.
> No more Raw Metrics Provider, writing UTF-8 bytes in Prometheus format.
> You get namespacing for free with Meter name and version.
> It's way better than current solution and any other library.
>
>
>>
>> That's all I am concerned about at the moment. I understand, and
>> appreciate that you've spent much time studying and explaining all
>> these things. But, this proposal is still too huge.
>>
>
> I appreciate your effort a lot!
>
>
>
>>
>> [1] https://lists.apache.org/thread/04jxqskcwwzdyfghkv4zstxxmzn154kf
>> [2]
>> https://github.com/streamnative/kop/blob/master/kafka-impl/src/main/java/io/streamnative/pulsar/handlers/kop/stats/PrometheusMetricsProvider.java
>>
>> Thanks,
>> Yunze
>>
>> On Sun, May 7, 2023 at 5:53 PM Asaf Mesika  wrote:
>> >
>> > I'm very appreciative for feedback from multiple pulsar users a

Re: [DISCUSS] PIP-265: PR-based system for managing and reviewing PIPs

2023-05-09 Thread Asaf Mesika

On Mon, May 8, 2023 at 1:06 PM PengHui Li  wrote:

> Thanks for driving the improvements in proposal managing and reviewing.
> The proposal looks good to me. I have only one question about the dir name
> for the pips.
>
> For now, we have
> https://github.com/apache/pulsar/tree/master/wiki/proposals
> Is it better to use the existing one? Or change the existing one to "pip".
> I mean, we'd better don't use two dirs for proposals.
>
>
I fixed the PIP by adding:

This will replace existing `wiki/proposal/PIP.md`which will be merged into
> the README.

All links pointing to that `PIP.md` file (from wiki, pulsar-site) will be
> amended to point to the new README file.






> Thanks,
> Penghui
>
> On Sun, May 7, 2023 at 5:52 PM Asaf Mesika  wrote:
>
> > Ping, in case it was lost in the barrage of mails
> >
> > On Sun, Apr 30, 2023 at 10:38 AM Asaf Mesika 
> > wrote:
> >
> > > Hi,
> > >
> > > I've summarized all comments from
> > > https://lists.apache.org/thread/5kpddlfh5xdbsjmv47ymnk3z6wd92jbh into
> a
> > > PIP.
> > >
> > > PIP: https://github.com/apache/pulsar/issues/20207
> > > <https://github.com/apache/pulsar/issues/20207>
> > >
> > > I'm leaving this discussion open for 2-3 days to make sure I haven't
> > > missed a comment, and proceed to vote, since we had most of the
> > discussion
> > > already in the link provided above.
> > >
> > > Thanks!
> > >
> > > Asaf
> > >
> >
>

Re: [VOTE] PIP-261: Restructure Getting Started section

2023-05-09 Thread Asaf Mesika

Who can help me update in https://github.com/apache/pulsar/wiki
that it has passed?


On Tue, May 9, 2023 at 5:01 PM Asaf Mesika  wrote:

> The vote has passed with 4 of binding +1 and 1 non-binding +1.
>
>
> Binding:
> Yunze Xu
> Hang Chen
> Yu
> Penghui
>
> Non-binding:
> Tison
>
> Thank you all,
>
> Asaf
>
> On Sun, May 7, 2023 at 12:20 PM Asaf Mesika  wrote:
>
>> Hi,
>>
>> PIP-261 as been opened for quite some time, garnered feedback from 3
>> people, which was implemented in the PIP.
>>
>> It is time to start the vote.
>>
>> PIP: https://github.com/apache/pulsar/issues/19912
>>
>> Thanks!
>>
>> Asaf
>>
>

Re: [VOTE] PIP-261: Restructure Getting Started section

2023-05-09 Thread Asaf Mesika

The vote has passed with 4 of binding +1 and 1 non-binding +1.


Binding:
Yunze Xu
Hang Chen
Yu
Penghui

Non-binding:
Tison

Thank you all,

Asaf

On Sun, May 7, 2023 at 12:20 PM Asaf Mesika  wrote:

> Hi,
>
> PIP-261 as been opened for quite some time, garnered feedback from 3
> people, which was implemented in the PIP.
>
> It is time to start the vote.
>
> PIP: https://github.com/apache/pulsar/issues/19912
>
> Thanks!
>
> Asaf
>

Re: [DISCUSS] PIP-265: PR-based system for managing and reviewing PIPs

2023-05-09 Thread Asaf Mesika

On Tue, May 9, 2023 at 5:18 AM Hang Chen  wrote:

> Thanks for driving this discussion.
>
> I agree to change the proposal discussion from issue and dev mail list
> to PR. It will be easier to review and comment, especially for large
> proposals.
> I have two questions about this change.
> - Some proposals contain images, and putting those images into Pulsar
> main repo will make the git db large. What's more, some images can be
> up to several MBs
>

That's a great point, and we must address it in the PIP.
How about we say that you only use:
1. Mermaid <https://mermaid.js.org/#/> - it's a tiny language to create
drawings? GitHub supports this language on code highlight and renders it
correctly.
2. Use SVG files which will be located in a folder named after the pip
issue number. SVG are vector graphics saved as text. For diagrams,
they should be ok in size, and compress well.

I think Mermaid should be enough for all drawings needed for illustration
of design document purposes. WDYT?



> - After merging one proposal, if we want to update the content, do we
> need to discuss it in the dev mail list or just push one PR to update
> it?
>

Does it happen often?
I guess if the change is not big, it's ok just to do PR.

I can clarify that as well, if it is agreed upon.


>
> Thanks,
> Hang
>
>
> PengHui Li  于2023年5月8日周一 18:06写道：
> >
> > Thanks for driving the improvements in proposal managing and reviewing.
> > The proposal looks good to me. I have only one question about the dir
> name
> > for the pips.
> >
> > For now, we have
> https://github.com/apache/pulsar/tree/master/wiki/proposals
> > Is it better to use the existing one? Or change the existing one to
> "pip".
> > I mean, we'd better don't use two dirs for proposals.
> >
> > Thanks,
> > Penghui
> >
> > On Sun, May 7, 2023 at 5:52 PM Asaf Mesika 
> wrote:
> >
> > > Ping, in case it was lost in the barrage of mails
> > >
> > > On Sun, Apr 30, 2023 at 10:38 AM Asaf Mesika 
> > > wrote:
> > >
> > > > Hi,
> > > >
> > > > I've summarized all comments from
> > > > https://lists.apache.org/thread/5kpddlfh5xdbsjmv47ymnk3z6wd92jbh
> into a
> > > > PIP.
> > > >
> > > > PIP: https://github.com/apache/pulsar/issues/20207
> > > > <https://github.com/apache/pulsar/issues/20207>
> > > >
> > > > I'm leaving this discussion open for 2-3 days to make sure I haven't
> > > > missed a comment, and proceed to vote, since we had most of the
> > > discussion
> > > > already in the link provided above.
> > > >
> > > > Thanks!
> > > >
> > > > Asaf
> > > >
> > >
>

Re: [VOTE] PIP-261: Restructure Getting Started section

2023-05-09 Thread Asaf Mesika

On Mon, May 8, 2023 at 12:31 PM PengHui Li  wrote:

> I have one question about the ready-made applications under the Getting
> Started section.
> Should we consider putting the application's source code into another repo?
> So that users can fork the repo to play the application and check how it
> works.
> Otherwise, users must follow the steps to build the application by
> themselves.
>

I agree. I quote from the PIP:

Each tutorial will have a link to a repository containing the full example
> if they j




>
> But it's not a blocker for this proposal. It could be a separate
> discussion.
>
> +1 (binding)
> Penghui
>
> On Mon, May 8, 2023 at 9:28 AM Yu  wrote:
>
> > +1 (binding)
> > It's time to have a frictionless onboarding experience!
> >
> > On Mon, May 8, 2023 at 9:21 AM Hang Chen  wrote:
> >
> > > +1 (binding)
> > >
> > > Best,
> > > Hang
> > >
> > > tison  于2023年5月7日周日 20:32写道：
> > > >
> > > > +1 (non-binding)
> > > >
> > > > N.B. IIRC PMC members have binding votes.
> > > >
> > > > Best,
> > > > tison.
> > > >
> > > >
> > > > Yunze Xu  于2023年5月7日周日 20:12写道：
> > > >
> > > > > +1 (binding)
> > > > >
> > > > > Thanks,
> > > > > Yunze
> > > > >
> > > > > On Sun, May 7, 2023 at 5:20 PM Asaf Mesika 
> > > wrote:
> > > > > >
> > > > > > Hi,
> > > > > >
> > > > > > PIP-261 as been opened for quite some time, garnered feedback
> from
> > 3
> > > > > > people, which was implemented in the PIP.
> > > > > >
> > > > > > It is time to start the vote.
> > > > > >
> > > > > > PIP: https://github.com/apache/pulsar/issues/19912
> > > > > >
> > > > > > Thanks!
> > > > > >
> > > > > > Asaf
> > > > >
> > >
> >
>

Re: [DISCUSS] Add checklist for PMC binding vote of PIP

2023-05-07 Thread Asaf Mesika

I understand that Dave, and hence I only started a discussion.
What do you think of last reply I made there?


On Sun, May 7, 2023 at 5:31 PM Dave Fisher  wrote:

>
>
> Sent from my iPhone
>
> > On Apr 18, 2023, at 5:14 AM, Asaf Mesika  wrote:
> >
> > The problem I'm trying to solve is: lack of ability to understand PIPs.
> > PIPs I had the chance of reading lack:
> > * Background information: It should contain all background information
> > necessary to understand the problem and the solution
> > * Clarity: It should be written in a coherent and easy to understand way.
> >
> > I thought this could improve using 2 ways:
> > 1. Define a clear template for PIPs - this should solve all the missing
> > information. This is in progress.
> > 2. Provide a checklist to verify the +1 voter check those 3 things:
> > background information, clarity, solid technical solution.
> >
> > Both Enrico and Yunze say, if I understand correctly, that the +1 voter
> > checks those 3 things implicitly.
> > Yet when I try to learn Pulsar by reading historical PIPs, I find some
> > lacking on those things (clarity, background information) making it super
> > hard for me to get onboard into Pulsar.
> >
> > Another aspect worth noting is: community increase. In my own opinion,
> > documents with clarity and enough background information produce a
> feeling
> > of quality - high quality. Making Pulsar PIPs clear and have all
> > information to understand them will help grow Pulsar adoption.
> >
> > Maybe incremental improvements are better.. If I understand correctly,
> both
> > Enrico and Yunze - you are ok with having a summary template, but have it
> > non-required?
> >
> > Enrico - Regarding previous suggestions. Root cause - help make Pulsar
> > better from my own perspective. Some suggestions may be super bad
> > suggestions and hopefully some will be good :)
> > This specific one - I validated with the PMC members in the weekly zoom
> > meeting roughly 3 weeks ago, and got +1 across the board (we had 5
> people).
> > I did it since I felt it was a touchy subject.
>
> Nothing discussed in that meeting was a decision. PMC Members in the
> community meeting are not making PMC decisions. Decisions are ONLY made
> here. Whatever you may think I said my intent was for you to start this
> discussion and only that.
>
> Best,
> Dave
>
> >
> > Thanks,
> >
> > Asaf
> >
> >
> >
> >
> >
> >
> >> On Tue, Apr 18, 2023 at 9:15 AM Yunze Xu 
> >> wrote:
> >>
> >> Basically I think describing how much work the reviewer did to give
> >> his +1 is good. Just like the vote for a release, each +1 follows with
> >> the verifications he did, e.g. here [1] is a vote for Pulsar 2.11.1
> >> candidate 1:
> >>
> >>> • Built from the source package (maven 3.8.6 OpenJDK 17.0)
> >>> • Ran binary package standalone with pub/sub
> >>> ...
> >>
> >> But I don't think forcing the rule is good. The proposal could
> >> sometimes be not so complicated. From my personal experience,
> >> sometimes I vote my +1 just because I think it's good and there is no
> >> serious problem. If you want me to vote again with the checklist, I
> >> might still not have an idea of what I should write, unless there is a
> >> template and I filled the template. Only if the proposal is somehow
> >> complicated will the checklist be meaningful, like the PIP-192, which
> >> is a very complicated proposal.
> >>
> >>> Moreover, this checklist can ensure that all participants have
> >> thoroughly reviewed the PIP,
> >>
> >> Regarding this point from Xiangying, I want to repeat a similar
> >> thought [2] for the previous discussion.
> >>
> >> IF ANYONE WANT, HE CAN STILL COPY A CHECKLIST FROM OTHERS AND JUST
> >> PERFORM SOME SLIGHTLY CHANGES.
> >>
> >> Forcing a checklist won't change anything if there is a PMC that gave
> >> his vote without any careful review. It just makes the rule more
> >> complicated. If you don't trust a PMC, no rule could restrict him.
> >> Rules only make him a better game player.
> >>
> >> In addition, when a reviewer approves a PR, should he add a checklist
> >> as well, instead of a simple LGTM or +1? Huge PRs appear more often
> >> than complicated proposals.
> >>
> >> In conclusion, I am +0 to this suggestion. If this suggestion is
> >> passed, I will follow it well. B

Re: [DISCUSS] PIP-264: Enhanced OTel-based metric system

2023-05-07 Thread Asaf Mesika

On Sun, May 7, 2023 at 4:23 PM Yunze Xu 
wrote:

> I'm excited to learn much more about metrics when I started reading
> this proposal. But I became more and more frustrated when I found
> there is still too much content left even if I've already spent much
> time reading this proposal. I'm wondering how much time did you expect
> reviewers to read through this proposal? I just recalled the
> discussion you started before [1]. Did you expect each PMC member that
> gives his/her +1 to read only parts of this proposal?
>

I estimated around 2 hours needed for a reviewer.
I hate it being so long, but I simply couldn't find a way to downsize it
more. Furthermore, I consulted with my colleagues including Matteo, but we
couldn't see a way to scope it down.
Why? Because once you begin this journey, you need to know how it's going
to end.
What I ended up doing, is writing all the crucial details for review in the
High Level Design section.
It's still a big, hefty section, but I don't think I can step out or let
anyone else change Pulsar so invasively without the full extent of the
change.

I don't think it's wise to read parts.
I did my very best effort to minimize it, but the scope is simply big. Open
for suggestions, but it requires reading all the PIP :)

Thanks a lot Yunze for dedicating any time to it.

>
> Let's talk back to the proposal, for now, what I mainly learned and
> are concerned about mostly are:
> 1. Pulsar has many ways to expose metrics. It's not unified and confusing.
> 2. The current metrics system cannot support a large amount of topics.
> 3. It's hard for plugin authors to integrate metrics. (For example,
> KoP [2] integrates metrics by implementing the
> PrometheusRawMetricsProvider interface and it indeed needs much work)
>
> Regarding the 1st issue, this proposal chooses OpenTelemetry (OTel).
>
> Regarding the 2nd issue, I scrolled to the "Why OpenTelemetry?"
> section. It's still frustrating to see no answer. Eventually, I found
>

OpenTelemetry isn't the solution for large amount of topic.
The solution is described at
"Aggregate and Filtering to solve cardinality issues" section.

> the explanation in the "What we need to fix in OpenTelemetry -
> Performance" section. It seems that we still need some enhancements in
> OTel. In other words, currently OTel is not ready for resolving all
> these issues listed in the proposal but we believe it will.
>

Let me rephrase "believe" --> we work together with the maintainers to do
it, yes.
I am open for any other suggestion.

>
> As for the 3rd issue, from the "Integrating with Pulsar Plugins"
> section, the plugin authors still need to implement the new OTel
> interfaces. Is it much easier than using the existing ways to expose
> metrics? Could metrics still be easily integrated with Grafana?
>

Yes, it's way easier.
Basically you have a full fledged metrics library objects: Meter, Gauge,
Histogram, Counter.
No more Raw Metrics Provider, writing UTF-8 bytes in Prometheus format.
You get namespacing for free with Meter name and version.
It's way better than current solution and any other library.

>
> That's all I am concerned about at the moment. I understand, and
> appreciate that you've spent much time studying and explaining all
> these things. But, this proposal is still too huge.
>

I appreciate your effort a lot!

>
> [1] https://lists.apache.org/thread/04jxqskcwwzdyfghkv4zstxxmzn154kf
> [2]
> https://github.com/streamnative/kop/blob/master/kafka-impl/src/main/java/io/streamnative/pulsar/handlers/kop/stats/PrometheusMetricsProvider.java
>
> Thanks,
> Yunze
>
> On Sun, May 7, 2023 at 5:53 PM Asaf Mesika  wrote:
> >
> > I'm very appreciative for feedback from multiple pulsar users and devs on
> > this PIP, since it has dramatic changes suggested and quite extensive
> > positive change for the users.
> >
> >
> > On Thu, Apr 27, 2023 at 7:32 PM Asaf Mesika 
> wrote:
> >
> > > Hi all,
> > >
> > > I'm very excited to release a PIP I've been working on in the past 11
> > > months, which I think will be immensely valuable to Pulsar, which I
> like so
> > > much.
> > >
> > > PIP: https://github.com/apache/pulsar/issues/20197
> > >
> > > I'm quoting here the preface:
> > >
> > > === QUOTE START ===
> > >
> > > Roughly 11 months ago, I started working on solving the biggest issue
> with
> > > Pulsar metrics: the lack of ability to monitor a pulsar broker with a
> large
> > > topic count: 10k, 100k, and future support of 1M. This started by
> mapping
> > > the existing functionality and then enumerating all the problems I saw
> (all
>

Re: [DISCUSS] PIP-264: Enhanced OTel-based metric system

2023-05-07 Thread Asaf Mesika

I'm very appreciative for feedback from multiple pulsar users and devs on
this PIP, since it has dramatic changes suggested and quite extensive
positive change for the users.


On Thu, Apr 27, 2023 at 7:32 PM Asaf Mesika  wrote:

> Hi all,
>
> I'm very excited to release a PIP I've been working on in the past 11
> months, which I think will be immensely valuable to Pulsar, which I like so
> much.
>
> PIP: https://github.com/apache/pulsar/issues/20197
>
> I'm quoting here the preface:
>
> === QUOTE START ===
>
> Roughly 11 months ago, I started working on solving the biggest issue with
> Pulsar metrics: the lack of ability to monitor a pulsar broker with a large
> topic count: 10k, 100k, and future support of 1M. This started by mapping
> the existing functionality and then enumerating all the problems I saw (all
> documented in this doc
> <https://docs.google.com/document/d/1vke4w1nt7EEgOvEerPEUS-Al3aqLTm9cl2wTBkKNXUA/edit?usp=sharing>
> ).
>
> This PIP is a parent PIP. It aims to gradually solve (using sub-PIPs) all
> the current metric system's problems and provide the ability to monitor a
> broker with a large topic count, which is currently lacking. As a parent
> PIP, it will describe each problem and its solution at a high level,
> leaving fine-grained details to the sub-PIPs. The parent PIP ensures all
> solutions align and does not contradict each other.
>
> The basic building block to solve the monitoring ability of large topic
> count is aggregating internally (to topic groups) and adding fine-grained
> filtering. We could have shoe-horned it into the existing metric system,
> but we thought adding that to a system already ingrained with many problems
> would be wrong and hard to do gradually, as so many things will break. This
> is why the second-biggest design decision presented here is consolidating
> all existing metric libraries into a single one - OpenTelemetry
> <https://opentelemetry.io/>. The parent PIP will explain why
> OpenTelemetry was chosen out of existing solutions and why it far exceeds
> all other options. I’ve been working closely with the OpenTelemetry
> community in the past eight months: brain-storming this integration, and
> raising issues, in an effort to remove serious blockers to make this
> migration successful.
>
> I made every effort to summarize this document so that it can be concise
> yet clear. I understand it is an effort to read it and, more so, provide
> meaningful feedback on such a large document; hence I’m very grateful for
> each individual who does so.
>
> I think this design will help improve the user experience immensely, so it
> is worth the time spent reading it.
>
>
> === QUOTE END ===
>
>
> Thanks!
>
> Asaf Mesika
>

Re: [DISCUSS] Add checklist for PMC binding vote of PIP

2023-05-07 Thread Asaf Mesika

Ping, in case it was lost in the barrage of mails

On Sun, Apr 30, 2023 at 3:54 PM Asaf Mesika  wrote:

> Is it ok if we use the following vote template? Per comments above, it
> will be optional, yet recommended.
>
> +1 (binding)
>
> [v] PIP has all sections detailed in the PIP template (Background,
> motivation, etc.)
> [v] A person having basic Pulsar user knowledge, can read the PIP and
> fully understand it
> [v] I read PIP and validated it technically
>
>
>
> On Wed, Apr 19, 2023 at 6:44 AM Yunze Xu 
> wrote:
>
>> >  you are ok with having a summary template, but have it non-required?
>>
>> Yes to me.
>>
>> In addition, I think the root cause of the problems you met is that
>> some PIPs have low quality. They are not clear and friendly to others.
>> A good proposal should not require reviewers to have deep knowledge of
>> a specific domain. I think what PMC members should do to improve it is
>> to cast the -1 to those ambiguous proposals until they become clear.
>>
>> Thanks,
>> Yunze
>>
>> On Tue, Apr 18, 2023 at 8:14 PM Asaf Mesika 
>> wrote:
>> >
>> > The problem I'm trying to solve is: lack of ability to understand PIPs.
>> > PIPs I had the chance of reading lack:
>> > * Background information: It should contain all background information
>> > necessary to understand the problem and the solution
>> > * Clarity: It should be written in a coherent and easy to understand
>> way.
>> >
>> > I thought this could improve using 2 ways:
>> > 1. Define a clear template for PIPs - this should solve all the missing
>> > information. This is in progress.
>> > 2. Provide a checklist to verify the +1 voter check those 3 things:
>> > background information, clarity, solid technical solution.
>> >
>> > Both Enrico and Yunze say, if I understand correctly, that the +1 voter
>> > checks those 3 things implicitly.
>> > Yet when I try to learn Pulsar by reading historical PIPs, I find some
>> > lacking on those things (clarity, background information) making it
>> super
>> > hard for me to get onboard into Pulsar.
>> >
>> > Another aspect worth noting is: community increase. In my own opinion,
>> > documents with clarity and enough background information produce a
>> feeling
>> > of quality - high quality. Making Pulsar PIPs clear and have all
>> > information to understand them will help grow Pulsar adoption.
>> >
>> > Maybe incremental improvements are better.. If I understand correctly,
>> both
>> > Enrico and Yunze - you are ok with having a summary template, but have
>> it
>> > non-required?
>> >
>> > Enrico - Regarding previous suggestions. Root cause - help make Pulsar
>> > better from my own perspective. Some suggestions may be super bad
>> > suggestions and hopefully some will be good :)
>> > This specific one - I validated with the PMC members in the weekly zoom
>> > meeting roughly 3 weeks ago, and got +1 across the board (we had 5
>> people).
>> > I did it since I felt it was a touchy subject.
>> >
>> > Thanks,
>> >
>> > Asaf
>> >
>> >
>> >
>> >
>> >
>> >
>> > On Tue, Apr 18, 2023 at 9:15 AM Yunze Xu 
>> > wrote:
>> >
>> > > Basically I think describing how much work the reviewer did to give
>> > > his +1 is good. Just like the vote for a release, each +1 follows with
>> > > the verifications he did, e.g. here [1] is a vote for Pulsar 2.11.1
>> > > candidate 1:
>> > >
>> > > > • Built from the source package (maven 3.8.6 OpenJDK 17.0)
>> > > > • Ran binary package standalone with pub/sub
>> > > > ...
>> > >
>> > > But I don't think forcing the rule is good. The proposal could
>> > > sometimes be not so complicated. From my personal experience,
>> > > sometimes I vote my +1 just because I think it's good and there is no
>> > > serious problem. If you want me to vote again with the checklist, I
>> > > might still not have an idea of what I should write, unless there is a
>> > > template and I filled the template. Only if the proposal is somehow
>> > > complicated will the checklist be meaningful, like the PIP-192, which
>> > > is a very complicated proposal.
>> > >
>> > > > Moreover, this checklist can ensure that all participants have
>> > > thoroughly reviewed the PIP

Re: [DISCUSS] PIP-265: PR-based system for managing and reviewing PIPs

2023-05-07 Thread Asaf Mesika

Ping, in case it was lost in the barrage of mails

On Sun, Apr 30, 2023 at 10:38 AM Asaf Mesika  wrote:

> Hi,
>
> I've summarized all comments from
> https://lists.apache.org/thread/5kpddlfh5xdbsjmv47ymnk3z6wd92jbh into a
> PIP.
>
> PIP: https://github.com/apache/pulsar/issues/20207
> <https://github.com/apache/pulsar/issues/20207>
>
> I'm leaving this discussion open for 2-3 days to make sure I haven't
> missed a comment, and proceed to vote, since we had most of the discussion
> already in the link provided above.
>
> Thanks!
>
> Asaf
>

Re: Making Pulsar JIRA read only

2023-05-07 Thread Asaf Mesika

Ping, in case it was lost in the barrage of mails

On Mon, May 1, 2023 at 9:11 AM Asaf Mesika  wrote:

> Hi,
>
> I understand JIRA was used from incubation for issue tracking and at some
> point GitHub was also used.
>
> Since GitHub is mostly used, how about we make JIRA read only ?
>
> We can close all tickets there and then RO.
>
> (Thanks Dave for background info on this).
>
> Asaf
>
>
>

1 2 3 >

1 - 100 of 292 matches

Mail list logo