Re: [DISCUSS] Optimizing the Method of Estimating Message Backlog Size in Pulsar

2024-03-27 Thread Girish Sharma
Hi Xiangying,


> In the current implementation, the backlog size is estimated from the
> mark delete position to the last confirm position, whereas the backlog
> message count is the number of messages from the mark delete position
> to the last confirm position, minus the count of individually
> acknowledged messages. The inconsistency between these two could
> potentially confuse users.
>

While confusing, it is somewhat accurate. Since the messages can be part of
the same ledger where some messages are acked, some aren't, we can't delete
that entire ledger until all messages of the ledger are acked - so it does
contribute towards size of the backlog from a disk perspective.
There might be some optimization possible - in a way that we try to figure
out all completely acked ledgers from markDeletePosition to latest offset
and remove their size, but what's the ROI there?

So I would say that in your proposal, option 2 (current) is more accurate
(while not being the best) than option 1.

Regards
-- 
Girish Sharma


Re: [DISCUSS] PIP-345: Optimize finding message by timestamp

2024-03-14 Thread Girish Sharma
One suggestion, I think you can make do with storing just begin timestamp.
Any search utilising these values will work the same way with just one of
those timestamps compared to both begin and end.

Any particular reason you need both the timestamps?

Regards

On Fri, Mar 15, 2024, 9:39 AM 太上玄元道君  wrote:

> bump
>
> 太上玄元道君 于2024年3月10日 周日06:41写道:
>
> > Hi Pulsar community,
> >
> > A new PIP is opened, this thread is to discuss PIP-345: Optimize finding
> > message by timestamp.
> >
> > Motivation:
> > Finding message by timestamp is widely used in Pulsar:
> > * It is used by the `pulsar-admin` tool to get the message id by
> > timestamp, expire messages by timestamp, and reset cursor.
> > * It is used by the `pulsar-client` to reset the subscription to a
> > specific timestamp.
> > * And also used by the `expiry-monitor` to find the messages that are
> > expired.
> > Even though the current implementation is correct, and using binary
> search
> > to speed-up, but it's still not efficient *enough*.
> > The current implementation is to scan all the ledgers to find the message
> > by timestamp.
> > This is a performance bottleneck, especially for large topics with many
> > messages.
> > Say, if there is a topic which has 1m entries, through the binary search,
> > it will take 20 iterations to find the message.
> > In some extreme cases, it may lead to a timeout, and the client will not
> > be able to seeking by timestamp.
> >
> > PIP: https://github.com/apache/pulsar/pull/22234
> >
> > Your feedback is very important to us, please take a moment to review the
> > proposal and provide your thoughts.
> >
> > Thanks,
> > Tao Jiuming
> >
>


Re: [DISCUSS] Retire pulsar-all Docker image and spin-off Python Functions runtime

2024-03-07 Thread Girish Sharma
+1
We are recently struggling with building a pulsar image in house (lots of
app sec constraints etc). a much reduced and minimal image would certainly
help there.

Any estimates on the size reduction in the base pulsar image after removal
of python related content? Is there scope of further slim down of the base
pulsar image by removing anything non essential in running a broker (or as
a bookie or zk)

Regards

On Thu, Mar 7, 2024 at 11:19 PM Neng Lu  wrote:

> +1
>
> This can reduce the image size significantly and thus improve the
> efficiency and reduce the cost.
>
> On Tue, Mar 5, 2024 at 11:25 PM Enrico Olivelli 
> wrote:
>
> > +1
> >
> > Great idea
> >
> > Enrico
> >
> > Il Mer 6 Mar 2024, 08:23 Zixuan Liu  ha scritto:
> >
> > > +1
> > >
> > > This is a good idea, and then we must provide a document on building
> the
> > > own connector image and python functions runtime image.
> > >
> > > Thanks,
> > > Zixuan
> > >
> > > Matteo Merli  于2024年3月6日周三 07:04写道:
> > >
> > > > The docker image `pulsar-all` is a convenience image that is created
> on
> > > top
> > > > of the base `pulsar` image, including all the Pulsar IO connectors as
> > > well
> > > > as the tiered storage offloaders.
> > > >
> > > > The Dockerfile for `pulsar-all` can be found here:
> > > >
> > >
> >
> https://github.com/apache/pulsar/blob/master/docker/pulsar-all/Dockerfile
> > > >
> > > > The resulting image is very big:
> > > >
> > > > ```
> > > > apachepulsar/pulsar-all   3.1.2
> > > >  3d1aa250bf6c   2 months ago3.68GB
> > > > ```
> > > >
> > > > This poses a challenge in many ways:
> > > >  1. Our CI pipeline needs to build these images and cache them across
> > > > different stages of the pipeline
> > > >  2. It takes a lot of time for release managers to build and push
> these
> > > > images to Docker Hub
> > > >  3. Users using this image in production see very long download
> times,
> > > > something that can affect the availability of the system (eg: more
> > > chances
> > > > of a 2nd broker to crash if a restart takes a very long time).
> > > >  4. It's very unlikely that one user will require all the connectors,
> > > most
> > > > likely, it would use just 2-3 of them.
> > > >
> > > > The problem is that `pulsar-all` was introduced at a time when there
> > were
> > > > ~3 Pulsar IO connectors. Right now we do have 35 connectors, with a
> 1.9
> > > GB
> > > > total size.
> > > >
> > > > The proposal here is to drop this image altogether. Users will be
> able
> > to
> > > > construct their own targeted images in a very simple way:
> > > >
> > > > ```
> > > > FROM apachepulsar/pulsar:latest
> > > > RUN mkdir -p connectors && \
> > > > cd connectors && \
> > > > wget
> > > >
> > > >
> > >
> >
> https://downloads.apache.org/pulsar/pulsar-3.2.0/connectors/pulsar-io-elastic-search-3.2.0.nar
> > > > ```
> > > >
> > > >
> > > >
> > > > ### Pulsar Functions Python Runtime
> > > >
> > > > In order to support Python functions runtime, we have been including
> > the
> > > > Pulsar base image with quite a bit of dependencies, from
> > `pulsar-client`
> > > > Python SDK, to gRPC which is quite a heavy package with many
> transitive
> > > > dependencies.
> > > >
> > > > Given that the vast majority would be using the `pulsar` base image
> to
> > > run
> > > > brokers and not python functions, it would make sense to split the
> > Python
> > > > support into a different image, like `pulsar-functions-python`, which
> > > > extends from the base image and adds all the needed Python
> > dependencies.
> > > >
> > > > This way it will be very easy for users to select the appropriate
> image
> > > and
> > > > we wouldn't be carrying a big amount of useless Python dependencies
> to
> > > > users who don't need them.
> > > >
> > > >
> > > > What are people's opinions with respect to this?
> > > >
> > > > Matteo
> > > >
> > > > --
> > > > Matteo Merli
> > > > 
> > > >
> > >
> >
>
>
> --
> Best Regards,
> Neng
>


-- 
Girish Sharma


Re: (Apache committer criteria) [ANNOUNCE] New Committer: Asaf Mesika

2024-03-06 Thread Girish Sharma
On Thu, Mar 7, 2024 at 11:38 AM Yunze Xu  wrote:

> Regarding PIP-332 and PIP 310, similar to PIP-337, there is no
> discussion mail in the dev mail list. David left a comment [1] in
>

There is for 310 -
https://lists.apache.org/thread/13ncst2nc311vxok1s75thl2gtnk7w1t


Regards
-- 
Girish Sharma


Re: Pulsar Version upgrade guidelines

2024-03-06 Thread Girish Sharma
Bumping this up.
I cannot be the only one confused by these questions.

Pulsar is at a stage where users have to constantly upgrade due to
stability or feature needs. The answers to the questions I am asking should
help everyone planning upgrades from 2.x to 3.x and other combinations.

Regards

On Fri, Feb 16, 2024 at 6:31 PM Girish Sharma 
wrote:

> There have been a few discussions in the past on the slack channel and I
> recently also started a similar thread [0] regarding if we can skip certain
> releases while upgrading towards pulsar 3.0 and beyond. Starting this dev
> mailing list discussion to get some more input.
>
> As per official release policy [1] itself, there are some open questions:
>
> *Before 3.0, upgrade should be done linearly through each feature version.
>> For example, when upgrading from 2.8 to 2.10, it is important to upgrade to
>> 2.9 before going to 2.10. *
>>
>
> This is a very clear statement. Although lengthy, it makes sense to limit
> the scope of OSS to test upgrades from and to every version.
>
> *Starting from 3.0, additionally, live upgrade/downgrade between one LTS
>> and the next one is guaranteed. For example, *
>>
>
> What does this exactly entail? Does it only mean that I can do 3.0.x <->
> 4.0.x ? The example just below is misleading from that perspective
>
>
>>
>>
>> *3.0 -> 4.0 -> 3.0 is OK;3.2 -> 4.0 -> 3.2 is OK;3.2 -> 4.4 ->
>> 3.2 is OK;3.2 -> 5.0 is not OK.*
>
>
> This seems to give a feeling that it is possible to upgrade from any 3.x
> version to any 3.x or 4.x version including rollbacks. Are we testing this
> as new 3.x versions release?
>
> To add to the confusion, the blog post [2] of 3.2 release mentions this
>
> *For the 3.2 series, you should be able to upgrade from version 3.1 or
>> downgrade from the subsequently released version 3.3. If you are currently
>> using an earlier version, please ensure that you upgrade to version 3.1
>> before proceeding further.*
>
>
> This is confusing now. So 3.2 -> 4.0 would be possible but 3.0 -> 3.2
> isn't? Why is 3.2 -> 4.4 possible then?
>
> Wish to see the community's take on this in order to align the
> recommendation.
>
> [0] https://apache-pulsar.slack.com/archives/C5Z4T36F7/p1705392242948349
> [1]
> https://pulsar.apache.org/contribute/release-policy/#compatibility-between-releases
> [2]
> https://pulsar.apache.org/blog/2024/02/12/announcing-apache-pulsar-3-2/
>
> --
> Girish Sharma
>


-- 
Girish Sharma


Re: [DISCUSS] Migrate CLI parser from jcommander to picocli

2024-02-21 Thread Girish Sharma
+1 (non-binding)
It has been a pain trying to figure out what's the exact sub-param name
given that some of them are 20+ characters long.


On Wed, Feb 21, 2024 at 3:47 PM Julien Jakubowski
 wrote:

> Thanks for this proposal! That should significantly improve the user
> experience with the CLI.
>
>
> > Le 21 févr. 2024 à 06:08, Zixuan Liu  a écrit :
> >
> > Hi Pulsar Community,
> >
> > The pulsar CLI(pulsar-admin, pulsar-client, pulsar-shell, and so on) uses
> > the jcommander [1] as CLI parser, which is an awesome project, but the
> > maintainer is not active and cannot keep up with modern CLI
> > features(auto-completion, sub-command, native-images, suggest commands,
> and
> > so on). Then I found the picocli [2] project to meet these needs, which
> is
> > active and powerful. For comparison, please see [3].
> >
> > Error prompt:
> > ```
> > bin/pulsar-admin clusters update cluster-a -b
> >
> > # jcommander
> > Need to provide just 1 parameter
> >
> > # picocli
> > Unknown option: '-b'
> > ```
> >
> > Suggest commands:
> > ```
> > bin/pulsar-admin cluste
> >
> > # jcommander
> > Expected a command, got cluste
> >
> > # picocli
> > Unmatched argument at index 0: 'cluste'
> > Did you mean: pulsar-admin clusters?
> > ```
> >
> > What do you think about migrating CLI parse from jcommander to picocli?
> >
> > Thanks,
> > Zixuan
> >
> > [1] - https://github.com/cbeust/jcommander
> > [2] - https://picocli.info/
> > [3] - https://github.com/remkop/picocli/wiki/picocli-vs-JCommander
>


-- 
Girish Sharma


Re: Ability to decrease partition count in pulsar

2024-02-20 Thread Girish Sharma
Hello Asaf, thank  you for taking a look at this. I will have a formal PIP
sometime by March end. Trying to close on the Rate limiting PIPs first.

On Sun, Feb 18, 2024 at 3:47 PM Asaf Mesika  wrote:

> Hey Girish,
>
> First, I say that I *love* this proposal and, in general, those types of
> proposals.
> This is what strides Pulsar towards being an even more next-generation
> messaging system.
>
> I read and have a few questions and brainstorming ideas popping into my
> mind:
>
> 1. The current design basically says: Let’s have a read-only toggle (flag)
> for each partition. When I decrease the partitions from, say, 2 to 1, then
> if the partitions were “billing-0” and “billing-1”, now “billing-1” will be
> marked read-only, and eventually, the client will only produce messages to
> “billing-0”. After 1 hour, I can scale it back to 2 partitions, and now the
> “billing-“1 will be toggled back to read-only=false.
>

This is true. But probably its only extension of a problem that already
exists today - In case you scale up a 3 day retention topic from 2 to 3
partitions and start a new subscription from the beginning, you will see
drastic time difference in the messages of the older partitions vs the
newer ones.



>
> * I know you stated that ordered consumption is out of scope. The thing I
> fear here is that even for shared subscriptions, in which order doesn’t
> matter, it still feels a bit weird that when you consume from the
> beginning, you can suddenly consume messages that are 1 hour apart from
> each other, one after another. Something like:
>
> P0  | t1 | t3 | t7 | t10| t11| t13| t17|
> ++++++++
> P1  | t2 | t4 | t6 | t9 | t12| t14| t16|
> ++++++++
> P2  ||| t5 | t8 ||| t15|
> ||||||||
> ++++++++
> ^  ^
> RO URO
>
>
> t5 - you scaled to 3 partitions.
> “R0” is when you change from 3 partitions to 2
> “URO” is when you change back to 3 partitions.
>
> When you consume this partitioned topic from the beginning, you will
> consume t15 mixed with t6 and t7, which can be hours apart.
>

Even if the messages are hours apart, they are still confined to the
ordering guarantees of a topic i.e. order is maintained within a partition
:)


>
> I understand this can happen today if you only add a partition and read
> from the beginning.
>

exactly! Maybe there is a need to solve this, maybe not as even kafka has
similar behavior. Although I am unaware if they are having discussions to
do something about it.


> 2. If we keep ordered consumption out of scope, how do we keep the users
> from doing “wrong” things, like using failover type subscriptions on
> partitioned topics that have decreased their partitions? Topic and its
> partition count is a detached “entity” from its consumption type.
>
>
This will be a very easy proposal to do a live check in the topic update
command. If there are exclusive/failover subscriptions attached to the
topic, then we prevent this. We should actually do this today as well as
the issue exists during partition count increase as well.


>
> I’m curious if you thought of implementing it following the pattern we have
> today for BK. When an ensemble changes, it simply adds the new ensemble to
> a list of ensembles, so you follow a chain of servers when you read from a
> ledger. You read from (b1,b2,b3) and then switch to (b1, b3, b5).
>
> What if a partitioned topic is exactly that? It is a chain of lists. Each
> list contains the topics (partitions).
> Something like:
> (billing-0-100, billing-1-101), (billing-0-102, billing-1-103,
> billing-2-104), (billing-0-105, billing-1-106)
>
> It’s only a direction - just wondering if something like that has been
> considered.
>
I believe this will be a very drastic change. I haven't looked in this
direction, but this will touch almost every aspect of the broker - from
dedupe, to transactions and beyond. I think almost all of the broker level
feature rely on the fact that a partition will always be owned by a single
topic at any given time. This will lead to an active partition for a single
partition across brokers..


-- 
Girish Sharma


Re: [ANNOUNCE] New Committer: Asaf Mesika

2024-02-20 Thread Girish Sharma
Congratulations Asaf!


On Tue, Feb 20, 2024 at 11:22 PM Julien Jakubowski
 wrote:

> Congrats, Asaf!
>
> On Tue, Feb 20, 2024 at 5:51 PM Lari Hotari  wrote:
>
> > The Apache Pulsar Project Management Committee (PMC) has invited
> > Asaf Mesika https://github.com/asafm to become a committer and we
> > are pleased to announce that he has accepted.
> >
> > Welcome and Congratulations, Asaf Mesika!
> >
> > Please join us in congratulating and welcoming Asaf onboard!
> >
> > Best Regards,
> >
> > Lari Hotari
> > on behalf of the Pulsar PMC
> >
>


-- 
Girish Sharma


Pulsar Version upgrade guidelines

2024-02-16 Thread Girish Sharma
There have been a few discussions in the past on the slack channel and I
recently also started a similar thread [0] regarding if we can skip certain
releases while upgrading towards pulsar 3.0 and beyond. Starting this dev
mailing list discussion to get some more input.

As per official release policy [1] itself, there are some open questions:

*Before 3.0, upgrade should be done linearly through each feature version.
> For example, when upgrading from 2.8 to 2.10, it is important to upgrade to
> 2.9 before going to 2.10. *
>

This is a very clear statement. Although lengthy, it makes sense to limit
the scope of OSS to test upgrades from and to every version.

*Starting from 3.0, additionally, live upgrade/downgrade between one LTS
> and the next one is guaranteed. For example, *
>

What does this exactly entail? Does it only mean that I can do 3.0.x <->
4.0.x ? The example just below is misleading from that perspective


>
>
> *3.0 -> 4.0 -> 3.0 is OK;3.2 -> 4.0 -> 3.2 is OK;3.2 -> 4.4 -> 3.2
> is OK;3.2 -> 5.0 is not OK.*


This seems to give a feeling that it is possible to upgrade from any 3.x
version to any 3.x or 4.x version including rollbacks. Are we testing this
as new 3.x versions release?

To add to the confusion, the blog post [2] of 3.2 release mentions this

*For the 3.2 series, you should be able to upgrade from version 3.1 or
> downgrade from the subsequently released version 3.3. If you are currently
> using an earlier version, please ensure that you upgrade to version 3.1
> before proceeding further.*


This is confusing now. So 3.2 -> 4.0 would be possible but 3.0 -> 3.2
isn't? Why is 3.2 -> 4.4 possible then?

Wish to see the community's take on this in order to align the
recommendation.

[0] https://apache-pulsar.slack.com/archives/C5Z4T36F7/p1705392242948349
[1]
https://pulsar.apache.org/contribute/release-policy/#compatibility-between-releases
[2] https://pulsar.apache.org/blog/2024/02/12/announcing-apache-pulsar-3-2/

-- 
Girish Sharma


Re: [DISCUSS] 2.10 & 2.11 EOL - pulsar.apache.org website shows that support has ended

2024-02-13 Thread Girish Sharma
Adding to the point that Alexander mentioned, should we think about making
the support cycle relative to the next release? I believe having 6 month
support and 3 month release widows is to actually have a 3 month overlap.
Should we formalize that instead of calling the support to be of 6 months?
i.e. should the support of version 3.(x) be formalized to "upto 3 months
post release of 3.(x + 1)" and likewise?
For instance, currently, I am sure that barely anyone would have moved to
pulsar 3.2 in their production systems and 3.1 is already EOL. Even in a
super fast paced organization, upgrades do not happen so fast..

Regards

On Thu, Feb 1, 2024 at 10:47 PM Frank Kelly 
wrote:

> Lari, Matteo, Chris etc talked about this a good bit in the Community
> meeting today.
> What I was looking for and what seems that Matteo was amendable to was
> adding a blurb here
> https://pulsar.apache.org/contribute/release-policy/#supported-versions
>
> saying something like
> "Please plan according to these committed dates below. However, depending
> on the availability of resources and time and/or the severity of an issue
> (e.g. a very impactful CVE), some ad hoc releases may be possible going
> back some number of patch releases but these would be provided on a
> 'best-effort' basis."
>
> -Frank
>
> On Thu, Jan 25, 2024 at 12:56 PM Alexander Hall 
> wrote:
>
> > On a related note, according to the release policy page (
> > https://pulsar.apache.org/contribute/release-policy/#supported-versions
> ),
> > the 3.1 branch only has ~16 more days of support. I'm hoping that 3.2.0
> > gets the green light for release before then, because we really didn't
> get
> > much of a support overlap between the 3.1 and 3.2 releases.
> >
> > Thanks,
> >
> > Alex
> >
> > -Original Message-
> > From: Frank Kelly 
> > Sent: Thursday, January 25, 2024 10:44 AM
> > To: dev@pulsar.apache.org
> > Subject: '[External]'Re: [DISCUSS] 2.10 & 2.11 EOL - pulsar.apache.org
> > website shows that support has ended
> >
> > [You don't often get email from fke...@cogitocorp.com.invalid. Learn why
> > this is important at https://aka.ms/LearnAboutSenderIdentification ]
> >
> > Clarity around this would be useful as we just started the process of
> > upgrading from 2.10.3 to 2.11.3 I know 3.0 now has LTS but I not hoping
> to
> > have to do another update for a while
> > https://pulsar.apache.org/blog/2023/05/02/announcing-apache-pulsar-3-0/
> >
> > Frank
> >
> > On Thu, Jan 25, 2024 at 6:11 AM Lari Hotari  wrote:
> >
> > > Bumping this thread to the top. We need to find a resolution.
> > >
> > > -Lari
> > >
> > > On Sat, 20 Jan 2024 at 11:13, Lari Hotari  wrote:
> > > >
> > > > Hi,
> > > >
> > > > Our website shows that "active support" and "security support" has
> > > > ended
> > > on 11 Jan 2024 for 2.11 and on 18 Apr 2023 for 2.10 . You can find
> > > this information in our release policy page at
> > > > https://pulsar.apache.org/contribute/release-policy/#supported-versi
> > > > ons
> > > .
> > > >
> > > > Does this mean that the Apache Pulsar PMC won't be driving more new
> > > releases for branch-2.11 and branch-2.10 ? Are there exceptions?
> > > > Do we need to make a separate decision about 2.10 & 2.11 EOL ?
> > > >
> > > > -Lari
> > > >
> > > > On 2023/12/19 06:25:20 Michael Marshall wrote:
> > > > > Hi Pulsar Community,
> > > > >
> > > > > Do we consider the 2.10 release line EOL? If not, is there a
> > > > > committer that would like to volunteer to release 2.10.6?
> > > > >
> > > > > We briefly discussed keeping 2.10 alive in June [0], and that was
> > > > > followed by a 2.10.5 release in July. Given that we already have
> > > > > 2.11, 3.0, 3.1, and now a discussion on 3.2, it seems
> > > > > unsustainable to keep
> > > > > 2.10 going much longer.
> > > > >
> > > > > Thanks,
> > > > > Michael
> > > > >
> > > > > [0]
> > > > > https://lists.apache.org/thread/w4jzk27qhtosgsz7l9bmhf1t7o9mxjhp
> > > > >
> > >
> >
>


-- 
Girish Sharma


Re: [DISCUSS][PIP-338] Add default lookup listener and fix inconsistency with listener's usage between different protocols

2024-02-12 Thread Girish Sharma
Lari, I think there is a big miss. This issue is not _just_ for admin REST
API calls. It exists for every produce/consume client as well as in most
cases, the lookup service being used is the HTTPLookupService leading to
the same issue as you are mentioning for admin API calls. I've left a
comment for the same here -
https://github.com/apache/pulsar/pull/22039#discussion_r1487332735

On Tue, Feb 13, 2024 at 12:53 PM Lari Hotari  wrote:

> On 2024/02/13 06:46:29 Girish Sharma wrote:
> > Personally, while this may be a much cleaner approach or may be not solve
> > the core issue at all, it is not what we are trying to achieve with our
> > PIP, which is basically only a PIP due to configuration changes involved,
> > but actually is a bug fix for a *bug we are facing in production right
> now.
> > *The same bug is also highlighted by you via the comment you have linked
> in
> > your [3] link
> > There are much more things to consider for the multiple bind address
> > approach and it deserves its own PIP. Specifically the comment on GH that
> > I've made showcasing a use case -
> > https://github.com/apache/pulsar/pull/22039#discussion_r1486400014
>
> The multiple bind address solution is already implemented for the Pulsar
> binary protocol as it is defined in PIP-95. What we are dealing with a gap
> in the solution, mainly about Pulsar Admin API http/https redirects.
> I replied in
> https://github.com/apache/pulsar/pull/22039#discussion_r1487282934 how
> this could possibly be achieved without too many changes.
> That's not a complete design since there would have to be a way to set the
> header value and documentation about how this should be achieved. One
> possibility is that there's a proxy server for the external address which
> sets this header value. (PIP-95 points into that direction)
>
> > Let's treat this PIP as a bug fix only. If needed, we can skip the PIP
> and
> > directly send a bug fix PR if that clears things here.
>
> I'm not sure if that's optimal. A PIP is usually issued when a public
> interface is changed. I agree that this is addressing gaps left in PIP-61 &
> PIP-95 and could be considered as an extension. However it's easier to add
> a new PIP than start extending old PIPs. The benefit of the PIP process is
> that it helps getting consensus about the design before someone puts a lot
> of effort in implementing something that would not be accepted.
> Since we don't have consensus, I think it's better to continue the
> discussion and also meet at the Pulsar community meeting to discuss this
> further. Hopefully others from the community also participate.
>
> > I disagree here. We have clearly identified that there is a bug in the
> > current code. We are trying to do a bug fix here. The goal is not to deep
>
> I'm not sure if it's a bug. The use case isn't covered by PIP-61/PIP-95.
> That is a gap in the design to me, not an implementation bug. However it
> doesn't really matter in the end what we call it. The end-to-end
> functionality isn't usable at the moment for all Pulsar Admin API calls.
>
> > dive into a much bigger design problem as we want to hotfix this ASAP in
> > our system, but also want an alignment with the community so as to not
> > maintain this patch locally, internally, for every version we upgrade to.
>
> Yes, that is a sensible goal. I'm pretty sure that we can achieve a
> solution that addresses the main gap in PIP-61 which is about the
> Pulsar Admin API redirects. It is also notable that there are gaps in the
> documentation of the advertised listeners in Pulsar [1]. The documentation
> also needs some more love. Contributions are more than welcome!
>
> -Lari
>
> 1 -
> https://pulsar.apache.org/docs/next/concepts-multiple-advertised-listeners/#advertised-listeners
>


-- 
Girish Sharma


Re: [DISCUSS][PIP-338] Add default lookup listener and fix inconsistency with listener's usage between different protocols

2024-02-12 Thread Girish Sharma
Hello Lari,

On Tue, Feb 13, 2024 at 12:04 PM Lari Hotari  wrote:

> Thanks for linking to the PIP-95 implementation PR #12056 [1]. I wasn't
> aware that it has been implemented.
>
> That PR is what I thought that was missing. It is the implementation to
> achieve what PIP-61 described as "only return the corresponding service
> URL" and the PIP-95 design of "use a unique bind address for each
> listener". It is possible that there are gaps, but I think that we should
> cover the possible gaps.
>
> > Maybe that's not entirely true. You can configure 100s of listeners for
> all
> > schemes/protocols, but the code only returns the internal or requested or
> > first address for all 4 schemes (pulsar, pulsar+ssl, http, https, and one
> > service url, which i am not sure why it is needed, maybe for backward
> > compatibility). So while, it's not exactly approach 2, it's also not
> purely
>
> I think that this is expected and what PIP-61 and PIP-95 describe.
> The main problem of PIP-61 and PIP-95 is that there isn't documentation of
> how to properly configure the solution. The important piece of
> "bindAddresses" implemented in PIP-95 was missing from PIP-61 in the first
> place and it's likely that PIP-95 isn't fully completed.
>
> The PR 12056 description is perhaps the best documentation of the feature
> and provides an example of how to use the configuration:
> 
> bindAddresses=external:pulsar://0.0.0.0:6652,external:pulsar+ssl://
> 0.0.0.0:6653
> bindAddress=0.0.0.0
> brokerServicePort=6650
> brokerServicePortTls=6651
>
> advertisedListeners=cluster:pulsar://broker-1.local:6650,cluster:pulsar+ssl://broker-1.local:6651,external:pulsar://
> broker-1.example.dev:6652,external:pulsar+ssl://broker-1.example.dev:6653
> internalListenerName=cluster
> 
>
>
Personally, while this may be a much cleaner approach or may be not solve
the core issue at all, it is not what we are trying to achieve with our
PIP, which is basically only a PIP due to configuration changes involved,
but actually is a bug fix for a *bug we are facing in production right now.
*The same bug is also highlighted by you via the comment you have linked in
your [3] link
There are much more things to consider for the multiple bind address
approach and it deserves its own PIP. Specifically the comment on GH that
I've made showcasing a use case -
https://github.com/apache/pulsar/pull/22039#discussion_r1486400014



> As it can be seen, the "external" listener is bound to different ports
> than the default "cluster" listener (listener for the brokerServicePort and
> brokerServicePortlTls is specified with internalListenerName).
>
> The clear gaps in the PIP-61 and PIP-95 solution are in the admin API.
> There are multiple APIs where a redirect is sent back to the client. The PR
> 12072 [2] covers the topic lookup over the admin API, but doesn't cover all
> other cases where the redirect happens. This is already pointed out in a
> comment [3] on that PR.
>
>
Let's treat this PIP as a bug fix only. If needed, we can skip the PIP and
directly send a bug fix PR if that clears things here.


> I think that PIP-338 should focus on covering the gaps and providing a
> design that is aligned with the current PIP-61 & PIP-95 implementation.
>
>
I disagree here. We have clearly identified that there is a bug in the
current code. We are trying to do a bug fix here. The goal is not to deep
dive into a much bigger design problem as we want to hotfix this ASAP in
our system, but also want an alignment with the community so as to not
maintain this patch locally, internally, for every version we upgrade to.



> -Lari
>
> 1 - https://github.com/apache/pulsar/pull/12056
> 2 - https://github.com/apache/pulsar/pull/12072
> 3 - https://github.com/apache/pulsar/pull/12072#issuecomment-921663472
>
>
-- 
Girish Sharma


Re: [DISCUSS][PIP-338] Add default lookup listener and fix inconsistency with listener's usage between different protocols

2024-02-12 Thread Girish Sharma
Reply inline, and also replied to the GH comment.

On Mon, Feb 12, 2024 at 9:37 PM Lari Hotari  wrote:

> The confusing detail is that in PIP-61 [1], the alternative that has been
> implemented in the Pulsar code base has been marked as the rejected
> alternative ("Return all advertised listeners(rejected)"). The preferred
> and proposed alternative "Only return the corresponding service URL" was
> never implemented.
>
>
Maybe that's not entirely true. You can configure 100s of listeners for all
schemes/protocols, but the code only returns the internal or requested or
first address for all 4 schemes (pulsar, pulsar+ssl, http, https, and one
service url, which i am not sure why it is needed, maybe for backward
compatibility). So while, it's not exactly approach 2, it's also not purely
approach 1. I can speculate and assume that in approach 1, the author meant
"one of each protocol" and that's actually what's implemented, but its not
clearly mentioned in the PIP. It would be great if we can get some input
from folks involved in PIP-61 and PIP-95 . Also, the wiki says PIP-95 [0]
is something else, while the PRs using PIP-95 [1] in commits refer to
something else [2] !


[0] -
https://github.com/apache/pulsar/wiki/PIP-95:-Transaction-coordinator-loading-mechanism
[1] - https://github.com/apache/pulsar/pull/12056
[2] - https://github.com/apache/pulsar/issues/12040

Regards


Re: Ability to decrease partition count in pulsar

2024-01-24 Thread Girish Sharma
Hello Lari, thanks for the comments. replies inline.


On Wed, Jan 24, 2024 at 7:36 PM Lari Hotari  wrote:

> Hi Girish,
>
> Very useful proposal.
>
> Would it be possible to enable comments on the Google Doc? It's pretty
> hard to comment on the doc since copying is also disabled.
>
> I've enabled them now. Thank you for going through the doc.


> In the scope definition 4.2,
> "The initial scope is to target unordered consumption flows. Even in
> the current world, there are challenges with normal partition scale up
> for ordered consumption based topics, so keeping the partition scale
> down out of scope for that as well."
>
> If we don't care about ordered consumption and re-keying, I guess the
> feature isn't very hard to implement.
> Pulsar already contains the topic termination feature which will let
> consumers to consume messages while publishers cannot publish more
> messages. This is the "ready-only topic" feature that could be used as
> one of the building blocks for implementing the decrease of the
> partition count for a topic.
>

Yes, terminated topic is already very close to the read-only topic barring
the grace period and maybe the scope of un-terminating a topic. I will
merge my read-only with the existing terminate API/feature.


>
> For the final design, it would be great to have a design for ordered
> consumption flows. It might not be trivial to design it. I happened to
> be at a local Kafka meetup a few months ago and this particular
> challenge was discussed in the context of Kafka and how painful it is
> to handle manually and what problems could happen in production when
> large scale streaming applications assume that a specific key is
> contained in a specific partition.
>
> There's a similar challenge also when the number of partitions are
> increased so this problem isn't specific to decreasing partitions.
> In ordered consumption flows, there is most likely an ordering key and
> a specific key is assigned to a specific partition. If the partition
> count changes, there would have to be some rekeying/reassignment that
> happens.
>
> I agree that this is an existing problem in both kafka and pulsar for both
partition count scale up (and scale down in kafka via re-mapping). For that
purpose, I've kept it out of scope. But what I would ensure is that adding
this new feature of partitions scale down is not increasing the complexity
or difficulty of providing seamless partition count change for ordered
consumption in future.


-- 
Girish Sharma


Re: Ability to decrease partition count in pulsar

2024-01-23 Thread Girish Sharma
Bumping this up! Hoping this can be discussed so that I get rule out that
this approach has any fatal flaws.

Regards

On Fri, Jan 19, 2024 at 11:58 AM Girish Sharma 
wrote:

> Hello everyone,
>
> A a true cloud native platform, which supports scale up and scale down, I
> feel like there is a need to be able to reduce partition count in pulsar to
> truly achieve a scale down after events like sales (akin to black friday,
> etc) or huge temporary publish burst due to backfill.
>
> I looked through the archives (upto 2021) and did not find any prior
> discussion on the same topic.
>
> I have given this an initial thought to figure out what would it need to
> support such a feature in the lowest footprint possible. I am attaching the
> document explaining the need, requirements and initial high level details
> [0]. What I would like is to understand if the community also finds this
> feature helpful and does the approach described in the document have some
> fatal flaw? Summarizing the approach here as well:
>
>- Introduce an ability to convert a normal topic object into a
>read-only topic via admin api and an additional partitioned-topic metadata
>property (just like shadow source, etc)
>- Add logic to block produce but allow new consumers and dispatch call
>based on this flag
>- Add logic in GC to clean out read only topics when all of their
>ledgers expire (TTL/retention)
>
> Goal is that there is no data movement involved and no impact on existing
> partitions during this scale down.
>
> Looking forward to the discussion.
>
> [0]
> https://docs.google.com/document/d/1sbGQSwDihQftIRsxAXg5Zm4uxKQ0kRk9HadKYRFTswI/edit?usp=sharing
>
> Regards
> --
> Girish Sharma
>


-- 
Girish Sharma


Ability to decrease partition count in pulsar

2024-01-18 Thread Girish Sharma
Hello everyone,

A a true cloud native platform, which supports scale up and scale down, I
feel like there is a need to be able to reduce partition count in pulsar to
truly achieve a scale down after events like sales (akin to black friday,
etc) or huge temporary publish burst due to backfill.

I looked through the archives (upto 2021) and did not find any prior
discussion on the same topic.

I have given this an initial thought to figure out what would it need to
support such a feature in the lowest footprint possible. I am attaching the
document explaining the need, requirements and initial high level details
[0]. What I would like is to understand if the community also finds this
feature helpful and does the approach described in the document have some
fatal flaw? Summarizing the approach here as well:

   - Introduce an ability to convert a normal topic object into a read-only
   topic via admin api and an additional partitioned-topic metadata property
   (just like shadow source, etc)
   - Add logic to block produce but allow new consumers and dispatch call
   based on this flag
   - Add logic in GC to clean out read only topics when all of their
   ledgers expire (TTL/retention)

Goal is that there is no data movement involved and no impact on existing
partitions during this scale down.

Looking forward to the discussion.

[0]
https://docs.google.com/document/d/1sbGQSwDihQftIRsxAXg5Zm4uxKQ0kRk9HadKYRFTswI/edit?usp=sharing

Regards
-- 
Girish Sharma


Re: [DISCUSS] PIP-310: Support custom publish rate limiters

2023-12-15 Thread Girish Sharma
Closing this discussion thread and the PIP. Apart from the discussion
present in this thread, I presented the detailed requirements in a dev meet
on 23rd November and the conclusion was that we will actually go ahead and
implement the requirements in pulsar itself.
There was a pre-requisite of refactoring rate limiter codebase which is
already covered by Lari in PIP-322.

I will be creating a new parent PIP soon about the high level requirements.

Thank you everyone who participated in the thread and the discussion on
23rd dev meeting.

Regards

On Thu, Nov 23, 2023 at 8:26 PM Girish Sharma 
wrote:

> I've captured our requirements in detail in this document -
> https://docs.google.com/document/d/1-y5nBaC9QuAUHKUGMVVe4By-SmMZIL4w09U1byJBbMc/edit
> Added it to agenda document as well. Will join the meeting and discuss.
>
> Regards
>
> On Wed, Nov 22, 2023 at 10:49 PM Lari Hotari  wrote:
>
>> I have written a long blog post that contains the context, the summary
>> of my view point about PIP-310 and the proposal for proceeding:
>>
>> https://codingthestreams.com/pulsar/2023/11/22/pulsar-slos-and-rate-limiting.html
>>
>> Let's discuss this tomorrow in the Pulsar community meeting [1]. Let's
>> coordinate on Pulsar Slack's #dev channel if the are issues in joining
>> the meeting.
>> See you tomorrow!
>>
>> -Lari
>>
>> 1 - https://github.com/apache/pulsar/wiki/Community-Meetings
>>
>> On Mon, 20 Nov 2023 at 20:48, Lari Hotari  wrote:
>> >
>> > Hi Girish,
>> >
>> > replies inline and after that there are some updates about my
>> > preparation for the community meeting on Thursday. (there's
>> > https://github.com/lhotari/async-tokenbucket with a PoC for a
>> > low-level high performance token bucket implementation)
>> >
>> > On Sat, 11 Nov 2023 at 17:25, Girish Sharma 
>> wrote:
>> > > Actually, the capacity is meant to simulate that particular rate
>> limit. if
>> > > we have 2 buckets anyways, the one managing the fixed rate limit part
>> > > shouldn't generally have a capacity more than the fixed rate, right?
>> >
>> > There are multiple ways to model and understand a dual token bucket
>> > implementation.
>> > I view the 2 buckets in a dual token bucket implementation as separate
>> > buckets. They are like an AND rule, so if either bucket is empty,
>> > there will be a need to pause to wait for new tokens.
>> > Since we aren't working with code yet, these comments could be out of
>> context.
>> >
>> > > I think it can be done, especially with that one thing you mentioned
>> about
>> > > holding off filling the second bucket for 10 minutes.. but it does
>> become
>> > > quite complicated in terms of managing the flow of the tokens..
>> because
>> > > while we only fill the second bucket once every 10 minutes, after the
>> 10th
>> > > minute, it needs to be filled continuously for a while (the duration
>> we
>> > > want to support the bursting for).. and the capacity of this second
>> bucket
>> > > also is governed by and exactly matches the burst value.
>> >
>> > There might not be a need for this complexity of the "filling bucket"
>> > in the first place. It was more of a demonstration that it's possible
>> > to implement the desired behavior of limited bursting by tweaking the
>> > basic token bucket algorithm slightly.
>> > I'd rather avoid this additional complexity.
>> >
>> > > Agreed that it is much higher than a single topics' max throughput..
>> but
>> > > the context of my example had multiple topics lying on the same
>> > > broker/bookie ensemble bursting together at the same time because
>> they had
>> > > been saving up on tokens in the bucket.
>> >
>> > Yes, that makes sense.
>> >
>> > > always be a need to overprovision resources. You usually don't want to
>> > > > go beyond 60% or 70% utilization on disk, cpu or network resources
>> so
>> > > > that queues in the system don't start to increase and impacting
>> > > > latencies. In Pulsar/Bookkeeper, the storage solution has a very
>> > > > effective load balancing, especially for writing. In Bookkeeper each
>> > > > ledger (the segment) of a topic selects the "ensemble" and the
>> "write
>> > > > quorum", the set of bookies to write to, when the ledger is opened.
>> > > > The bookkeeper client could also ch

Re: [VOTE] PIP-325: Add command to abort transaction

2023-12-15 Thread Girish Sharma
Hello Ruihong*,*
you actually replied to the discussion thread itself.
Moreover, you should wait for the discussion thread to have some actual
discussion before stating the voting thread..

https://github.com/apache/pulsar/blob/master/pip/README.md

Regards

On Fri, Dec 15, 2023 at 6:37 PM ruihongzhou 
wrote:

> Hi community,
>
>
> This thread is to start a vote forPIP-325: Add command to abort
> transaction.
>
>
> PIP:https://github.com/apache/pulsar/pull/21731
>
>
> Releted PR:https://github.com/apache/pulsar/pull/21630
>
> Discussion thread:
> https://lists.apache.org/thread/p559tsphr7kvbh2qqw8vsow0ylytonnz
>
>
>
>
>
>
>
>
>
> Ruihong



-- 
Girish Sharma


Re: [VOTE] PIP-322: Pulsar Rate Limiting Refactoring

2023-12-11 Thread Girish Sharma
+1 (non-binding)

On Mon, Dec 11, 2023 at 3:49 PM PengHui Li  wrote:

> +1 (binding)
>
> Just one minor comment on the proposal PR
>
> Regards,
> Penghui
>


-- 
Girish Sharma


Re: [DISCUSS] PIP-321 Split the responsibilities of namespace replication-clusters

2023-12-06 Thread Girish Sharma
Hello Xiangying,


On Thu, Dec 7, 2023 at 6:32 AM Xiangying Meng  wrote:

> Hi Girish,
>
> What you are actually opposing is the implementation of true topic-level
> geo-replication. You believe that topics should be divided into different
> namespaces based on replication. Following this line of thought, what we
> should do is restrict the current topic-level replication settings, not
> allowing the replication clusters set at the topic level to exceed the
> range of replication clusters set in the namespace.
>

Yes, that's my viewpoint. In case that's not your view point, then in your
use cases do you ever have more than one namespace inside a tenant?
With every property coming at topic level, it makes no sense for the
namespace hierarchy to exist anymore.


>
> One point that confuses me is that we provide a setting for topic-level
> replication clusters, but it can only be used to amend the namespace
> settings and cannot work independently. Isn't this also a poor design for
> Pulsar?
>

This feature was originally added in pulsar without a PIP. And the PR [0]
also doesn't have much context around why it was needed and why it is being
added.. So I can't comment on why this was added..
But my understanding is that even in a situation when the topics are
divided into proper namespaces based on use cases and suddenly there is an
exceptional need for one of the existing topics to have lesser replication,
then instead of following a long exercise of moving that topic to a new
namespace, you can use this feature.

[0] - https://github.com/apache/pulsar/pull/12136



>
> On Thu, Dec 7, 2023 at 2:28 AM Girish Sharma 
> wrote:
>
> > Hello, replies inline.
> >
> > On Wed, Dec 6, 2023 at 5:28 PM Xiangying Meng 
> > wrote:
> >
> > > Hi Girish,
> > >
> > > Thank you for your explanation. Because Joe's email referenced the
> > current
> > > implementation of Pulsar, I misunderstood him to be saying that this
> > > current implementation is not good.
> > >
> > > A possible use case is where there is one or a small number of topics
> in
> > > the namespace that store important messages, which need to be
> replicated
> > to
> > > other clusters. Meanwhile, other topics only need to store data in the
> > > local cluster.
> > >
> >
> > Is it not possible to simply have the other topics in a namespace which
> > allows for that other cluster, and the local topics remain in the
> namespace
> > with local cluster needs. Seems to me like a proper use case of two
> > different namespaces as the use case is different in both cases.
> >
> >
> >
> > >
> > > For example, only topic1 needs replication, while topic2 to topic100 do
> > > not. According to the current implementation, we need to set
> replication
> > > clusters at the namespace level (e.g. cluster1 and cluster2), and then
> > set
> > > the topic-level replication clusters (cluster1) for topic2 to topic100
> to
> > > exclude them. It's hard to say that this is a good design.
> > >
> >
> > No, all you need is to put topic1 in namespace1 and topic2 to topic100 in
> > namespace2 . This is exactly what me and Joe were saying is a bad design
> > choice that you are clubbing all 100 topics in same namespace.
> >
> >
> >
> > >
> > > Best regards.
> > >
> > > On Wed, Dec 6, 2023 at 12:49 PM Joe F  wrote:
> > >
> > > > Girish,
> > > >
> > > > Thank you for making my point much better than I did ..
> > > >
> > > > -Joe
> > > >
> > > > On Tue, Dec 5, 2023 at 1:45 AM Girish Sharma <
> scrapmachi...@gmail.com>
> > > > wrote:
> > > >
> > > > > Hello Xiangying,
> > > > >
> > > > > I believe what Joe here is referring to as "application design" is
> > not
> > > > the
> > > > > design of pulsar or namespace level replication but the design of
> > your
> > > > > application and the dependency that you've put on topic level
> > > > replication.
> > > > >
> > > > > In general, I am aligned with Joe from an application design
> > > standpoint.
> > > > A
> > > > > namespace is supposed to represent a single application use case,
> > topic
> > > > > level override of replication clusters helps in cases where there
> > are a
> > > > few
> > > > > exceptional topics which do not need replication in all of the
> > > namespac

Re: [DISCUSS] PIP-321 Split the responsibilities of namespace replication-clusters

2023-12-06 Thread Girish Sharma
Hello, replies inline.

On Wed, Dec 6, 2023 at 5:28 PM Xiangying Meng  wrote:

> Hi Girish,
>
> Thank you for your explanation. Because Joe's email referenced the current
> implementation of Pulsar, I misunderstood him to be saying that this
> current implementation is not good.
>
> A possible use case is where there is one or a small number of topics in
> the namespace that store important messages, which need to be replicated to
> other clusters. Meanwhile, other topics only need to store data in the
> local cluster.
>

Is it not possible to simply have the other topics in a namespace which
allows for that other cluster, and the local topics remain in the namespace
with local cluster needs. Seems to me like a proper use case of two
different namespaces as the use case is different in both cases.



>
> For example, only topic1 needs replication, while topic2 to topic100 do
> not. According to the current implementation, we need to set replication
> clusters at the namespace level (e.g. cluster1 and cluster2), and then set
> the topic-level replication clusters (cluster1) for topic2 to topic100 to
> exclude them. It's hard to say that this is a good design.
>

No, all you need is to put topic1 in namespace1 and topic2 to topic100 in
namespace2 . This is exactly what me and Joe were saying is a bad design
choice that you are clubbing all 100 topics in same namespace.



>
> Best regards.
>
> On Wed, Dec 6, 2023 at 12:49 PM Joe F  wrote:
>
> > Girish,
> >
> > Thank you for making my point much better than I did ..
> >
> > -Joe
> >
> > On Tue, Dec 5, 2023 at 1:45 AM Girish Sharma 
> > wrote:
> >
> > > Hello Xiangying,
> > >
> > > I believe what Joe here is referring to as "application design" is not
> > the
> > > design of pulsar or namespace level replication but the design of your
> > > application and the dependency that you've put on topic level
> > replication.
> > >
> > > In general, I am aligned with Joe from an application design
> standpoint.
> > A
> > > namespace is supposed to represent a single application use case, topic
> > > level override of replication clusters helps in cases where there are a
> > few
> > > exceptional topics which do not need replication in all of the
> namespace
> > > clusters. This helps in saving network bandwidth, storage, CPU, RAM etc
> > >
> > > But the reason why you've raised this PIP is to bring down the actual
> > > replication semantics at a topic level. Yes, namespace level still
> exists
> > > as per your PIP as well, but is basically left only to be a "default in
> > > case topic level is missing".
> > > This brings me to a very basic question - What's the use case that you
> > are
> > > trying to solve that needs these changes? Because, then what's stopping
> > us
> > > from bringing every construct that's at a namespace level (bundling,
> > > hardware affinity, etc) down to a topic level?
> > >
> > > Regards
> > >
> > > On Tue, Dec 5, 2023 at 2:52 PM Xiangying Meng 
> > > wrote:
> > >
> > > > Hi Joe,
> > > >
> > > > You're correct. The initial design of the replication policy leaves
> > room
> > > > for improvement. To address this, we aim to refine the cluster
> settings
> > > at
> > > > the namespace level in a way that won't impact the existing system.
> The
> > > > replication clusters should solely be used to establish full mesh
> > > > replication for that specific namespace, without having any other
> > > > definitions or functionalities.
> > > >
> > > > BR,
> > > > Xiangying
> > > >
> > > >
> > > > On Mon, Dec 4, 2023 at 1:52 PM Joe F  wrote:
> > > >
> > > > > >if users want to change the replication policy for
> > > > > topic-n and do not change the replication policy of other topics,
> > they
> > > > need
> > > > > to change all the topic policy under this namespace.
> > > > >
> > > > > This PIP unfortunately  flows from  attempting to solve bad
> > application
> > > > > design
> > > > >
> > > > > A namespace is supposed to represent an application, and the
> > namespace
> > > > > policy is an umbrella for a similar set of policies  that applies
> to
> > > all
> > > > > topics.  The exceptions would be if a topic had a need for a
> 

Re: [DISCUSS] PIP-321 Split the responsibilities of namespace replication-clusters

2023-12-05 Thread Girish Sharma
Hello Xiangying,

I believe what Joe here is referring to as "application design" is not the
design of pulsar or namespace level replication but the design of your
application and the dependency that you've put on topic level replication.

In general, I am aligned with Joe from an application design standpoint. A
namespace is supposed to represent a single application use case, topic
level override of replication clusters helps in cases where there are a few
exceptional topics which do not need replication in all of the namespace
clusters. This helps in saving network bandwidth, storage, CPU, RAM etc

But the reason why you've raised this PIP is to bring down the actual
replication semantics at a topic level. Yes, namespace level still exists
as per your PIP as well, but is basically left only to be a "default in
case topic level is missing".
This brings me to a very basic question - What's the use case that you are
trying to solve that needs these changes? Because, then what's stopping us
from bringing every construct that's at a namespace level (bundling,
hardware affinity, etc) down to a topic level?

Regards

On Tue, Dec 5, 2023 at 2:52 PM Xiangying Meng  wrote:

> Hi Joe,
>
> You're correct. The initial design of the replication policy leaves room
> for improvement. To address this, we aim to refine the cluster settings at
> the namespace level in a way that won't impact the existing system. The
> replication clusters should solely be used to establish full mesh
> replication for that specific namespace, without having any other
> definitions or functionalities.
>
> BR,
> Xiangying
>
>
> On Mon, Dec 4, 2023 at 1:52 PM Joe F  wrote:
>
> > >if users want to change the replication policy for
> > topic-n and do not change the replication policy of other topics, they
> need
> > to change all the topic policy under this namespace.
> >
> > This PIP unfortunately  flows from  attempting to solve bad application
> > design
> >
> > A namespace is supposed to represent an application, and the namespace
> > policy is an umbrella for a similar set of policies  that applies to all
> > topics.  The exceptions would be if a topic had a need for a deficit, The
> > case of one topic in the namespace sticking out of the namespace policy
> > umbrella is bad  application design in my opinion
> >
> > -Joe.
> >
> >
> >
> > On Sun, Dec 3, 2023 at 6:00 PM Xiangying Meng 
> > wrote:
> >
> > > Hi Rajan and Girish,
> > > Thanks for your reply. About the question you mentioned, there is some
> > > information I want to share with you.
> > > >If anyone wants to setup different replication clusters then either
> > > >those topics can be created under different namespaces or defined at
> > topic
> > > >level policy.
> > >
> > > >And users can anyway go and update the namespace's cluster list to add
> > the
> > > >missing cluster.
> > > Because the replication clusters also mean the clusters where the topic
> > can
> > > be created or loaded, the topic-level replication clusters can only be
> > the
> > > subset of namespace-level replication clusters.
> > > Just as Girish mentioned, the users can update the namespace's cluster
> > list
> > > to add the missing cluster.
> > > But there is a problem because the replication clusters as the
> namespace
> > > level will create a full mesh replication for that namespace across the
> > > clusters defined in
> > > replication-clusters if users want to change the replication policy for
> > > topic-n and do not change the replication policy of other topics, they
> > need
> > > to change all the topic policy under this namespace.
> > >
> > > > Pulsar is being used by many legacy systems and changing its
> > > >semantics for specific usecases without considering consequences are
> > > >creating a lot of pain and incompatibility problems for other existing
> > > >systems and let's avoid doing it as we are struggling with such
> changes
> > > and
> > > >breaking compatibility or changing semantics are just not acceptable.
> > >
> > > This proposal will not introduce an incompatibility problem, because
> the
> > > default value of the namespace policy of allowed-clusters and
> > > topic-policy-synchronized-clusters are the replication-clusters.
> > >
> > > >Allowed clusters defined at tenant level
> > > >will restrict tenants to create namespaces in regions/clusters where
> > they
> > > >are not allowed.
&g

Re: [DISCUSS] PIP-321 Split the responsibilities of namespace replication-clusters

2023-11-30 Thread Girish Sharma
Hi Xiangying,

Shouldn't the solution to the issue mentioned in #21564 [0] mostly be
around validating that topic level replication clusters are subset of
namespace level replication clusters?
It would be a completely compatible change as even today the case where a
topic has a cluster not defined in namespaces's replication-clusters
doesn't really work.
And users can anyway go and update the namespace's cluster list to add the
missing cluster.

As Rajan also mentioned, allowed-clusters field has a different
meaning/purpose.
Regards

On Thu, Nov 30, 2023 at 10:56 AM Xiangying Meng 
wrote:

> Hi, Pulsar Community
>
> I drafted a proposal to make the configuration of clusters at the namespace
> level clearer. This helps solve the problem of geo-replication not working
> correctly at the topic level.
>
> https://github.com/apache/pulsar/pull/21648
>
> I'm looking forward to hearing from you.
>
> BR
> Xiangying
>


-- 
Girish Sharma


Re: [DISCUSS] PIP-310: Support custom publish rate limiters

2023-11-23 Thread Girish Sharma
I've captured our requirements in detail in this document -
https://docs.google.com/document/d/1-y5nBaC9QuAUHKUGMVVe4By-SmMZIL4w09U1byJBbMc/edit
Added it to agenda document as well. Will join the meeting and discuss.

Regards

On Wed, Nov 22, 2023 at 10:49 PM Lari Hotari  wrote:

> I have written a long blog post that contains the context, the summary
> of my view point about PIP-310 and the proposal for proceeding:
>
> https://codingthestreams.com/pulsar/2023/11/22/pulsar-slos-and-rate-limiting.html
>
> Let's discuss this tomorrow in the Pulsar community meeting [1]. Let's
> coordinate on Pulsar Slack's #dev channel if the are issues in joining
> the meeting.
> See you tomorrow!
>
> -Lari
>
> 1 - https://github.com/apache/pulsar/wiki/Community-Meetings
>
> On Mon, 20 Nov 2023 at 20:48, Lari Hotari  wrote:
> >
> > Hi Girish,
> >
> > replies inline and after that there are some updates about my
> > preparation for the community meeting on Thursday. (there's
> > https://github.com/lhotari/async-tokenbucket with a PoC for a
> > low-level high performance token bucket implementation)
> >
> > On Sat, 11 Nov 2023 at 17:25, Girish Sharma 
> wrote:
> > > Actually, the capacity is meant to simulate that particular rate
> limit. if
> > > we have 2 buckets anyways, the one managing the fixed rate limit part
> > > shouldn't generally have a capacity more than the fixed rate, right?
> >
> > There are multiple ways to model and understand a dual token bucket
> > implementation.
> > I view the 2 buckets in a dual token bucket implementation as separate
> > buckets. They are like an AND rule, so if either bucket is empty,
> > there will be a need to pause to wait for new tokens.
> > Since we aren't working with code yet, these comments could be out of
> context.
> >
> > > I think it can be done, especially with that one thing you mentioned
> about
> > > holding off filling the second bucket for 10 minutes.. but it does
> become
> > > quite complicated in terms of managing the flow of the tokens.. because
> > > while we only fill the second bucket once every 10 minutes, after the
> 10th
> > > minute, it needs to be filled continuously for a while (the duration we
> > > want to support the bursting for).. and the capacity of this second
> bucket
> > > also is governed by and exactly matches the burst value.
> >
> > There might not be a need for this complexity of the "filling bucket"
> > in the first place. It was more of a demonstration that it's possible
> > to implement the desired behavior of limited bursting by tweaking the
> > basic token bucket algorithm slightly.
> > I'd rather avoid this additional complexity.
> >
> > > Agreed that it is much higher than a single topics' max throughput..
> but
> > > the context of my example had multiple topics lying on the same
> > > broker/bookie ensemble bursting together at the same time because they
> had
> > > been saving up on tokens in the bucket.
> >
> > Yes, that makes sense.
> >
> > > always be a need to overprovision resources. You usually don't want to
> > > > go beyond 60% or 70% utilization on disk, cpu or network resources so
> > > > that queues in the system don't start to increase and impacting
> > > > latencies. In Pulsar/Bookkeeper, the storage solution has a very
> > > > effective load balancing, especially for writing. In Bookkeeper each
> > > > ledger (the segment) of a topic selects the "ensemble" and the "write
> > > > quorum", the set of bookies to write to, when the ledger is opened.
> > > > The bookkeeper client could also change the ensemble in the middle of
> > > > a ledger due to some event like a bookie becoming read-only or
> > > >
> > >
> > > While it does do that on complete failure of bookie or a bookie disk,
> or
> > > broker going down, degradations aren't handled this well. So if all
> topics
> > > in a bookie are bursting due to the fact that they had accumulated
> tokens,
> > > then all it will lead to is breach of write latency SLA because at one
> > > point, the disks/cpu/network etc will start choking. (even after
> > > considering the 70% utilization i.e. 30% buffer)
> >
> > Yes.
> >
> > > That's only in the case of the default rate limiter where the
> tryAcquire
> > > isn't even implemented.. since the default rate limiter checks for
> breach
> > > only at a fixed rate rather than before every produce call. But in

Re: [DISCUSS] PIP-310: Support custom publish rate limiters

2023-11-11 Thread Girish Sharma
f view and what you envision.
>
> Thanks, I hope I was able to express most of it in these long emails.
> I'm having a break next week and after that I was thinking of
> summarizing this discussion from my viewpoint and then meet in the
> Pulsar community meeting on November 23rd to discuss the summary and
> conclusions and the path forward. Perhaps you could also prepare in a
>

Sounds like a plan!


> similar way where you summarize your viewpoints and we discuss this on
> Nov 23rd in the Pulsar community meeting together with everyone who is
> interested to participate. If we have completed the preparation before
> the meeting, we could possibly already exchange our summaries
> asynchronously before the meeting. Girish, Would this work for you?
>
> Yes, we can exchange it before 23rd. I can come back on my final
requirements and plan by end of next week.

Regards
-- 
Girish Sharma


Re: [DISCUSS] PIP-310: Support custom publish rate limiters

2023-11-10 Thread Girish Sharma
nd getting the ball
> rolling. Since you are operating Pulsar at scale, your contributions
> and feedback are very valuable in improving Pulsar's capacity manage.
> I happen to have a different view of how a custom rate limiter
> implemented with the possible pluggable interface could help with
> overall capacity management in Pulsar.
> We need to go beyond PIP-310 in solving multi-tenant capacity
> management/SOP/SLA challenges with Pulsar. The resource groups work
> started with PIP-81 is a good start point, but there's a need to
> improve and revisit the design to be able to meet the competition, the
> closed source Confluent Kora.
>

As mentioned in the comment above, resource groups shouldn't be discussed
here.. they are out of scope of the discussion involving partition level
rate limiting and I did not intend to bring them into the discussion.
I would like to see a comment on my proposal about the state of pulsar rate
limiter. I believe that's true "meeting in the middle".


>
> Thanks for providing such detailed and useful feedback! I think that
> this has already been a valuable interaction.
>

Thank you for painstakingly replying to these long emails.. This probably
is the longest thread in pulsar ML in recent times :)


>
> The improvements happen one step at a time. We can make things happen
> when we work together. I'm looking forward to that!
>

Would love to hear a plan from your point of view and what you envision.

>
> -Lari
>


-- 
Girish Sharma


Re: [DISCUSS] PIP-310: Support custom publish rate limiters

2023-11-09 Thread Girish Sharma
Hello Lari, replies inline

On Thu, Nov 9, 2023 at 6:50 AM Lari Hotari  wrote:

> Hi Girish,
>
> replies inline.
>
> On Thu, 9 Nov 2023 at 00:29, Girish Sharma 
> wrote:
> > While dual-rate dual token bucket looks promising, there is still some
> > challenge with respect to allowing a certain peak burst for/up to a
> bigger
> > duration. I am explaining it below:
>
> > Assume a 10MBps topic. Bursting support of 1.5x upto 2 minutes, once
> every
> > 10 minute interval.
>
> It's possible to have many ways to model a dual token buckets.
> When there are tokens in the bucket, they are consumed as fast as
> possible. This is why there is a need for the second token bucket
> which is used to rate limit the traffic to the absolute maximum rate.
> Technically the second bucket rate limits the average rate for a short
> time window.
>
> I'd pick the first bucket for handling the 10MB rate.
> The capacity of the first bucket would be 15MB * 120=1800MB. The fill
> would happen in special way. I'm not sure if Bucket4J has this at all.
> So describing the way of adding tokens to the bucket: the tokens in
> the bucket would remain the same when the rate is <10MB. As many
>

How is this special behavior (tokens in bucket remaining the same when rate
is <10MB) achieved? I would assume that to even figure out that the rate is
less than 10MB, there is some counter going around?


> tokens would be added to the bucket as are consumed by the actual
> traffic. The left over tokens 10MB - actual rate would go to a
> separate filling bucket that gets poured into the actual bucket every
> 10 minutes.
> This first bucket with this separate "filling bucket" would handle the
> bursting up to 1800MB.
>

But this isn't the requirement? Let's assume that the actual traffic has
been 5MB for a while and this 1800MB capacity bucket is all filled up now..
What's the real use here for that at all?


> The second bucket would solely enforce the 1.5x limit of 15MB rate
> with a small capacity bucket which enforces the average rate for a
> short time window.
> There's one nuance here. The bursting support will only allow bursting
> if the average rate has been lower than 10MBps for the tokens to use
> for the bursting to be usable.
> It would be possible that for example 50% of the tokens would be
> immediately available and 50% of the tokens are made available in the
> "filling bucket" that gets poured into the actual bucket every 10
> minutes. Without having some way to earn the burst, I don't think that
> there's a reasonable way to make things usable. The 10MB limit
>
wouldn't have an actual meaning unless that is used to "earn" the
> tokens to be used for the burst.
>
>
I think this approach of thinking about rate limiter - "earning the right
to burst by letting tokens remain into the bucket, (by doing lower than
10MB for a while)" doesn't not fit well in a messaging use case in real
world, or theoretic.
For a 10MB topic, if the actual produce has been , say, 5MB for a long
while, this shouldn't give the right to that topic to burst to 15MB for as
much as tokens are present.. This is purely due to the fact that this will
then start stressing the network and bookie disks.
Imagine a 100 of such topics going around with similar configuration of
fixed+burst limits and were doing way lower than the fixed rate for the
past couple of hours. Now that they've earned enough tokens, if they all
start bursting, this will bring down the system, which is probably not
capable of supporting simultaneous peaks of all possible topics at all.

Now of course we can utilize a broker level fixed rate limiter to not allow
the overall throughput of the system to go beyond a number, but at that
point - all the earning semantic goes for a toss anyway since the behavior
would be unknown wrt which topics are now going through with bursting and
which are being blocked due to the broker level fixed rate limiting.

As such, letting topics loose would not sit well with any sort of SLA
guarantees to the end user.

Moreover, contrary to the earning tokens logic, in reality a topic _should_
be allowed to burst upto the SOP/SLA as soon as produce starts in the
topic. It shouldn't _have_ to wait for tokens to fill up as it does
below-fixed-rate for a while before it is allowed to burst. This is because
there is no real benefit or reason to not let the topic do such as the
hardware is already present and the topic is already provisioned
(partitions, broker spread) accordingly, assuming the burst.

In an algorithmic/academic/literature setting, token bucket sounds really
promising.. but a platform with SLA to users would not run like that.



> In the current rate limiters in Pulsar, the implementation is not
> optimized to how Pulsar uses rate limiting. There's 

Re: [DISCUSS] PIP-310: Support custom publish rate limiters

2023-11-08 Thread Girish Sharma
r ways
> to solve this challenge where the Pulsar binary protocol doesn't have
> to be modified or the backpressure solution revisited in the broker.
> The proper solution requires doing both, adding a permit-based flow
>

When you say - adding a permit-based flow control - even if this is
implemented, multiplexing is still an issue as the tcp/ip channel itself is
put on pause at the netty level. Is there any other way of rate limiting
and rejecting packets from a channel selectively so as to contain the rate
limiting effect only to the specific partition out of all the partitions
being shared in the channel?

When I was doing poller vs precise testing wrt CPU, network, broker
latencies etc, one of the reason precise was much more CPU efficient and
had minimal impact on broker latencies was due to the fact that the netty
channel was being paused precisely.


> control for producers and revisiting the broker side backpressure/flow
> control and therefore it won't happen quickly.
> Please go ahead and create a GH issue and share your context. That
> will be very helpful.
>
>
I will do so. Does this need a PIP? To reiterate, I will be opening a GH
issue on the lines of "don't share connection to the broker across producer
objects"



> On the lowest level of unit tests, it will be helpful to have a
> solution where the clock source used in unit test can be run quickly
> so that simulated scenarios don't take a long time to run, but could
> cover the essential features.
>
>
Agreed.


> It might be hard to get started on the rate limiter implementation by
> looking at the existing code. The reason of the double methods in the
> existing interface is due to it covering 2 completely different ways
> of handling the rate limiting and trying to handle that in a single
> concept. What is actually desired for at least the producing rate
> limiting is that it's an asynchronous rate limiting where any messages
> that have already arrived to the broker will be handled. The token
> count could go to a negative value when implementing this in the token
> bucket algorithm. If it goes below 0, the producers for the topic
> should be backpressured by toggling the auto-read state and it should
> schedule a job to resume the auto-reading after there are at least a
> configurable amount of tokens available. There should be no need to
> use the scheduler to add tokens to the bucket. Whenever tokens are
> used, the new token count can be calculated based on the time since
> tokens were last updated, by default the average max rate defines how
> many tokens are added. The token count cannot increase larger than the
> token bucket capacity. This is how simple a plain token bucket
>

Are you suggesting that we first go ahead and convert the rate limiter in
pulsar to a simple, single-token based approach?
I personally do not see any benefit in this apart from code refactoring.
The precise rate limiter is basically doing that already- all be it
refilling only every 1 second, rather than distributing the tokens across
that second.



> algorithm is. That could be a starting point until we start covering
> the advanced cases which require a dual token bucket algorithm.
> If someone has the bandwidth, it would be fine to start experimenting
> in this area with code to learn more.
>
>
I think this is a big assumption. Since this is a critical use case in my
organisation, I will have to contribute everything here myself. Now I do
understand that reviews can take time in the OSS world, but this can't be
left at "simple token based approach and then letting anyone pick and
explore to extend/enhance it to dual token approach". This probably is one
of the main reasons why pluggability is important here :)


-- 
Girish Sharma


Re: [DISCUSS] PIP-310: Support custom publish rate limiters

2023-11-08 Thread Girish Sharma
 find that whatever we have built/modified/improved is
still lacking.


>
> Let's keep the focus on improving the rate limiting in Pulsar core.
> The possible pluggable interface could follow.
>

I personally would like to work on improving the interface and making it
pluggable first. This is a smaller task both from a design and coding
perspective. Meanwhile, I will create another proposal for improving the
built in rate limiter. Since we have had a lot of discussion about how to
improve the rate limiter, and I will continue discussing which rate limiter
works best in my opinion, I think we can be in a liberty to take a bit of
extra time in discussing and closing the improved rate limiter design. Of
course, I will keep the interface definition in mind while proposing the
improved rate limiter and vice versa.

What are your thoughts here?


>
>
> -Lari
>
> On Wed, 8 Nov 2023 at 10:46, Girish Sharma 
> wrote:
> >
> > Hello Rajan,
> > I haven't updated the PIP with a better interface for PublishRateLimiter
> > yet as the discussion here in this thread went in a different direction.
> >
> > Personally, I agree with you that even if we choose one algorithm and
> > improve the built-in rate limiter, it still may not suit all use cases as
> > you have mentioned.
> >
> > On Asaf's comment on too many public interfaces in Pulsar and no other
> > Apache software having so many public interfaces - I would like to ask,
> has
> > that brought in any con though? For this particular use case, I feel like
> > having it has a public interface would actually improve the code quality
> > and design as the usage would be checked and changes would go through
> > scrutiny (unlike how the current PublishRateLimiter evolved unchecked).
> > Asaf - what are your thoughts on this? Are you okay with making the
> > PublishRateLimiter pluggable with a better interface?
> >
> >
> >
> >
> >
> > On Wed, Nov 8, 2023 at 5:43 AM Rajan Dhabalia 
> wrote:
> >
> > > Hi Lari/Girish,
> > >
> > > I am sorry for jumping late in the discussion but I would like to
> > > acknowledge the requirement of pluggable publish rate-limiter and I had
> > > also asked it during implementation of publish rate limiter as well.
> There
> > > are trade-offs between different rate-limiter implementations based on
> > > accuracy, n/w usage, simplification and user should be able to choose
> one
> > > based on the requirement. However, we don't have correct and extensible
> > > Publish rate limiter interface right now, and before making it
> pluggable we
> > > have to make sure that it should support any type of implementation for
> > > example: token based or sliding-window based throttling, support of
> various
> > > decaying functions (eg: exponential decay:
> > > https://en.wikipedia.org/wiki/Exponential_decay), etc.. I haven't seen
> > > such
> > > interface details and design in the PIP:
> > > https://github.com/apache/pulsar/pull/21399/. So, I would encourage to
> > > work
> > > towards building pluggable rate-limiter but current PIP is not ready
> as it
> > > doesn't cover such generic interfaces that can support different types
> of
> > > implementation.
> > >
> > > Thanks,
> > > Rajan
> > >
> > > On Tue, Nov 7, 2023 at 10:02 AM Lari Hotari 
> wrote:
> > >
> > > > Hi Girish,
> > > >
> > > > Replies inline.
> > > >
> > > > On Tue, 7 Nov 2023 at 15:26, Girish Sharma 
> > > > wrote:
> > > > >
> > > > > Hello Lari, replies inline.
> > > > >
> > > > > I will also be going through some textbook rate limiters (the one
> you
> > > > shared, plus others) and propose the one that at least suits our
> needs in
> > > > the next reply.
> > > >
> > > >
> > > > sounds good. I've been also trying to find more rate limiter
> resources
> > > > that could be useful for our design.
> > > >
> > > > Bucket4J documentation gives some good ideas and it's shows how the
> > > > token bucket algorithm could be varied. For example, the "Refill
> > > > styles" section [1] is useful to read as an inspiration.
> > > > In network routers, there's a concept of "dual token bucket"
> > > > algorithms and by googling you can find both Cisco and Juniper
> > > > documentation referencing this.
> > > > I also asked ChatGPT-4 to explain "dual

Re: [DISCUSS] PIP-310: Support custom publish rate limiters

2023-11-08 Thread Girish Sharma
Hello Rajan,
I haven't updated the PIP with a better interface for PublishRateLimiter
yet as the discussion here in this thread went in a different direction.

Personally, I agree with you that even if we choose one algorithm and
improve the built-in rate limiter, it still may not suit all use cases as
you have mentioned.

On Asaf's comment on too many public interfaces in Pulsar and no other
Apache software having so many public interfaces - I would like to ask, has
that brought in any con though? For this particular use case, I feel like
having it has a public interface would actually improve the code quality
and design as the usage would be checked and changes would go through
scrutiny (unlike how the current PublishRateLimiter evolved unchecked).
Asaf - what are your thoughts on this? Are you okay with making the
PublishRateLimiter pluggable with a better interface?





On Wed, Nov 8, 2023 at 5:43 AM Rajan Dhabalia  wrote:

> Hi Lari/Girish,
>
> I am sorry for jumping late in the discussion but I would like to
> acknowledge the requirement of pluggable publish rate-limiter and I had
> also asked it during implementation of publish rate limiter as well. There
> are trade-offs between different rate-limiter implementations based on
> accuracy, n/w usage, simplification and user should be able to choose one
> based on the requirement. However, we don't have correct and extensible
> Publish rate limiter interface right now, and before making it pluggable we
> have to make sure that it should support any type of implementation for
> example: token based or sliding-window based throttling, support of various
> decaying functions (eg: exponential decay:
> https://en.wikipedia.org/wiki/Exponential_decay), etc.. I haven't seen
> such
> interface details and design in the PIP:
> https://github.com/apache/pulsar/pull/21399/. So, I would encourage to
> work
> towards building pluggable rate-limiter but current PIP is not ready as it
> doesn't cover such generic interfaces that can support different types of
> implementation.
>
> Thanks,
> Rajan
>
> On Tue, Nov 7, 2023 at 10:02 AM Lari Hotari  wrote:
>
> > Hi Girish,
> >
> > Replies inline.
> >
> > On Tue, 7 Nov 2023 at 15:26, Girish Sharma 
> > wrote:
> > >
> > > Hello Lari, replies inline.
> > >
> > > I will also be going through some textbook rate limiters (the one you
> > shared, plus others) and propose the one that at least suits our needs in
> > the next reply.
> >
> >
> > sounds good. I've been also trying to find more rate limiter resources
> > that could be useful for our design.
> >
> > Bucket4J documentation gives some good ideas and it's shows how the
> > token bucket algorithm could be varied. For example, the "Refill
> > styles" section [1] is useful to read as an inspiration.
> > In network routers, there's a concept of "dual token bucket"
> > algorithms and by googling you can find both Cisco and Juniper
> > documentation referencing this.
> > I also asked ChatGPT-4 to explain "dual token bucket" algorithm [2].
> >
> > 1 - https://bucket4j.com/8.6.0/toc.html#refill-types
> > 2 - https://chat.openai.com/share/d4f4f740-f675-4233-964e-2910a7c8ed24
> >
> > >>
> > >> It is bi-weekly on Thursdays. The meeting calendar, zoom link and
> > >> meeting notes can be found at
> > >> https://github.com/apache/pulsar/wiki/Community-Meetings .
> > >>
> > >
> > > Would it make sense for me to join this time given that you are
> skipping
> > it?
> >
> > Yes, it's worth joining regularly when one is participating in Pulsar
> > core development. There's usually a chance to discuss all topics that
> > Pulsar community members bring up to discussion. A few times there
> > haven't been any participants and in that case, it's good to ask on
> > the #dev channel on Pulsar Slack whether others are joining the
> > meeting.
> >
> > >>
> > >> ok. btw. "metrics" doesn't necessarily mean providing the rate limiter
> > >> metrics via Prometheus. There might be other ways to provide this
> > >> information for components that could react to this.
> > >> For example, it could a be system topic where these rate limiters emit
> > events.
> > >>
> > >
> > > Are there any other system topics than
> > `tenent/namespace/__change_events` . While it's an improvement over
> > querying metrics, it would still mean one consumer per namespace and
> would
> > form a cyclic dependency - for example, in case a broker is degrading due
> 

Re: [DISCUSS] PIP-310: Support custom publish rate limiters

2023-11-07 Thread Girish Sharma
t; sending in a way or another. If you'd be sending synchronously, it
> would be extremely inefficient if messages are sent one-by-one. The
> asyncronouscity could also come from multiple threads in an
> application.
>
> You are right that a well homogenised workload reduces the severity of
> the multiplexing problem. That's also why the problem of multiplexing
> hasn't been fixed since the problem isn't visibile and it's also very
> hard to observe.
> In a case where 2 producers with very different rate limiting options
> share a connection, it definetely is a problem in the current
>

Yes actually, I was talking to the team and we did observe a case where
there was a client app with 2 producer objects writing to two different
topics. When one of their topics was breaching quota, the other one was
also observing rate limiting even though it was under quota. So eventually,
this surely needs to be solved. One short term solution is to at least not
share connections across producer objects. But I still feel it's out of
scope of this discussion here.

What do you think about the quick solution of not sharing connections
across producer objects? I can raise a github issue explaining the
situation.



-- 
Girish Sharma


Re: [DISCUSS] PIP-310: Support custom publish rate limiters

2023-11-06 Thread Girish Sharma
tial implementation (poller), the next implementation
simply added more methods into the interface rather than actually using the
ones already existing.
For instance, there are both `tryAcquire` and `incrementPublishCount`
methods, there are both `checkPublishRate` and `isPublishRateExceeded`.
Then there is the issue of misplaced responsibilities where when its
precise, the complete responsibility of checking and responding back with
whether its rate limited or not lies with the `PrecisePublishLimiter.java`
but when its poller based, there is some logic inside
`PublishRateLimiterImpl` and rest of the logic is spread across
`AbstractTopic.java`

> short and medium term blacklisting of topics based on breach of rate
> > limiter beyond a given SOP. I feel this is very very specific to our
> > organization right now to be included inside pulsar itself.
>
> This is outside of rate limiters. IIRC, there have been some
> discussions in the community that an API for blocking individual
> producers or producing to specific topics could be useful in some
> cases.
> An external component could observe metrics and control blocking if
> there's an API for doing so.
>
>
Actually, putting this in an external component that's based off of metrics
is not a scalable or responsive solution.
First of all, it puts a lot of pressure on the metrics system (prometheus)
where we are now querying 1000s of metrics every minute/sub-minute
uselessly. Since majority of the time, not even a single topic may need
blacklisting, this is very very inefficient.
Secondly, it makes the design such that this external component now needs
to be in sync about the existing topics and their rate set inside the
pulsar's zookeeper. This also puts extra pressure on zk-reads.
Lastly, the response time for blacklisting the topic increases a lot in
this approach.

This would be a much simpler and efficient model if it were reactive,
based/triggered directly from within the rate limiter component. It can be
fast and responsive, which is very critical when trying to prevent the
system from abuse.



> > Actually, by default, and for 99% of the cases, multiplexing isn't an
> issue
> > assuming:
> > * A single producer object is producing to a single topic (one or more
> > partition)
> > * Produce is happening in a round robin manner (by default)
>
> Connection multiplexing issue is also a problem in the cases you
> listed since multiple partitions might be served by the same broker
> and the connection to that broker could be shared. Producing in a
> round-robin manner does not eliminate the issue because the sending
> process is asynchronous in the background. Therefore, it is a problem
>

It's only async if the producers use `sendAsync` . Moreover, even if it is
async in nature, practically it ends up being quite linear and
well-homogenised. I am speaking from experience of running 1000s of
partitioned topics in production


> Let's continue in getting into more details of the intended behavior
> and define what bursting really means and how we believe that solves a
> problem with "hot source" producers.
> Makes sense?
>
>
I am happy if the discussion concludes eitherways - a pluggable
implementation or a 99% use-case capturing configurable rate limiter. But
from what I've seen, the participation in OSS threads can be very random
and thus, I am afraid it might take a while before more folks pitch in
their inputs and a clear direction of discussion is formed.


-- 
Girish Sharma


Re: [DISCUSS] PIP-310: Support custom publish rate limiters

2023-11-04 Thread Girish Sharma
On Sat, Nov 4, 2023 at 9:21 PM Lari Hotari  wrote:

> One additional question:
>
> In your use case, do you have multiple producers concurrently producing to
> the same topic from different clients?
>
> The use case is challenging in the current implementation when using topic
> producing rate limiting. The problem is that the different producers will
> be able to send messages at very different rates since there isn't a
> solution to ensure fairness across multiple producers in the topic producer
> rate limiting solution. This is something that should be addressed when
> improving rate limiting.
>

We haven't personally seen this pattern that different logical producers
(app/object) are producing to the same topic. I feel like this is an
anti-pattern and goes against the homogenous nature of a topic.
Even if such kinds of use cases arrive, they can easily be handled by
moving the different producers to different topic and the consumers
subscribing to more than one topic in case they need data from all of those
topics.

Regards


>
> -Lari
>
> la 4. marrask. 2023 klo 17.24 Lari Hotari  kirjoitti:
>
> > Replies inline
> >
> > On Fri, 3 Nov 2023 at 20:48, Girish Sharma 
> > wrote:
> >
> >> Could you please elaborate more on these details? Here are some
> questions:
> >> > 1. What do you mean that it is too strict?
> >> > - Should the rate limiting allow bursting over the limit for some
> >> time?
> >> >
> >>
> >> That's one of the major use cases, yes.
> >>
> >
> > One possibility would be to improve the existing rate limiter to allow
> > bursting.
> > I think that Pulsar's out-of-the-box rate limiter should cover 99% of the
> > use cases instead of having one implementing their own rate limiter
> > algorithm.
> > The problems you are describing seem to be common to many Pulsar use
> > cases, and therefore, I think they should be handled directly in Pulsar.
> >
> > Optimally, there would be a single solution that abstracts the rate
> > limiting in a way where it does the right thing based on the declarative
> > configuration.
> > I would prefer that over having a pluggable solution for rate limiter
> > implementations.
> >
> > What would help is getting deeper in the design of the rate limiter
> > itself, without limiting ourselves to the existing rate limiter
> > implementation in Pulsar.
> >
> > In textbooks, there are algorithms such as "leaky bucket" [1] and "token
> > bucket" [2]. Both algorithms have several variations and in some ways
> they
> > are very similar algorithms but looking from the different point of view.
> > It would possibly be easier to conceptualize and understand a rate
> limiting
> > algorithm if common algorithm names and implementation choices mentioned
> in
> > textbooks would be referenced in the implementation.
> > It seems that a "token bucket" type of algorithm can be used to implement
> > rate limiting with bursting. In the token bucket algorithm, the size of
> the
> > token bucket defines how large bursts will be allowed. The design could
> > also be something where 2 rate limiters with different type of algorithms
> > and/or configuration parameters are combined to achieve a desired
> behavior.
> > For example, to achieve a rate limiter with bursting and a fixed maximum
> > rate.
> > By default, the token bucket algorithm doesn't enforce a maximum rate for
> > bursts, but that could be achieved by chaining 2 rate limiters if that is
> > really needed.
> >
> > The current Pulsar rate limiter implementation could be implemented in a
> > cleaner way, which would also be more efficient. Instead of having a
> > scheduler call a method once per second, I think that the rate limiter
> > could be implemented in a reactive way where the algorithm is implemented
> > without a scheduler.
> > I wonder if there are others that would be interested in getting down
> into
> > such implementation details?
> >
> > 1 - https://en.wikipedia.org/wiki/Leaky_bucket
> > 2 - https://en.wikipedia.org/wiki/Token_bucket
> >
> >
> > > 2. What type of data loss are you experiencing?
> >> >
> >>
> >> Messages produced by the producers which eventually get timed out due to
> >> rate limiting.
> >>
> >
> > Are you able to slow down producing on the client side? If that is
> > possible, there could be ways to improve ways to do client side back
> > pressure with Pulsar Client. Currently, the client doesn't expose this
> > in

Re: [DISCUSS] PIP-310: Support custom publish rate limiters

2023-11-04 Thread Girish Sharma
ting configuration parameters in the rate
> limiter.
> As mentioned above, I think we could be improving the existing rate limiter
> in Pulsar to cover 99% of the use case by making it stable and by including
> the bursting configuration options.
> Is there additional functionality you feel the rate limiter needs beyond
> bursting support?
>
>
There are a few other custom things. For example, there would be cases of
short and medium term blacklisting of topics based on breach of rate
limiter beyond a given SOP. I feel this is very very specific to our
organization right now to be included inside pulsar itself.


> One way to workaround the multiplexing problem would be to add a client
> side option for producers and consumers, where you could specify that the
> client picks a separate TCP/IP connection that is not shared and isn't from
> the connection pool.
> Preventing connection multiplexing seems to be the only way to make the
> current rate limiting deterministic and stable without adding the explicit
> flow control to the Pulsar binary protocol for producers.
>

Actually, by default, and for 99% of the cases, multiplexing isn't an issue
assuming:
* A single producer object is producing to a single topic (one or more
partition)
* Produce is happening in a round robin manner (by default)

Due to these assumptions, it is more than likely that all partitions are
doing uniform QPS and MBps, thus disabling auto-read off at the netty layer
doesn't have that drastic impact on the rate limiting aspect.



>
> Are there other community members with input on the design and
> implementation of an improved rate limiter?
> I’m eager to continue this conversation and work together towards a robust
> solution.
>

Again, I would love this to land in pieces so that TAT for actual usage is
much faster. What do you suggest from that perspective?


>
> -Lari
>


-- 
Girish Sharma


Re: [DISCUSS] PIP-310: Support custom publish rate limiters

2023-11-03 Thread Girish Sharma
Hello Lari, replies inline.


On Fri, Nov 3, 2023 at 11:13 PM Lari Hotari  wrote:

> Hi Girish,
>
> Thanks for the questions. I'll reply to them
>
> > does this sharing of the same tcp/ip connection happen across partitions
> as
> > well (assuming both the partitions of the topic are on the same broker)?
> > i.e. producer 127.0.0.1 for partition
> > `persistent://tenant/ns/topic0-partition0` and producer 127.0.0.1 for
> > partition `persistent://tenant/ns/topic0-partition1` share the same
> tcp/ip
> > connection assuming both are on broker-0 ?
>
> The Pulsar Java client would be sharing the same TCP/IP connection to a
> single broker when using the default setting of connectionsPerBroker = 1.
> It could be using a different connection if connectionsPerBroker > 1.
>
> Thanks for clarifying this.

Could you please elaborate more on these details? Here are some questions:
> 1. What do you mean that it is too strict?
> - Should the rate limiting allow bursting over the limit for some time?
>

That's one of the major use cases, yes.


> 2. What type of data loss are you experiencing?
>

Messages produced by the producers which eventually get timed out due to
rate limiting.


> 3. What is the root cause of the data loss?
>- Do you mean that the system performance degrades and data loss is due
> to not being able to produce from client to the broker quickly enough and
> data loss happens because messages cannot be forwarded from the client to
> the broker?
>

No, the system performance decreases in the case of poller based rate
limiters. In the precise one, it's purely the broker pausing the netty
channel's auto read property. If the producer goes beyond the set
throughput for a longer (than timeout) duration then it starts observing
timeouts leading to the messages being timed out essentially being lost.

As mentioned in my previous email, there has been discussions about
> improving producer flow control. One of the solution ideas that was
> discussed in a Pulsar community meeting in January was to add explicit flow
> control to producers, somewhat similar to how there are "permits" as the
> flow control for consumers. The permits would be based on byte size
> (instead of number of messages). With explicit flow control in the
> protocol, the rate limiting will also be effective and deterministic and
> the issues that Tao Jiuming was explaining could also be resolved. It also
> would solve the producer/consumer multiplexing on a single TCP/IP
> connection when flow control and rate limiting isn't based on the TCP/IP
> level (and toggling the Netty channel's auto read property).
>
> I think the core implementation of how the broker fails fast at the time
of rate limiting (whether it is by pausing netty channel or a new permits
based model) does not change the actual issue I am targeting. Multiplexing
has some impact on it - but yet again only limited, and can easily be fixed
by the client by increasing the connections per broker. Even after assuming
both these things are somehow "fixed", the fact remains that an absolutely
strict rate limiter will lead to the above mentioned data loss for burst
going above the limit and that a poller based rate limiter doesn't really
rate limit anything as it allows all produce in the first interval of the
next second.


> Let's continue discussion, since I think that this is an important
> improvement area. Together we could find a good solution that works for
> multiple use cases and addresses existing challenges in producer flow
> control and rate limiting.
>
> -Lari
>
> On 2023/11/03 11:16:37 Girish Sharma wrote:
> > Hello Lari,
> > Thanks for bringing this to my attention. I went through the links, but
> > does this sharing of the same tcp/ip connection happen across partitions
> as
> > well (assuming both the partitions of the topic are on the same broker)?
> > i.e. producer 127.0.0.1 for partition
> > `persistent://tenant/ns/topic0-partition0` and producer 127.0.0.1 for
> > partition `persistent://tenant/ns/topic0-partition1` share the same
> tcp/ip
> > connection assuming both are on broker-0 ?
> >
> > In general, the major use case behind this PIP for me and my organization
> > is about supporting produce spikes. We do not want to allocate absolute
> > maximum throughput for a topic which would not even be utilized 99.99% of
> > the time. Thus, for a topic that stays constantly at 100MBps and goes to
> > 150MBps only once in a blue moon, it's unwise to allocate 150MBps worth
> of
> > resources 100% of the time. The poller based rate limiter is also not
> good
> > here as it would allow over use of hardware without a check, degrading
> the
> > syste

Re: [DISCUSS] PIP-310: Support custom publish rate limiters

2023-11-03 Thread Girish Sharma
Hello Tao,
As I understand, there is a fine balance between rate-limiting,
backpressure and not keeping clients waiting. Different use cases may need
different approach to rate-limiting and thus, making rate limiter
customizable is my first step towards making pulsar more customizable as
per need.

Regards

On Fri, Nov 3, 2023 at 5:42 PM 太上玄元道君  wrote:

> Hi Girish,
>
> There is also a discussion thread[1] about rate-limiting.
>
> I think there is some conflicts between some kind of rate-limiter and
> backpressure
>
> Take the fail-fast strategy as an example:
> Brokers have to reply to clients after receiving and decode the message,
> but the broker also has the back-pressure mechanism. Broker cannot read
> messages because the channel is `disableAutoRead`.
>
> So the rate-limiters have to adapt to back-pressure.
>
> Thanks,
> Tao Jiuming
>
> 2023年10月19日 20:51,Girish Sharma  写道:
>
> Hi,
> Currently, there are only 2 kinds of publish rate limiters - polling based
> and precise. Users have an option to use either one of them in the topic
> publish rate limiter, but the resource group rate limiter only uses polling
> one.
>
> There are challenges with both the rate limiters and the fact that we can't
> use precise rate limiter in the resource group level.
>
> Thus, in order to support custom rate limiters, I've created the PIP-310
>
> This is the discussion thread. Please go through the PIP and provide your
> inputs.
>
> Link - https://github.com/apache/pulsar/pull/21399
>
> Regards
> --
> Girish Sharma
>


-- 
Girish Sharma


Re: [DISCUSS] PIP-310: Support custom publish rate limiters

2023-11-03 Thread Girish Sharma
Hello Lari,
Thanks for bringing this to my attention. I went through the links, but
does this sharing of the same tcp/ip connection happen across partitions as
well (assuming both the partitions of the topic are on the same broker)?
i.e. producer 127.0.0.1 for partition
`persistent://tenant/ns/topic0-partition0` and producer 127.0.0.1 for
partition `persistent://tenant/ns/topic0-partition1` share the same tcp/ip
connection assuming both are on broker-0 ?

In general, the major use case behind this PIP for me and my organization
is about supporting produce spikes. We do not want to allocate absolute
maximum throughput for a topic which would not even be utilized 99.99% of
the time. Thus, for a topic that stays constantly at 100MBps and goes to
150MBps only once in a blue moon, it's unwise to allocate 150MBps worth of
resources 100% of the time. The poller based rate limiter is also not good
here as it would allow over use of hardware without a check, degrading the
system.

@Asif, I have been sick these last 10 days, but will be updating the PIP
with the discussed changes early next week.

Regards

On Fri, Nov 3, 2023 at 3:25 PM Lari Hotari  wrote:

> Hi Girish,
>
> In order to address your problem described in the PIP document [1], it
> might be necessary to make improvements in how rate limiters apply
> backpressure in Pulsar.
>
> Pulsar uses mainly TCP/IP connection level controls for achieving
> backpressure. The challenge is that Pulsar can share a single TCP/IP
> connection across multiple producers and consumers. Because of this, there
> could be multiple producers and consumers and rate limiters operating on
> the same connection on the broker, and they will do conflicting decisions,
> which results in undesired behavior.
>
> Regarding the shared TCP/IP connection backpressure issue, Apache Flink
> had a somewhat similar problem before Flink 1.5. It is described in the
> "inflicting backpressure" section of this blog post from 2019:
>
> https://flink.apache.org/2019/06/05/flink-network-stack.html#inflicting-backpressure-1
> Flink solved the issue of multiplexing multiple streams of data on a
> single TCP/IP connection in Flink 1.5 by introducing it's own flow control
> mechanism.
>
> The backpressure and rate limiting challenges have been discussed a few
> times in Pulsar community meetings over the past years. There was also a
> generic backpressure thread on the dev mailing list [2] in Sep 2022.
> However, we haven't really documented Pulsar's backpressure design and how
> rate limiters are part of the overall solution and how we could improve.
> I think it might be time to do so since there's a requirement to improve
> rate limiting. I guess that's the main motivation also for PIP-310.
>
> -Lari
>
> 1 - https://github.com/apache/pulsar/pull/21399/files
> 2 - https://lists.apache.org/thread/03w6x9zsgx11mqcp5m4k4n27cyqmp271
>
> On 2023/10/19 12:51:14 Girish Sharma wrote:
> > Hi,
> > Currently, there are only 2 kinds of publish rate limiters - polling
> based
> > and precise. Users have an option to use either one of them in the topic
> > publish rate limiter, but the resource group rate limiter only uses
> polling
> > one.
> >
> > There are challenges with both the rate limiters and the fact that we
> can't
> > use precise rate limiter in the resource group level.
> >
> > Thus, in order to support custom rate limiters, I've created the PIP-310
> >
> > This is the discussion thread. Please go through the PIP and provide your
> > inputs.
> >
> > Link - https://github.com/apache/pulsar/pull/21399
> >
> > Regards
> > --
> > Girish Sharma
> >
>


-- 
Girish Sharma


[DISCUSS] PIP-310: Support custom publish rate limiters

2023-10-19 Thread Girish Sharma
Hi,
Currently, there are only 2 kinds of publish rate limiters - polling based
and precise. Users have an option to use either one of them in the topic
publish rate limiter, but the resource group rate limiter only uses polling
one.

There are challenges with both the rate limiters and the fact that we can't
use precise rate limiter in the resource group level.

Thus, in order to support custom rate limiters, I've created the PIP-310

This is the discussion thread. Please go through the PIP and provide your
inputs.

Link - https://github.com/apache/pulsar/pull/21399

Regards
-- 
Girish Sharma


Re: [ANNOUNCE] Zili Chen (tison) as new PMC member in Apache Pulsar

2023-07-21 Thread Girish Sharma
Congratulations Tison!

Regards

On Fri, Jul 21, 2023 at 10:44 PM Nicolò Boschi  wrote:

> Congrats Tison!
>
>
> Il giorno ven 21 lug 2023 alle 19:05 Michael Marshall <
> mmarsh...@apache.org>
> ha scritto:
>
> > Hi Pulsar Community,
> >
> > The Apache Pulsar Project Management Committee (PMC) has invited Zili
> Chen
> > (https://github.com/tisonkun) as a member of the PMC and we are
> > pleased to announce that tison has accepted.
> >
> > tison is very active in the community by contributing and reviewing
> > many PRs, actively engaging on the mailing list, triaging GitHub
> > Issues, and helping out with the website.
> >
> > On behalf of the Pulsar PMC, welcome and congratulations to tison!
> >
> > Best,
> > Michael
> >
> --
> Nicolò Boschi
>


-- 
Girish Sharma


Re: Failover Subscription - consumer assignment logic discussion

2023-07-03 Thread Girish Sharma
Hello PengHui,

On Mon, Jul 3, 2023 at 8:39 PM PengHui Li  wrote:

>
> Got it, for the Failover subscription, the new consumer caused the active
> consumer
> shift. I think we can make some improvements to this part to make sure the
> new active
> consumer will only get messages after the previous active consumer acked
> all the received
> message unless the previous active consumer disconnected.
>
> I think this will greatly help maintain the ordering guarantees per
partition.


>
> If all the consumers with the highest priority are disconnected, then
> the consumers with a lower priority will be peeked. The Shared subscription
> have different behavior. It will select the lower priority consumer if all
> highest
> priority consumers don't have available permits. I think the challenge for
> Failover subscription is the broker needs to shift the active consumer
> according
> to the available permits. But it could be considered in a different active
> consumer assigner implementation like Kafka's consumer group coordinator,
> you can have different policies.
>
> Right, in case of shared subs, the lower priority consumers are used more
often since permits are considered and thus, slow consumers are detected
quickly.

In Failover, the current logic can lead to a single remaining active
consumer consuming from all partitions, while multiple lower priority
consumers are on standby. That single higher priority consumer may not be
able to keep up with the topic throughput.
We cannot also directly use the same behavior as Shared subscription here
because that would again lead to out of order delivery of messages.

I do not have a solution in mind here right now but I will come up with
something so that the load balancing can be better, utilizing lower
priority consumers as well.

I was also thinking that from a user perspective, one would think that a
lower priority consumer is meant for backup in case one active consumer
goes down - which also doesn't work like that.

Regards



> Regards,
> Penghui
>
> On Mon, Jul 3, 2023 at 7:52 PM Girish Sharma 
> wrote:
>
> > Hello PengHui,
> > Thank you for the reply. Adding comments inline below with a few
> concerns.
> >
> > On Mon, Jul 3, 2023 at 4:38 PM PengHui Li  wrote:
> >
> > > Hi Girish,
> > >
> > > Thanks for raising the discussion.
> > >
> > > I can confirm that your understanding is correct, and the document
> > > is confusing. If there are four consumers connected to a partitioned
> > topic
> > > with two partitions, each partition will have four connected consumers
> > but
> > > only one active consumer. The document said two consumers are connected
> > > to each partition is wrong. We will try to improve the document, and
> your
> > > contribution is welcome if you want to improve it.
> > >
> > > Yes, the part where it shows only 2 consumers are connected is
> > misleading,
> > but from information point of view, it is still okay to show only 2 in
> the
> > visualization, as one is active and other one is backup (next in line)
> >
> > The confusion comes where it tries to indicate that the active consumers
> > are uniformly spread. i.e. in the example, consumers A and C are active
> > while in reality, consumers A and B are active.
> > Maybe there is scope of visualization improvement there.
> >
> >
> >
> > > For the consumer shift for the partition without active consumer
> > failures.
> > > I think it should be a load-balance consideration. Kafka has a consumer
> > > group coordinator, which can balance traffic between consumers. But
> > Pulsar
> > > doesn't have. So Pulsar has to re-assign the active consumer when the
> > > consumer
> > > leaves, no matter whether the consumer is active or not.
> > >
> >
> > From a code perspective, I do understand that it's tricky to ensure
> minimal
> > re-assignment.  But this should be highlighted in the documentation as it
> > has implications in terms of ordered consumption as described below.
> >
> >
> > > Frankly, it's not the best policy for all the cases. IMO, Pulsar also
> can
> > > have different
> > > policies for assigning active consumers for different requirements. Do
> > you
> > > have
> > > a real case that the unnecessary consumer shift will impact? Which will
> > > help us to
> > > understand the value of introducing different policies. All I can think
> > of
> > > at the moment
> > > are load balance (if the traffic of the partitions is far from each
> > other)
> > >

Re: Failover Subscription - consumer assignment logic discussion

2023-07-03 Thread Girish Sharma
Hello PengHui,
Thank you for the reply. Adding comments inline below with a few concerns.

On Mon, Jul 3, 2023 at 4:38 PM PengHui Li  wrote:

> Hi Girish,
>
> Thanks for raising the discussion.
>
> I can confirm that your understanding is correct, and the document
> is confusing. If there are four consumers connected to a partitioned topic
> with two partitions, each partition will have four connected consumers but
> only one active consumer. The document said two consumers are connected
> to each partition is wrong. We will try to improve the document, and your
> contribution is welcome if you want to improve it.
>
> Yes, the part where it shows only 2 consumers are connected is misleading,
but from information point of view, it is still okay to show only 2 in the
visualization, as one is active and other one is backup (next in line)

The confusion comes where it tries to indicate that the active consumers
are uniformly spread. i.e. in the example, consumers A and C are active
while in reality, consumers A and B are active.
Maybe there is scope of visualization improvement there.



> For the consumer shift for the partition without active consumer failures.
> I think it should be a load-balance consideration. Kafka has a consumer
> group coordinator, which can balance traffic between consumers. But Pulsar
> doesn't have. So Pulsar has to re-assign the active consumer when the
> consumer
> leaves, no matter whether the consumer is active or not.
>

>From a code perspective, I do understand that it's tricky to ensure minimal
re-assignment.  But this should be highlighted in the documentation as it
has implications in terms of ordered consumption as described below.


> Frankly, it's not the best policy for all the cases. IMO, Pulsar also can
> have different
> policies for assigning active consumers for different requirements. Do you
> have
> a real case that the unnecessary consumer shift will impact? Which will
> help us to
> understand the value of introducing different policies. All I can think of
> at the moment
> are load balance (if the traffic of the partitions is far from each other)
> and the duplicated
> messages when switching the active consumer.
>

Right, so currently I do see these challenges:

   - Unlike KEY_SHARED, there is no logic to start sending data to newly
   assigned consumers *only after *the previous one acks to a certain
   checkpoint.
  - This in turn leads to chances of out of order consumption and
  duplicate consumption where the in queue messages of older consumers may
  still be processed while the same messages are sent to new consumers as
  well.
   - For any disconnected or newly added consumer, more than one partition
   gets affected based on the index of the consumer which got removed.
   - What is the use of setting a consumer priority anything below the
   highest. The code seems to only consider the highest priority consumers to
   spread active consumers, and ignore any consumer with priority set anything
   lower than the highest priority among the consumers. Which means those
   consumers would always sit idle until there is at least 1 consumer with
   higher priority. Example, if ten consumers (consumer priority 1 through 10)
   are connected to 10 partitions, all 10 partitions would only send data to
   just one of the consumers at any given time.

Regards



> Regards,
> Penghui
>
> On Mon, Jul 3, 2023 at 2:49 PM Girish Sharma 
> wrote:
>
> > Bumping this up. Would really like to discuss this in the community.
> >
> > Regards
> >
> > On Wed, Jun 28, 2023 at 11:49 PM Girish Sharma 
> > wrote:
> >
> > > Hi everyone, I am trying to understand the failover subscription logic
> a
> > > bit more in detail. Specifically, the doc
> > > <https://pulsar.apache.org/docs/3.0.x/concepts-messaging/#failover
> > >mention
> > > this part for partitioned topic:
> > >
> > >
> > >
> > > * If the number of partitions in a partitioned topic is less than the
> > > number of consumers:For example, in the diagram below, this partitioned
> > > topic has 2 partitions and there are 4 consumers.Each partition has 1
> > > active consumer and 1 stand-by consumer.*
> > >
> > >
> > >- *For p0, consumer A is the master consumer, while consumer B would
> > >be the next consumer in line to receive messages if consumer A is
> > >disconnected.*
> > >- *For p1, consumer C is the master consumer, while consumer D would
> > >be the next consumer in line to receive messages if consumer C is
> > >disconnected*.
> > >
> > > So, as per this, since all four (A,B,C,D) consumers make connection to
> > 

Re: Failover Subscription - consumer assignment logic discussion

2023-07-03 Thread Girish Sharma
Bumping this up. Would really like to discuss this in the community.

Regards

On Wed, Jun 28, 2023 at 11:49 PM Girish Sharma 
wrote:

> Hi everyone, I am trying to understand the failover subscription logic a
> bit more in detail. Specifically, the doc
> <https://pulsar.apache.org/docs/3.0.x/concepts-messaging/#failover>mention
> this part for partitioned topic:
>
>
>
> * If the number of partitions in a partitioned topic is less than the
> number of consumers:For example, in the diagram below, this partitioned
> topic has 2 partitions and there are 4 consumers.Each partition has 1
> active consumer and 1 stand-by consumer.*
>
>
>- *For p0, consumer A is the master consumer, while consumer B would
>be the next consumer in line to receive messages if consumer A is
>disconnected.*
>- *For p1, consumer C is the master consumer, while consumer D would
>be the next consumer in line to receive messages if consumer C is
>disconnected*.
>
> So, as per this, since all four (A,B,C,D) consumers make connection to
> both partitions p0 and p1, the consumers array size in
> AbstractDispatcherSingleActiveConsumer should be 4. Now based on the
> consumer index choosing logic spanning lines 126 - 130
> <https://github.com/apache/pulsar/blob/master/pulsar-broker/src/main/java/org/apache/pulsar/broker/service/AbstractDispatcherSingleActiveConsumer.java#L126-L130>
> , the consumer index assigned to p0 should be 0 (i.e. A) and to p1 should
> be 1 (i.e. B) . I am assuming here that all 4 consumers have the same
> priority.Now consider consumer B getting disconnected. remaining consumer
> array == (A,C,D) . In this case, p1 will get a new consumer using logic 1
> % 3 = 1 index i.e. consumer C now. p0's consumer would remain same i.e. 0
> % 3 = 0 i.e. A.
> Now next consider that consumer A also goes down. remaining consumer array
> == (C,D) In this case, p0 will get a new consumer -> 0%2 = 0 i.e.
> consumer C and p1 would now be shifted to 1%2 = 1 Consumer D . Even
> though p1's active consumer was untouched, p1 got a consumer shift.So I
> have couple of questions -
>
>- Am I missing something? Is my understanding of logic correct?
>- If yes, why does the doc say what it says? And why change p1's
>consumer uselessly in above example
>
>
> Regards
> --
> Girish Sharma
>


-- 
Girish Sharma


Failover Subscription - consumer assignment logic discussion

2023-06-28 Thread Girish Sharma
Hi everyone, I am trying to understand the failover subscription logic a
bit more in detail. Specifically, the doc
<https://pulsar.apache.org/docs/3.0.x/concepts-messaging/#failover>mention
this part for partitioned topic:



* If the number of partitions in a partitioned topic is less than the
number of consumers:For example, in the diagram below, this partitioned
topic has 2 partitions and there are 4 consumers.Each partition has 1
active consumer and 1 stand-by consumer.*


   - *For p0, consumer A is the master consumer, while consumer B would be
   the next consumer in line to receive messages if consumer A is
   disconnected.*
   - *For p1, consumer C is the master consumer, while consumer D would be
   the next consumer in line to receive messages if consumer C is disconnected*
   .

So, as per this, since all four (A,B,C,D) consumers make connection to both
partitions p0 and p1, the consumers array size in
AbstractDispatcherSingleActiveConsumer should be 4. Now based on the
consumer index choosing logic spanning lines 126 - 130
<https://github.com/apache/pulsar/blob/master/pulsar-broker/src/main/java/org/apache/pulsar/broker/service/AbstractDispatcherSingleActiveConsumer.java#L126-L130>
, the consumer index assigned to p0 should be 0 (i.e. A) and to p1 should
be 1 (i.e. B) . I am assuming here that all 4 consumers have the same
priority.Now consider consumer B getting disconnected. remaining consumer
array == (A,C,D) . In this case, p1 will get a new consumer using logic 1 %
3 = 1 index i.e. consumer C now. p0's consumer would remain same i.e. 0 % 3
= 0 i.e. A.
Now next consider that consumer A also goes down. remaining consumer array
== (C,D) In this case, p0 will get a new consumer -> 0%2 = 0 i.e. consumer
C and p1 would now be shifted to 1%2 = 1 Consumer D . Even though p1's
active consumer was untouched, p1 got a consumer shift.So I have couple of
questions -

   - Am I missing something? Is my understanding of logic correct?
   - If yes, why does the doc say what it says? And why change p1's
   consumer uselessly in above example


Regards
-- 
Girish Sharma


Re: [Discuss] Suggestion for a "clear" parameter in Pulsar-admin to simplify tenant and namespace cleanup

2023-04-24 Thread Girish Sharma
Hello Xiangying,
Thank you for the summary .

I would like to understand if this feature would delete the regex based
topics or namespaces from all available clusters in the global zookeeper,
or only from the local cluster where this command is run.

Regards

On Tue, Apr 25, 2023, 8:52 AM Xiangying Meng  wrote:

> Hi Girish,
>
> Thank you for raising concerns about the proposed feature. I would
> like to address the points you mentioned in your email.
>
> 1. I understand that one of your concerns is whether the proposed
> regex-based deletion feature would provide significant advantages over
> using a simple one-liner script to call the delete topic command for
> each topic. As Yubiao pointed out, using scripts to delete topics one
> by one can lead to increased network overhead and slow performance,
> particularly when dealing with a large number of topics. Implementing
> regex support for delete operations would provide a more efficient and
> convenient way to manage resources in Pulsar.
>
> 2. In addition to the benefits for testing purposes, we have
> communicated with business users of Pulsar and found that the proposed
> regex-based deletion feature can be helpful in production environments
> as well. For instance, it can be used to efficiently clean up
> subscriptions associated with deprecated services, ensuring better
> resource management and reducing clutter in the system.
>
> 3. As I suggested earlier, we can introduce a new option flag (e.g.,
> `--regex` or `--pattern`) to the existing `pulsar-admin topics delete`
> command to prevent breaking changes for users who have already used
> the command in their scripts. This would ensure backward compatibility
> while providing the new functionality for those who want to use regex
> for deletion.
>
> I hope this clears up any confusion and addresses your concerns.
> Please let me know if you have any further questions or suggestions.
>
> Best regards,
> Xiangying Meng
>
> On Mon, Apr 24, 2023 at 6:23 PM Girish Sharma 
> wrote:
> >
> > Hello Yubiao,
> > As per my understanding, this feature suggestion is intended to delete
> the
> > topics from all replicated clusters under the namespace. Thus, the
> example
> > you are providing may not be a good fit for this?
> >
> > Xiangying, please clarify if my understanding is incorrect.
> >
> > On Mon, Apr 24, 2023 at 3:24 PM Yubiao Feng
> >  wrote:
> >
> > > Hi Girish Sharma
> > >
> > > > What additional advantage would one get by using that approach
> > > > rather than simply using a one liner script to just call delete
> > > > topic for each of those topics if the list of topics is known.
> > >
> > > If users enabled `Geo-Replication` on a namespace in mistake(expected
> > > only to enable one topic),
> > > it is possible to create many topics on the remote cluster in one
> second.
> > >
> > > Not long ago, 10,000 topics were created per second because of this
> > > mistake. It took us a long time to
> > > remove these topics. We delete these topics in this way:
> > > ```
> > > cat topics_name_file | awk  '{system("bin/pulsar-admin topics delete
> "$0)}'
> > > )
> > > ```
> > > It deletes topics one by one.
> > >
> > > We conclude later that stress test tools such as `Jmeter` or `ab`
> should be
> > > used to delete so many topics.
> > >
> > > If Pulsar could provide these APIs, it would be better.
> > >
> > > Thanks
> > > Yubiao Feng
> > >
> > >
> > >
> > >
> > > On Wed, Apr 19, 2023 at 3:29 PM Girish Sharma  >
> > > wrote:
> > >
> > > > Hello Yubiao,
> > > >
> > > > What additional advantage would one get by using that approach rather
> > > than
> > > > simply using a one liner script to just call delete topic for each of
> > > those
> > > > topics if the list of topics is known.
> > > >
> > > > Regards
> > > >
> > > > On Wed, Apr 19, 2023 at 12:54 PM Yubiao Feng
> > > >  wrote:
> > > >
> > > > > In addition to these two, It is recommended to add a method to
> batch
> > > > delete
> > > > > topics, such as this:
> > > > >
> > > > > ```
> > > > > pulsar-admin topics delete-all-topics , 
> > > > >
> > > > > or
> > > > >
> > > > > pulsar-admin topics delete-all-topic  > > lists>
> > > > &

Re: [Discuss] Suggestion for a "clear" parameter in Pulsar-admin to simplify tenant and namespace cleanup

2023-04-24 Thread Girish Sharma
Hello Yubiao,
As per my understanding, this feature suggestion is intended to delete the
topics from all replicated clusters under the namespace. Thus, the example
you are providing may not be a good fit for this?

Xiangying, please clarify if my understanding is incorrect.

On Mon, Apr 24, 2023 at 3:24 PM Yubiao Feng
 wrote:

> Hi Girish Sharma
>
> > What additional advantage would one get by using that approach
> > rather than simply using a one liner script to just call delete
> > topic for each of those topics if the list of topics is known.
>
> If users enabled `Geo-Replication` on a namespace in mistake(expected
> only to enable one topic),
> it is possible to create many topics on the remote cluster in one second.
>
> Not long ago, 10,000 topics were created per second because of this
> mistake. It took us a long time to
> remove these topics. We delete these topics in this way:
> ```
> cat topics_name_file | awk  '{system("bin/pulsar-admin topics delete "$0)}'
> )
> ```
> It deletes topics one by one.
>
> We conclude later that stress test tools such as `Jmeter` or `ab` should be
> used to delete so many topics.
>
> If Pulsar could provide these APIs, it would be better.
>
> Thanks
> Yubiao Feng
>
>
>
>
> On Wed, Apr 19, 2023 at 3:29 PM Girish Sharma 
> wrote:
>
> > Hello Yubiao,
> >
> > What additional advantage would one get by using that approach rather
> than
> > simply using a one liner script to just call delete topic for each of
> those
> > topics if the list of topics is known.
> >
> > Regards
> >
> > On Wed, Apr 19, 2023 at 12:54 PM Yubiao Feng
> >  wrote:
> >
> > > In addition to these two, It is recommended to add a method to batch
> > delete
> > > topics, such as this:
> > >
> > > ```
> > > pulsar-admin topics delete-all-topics , 
> > >
> > > or
> > >
> > > pulsar-admin topics delete-all-topic  lists>
> > > ```
> > >
> > > Thanks
> > > Yubiao Feng
> > >
> > > On Sat, Apr 15, 2023 at 5:37 PM Xiangying Meng 
> > > wrote:
> > >
> > > > Dear Apache Pulsar Community,
> > > >
> > > > I hope this email finds you well.I am writing to suggest a potential
> > > > improvement to the Pulsar-admin tool,
> > > >  which I believe could simplify the process of cleaning up tenants
> and
> > > > namespaces in Apache Pulsar.
> > > >
> > > > Currently, cleaning up all the namespaces and topics within a tenant
> or
> > > > cleaning up all the topics within a namespace requires several manual
> > > > steps,
> > > > such as listing the namespaces, listing the topics, and then deleting
> > > each
> > > > topic individually.
> > > > This process can be time-consuming and error-prone for users.
> > > >
> > > > To address this issue, I propose the addition of a "clear" parameter
> to
> > > the
> > > > Pulsar-admin tool,
> > > > which would automate the cleanup process for tenants and namespaces.
> > > Here's
> > > > a conceptual implementation:
> > > >
> > > > 1. To clean up all namespaces and topics within a tenant:
> > > > ``` bash
> > > > pulsar-admin tenants clear 
> > > > ```
> > > > 2. To clean up all topics within a namespace:
> > > > ```bash
> > > > pulsar-admin namespaces clear /
> > > > ```
> > > >
> > > > By implementing these new parameters, users would be able to perform
> > > > cleanup operations more efficiently and with fewer manual steps.
> > > > I believe this improvement would greatly enhance the user experience
> > when
> > > > working with Apache Pulsar.
> > > >
> > > > I'd like to discuss the feasibility of this suggestion and gather
> > > feedback
> > > > from the community.
> > > > If everyone agrees, I can work on implementing this feature and
> submit
> > a
> > > > pull request for review.
> > > >
> > > > Looking forward to hearing your thoughts on this.
> > > >
> > > > Best regards,
> > > > Xiangying
> > > >
> > >
> >
> >
> > --
> > Girish Sharma
> >
>


-- 
Girish Sharma


Re: ANNOUNCE] Apache Pulsar 2.10.4 released

2023-04-20 Thread Girish Sharma
Hello Xiangying,
The parent pom - org.apache.pulsar:pulsar:2.10.4 doesn't exist on maven
central here -
https://repo.maven.apache.org/maven2/org/apache/pulsar/pulsar/

Fetching 2.10.4 pulsar-client is failing due to this.

Regards

On Wed, Apr 19, 2023 at 2:09 PM Zike Yang  wrote:

> Hi, Xiangying
>
> Thanks for the announcement.
> I think we also need to send this email to us...@pulsar.apache.org and
> annou...@apache.org.
>
> BR,
> Zike Yang
>
> On Wed, Apr 19, 2023 at 12:37 PM Xiangying Meng 
> wrote:
> >
> > The Apache Pulsar team is proud to announce Apache Pulsar version 2.10.4.
> >
> > Pulsar is a highly scalable, low latency messaging platform running on
> > commodity hardware. It provides simple pub-sub semantics over topics,
> > guaranteed at-least-once delivery of messages, automatic cursor
> management
> > for
> > subscribers, and cross-datacenter replication.
> >
> > For Pulsar release details and downloads, visit:
> >
> > https://pulsar.apache.org/download
> >
> > Release Notes are at:
> > https://pulsar.apache.org/release-notes
> >
> > We would like to thank the contributors that made the release possible.
> >
> > Regards,
> >
> > The Pulsar Team
>


-- 
Girish Sharma


Re: [Discuss] Suggestion for a "clear" parameter in Pulsar-admin to simplify tenant and namespace cleanup

2023-04-19 Thread Girish Sharma
Hello Yubiao,

What additional advantage would one get by using that approach rather than
simply using a one liner script to just call delete topic for each of those
topics if the list of topics is known.

Regards

On Wed, Apr 19, 2023 at 12:54 PM Yubiao Feng
 wrote:

> In addition to these two, It is recommended to add a method to batch delete
> topics, such as this:
>
> ```
> pulsar-admin topics delete-all-topics , 
>
> or
>
> pulsar-admin topics delete-all-topic 
> ```
>
> Thanks
> Yubiao Feng
>
> On Sat, Apr 15, 2023 at 5:37 PM Xiangying Meng 
> wrote:
>
> > Dear Apache Pulsar Community,
> >
> > I hope this email finds you well.I am writing to suggest a potential
> > improvement to the Pulsar-admin tool,
> >  which I believe could simplify the process of cleaning up tenants and
> > namespaces in Apache Pulsar.
> >
> > Currently, cleaning up all the namespaces and topics within a tenant or
> > cleaning up all the topics within a namespace requires several manual
> > steps,
> > such as listing the namespaces, listing the topics, and then deleting
> each
> > topic individually.
> > This process can be time-consuming and error-prone for users.
> >
> > To address this issue, I propose the addition of a "clear" parameter to
> the
> > Pulsar-admin tool,
> > which would automate the cleanup process for tenants and namespaces.
> Here's
> > a conceptual implementation:
> >
> > 1. To clean up all namespaces and topics within a tenant:
> > ``` bash
> > pulsar-admin tenants clear 
> > ```
> > 2. To clean up all topics within a namespace:
> > ```bash
> > pulsar-admin namespaces clear /
> > ```
> >
> > By implementing these new parameters, users would be able to perform
> > cleanup operations more efficiently and with fewer manual steps.
> > I believe this improvement would greatly enhance the user experience when
> > working with Apache Pulsar.
> >
> > I'd like to discuss the feasibility of this suggestion and gather
> feedback
> > from the community.
> > If everyone agrees, I can work on implementing this feature and submit a
> > pull request for review.
> >
> > Looking forward to hearing your thoughts on this.
> >
> > Best regards,
> > Xiangying
> >
>


-- 
Girish Sharma


Re: [Discuss] Suggestion for a "clear" parameter in Pulsar-admin to simplify tenant and namespace cleanup

2023-04-15 Thread Girish Sharma
> However, the current goal is to keep the tenant and namespace intact while
> cleaning up their contents.
Ah, I see now. Yes, in that case a clear command is better. Will this
command also take into account the value of the broker config
`forceDeleteNamespaceAllowed` in case someone is clearing the owner tenant?

Regards

On Sat, Apr 15, 2023 at 3:39 PM Enrico Olivelli  wrote:

> The proposal sounds really useful, especially for automated testing.
> +1
>
> Enrico
>
> Il giorno sab 15 apr 2023 alle ore 12:07 Xiangying Meng
>  ha scritto:
> >
> > Dear Girish,
> >
> > Thank you for your response and suggestion to extend the use of the
> > `boolean force` flag for namespaces and tenants.
> > I understand that the `force` flag is already implemented for deleting
> > topics, namespaces, and tenants,
> > and it provides a consistent way to perform these actions.
> >
> > However, the current goal is to keep the tenant and namespace intact
> while
> > cleaning up their contents.
> > In other words, I want to have a way to remove all topics within a
> > namespace or all namespaces and topics
> > within a tenant without actually deleting the namespace or tenant itself.
> >
> > To achieve this goal, I proposed adding a `clear` command for
> `namespaces`
> > and `tenants`.
> >
> > This approach would allow users to keep the tenant and namespace
> structures
> > in place
> > while cleaning up their contents.
> > I hope this clarifies my intention, and I would like to hear your
> thoughts
> > on this proposal.
> >
> > Best regards,
> > Xiangying
> >
> > On Sat, Apr 15, 2023 at 5:49 PM Girish Sharma 
> > wrote:
> >
> > > Hello Xiangying,
> > > This indeed is a cumbersome task to delete a filled namespace or
> tenant. We
> > > face this challenge in our organization where we use the multi-tenancy
> > > feature of pulsar heavily.
> > >
> > > I would like to suggest a different command to do this though..
> Similar to
> > > how you cannot delete a topic without deleting its
> > > subscribers/producers/consumers, unless we use the `boolean force`
> flag.
> > > Why not extend this to namespace and tenant as well and let the force
> param
> > > do the cleanup (which your suggested `clear` command would do).
> > >
> > > As of today, using force to delete a namespace just returns 405 saying
> > > broker doesn't allow force delete of namespace containing topics.
> > >
> > > Any thoughts?
> > >
> > > On Sat, Apr 15, 2023 at 3:07 PM Xiangying Meng 
> > > wrote:
> > >
> > > > Dear Apache Pulsar Community,
> > > >
> > > > I hope this email finds you well.I am writing to suggest a potential
> > > > improvement to the Pulsar-admin tool,
> > > >  which I believe could simplify the process of cleaning up tenants
> and
> > > > namespaces in Apache Pulsar.
> > > >
> > > > Currently, cleaning up all the namespaces and topics within a tenant
> or
> > > > cleaning up all the topics within a namespace requires several manual
> > > > steps,
> > > > such as listing the namespaces, listing the topics, and then deleting
> > > each
> > > > topic individually.
> > > > This process can be time-consuming and error-prone for users.
> > > >
> > > > To address this issue, I propose the addition of a "clear" parameter
> to
> > > the
> > > > Pulsar-admin tool,
> > > > which would automate the cleanup process for tenants and namespaces.
> > > Here's
> > > > a conceptual implementation:
> > > >
> > > > 1. To clean up all namespaces and topics within a tenant:
> > > > ``` bash
> > > > pulsar-admin tenants clear 
> > > > ```
> > > > 2. To clean up all topics within a namespace:
> > > > ```bash
> > > > pulsar-admin namespaces clear /
> > > > ```
> > > >
> > > > By implementing these new parameters, users would be able to perform
> > > > cleanup operations more efficiently and with fewer manual steps.
> > > > I believe this improvement would greatly enhance the user experience
> when
> > > > working with Apache Pulsar.
> > > >
> > > > I'd like to discuss the feasibility of this suggestion and gather
> > > feedback
> > > > from the community.
> > > > If everyone agrees, I can work on implementing this feature and
> submit a
> > > > pull request for review.
> > > >
> > > > Looking forward to hearing your thoughts on this.
> > > >
> > > > Best regards,
> > > > Xiangying
> > > >
> > >
> > >
> > > --
> > > Girish Sharma
> > >
>


-- 
Girish Sharma


Re: [Discuss] Suggestion for a "clear" parameter in Pulsar-admin to simplify tenant and namespace cleanup

2023-04-15 Thread Girish Sharma
Hello Xiangying,
This indeed is a cumbersome task to delete a filled namespace or tenant. We
face this challenge in our organization where we use the multi-tenancy
feature of pulsar heavily.

I would like to suggest a different command to do this though.. Similar to
how you cannot delete a topic without deleting its
subscribers/producers/consumers, unless we use the `boolean force` flag.
Why not extend this to namespace and tenant as well and let the force param
do the cleanup (which your suggested `clear` command would do).

As of today, using force to delete a namespace just returns 405 saying
broker doesn't allow force delete of namespace containing topics.

Any thoughts?

On Sat, Apr 15, 2023 at 3:07 PM Xiangying Meng  wrote:

> Dear Apache Pulsar Community,
>
> I hope this email finds you well.I am writing to suggest a potential
> improvement to the Pulsar-admin tool,
>  which I believe could simplify the process of cleaning up tenants and
> namespaces in Apache Pulsar.
>
> Currently, cleaning up all the namespaces and topics within a tenant or
> cleaning up all the topics within a namespace requires several manual
> steps,
> such as listing the namespaces, listing the topics, and then deleting each
> topic individually.
> This process can be time-consuming and error-prone for users.
>
> To address this issue, I propose the addition of a "clear" parameter to the
> Pulsar-admin tool,
> which would automate the cleanup process for tenants and namespaces. Here's
> a conceptual implementation:
>
> 1. To clean up all namespaces and topics within a tenant:
> ``` bash
> pulsar-admin tenants clear 
> ```
> 2. To clean up all topics within a namespace:
> ```bash
> pulsar-admin namespaces clear /
> ```
>
> By implementing these new parameters, users would be able to perform
> cleanup operations more efficiently and with fewer manual steps.
> I believe this improvement would greatly enhance the user experience when
> working with Apache Pulsar.
>
> I'd like to discuss the feasibility of this suggestion and gather feedback
> from the community.
> If everyone agrees, I can work on implementing this feature and submit a
> pull request for review.
>
> Looking forward to hearing your thoughts on this.
>
> Best regards,
> Xiangying
>


-- 
Girish Sharma


Re: [DISCUSS] We must change the way we review PIPs

2023-03-31 Thread Girish Sharma
On Fri, Mar 31, 2023 at 7:09 PM Enrico Olivelli  wrote:

> I agree that we should finally have PIPs committed to git somewhere.
>
> When a PIP is approved it can be committed, but we must run the VOTE
> and wait for 3 binding +1,
> this is hard to do with a PR.
> It already happened a few times that people committed patches related
> to unapproved PIPs.
>
>
Github actually does have ways to allow for this. the PIPs can live in a
separate branch where we set 3 reviews as minimum required and the
CODEOWNERS file can list down usernames of all PMC members who are eligible
for binding votes.
[image: image.png]

Regards


Re: [DISCUSS] We must change the way we review PIPs

2023-03-30 Thread Girish Sharma
+1 (non-binding .. ? )
I've already commented a couple of times (here and there) that the process
needs to be consolidated at a single place. This is a good and detailed
approach.
Not sure if there is a historical context behind keeping the discussion in
dev mailing list..

Regards

On Fri, Mar 31, 2023 at 1:57 AM Asaf Mesika  wrote:

> Hi all,
>
> In the last 2 months, I've increased my PIP review time (I do it in
> cycles), and reviewed quite a few PIPs.
>
> My conclusion as a result of that:
>
> It's nearly impossible to review PIPs using a mailing list.
> We must fix it soon.
>
> *Why?*
> 1. Let's say you review the PIP and find 10 issues. Once you quote and
> comment on those ten points, you basically started 10 threads of
> conversations.
> After 2-3 ping pongs with quotes of quotes of quotes, it takes you forever
> to read each thread properly. You find your CTRL-F to search to find your
> original quote, and reply. Load it up again in your head, switching to the
> PIP tab to read it again.
> After 10 ping pongs, it becomes almost an impossible mission.
>
> I can say I'm 75% tired just from the process, not from the review itself.
>
> 2. It's non collaborative by design.
> After 10 ping pongs, the ability of someone to come and join the discussion
> is 0. They need to go through so many replies, which are half quotes, find
> the original reply, and browse to the PIP.
> It's no wonder people drop off the PIP review once we cross 5-6 replies.
> It's no wonder, nobody joins after 10 replies.
>
> 3. It's not open to the public. Non collaborative.
> You can't just get a link, and join the review. Not only because of (1) and
> (2). You need to join the mailing list. You don't get the past emails to
> reply. Just joining the list is a high enough bar for many people.
> I personally participated in review of proposals in OpenTelemetry in the
> last 6 months, even though I'm just an occasional user.  Why? They were
> conducted on GitHub PR , so it was easy for me - click a link and reply.
>
> 4. All over the place
> Sometimes people comment on the GitHub issue.
> Sometimes on the mailing list.
> Not a single place.
>
> 5. No history.
> Ok, finally the author was convinced. I can't see just the changes. They
> need to explicitly tell me something was changed.
>
> 6. Delete All.
> They can go crazy, after 1 year, edit the GitHub issue , delete all the
> text and write "Kafka is the king". No history, nobody can stop them. It's
> their issue.
>
> 7. Show me all the approved PIPs
> Hard to track it today, hard to maintain it updated.
>
> 8. Resolved comments
> Even though you managed to read all 35 replies so far, in reply 36 you see
> the author agreed to all 8 out of 10 suggestions. You have no idea of
> knowing that in advance. You just wasted 1 hour.
>
>
> *What do I suggest?*
>
> PR is the main tool we have that allows multiple threaded discussion.
> Git provides history. You can't delete it without approval from PMC
> members.
>
> 1. We'll create a folder named "pip" in the pulsar main repo. It will
> contain one markdown file for each PIP. The file will be named
> "pip-xxx.md".I will write below how to obtain XXX before you start.
> 2. To create a PIP, you grab "pip/template.md" and use it to compose your
> file in the pip folder.
> 3. You submit this file as a PR named "PIP-xxx: short description".
> 4. You create "[DISCUSS] PIP-xxx: short description" in the DEV mailing
> list and refer people to your PR, with short text explaining the gist of
> it.
> 5. People discuss using PR comments, each is its own threaded comment.
> 6. Comment was done discussion? They resolve it. This way you see what the
> pending discussions are at a glance.
> 7. Reached consensus? Good. Send "[VOTE] PIP-xxx: short description" on DEV
> mailing list.
> 8. PIP approved? Awesome. Push commit with link to vote.
> A PMC member will merge it.
> Merge == approved.
> PMC members can add a PIP label.
> 9. Rejected? All good. Close the PR.
>  Closed == Rejected.
>  It can't be deleted. All comments are still here.
>
> Before you start, you search Pull Requests with label PIP in GitHub (`is:pr
> "PIP-" in:title`)
> Take the biggest number and add 1.
> It is super rare to have two people create PR at the same time.
>
> *Show me all approved PIPs:*
> Search merged PRs labeled PIP.
> Look at "pip" folder
>
> *Show me rejected PIPs:*
> Search closed PRs with "PIP-" in title, or labeled PIP.
>
>
> This is very similar to how Apache BK works.
>
> WDYT?
>


-- 
Girish Sharma


Re: [DISCUSS] Change PIP template

2023-02-27 Thread Girish Sharma
 Hi Asaf,
I was referring to the PIP process, as a whole, as explained in
https://github.com/apache/pulsar/blob/master/wiki/proposals/PIP.md
Someone looking at GitHub ticket would find and almost empty PIP GH issue
while the same PIP has had many discussions over here in the ML.
There is scope of improvement in the process where we either remove the
first step to create the PIP over at GitHub and directly present the PIP in
the first mail of the thread here, or we do all discussions in GH.
Both the ML and GH are searchable and linkable for tracking purposes.

Regards

On Mon, Feb 27, 2023 at 6:23 PM Asaf Mesika  wrote:

> On Sun, Feb 26, 2023 at 2:49 PM Girish Sharma 
> wrote:
>
> > Good proposal Asaf.
> > I've also wondered why the PIP creation and discussion process is so
> > separated. The PIP discussion and voting starts off as a GitHub issue,
> but
> > all of its discussion happens here on the mailing list. Is there scope of
> > improvement in that process as well?
> >
>
> Not sure I follow. Can you outline the problem exactly?
>
>
> >
> > Regards
> >
> > On Sun, Feb 26, 2023 at 6:16 PM tison  wrote:
> >
> > > Hi Asaf,
> > >
> > > I agree that, generally, a PIP is written as a whole and paste as the
> > body.
> > > So +1 for your proposal.
> > >
> > > Additionally, I'm thinking of moving the doc of procedure (wiki/PIP.md)
> > to
> > > the contributions guide and use the new markdown template to supersede
> > the
> > > wiki/PIP-template.md. Then we don't need to hold the wiki folder.
> > >
> > > It can be an extended version to your proposal, so let's keep on your
> > > proposal in this thread. Just for your reference.
> > >
> > > Best,
> > > tison.
> > >
> > >
> > > Asaf Mesika  于2023年2月26日周日 19:18写道:
> > >
> > > > Hi,
> > > >
> > > > I would like to suggest two changes I'd like to make to the PIP
> design
> > > > template:
> > > > 1. Remove the form - just have a markdown template fill the issue
> body
> > as
> > > > it is created.
> > > > 2. Change the PIP template structure
> > > >
> > > > == Removing the form
> > > >
> > > > Today, when you want to submit a PIP, you are required to fill out a
> > form
> > > > with boxes composed of 3-4 lines length.
> > > > It's not good because:
> > > > * It broadcasts to the author: we want a very small PIP, something
> that
> > > > fits those small boxes.
> > > > * It makes the PIP look like a bug, where you fill out fields.
> > > > * It doesn't allow having H2 headings, only H1 headings, thus
> limiting
> > > the
> > > > structure.
> > > >
> > > > A PIP is a design essentially, something 1-3 pages long. Thus,
> > > > people take the time to write it down. Preferably, they copy paste
> the
> > > body
> > > > of the PIP issue, and use it to fill in sections.
> > > >
> > > > My suggestion is to define an issue template using only markdown,
> > > without a
> > > > form.
> > > >
> > > > == Changing PIP Structure
> > > >
> > > > Today the structure of the PIP doc (pasted below), is missing a
> section
> > > and
> > > > generally aims to jump directly into API changes / code /
> > implementation.
> > > > This results in lots of back and forth emails in an attempt to get
> the
> > > > following essentials:
> > > > * All required background knowledge to understand the proposal
> > > > * A high level overview of the proposed solution
> > > > * Understanding how this proposal will be monitored
> > > > * What steps exactly I need to take if I revert to the previous
> > version.
> > > >
> > > > The structure I propose below aims to reduce that friction and get
> all
> > > PIP
> > > > aligned to provide that information.
> > > >
> > > > === Today's structure
> > > >
> > > > # Motivation
> > > > * "Explain why this change is needed, what benefits it would bring to
> > > > Apache Pulsar and what problem it's trying to solve."
> > > > # Goal
> > > > * "Define the scope of this proposal. Given the motivation stated
> > above,
> > > > what are the problems that this proposal is addressing and what other
> > > items
>

Re: [DISCUSS] Change PIP template

2023-02-26 Thread Girish Sharma
d the problem statement and what you plan to change *without*
> > resorting to a couple of hours of code reading just to start having a
> high
> > level understanding of the change.
> > * Provide links where possible if a person wants to dig deeper into the
> > background information.
> > * Explain what is the problem you're trying to solve - current situation.
> > * This section is the "Why" of your proposal.
> >
> > # Goals
> > ## Scope
> > * Describe the goals of your proposal, and why it benefits Apache Pulsar
> > ## Out of Scope
> > * Describe what you have decided to keep out of scope, perhaps left for a
> > different PIP/s.
> >
> > # High-level Design
> > * Describe in high level, end-to-end, the solution. This should be a few
> > paragraphs long as a guideline.
> > * Reading this would allow me to understand the solution from a bird's
> eye
> > view, end to end.
> > * DON'T put all the design in a Google Doc and share the link here, as it
> > won't last the test of time.
> >
> > # Detailed Design
> > * Describe in detail what you plan to do to achieve your high level
> design
> > * It should include the following if applicable:
> >   * REST API Changes
> >   * Protocol Changes
> >
> > # Monitoring
> > * Describe exactly what you will add to Pulsar allowing you to
> > monitor/observe this proposal?
> >   * If those are metrics, provide the names, description, labels and
> units
> >   * Explain what constitutes abnormal that I should pay attention to
> >
> > # Backward Compatibility
> > * Describe exact instructions if someone needs to revert from a version
> > containing it to a previous version
> >
> > # Alternatives
> > * Describe alternative design decisions and why you have not opted for
> them
> >
> > # General notes
> > * Any general notes you wish to make
> >
> > # Links (Updated afterwards)
> > * Mailing List discussion thread:
> > * Mailing List voting thread:
> >
> > ==
> > Would love to hear what you think about it, before opening a PR about
> this.
> >
>


-- 
Girish Sharma


Re: [DISCUSS] PIP-235: Add metric for subscription backlog size

2022-12-30 Thread Girish Sharma
Hello Xiao,

When you say that the meaning of the metric will change, could you
elaborate further? The PIP does not talk about what all labels you are
proposing in this new metric, so maybe you can add more details in the PIP.

Here is what I understood from your PIP description. You will add a new
metric `pulsar_subscription_back_log_size` which will have granularity of
backlog size at a topic-subscription combination. Which means the general
use of the metric would be like sum(pulsar_subscription_back_log_size) by
(topic, subscription[, ...]) .
If this understanding is correct, then when the config to enable this new
metric is true, then these two expressions:

*`sum(pulsar_storage_backlog_size) by (cluster, topic)`*
and
*`sum(pulsar_subscription_back_log_size) by (cluster, topic)`*
will always result in the same value.
The only difference is that the new metric will also allow you to do
*`sum(pulsar_subscription_back_log_size) by (cluster, topic, subscription)`*
to get value for each subscription of the topic.

Regards

On Fri, Dec 30, 2022 at 10:26 PM 萧 易客  wrote:

> Hi Sharma,
>
> If a metric has different meaning with different config, it confuses user
> cause they need to check the config before use it. It's not best practice
> cause it's easy to made a mistake if user forgot or config changed.
> For example, if user has multiple clusters with different config, the same
> metric has different meaning on each cluser, it's add extra work to
> maintaining alarm rule.
>
> Yike Xiao
>
> ________
> 发件人: Girish Sharma 
> 发送时间: 2022年12月31日 0:07
> 收件人: dev@pulsar.apache.org 
> 主题: Re: [DISCUSS] PIP-235: Add metric for subscription back size
>
> How about adding a new label subscription in the same metric? It should
> also not break any existing usage of the metric as well for those only
> aggregating on topic or cluster label. The documentation can clearly
> mention that this is an addon on top of existing metrics.
>
> Please pardon my mistake as I am new to this mailing list, but if new
> additions to the same metrics are not considered by default as a general
> guideline in pulsar codebase, then this can go on its own new metric. It is
> just that this new metric would then again need the same labels (cluster,
> broker, topic etc) causing a lot of duplicate data.
> In our in-house usage of pulsar, metrics finally flow into a central
> platform and we have to be careful on the size (number of metrics and
> labels) that we are ingesting into this platform.
>
> Regards
>
> On Fri, Dec 30, 2022 at 8:55 PM 萧 易客  wrote:
>
> > Hi pulsar community,
> >
> > Motivation
> > Now we have pulsar_storage_backlog_size for topic backlog size, user can
> > create an alarm rule like pulsar_storage_backlog_size > THRESHOLD​,
> > typically this alarm is going to notify corresponding subscription owner,
> > but it need extra process to identify subscriptions that backlog size
> > exceed the threshold. So we could add a new metric for subscription back
> > size.
> >
> > For more detail, please read PIP-235 issue page [1]
> >
> > It's my first PIP, I'm looking forward to hearing what you think and open
> > for any suggestions.
> >
> > [1]:
> https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Fpulsar%2Fissues%2F19112=05%7C01%7C%7C830f15a75a5e4834f0f808daea7ffc58%7C84df9e7fe9f640afb435%7C1%7C0%7C638080132677757011%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C=hNjo6llUJ0gbi3PTjXwcYAipQ3cTSWmzFWFJCr3auZc%3D=0
> > [
> >
> https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fopengraph.githubassets.com%2Fabebee001c4e8297f9418e3b78bd77a854f26d1c4b73bfea5fb203251968a18e%2Fapache%2Fpulsar%2Fissues%2F19112=05%7C01%7C%7C830f15a75a5e4834f0f808daea7ffc58%7C84df9e7fe9f640afb435%7C1%7C0%7C638080132677757011%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C=p%2F7ZpaqFn5nJzVOP46cYjuN1KQZ0mhuiG8kUfJtkq7Q%3D=0
> > ]<
> https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Fpulsar%2Fissues%2F19112=05%7C01%7C%7C830f15a75a5e4834f0f808daea7ffc58%7C84df9e7fe9f640afb435%7C1%7C0%7C638080132677757011%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C=hNjo6llUJ0gbi3PTjXwcYAipQ3cTSWmzFWFJCr3auZc%3D=0
> >
> > PIP-235: Add metric for subscription back size · Issue #19112 ·
> > apache/pulsar<
> https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Fpulsar%2Fissues%2F19112=05%7C01%7C%7C830f15a75a5e4834f0f808daea7ffc58%7C84df9e7fe9f640afb435%7C1%7C0%7C

Re: [DISCUSS] PIP-235: Add metric for subscription back size

2022-12-30 Thread Girish Sharma
 How about adding a new label subscription in the same metric? It should
also not break any existing usage of the metric as well for those only
aggregating on topic or cluster label. The documentation can clearly
mention that this is an addon on top of existing metrics.

Please pardon my mistake as I am new to this mailing list, but if new
additions to the same metrics are not considered by default as a general
guideline in pulsar codebase, then this can go on its own new metric. It is
just that this new metric would then again need the same labels (cluster,
broker, topic etc) causing a lot of duplicate data.
In our in-house usage of pulsar, metrics finally flow into a central
platform and we have to be careful on the size (number of metrics and
labels) that we are ingesting into this platform.

Regards

On Fri, Dec 30, 2022 at 8:55 PM 萧 易客  wrote:

> Hi pulsar community,
>
> Motivation
> Now we have pulsar_storage_backlog_size for topic backlog size, user can
> create an alarm rule like pulsar_storage_backlog_size > THRESHOLD​,
> typically this alarm is going to notify corresponding subscription owner,
> but it need extra process to identify subscriptions that backlog size
> exceed the threshold. So we could add a new metric for subscription back
> size.
>
> For more detail, please read PIP-235 issue page [1]
>
> It's my first PIP, I'm looking forward to hearing what you think and open
> for any suggestions.
>
> [1]: https://github.com/apache/pulsar/issues/19112
> [
> https://opengraph.githubassets.com/abebee001c4e8297f9418e3b78bd77a854f26d1c4b73bfea5fb203251968a18e/apache/pulsar/issues/19112
> ]<https://github.com/apache/pulsar/issues/19112>
> PIP-235: Add metric for subscription back size · Issue #19112 ·
> apache/pulsar<https://github.com/apache/pulsar/issues/19112>
> Motivation Now we have pulsar_storage_backlog_size for topic backlog size,
> user can create an alarm rule like pulsar_storage_backlog_size 
> THRESHOLD, typically this alarm is going to notify cor...
> github.com
> 
> 
> 
> Yike Xiao
>


-- 
Girish Sharma


Too many emails - Is there a better way to control or manage emails from GitBox

2022-12-09 Thread Girish Sharma
Hello Pulsar community,

I recently joined this ML. I have been keenly following the RC, Voting and
PIP related email threads so far. I only have one question - is there a way
to disable the emails from GitBox about github discussions? Mainly for the
following reasons:

   1. The GitHub discussions emails from GitBox do not thread properly as
   the subject of the email contains unique text like "XYZ commented on thread
   "
   2. There is already a way to subscribe to github discussions via GitHub
   UI so these emails are duplicated.

Regards
-- 
Girish Sharma