Re: [DISCUSS] Cherry-pick PIP-364: Introduce a new load balance algorithm AvgShedder.
+1 Thanks, Kai
[DISCUSS] PIP-366: Support to specify different config for Configuration and Local Metadata Store
Hi all, I pushed a new proposal to support specifying different config for Configuration and Local Metadata Store Please take a look and share your thoughts. Thanks! Link: https://github.com/apache/pulsar/pull/23033 Thanks, Kai
Re: [VOTE] PIP-364: Introduce a new load balance algorithm AvgShedder.
+1 (non-binding) Thanks Kai On 2024/06/26 03:00:02 thetumbled wrote: > Hi, Pulsar Community. > I would like to start the voting thread for PIP-364: Introduce a new load > balance algorithm AvgShedder. > Proposal PR: https://github.com/apache/pulsar/pull/22946 > Implementation PR: https://github.com/apache/pulsar/pull/22949 > > Thanks, > Wenzhi Feng(thetumbled).
[DISCUSS] Cherry-pick PIP-321 Introduce allowed-cluster at the namespace level
Hi all I would like to start a discussion to cherry-pick PIP-321 into `branch-3.0` and `branch-3.3` - https://github.com/apache/pulsar/pull/22378 This PIP Introduced `allowed-clusters` at the namespace level to address if replication is only enabled at the topic level, the replicator will connect fail when creating a producer issue. I want to cherry-pick it into the version `3.0.x` and `3.3.x`. The thread will remain in the open state for 48 hours. If there is no objection, I will perform the cherry-picking. Thanks Kai
Re: [VOTE] PIP-357: Correct the conf name in load balance module.
+1 non-binding Thank you, Kai On 2024/06/05 02:32:51 thetumbled wrote: > Hi, Pulsar Community. > I would like to start the voting thread for PIP-357: Correct the conf name > in load balance module. > Proposal PR: https://github.com/apache/pulsar/pull/22823 > Implementation PR: https://github.com/apache/pulsar/pull/22824 > > Thanks, > Wenzhi Feng(thetumbled).
Re: [VOTE] PIP-354: apply topK mechanism to ModularLoadManagerImpl
+1 (non-binding) Thanks, Kai
Re: [VOTE] PIP-335: Oxia metadata support
+1 (non-binding) Thanks, Kai
Re: [DISCUSS] PIP-335: Oxia metadata support
+1 Thanks, Kai
Re: [VOTE] PIP-315: Configurable max delay limit for delayed delivery
+1 (non-binding) Thanks, Kai Wang On 2023/11/15 03:59:46 Kevin Lu wrote: > Hi All, > > This thread is to start a vote for PIP-315. > > PIP: https://github.com/apache/pulsar/pull/21490 > Discussion thread: > https://lists.apache.org/thread/285nm08842or324rxc2zy83wxgqxtcjp > > Regards, > Kevin >
Re: [ANNOUNCE] Yubiao Feng as new PMC member in Apache Pulsar
Congrats! Thanks, Kai
Re: [DISCUSS] Cherry-pick PR-16059 to 2.10 to prevent infinite unloading
Hi dev, I pushed a PR https://github.com/apache/pulsar/pull/20822 to fix the infinite unloading. Please help review this PR. Thanks! Thanks, Kai On 2023/07/09 23:40:03 PengHui Li wrote: > Hi Heesung, > > For 2.10, I would like to suggest fixing the issue instead of cherry-picking > the PR. The problem that https://github.com/apache/pulsar/pull/388 had > resolved will happen again if `loadBalancerDistributeBundlesEvenlyEnabled` > is disabled. We should try to remove the configuration in the future > because users are difficult to decide whether to enable or disable it. Both > of them have problems, just different issues. > > > I think we also need to consider the namespace anit-affinity-group logic > too. > > +1, it should be fixed to avoid an infinite bundle unloading loop. > > Thanks, > Penghui > > On Sat, Jul 8, 2023 at 4:07 AM Heesung Sohn > wrote: > > > Hi dev, > > > > I think we also need to consider the namespace anit-affinity-group logic > > too. These logics seem to do similar things. > > > > https://pulsar.apache.org/docs/3.0.x/administration-load-balance/#distribute-anti-affinity-namespaces-across-failure-domains > > > > > > PengHui > > We got three biding votes here. Do you think we should proceed to > > cherry-pick the PR to 2.10, then? > > > > Thanks, > > Heesung > > > > > > > > > > > > On Sun, Jul 2, 2023 at 5:22 PM PengHui Li wrote: > > > > > > `removeMostServicingBrokersForNamespace ` is introduced by [1] to > > > solve the problem that when all bundles in a particular namespace > > > belong to 1 or few machines, customers owning that namespace will be > > > heavily impacted if that broker goes down. Of course, this PR caused > > > the infinite unloading issue and we need to fix it. > > > > > > Thanks for the context. > > > It looks like we can also try to fix the infinite unloading issue. > > > Now, the broker is unloading the bundles without checking the > > distribution > > > of the bundles under a namespace, but it will check when finding > > > a new owner. Is it possible to check the bundle distribution before > > > unloading the bundles to avoid infinite unloading? > > > > > > Regards, > > > Penghui > > > > > > > > > On Sun, Jul 2, 2023 at 3:28 PM Enrico Olivelli > > > wrote: > > > > > > > +1 > > > > > > > > Enrico > > > > > > > > Il Dom 2 Lug 2023, 06:19 Hang Chen ha scritto: > > > > > > > > > +1 for cherry-picking it to branch-2.10. We have a flag to control > > > > > whether to enable or disable it. > > > > > > > > > > `removeMostServicingBrokersForNamespace ` is introduced by [1] to > > > > > solve the problem that when all bundles in a particular namespace > > > > > belong to 1 or few machines, customers owning that namespace will be > > > > > heavily impacted if that broker goes down. Of course, this PR caused > > > > > the infinite unloading issue and we need to fix it. > > > > > > > > > > > I agree with making it false for the next major version release by > > > > > default. > > > > > We'd better remove the config in the next version instead of change > > > > > the default value to `false`, which will make Pulsar's configuration > > > > > keep increasing. > > > > > > > > > > Thanks, > > > > > Hang > > > > > > > > > > [1] https://github.com/apache/pulsar/pull/388 > > > > > > > > > > PengHui Li 于2023年7月1日周六 09:38写道: > > > > > > > > > > > > +1 for cherry-pick to branch-2.10 since users don't have a > > workaround > > > > > > for this issue, and the change is well-understand, low risk. > > > > > > > > > > > > I agree with making it false for the next major version release by > > > > > default. > > > > > > > > > > > > Thanks, > > > > > > Penghui > > > > > > > > > > > > On Sat, Jul 1, 2023 at 9:26 AM Heesung Sohn > > > > > > wrote: > > > > > > > > > > > > > Hi dev, > > > > > > > > > > > > > > I realized that `removeMostServicingBrokersForNamespace` func in > > > the > > > > > broker > > > > > > > selection logic can cause infinite unloading. > > > > > > > > > > > > > > Suppose an overloaded broker unloaded a bundle and only has the > > > > minimum > > > > > > > number of bundles(in that namespace) among brokers. In that case, > > > the > > > > > > > selection logic (`removeMostServicingBrokersForNamespace`) will > > > > filter > > > > > out > > > > > > > other brokers and always reassign the bundle to the previous > > > broker. > > > > > This > > > > > > > will cause infinite unloading(like a boomerang). > > > > > > > > > > > > > > To mitigate this issue, we need to cherry-pick this PR to disable > > > > this > > > > > > > logic by the config. > > > > > > > https://github.com/apache/pulsar/pull/16059 > > > > > > > > > > > > > > And we probably want to disable this > > > > > > > `removeMostServicingBrokersForNamespace` logic by default. > > > > > > > > > > > > > > Regards, > > > > > > > Heesung > > > > > > > > > > > > > > > > > > > > > >
Re: [ANNOUNCE] Bo Cong as new PMC member in Apache Pulsar
Congratulations! Thanks, Kai On Jan 18, 2023 at 9:50 PM +0800, PengHui Li , wrote: > Hi all, > > The Apache Pulsar Project Management Committee (PMC) has invited Bo Cong > (https://github.com/congbobo184) as a member of the PMC and we are > pleased to announce that he has accepted. > > He is very active in the community in the past few years and made a lot of > great contributions > such as transactions and schemas. > > Welcome Bo Cong to the Apache Pulsar PMC. > > Best Regards, > Penghui on behalf of the Pulsar PMC
Re: [ANNOUNCE] New Committer: Baodi Shi
Congratulations! Thanks, Kai On Jan 18, 2023 at 9:36 PM +0800, dev@pulsar.apache.org, wrote: > > Congratulations !
Re: [ANNOUNCE] Yunze Xu as a new PMC member in Apache Pulsar
Congratulations! Yunze Thanks, Kai
Re: [DISCUSSION] Any idea about simplify the configuration file?
+1, we can provide a minimal configuration file to users. It only contains the required config and a few commonly used configs. The full configuration file can be named `broker.full.conf`, and it is used to provide a reference for users. Thanks, Kai On Dec 13, 2022 at 9:03 PM +0800, Yunze Xu , wrote: > For example, when running a standalone (without TLS enabled), only the > following configs are required: > > ```properties > brokerServicePort=6650 > webServicePort=8080 > allowLoopback=true > clusterName=standalone > managedLedgerDefaultEnsembleSize=1 > managedLedgerDefaultWriteQuorum=1 > managedLedgerDefaultAckQuorum=1 > ``` > > Actually only these two ports and `clusterName` are needed, other > configurations can be configured with a default values for standalone. > However, I found there are over 600 configurations in the > `standalone.conf`: > > ```bash > $ grep "^[^#]" conf/standalone.conf | wc -l > 629 > ``` > > Thanks, > Yunze Xu > > On Tue, Dec 13, 2022 at 8:53 PM Yunze Xu wrote: > > > > Hi all, > > > > As more people joined the development of Pulsar and more PIPs are > > opened, I found the configurations became very large. At the moment > > for commit 9917aac, there are 426 configuration items in broker.conf, > > which is too many. > > > > ```bash > > $ grep "^[^#]" conf/broker.conf | wc -l > > 426 > > ``` > > > > For beginners, finding the real useful configs from the `broker.conf` > > is hard. For developers, it's also bad. For example, the IDE code > > completion works significantly slower for a method of > > `ServiceConfiguration` than other classes. > > > > Let's look at Apache Kafka, there are only 17 configs in the server > > configuration file. > > > > ```bash > > kafka_2.13-3.3.1$ grep "^[^#]" config/server.properties | wc -l > > 17 > > ``` > > > > I think this difference makes Pulsar far more complicated to customize > > than Kafka, or than Pulsar should be. > > > > I have an idea that we can split `ServiceConfiguration` into different > > configuration classes. Some configs that are not commonly used should > > be moved into another configuration file. Just a brainstorm, does > > anyone else have better ideas? > > > > Thanks, > > Yunze
Re: [DISCUSS] PIP-221: Make TableView support read the non-persistent topic
Hi Joe, > I am not sure about the semantics of TableView on a non-persistent topic. > What happens if the client crashes? What is the base state for the table? If users use a non-persistent topic as the TableView topic, when the client crashes, the TableViews data will be lose. The current use case is to use the non-persistent topic to store the load data used by the new load manager. It doesn't require strong consistency ensure, and no need persistence. Thanks, Kai On 2022/11/14 23:03:13 Joe F wrote: > I am not sure about the semantics of TableView on a non-persistent topic. > > Exactly how does this work? > > What happens if the client crashes? What is the base state for the table? > > What exactly can I expect as a user from this? > > Joe > > On Sun, Nov 13, 2022 at 8:57 PM Kai Wang wrote: > > > Hi, pulsar-dev community, > > > > Since the non-persistent topic support doesn't require API changes. I have > > pushed a PR to implement it, which has already been merged. > > > > See: https://github.com/apache/pulsar/pull/18375 > > > > And this PIP title has been changed to `Make TableView support TTL`. > > > > PIP link: https://github.com/apache/pulsar/issues/18229 > > > > Thanks, > > Kai > > > > On 2022/11/04 02:28:41 Kai Wang wrote: > > > Hi, pulsar-dev community, > > > > > > I’ve opened a PIP to discuss : PIP-221: Make TableView support read the > > non-persistent topic. > > > > > > PIP link: https://github.com/apache/pulsar/issues/18229 > > > > > > Thanks, > > > Kai > > > > > >
Re: [DISCUSS] PIP-221: Make TableView support read the non-persistent topic
Hi Michael, > What time is used to compute expiration? Is it the publish time or the > receive time? This TTL will be based on the message publish time. We can also make it configurable if users have this demand. > Also, are there cases that will reset a key's timer? If some keys need to reset the timer, users can publish a new message with the old key . Since we are using the publish time as the expiration time. Thanks, Kai On 2022/11/14 16:43:06 Michael Marshall wrote: > > And this PIP title has been changed to `Make TableView support TTL`. > > What time is used to compute expiration? Is it the publish time or the > receive time? Also, are there cases that will reset a key's timer? > > Thanks, > Michael > > On Mon, Nov 14, 2022 at 2:40 AM Enrico Olivelli wrote: > > > > Il giorno lun 14 nov 2022 alle ore 05:57 Kai Wang ha > > scritto: > > > > > > Hi, pulsar-dev community, > > > > > > Since the non-persistent topic support doesn't require API changes. I > > > have pushed a PR to implement it, which has already been merged. > > > > > > See: https://github.com/apache/pulsar/pull/18375 > > > > Perfect > > > > Thanks > > Enrico > > > > > > > > And this PIP title has been changed to `Make TableView support TTL`. > > > > > > PIP link: https://github.com/apache/pulsar/issues/18229 > > > > > > Thanks, > > > Kai > > > > > > On 2022/11/04 02:28:41 Kai Wang wrote: > > > > Hi, pulsar-dev community, > > > > > > > > I’ve opened a PIP to discuss : PIP-221: Make TableView support read the > > > > non-persistent topic. > > > > > > > > PIP link: https://github.com/apache/pulsar/issues/18229 > > > > > > > > Thanks, > > > > Kai > > > > >
Re: [ANNOUNCE] New Committer: Lin Chen
Congrats! Thanks, Kai
Re: [ANNOUNCE] New Committer: Zili Chen
Congratulations! tison Thanks, Kai On Nov 10, 2022 at 8:16 AM +0800, dev@pulsar.apache.org, wrote: > > Congratulations! tison
Re: [DISCUSS] PIP-221: Make TableView support read the non-persistent topic
Hi, pulsar-dev community, Since the non-persistent topic support doesn't require API changes. I have pushed a PR to implement it, which has already been merged. See: https://github.com/apache/pulsar/pull/18375 And this PIP title has been changed to `Make TableView support TTL`. PIP link: https://github.com/apache/pulsar/issues/18229 Thanks, Kai On 2022/11/04 02:28:41 Kai Wang wrote: > Hi, pulsar-dev community, > > I’ve opened a PIP to discuss : PIP-221: Make TableView support read the > non-persistent topic. > > PIP link: https://github.com/apache/pulsar/issues/18229 > > Thanks, > Kai >
[DISCUSS] PIP-221: Make TableView support read the non-persistent topic
Hi, pulsar-dev community, I’ve opened a PIP to discuss : PIP-221: Make TableView support read the non-persistent topic. PIP link: https://github.com/apache/pulsar/issues/18229 Thanks, Kai
Re: [VOTE] Pulsar Client C++ Release 3.0.0 Candidate 3
+1 (non-binding) Environment: M1 macOS 12.6 and Ubuntu 20.04 x86_64 * verify checksum and signatures * build from the source * run producers and consumers But when I follow `README.md` guide to starting the standalone broker, I get an error. ``` ➜ apache-pulsar-client-cpp-3.0.0 ./pulsar-test-service-start.sh fatal: not a git repository (or any of the parent directories): .git ``` On 2022/10/21 21:29:36 Matteo Merli wrote: > This is the third release candidate for Apache Pulsar Client C++, version > 3.0.0. > > It fixes the following issues: > https://github.com/apache/pulsar-client-cpp/milestone/1?closed=1 > > *** Please download, test and vote on this release. This vote will stay open > for at least 72 hours *** > > Note that we are voting upon the source (tag), binaries are provided for > convenience. > > Source and binary files: > https://dist.apache.org/repos/dist/dev/pulsar/pulsar-client-cpp-3.0.0-candidate-3/ > > SHA-512 checksums: > 4e68aed2018c40885124db8aa2c31303da63e2ecf9655807cd1eb2f8c802bff00abc322f08b2a58defc4ec089f9a74d567523307067cd254ef1d61f764fb0b3b > ./apache-pulsar-client-cpp-3.0.0.tar.gz > > The tag to be voted upon: > v3.0.0-candidate-3 (f70aa89d1ac0c012d0dc472e1c53462834dfb517) > https://github.com/apache/pulsar-client-cpp/releases/tag/v3.0.0-candidate-3 > > Pulsar's KEYS file containing PGP keys you use to sign the release: > https://dist.apache.org/repos/dist/dev/pulsar/KEYS > > Please download the source package, and follow the README to compile and test. > > -- > Matteo Merli > >
Re: [DISCUSS] Release Pulsar Client C++ 3.0.0
+1 Thanks, Kai On Oct 7, 2022, 2:07 AM +0800, Matteo Merli , wrote: > We have moved the C++ client to its own separate repo > (https://github.com/apache/pulsar-client-cpp) as part of PIP-209. > > There are several new features and fixes in the main branch that it > would be good to get released, as well to get the new release process > all flushed out. > > Matteo > > > -- > Matteo Merli >
Re: [VOTE] PIP-209: Separate C++/Python clients to own repositories
+1 (non-binding) Thanks, Kai
Re: [ANNOUNCE] Jiwei Guo as a new PMC member in Pulsar
Congratulations! Thanks, Kai On 2022/08/18 11:24:01 PengHui Li wrote: > Hi, all > > I'm glad to announce that the Apache Pulsar PMC invited Jiwei Guo to join > the > PMC and he accepted. > > Please join in celebrating! > > Best, > Penghui >
Re: [Vote] PIP-192 New Pulsar Broker Load Balancer
+1 (non-binding) Thanks, Kai Heesung Sohn 于2022年8月2日周二 08:50写道: > Dear Pulsar Community, > > Please review and vote on this PIP. > > PIP link: https://github.com/apache/pulsar/issues/16691 > > Thank you, > -Heesung >
Re: [ANNOUNCE] Micheal Marshall as a new PMC member in Pulsar
Congratulations Michael! Thanks, Kai
Re: CI - remove cmd line test retries?
+1 Thanks, Kai
Re: [VOTE] [PIP-182] Provide new load balance placement strategy implementation for ModularLoadManagerStrategy
+1 (non-binding) Thanks, Kai
[DISSCUSS] [PIP-151] Use the system topic to store the bundle load data
Hi Pulsar community, I created a PIP to use the system topic to store the bundle load data in the load manager. The proposal can be found: https://github.com/apache/pulsar/issues/15037 - ## Motivation Currently, Pulsar LoadManager is using Zookeeper to store the bundle load data, when we have many bundles, this might put a lot of pressure on Zookeeper. ## Goal This PIP proposes storing the load manager's bundle load data to a system topic and using TableView to read it. ## Implementation Since the bundle load data is stats data, it no needs a strong consistent guarantee. So we can use the system topic to store the load data and use TableView to read it. ### System topic client Add a new SystemTopicClient calls `LoadBalanceBundleDataSystemTopicClient` , the topic name is `persistent://pulsar/system/__load_balance_bundle_data`, and the key is the bundle name, value is BundleData. ```java public class LoadBalanceBundleDataSystemTopicClient extends SystemTopicClientBase { // ... } ``` Add new Event calls `LoadBalanceBundleDataEvent` ``` @Data public class LoadBalanceBundleDataEvent { private String bundle; private BundleData bundleData; } ``` ### ModularLoadManagerImpl Add a TableView in `ModularLoadManagerImpl` to replace the bundle cache ``` private TableView bundlesTableView; ``` ## Compatibility This feature can have both backward and forward compatibility since the bundle data is stats data. - Thanks, Kai
Re: [DISCUSS] Use the non-persistent topic to sync bundle load data
This proposal only considers using non-persistent topics to sync bundle load data, the historical load data will still write into zookeeper, I think we can draft another proposal for persistent historical load data to the system topic. Thanks, Kai On Mar 18, 2022, 02:54 +0800, Joe F , wrote: > IIRC, there is a historical load profile for topic that feeds into > decisions by the load balancer. > > What happens during a cluster startup, with this new proposal? > > > > > On Thu, Mar 17, 2022 at 7:50 AM PengHui Li wrote: > > > > But which brokers will own that topic ? > > in a Pulsar cluster with a high level of isolation of tenants, we must > > ensure that: > > - at least one broker is allowed to own the topic > > - brokers dedicated to tenants do not own the topic > > With the current approach the data in on zookeeper, and this is shared > > among all the brokers > > > > We have "pulsar/system" namespace which can be used to maintain > > system topics. If users consider broker isolation, it's all transparent. > > > > Using a topic we also can shared the data among all brokers. > > Who want a data copy, only need to create a reader when starting. > > And we have introduced table view, which will make it easier to cache > > the load data, and perform the load cache update. > > > > > Another point: > > will users be allowed to produce/consume this topic ? how do we deal > > with permissions = > > > > Good point. We should avoid the user's producers/consumers, and only > > the super user can access the system topic. > > > > Thanks, > > Penghui > > > > On Thu, Mar 17, 2022 at 10:08 PM Enrico Olivelli > > wrote: > > > > > Il giorno gio 17 mar 2022 alle ore 02:42 PengHui Li > > > ha scritto: > > > > > > > > > we do not know > > > > anything about the availability of the owner of the topic. > > > > > > > > If the owner broker is not available, other brokers will take over. > > > > > > > > > We could make it simpler and when a broker wants to push its data, it > > > > looks > > > > up the REST address of the "leader broker" and then pushes the data to > > > it, > > > > I mean, without involving a "topic" > > > > > > > > Any broker may become the leader broker, in this case, the brokers need > > > to > > > > know all the addresses of the brokers in the cluster. With the topic > > > > approach, > > > > they only need to know the topic name. > > > > > > I thought about this a little more. > > > Using a non persistent topic makes sense. So I am closer to be > > > convinced about this move. > > > > > > But which brokers will own that topic ? > > > in a Pulsar cluster with a high level of isolation of tenants, we must > > > ensure that: > > > - at least one broker is allowed to own the topic > > > - brokers dedicated to tenants do not own the topic > > > With the current approach the data in on zookeeper, and this is shared > > > among all the brokers > > > > > > Another point: > > > will users be allowed to produce/consume this topic ? how do we deal > > > with permissions = > > > > > > > > > Enrico > > > > > > > > > > > Penghui > > > > > > > > On Thu, Mar 17, 2022 at 12:35 AM Enrico Olivelli > > > > wrote: > > > > > > > > > But in order to read from a topic you need a broker that is the owner > > > of > > > > > the owner of the special "temporary topic". > > > > > > > > > > While the metadata service (ZooKeeper) is already a central point and > > > it is > > > > > meant to be available (otherwise Pulsar doesn't work), we do not know > > > > > anything about the availability of the owner of the topic. > > > > > > > > > > Or do you mean to create a special topic that is always owned by the > > > > > "leader broker" ? > > > > > > > > > > We could make it simpler and when a broker wants to push its data, it > > > looks > > > > > up the REST address of the "leader broker" and then pushes the data > > to > > > it, > > > > > I mean, without involving a "topic". > > > > > > > > > > > > > > > Enri
[DISCUSS] Use the non-persistent topic to sync bundle load data
Hi Pulsar Community, Currently, Pulsar LoadManager is using Zookeeper to store the local broker data, the LoadReportUpdaterTask will report the local load data to Zookeeper, the leader broker will collect load data and store it to Zookeeper. When we have a lot of brokers and bundles, this load datas will put some pressure on Zookeeper. Since the load data are not strongly consistent, we can use the non-persistent topics to sync the load data. And it will reduce our dependence on Zookeeper. If this proposal is acceptable, I will draft a PIP. Any suggestions are appreciated. Thanks, Kai