[DISCUSS] cherry-pick #21816 resolving the metrics missing issue for time-based backlog
Hi all, I want to start a discussion to cherry-pick #21816[0] to release branches. This PR added the metrics for the time-based backlog, which is introduced in 2.8.0 [1]. However, there has always been a lack of relevant indicators to assist users in daily monitoring work. It becomes a blocker for users to use the time-based backlog on production, and it is hard to add alerts and dashboards. Since #21816 is not a BUG fix, it hasn't been cherry-picked to release branches. But now, I believe having it in the release branches is worth it. The target branches: - branch-3.2 - branch-3.0 - branch-2.11 - branch-2.10 [0] https://github.com/apache/pulsar/pull/21816 [1] https://github.com/apache/pulsar/pull/10093 I will keep the discussion open for at least 48 hours. If there is no objections, I will perform the cherry-picking. Regards, Penghui
Re: Suggestions on GitHub labels and issue templates
Thanks Kiryl, very good proposal. > • (?) Probably it makes sense to enable and track website and docs issues in > apache/pulsar-site repository. And add a good visible link to apache/pulsar > README.md. Yes, that would work too. Since the issue reporting for docs has been centralized to apache/pulsar in the past, I don't think that it's a great idea to move it back to pulsar-site, unless there's a compelling reason. Instead of moving the location of website and docs issues, we could improve the template for docs issues and add a template for website issues. -Lari On Mon, 18 Mar 2024 at 15:39, Kiryl Valkovich wrote: > > Comment with better formatting on GitHub: > https://github.com/apache/pulsar/issues/22277#issuecomment-2002553745 > > • Deprecate java label. Pulsar is written in Java and most PRs update > Java code. > • Instead of removing labels, deprecate them by renaming them to > deprecated/. Probably pick another prefix that is alphabetically > closer to the end of the alphabet to reduce noise. > • Add go label automatically using labeler: > https://github.com/apache/pulsar/blob/master/.github/labeler.yml > go: > - changed-files: > - any-glob-to-any-file: '**/*.go' > > • Add component/* labels automatically based on the file path > component/config: > - changed-files: > - any-glob-to-any-file: 'conf/**/*' > - any-glob-to-any-file: 'pulsar-config-validation/**/*' > component/client: > - changed-files: > - any-glob-to-any-file: 'pulsar-client/**/*' > - any-glob-to-any-file: 'pulsar-client-*/**/*' > ... > > • Rename bug label to type/bug for consistency. Keep the red color. > • (?) Rename component/* => area/* for shorter names. The > https://github.com/kubernetes/kubernetes/labels has such naming. > • Rename doc-required label to type/doc. Relabel open issues and PRs with > doc labels to the type/doc. > • Deprecate all other doc-* labels. If it is needed for some kind of > workflow, simply use the board project with ToDo -> In Progress -> Done > states. > • (?) Probably it makes sense to enable and track website and docs issues > in apache/pulsar-site repository. And add a good visible link to > apache/pulsar README.md. > • Deprecate the question label. Instead, move such issues to Discussions > -> Q > • Migrate issues with the enhancement either to type/feature label or > Discussions. Add a new Suggest an idea issue template that redirects to the > Discussions -> Ideas > • (?) Rename PIP => type/PIP for consistency > • Rename flaky-test => type/flaky-test to consistency > • Deprecate lifecycle/stale label. Use Stale instead. Rename Stale => > stale for consistency. > • Add the ability to pick an area/* label from the dropdown on issue > creation. > systemd/systemd and a few other projects use this action for that: > https://github.com/redhat-plumbers-in-action/advanced-issue-labeler?tab=readme-ov-file#real-life-examples > > > Best, > Kiryl >
Re: [DISCUSS] Release Pulsar C++ Client 3.5.1 and upgrade the verify process
+1 Thank you for push this discussion. We can modify the release process: we'll require the release manager to attach the PR for Python and Node.js upgrades when initiating a candidate vote, and ensure it CI can pass. Once the CPP client release is successful, we can remove the candidate, and then push for its merge. Thanks, Baodi Shi On Mar 25, 2024 at 18:29:23, Yunze Xu wrote: > Hi all, > > Recently I found a regression [1] for the C++ client 3.5.0 (thanks to > the reminder from @shibd). So I will push a fix and then release the > C++ client 3.5.1. > > However, this is not the 1st time that a regression was introduced, > see [2] for example. So I suggest when verifying the C++ client, we > can verify the Python and Node.js clients by upgrading the > dependencies as well. See the updated release process in [3]. > > [1] https://github.com/apache/pulsar-client-cpp/issues/420 > [2] https://lists.apache.org/thread/rjolgrlp4x1lmfj678k3hjco80kcb73c > [3] > https://github.com/apache/pulsar-client-cpp/wiki/Verify-the-candidate-release-in-your-local-env#verify-the-3rd-party-projects-that-depend-on-pulsar-c-client > > Thanks, > Yunze >
Re: [DISCUSS] Broken builds and CI Failures in Maintenance Branches; improving maintenance strategy to address root causes
Hi, Lari Thanks for driving the discussion, and I agree that the cherry-picking is the pain especially when we need to maintain old branches for a long time. Frankly, my first impression is to target the bug fix to branch-3.0, but the features and improvements to the master branch will burden the contributors and committers more. They might merge the changes to the wrong branches for a time because they need time to build muscle memory. Of course, we can use CI to check the labels and the target branch. It will not be a blocker. I agree that the merge branch solution will resolve the ordering and coordination issues arising from the cherry-pick solutions. Coordination means how to decide a PR should be cherry-picked (Yunze pointed out to me). I have a few questions about the merge branch solution. - It looks like we will employ both merge branches and cherry-pick solutions finally after we have 4.0. Because at that time, the target branch for the BUG fix is branch-4.0, and we still have 18 month overlap. - For the existing cherry-picking solution, if there is a case that we can't cherry-pick it due to too many conflicts, we will usually create a separate PR for the release branch directly. How do we handle this case with the merge branch solution? If I understand correctly, we can also push separate PRs to the new branches and always apply the new branches when handling merge conflicts from this commit? - Is it possible to cherry-pick commits from the master to the LTS branch? The reason for asking this question is a PR might be recognized as an improvement, but someone found it should be contained in the LTS version. For example, https://github.com/apache/pulsar/pull/21739. Maybe there are other solutions to handle this case, e.g., push PR directly. Because we might get much more conflicts at that time. - Do we need to wait for the PRs that are targeted to branch-3.0 to be merged before cutting branch-4.0? Because if there are many comments on the existing PR, we don't want to ask the author to create a new one to continue the review with targeting branch-4.0. Usually, we will cut branches for preparing the release for at least 3 weeks. It sounds like a challenge because we will only allow regression fixes to branch-4.0 during that time. We need to find a solution for it. - Does the committer performing the branch merging need to resolve all the conflicts? I mean, if we have 20 commits need to merge, and maybe there is only one that is urgent to merge to the new branch for a patch release. With the cherry-pick solution, you can only cherry-pick that commit and create the patch release. I think we must merge all the commits for the merge branch solution. Maybe I'm wrong. I would support the merge branch solution and we also need documentation to clarify the items to note. If I understand correctly, we can also go back to the current solution if we find something is not working, right? Because the cherry-pick is very flexible even if the merges happen between branches. At least worth trying. Regards, Penghui On Wed, Mar 20, 2024 at 9:38 PM Yunze Xu wrote: > > However, in async work, people should have more patience to read and > write. > > I mean, it would be better to have something like "TL; DR". Anyway, > I'd like to apply this change since the next feature release (3.3.0). > > Thanks, > Yunze > > On Tue, Mar 19, 2024 at 12:10 AM Lari Hotari wrote: > > > > Thanks for the comments, Yunze. > > > > On 2024/03/18 05:48:39 Yunze Xu wrote: > > > I'm afraid many people don't have patience to read all the contents. > > > > I agree. However, in async work, people should have more patience to > read and write. Synchronous meetings aren't a good solution either. The > lack of patience could be caused by lack of interest. There's not a large > group of people in our community that are interested in improving the > maintenance strategy and also committed to invest their time and effort in > these activities. I hope more people sign up to this type of efforts and > show their interest and commitment in improving Apache Pulsar. > > > > > Here is my summary in short (please correct me if I'm wrong): > > > - For bug fixes, the target branch should be branch-3.0. Once the PR > > > is merged into branch-3.0, checkout the branch-3.x and run `git merge > > > branch-3.0` and resolve the conflicts > > > > I didn't describe the details of how this is handle. It is different in > practice. > > > > > - For features, the target branch should be branch-3.x > > > > New features would continue to go to master (or "main" if we decide to > rename it). Bugs would be fixed in the branch where the feature containing > the bug was introduced if it is missing from the LTS branch. > > > > > Since we introduced the LTS concept, I agree that we should make > > > branch-3.0 as the default branch. Cherry-picking is a disaster when > > > cherry-picks happen in the wrong order. > > > > Yes. > > >
[DISCUSS] Release Pulsar C++ Client 3.5.1 and upgrade the verify process
Hi all, Recently I found a regression [1] for the C++ client 3.5.0 (thanks to the reminder from @shibd). So I will push a fix and then release the C++ client 3.5.1. However, this is not the 1st time that a regression was introduced, see [2] for example. So I suggest when verifying the C++ client, we can verify the Python and Node.js clients by upgrading the dependencies as well. See the updated release process in [3]. [1] https://github.com/apache/pulsar-client-cpp/issues/420 [2] https://lists.apache.org/thread/rjolgrlp4x1lmfj678k3hjco80kcb73c [3] https://github.com/apache/pulsar-client-cpp/wiki/Verify-the-candidate-release-in-your-local-env#verify-the-3rd-party-projects-that-depend-on-pulsar-c-client Thanks, Yunze
Re: [VOTE] Pulsar Client Python Release 3.5.0 Candidate 2
Cancel this release for the regression found in https://github.com/apache/pulsar-client-cpp/issues/420. I will prepare the fix and start the release for the C++ client 3.5.1. Then I will continue the candidate 3. Thanks, Yunze On Mon, Mar 25, 2024 at 3:48 PM PengHui Li wrote: > > +1 (binding) > > - Checked the signature > - Installed the wheel on macOS with Python 3.12 > - Run the consume and produce examples > > Regards, > Penghui > > On Fri, Mar 22, 2024 at 11:55 PM Yunze Xu wrote: > > > This is the 2nd release candidate for Apache Pulsar Client Python, > > version 3.5.0. > > > > It fixes the following issues: > > https://github.com/apache/pulsar-client-python/milestone/6?closed=1 > > > > *** Please download, test and vote on this release. This vote will > > stay open for at least 72 hours *** > > > > Python wheels: > > > > https://dist.apache.org/repos/dist/dev/pulsar/pulsar-client-python-3.5.0-candidate-2/ > > > > The supported python versions are 3.8, 3.9, 3.10, 3.11 and 3.12. The > > supported platforms and architectures are: > > - Windows x86_64 (windows/) > > - glibc-based Linux x86_64 (linux-glibc-x86_64/) > > - glibc-based Linux arm64 (linux-glibc-arm64/) > > - musl-based Linux x86_64 (linux-musl-x86_64/) > > - musl-based Linux arm64 (linux-musl-arm64/) > > - macOS universal 2 (macos/) > > > > You can download the wheel (the `.whl` file) according to your own OS > > and Python version > > and install the wheel: > > - Windows: `py -m pip install *.whl --force-reinstall` > > - Linux or macOS: `python3 -m pip install *.whl --force-reinstall` > > > > The tag to be voted upon: v3.5.0-candidate-2 > > (730c2d7dea60ff632688463662a6101cacb98c22) > > > > https://github.com/apache/pulsar-client-python/releases/tag/v3.5.0-candidate-2 > > > > Pulsar's KEYS file containing PGP keys you use to sign the release: > > https://downloads.apache.org/pulsar/KEYS > > > > Please download the Python wheels and follow the README to test. > >
Re: [VOTE] PIP-345: Optimize finding message by timestamp
Hi, Jiuming Yes, it's not a good one "ManagedLedger#getEarliestM essagePublishTimeInBacklog" and it should be the only one in the ManagedLedger to have a publish time concept. I think we mixed the concepts in https://github.com/apache/pulsar/pull/12523, which is bad. It's better to start a proposal to deprecate this method and change existing implemetation. > For finding message by timestamp, we can introduce `sparse index` to Pulsar, after add entries complete, add a index to `ManagedLedgerIndex` and store the index to ML. What do you think? Yes, we can have different options. If users do not have too much data in one Ledger (and it is configurable), It should be fine. We can just build the index based on the Ledger's timestamp (the Ledger close time). By default, it should be good for many use cases. Since we have the ManagedLedgerIndex abstract, users can also develop their own implementations for extreme performance requirements. Just keep the Pulsar core more clear, simple and work for most common cases. Regards, Penghui On Mon, Mar 25, 2024 at 5:47 PM 太上玄元道君 wrote: > Hi Penghui, > > Thanks for your feedback! > > I'm not sure about this either, since publishTimestamp is a Messaging layer > concept, and ML as a Persistence layer should not be aware about this. > > But in ML, I'd noticed some methods searching message by > PublishTimestamp(say, > ManagedLedgerImpl#getEarliestMessagePublishTimeInBacklog), > so that's why I want to add publishTimestamp to ML. > > Introduce secondary index to ML is a good idea, since RocketMQ has a `Hash > index`, and Kakfa has a `Sparse index`. > > For finding message by timestamp, we can introduce `sparse index` to > Pulsar, after add entries complete, add a index to `ManagedLedgerIndex` and > store the index to ML. What do you think? > > Thanks, > Tao Jiuming > > > > PengHui Li 于2024年3月25日周一 15:17写道: > > > Hi, Jiuming > > > > I'm sorry for not getting back to you sooner. > > > > First, I support the motivation to optimize this case because it could > be a > > significant > > blocker for users who want infinite data retention, which is a BIG > > differentiator > > with Apache Kafka. And, I really saw the cases with high publish > > throughput, and one > > ledger could even hold 1M entries, 100M new entries published to a topic. > > > > Then, I try to check the details of the existing implementation. I think > > the tricky part is > > the publish time is not the concept of the ManageLedger. I saw the > changes > > that you > > proposed will add publish time to the ManageLedger module, which doesn't > > look good > > me. Because it will couple the Pulsar concept with the ManageLedger > > concept. > > > > Essentially, the publish time could be a secondary index of the > > ManageLedger. > > My opinion is to have a general ManagedLedgerIndex abstract, and the > Pulsar > > broker > > can create any index it wants. Since the broker creates the index, the > > broker can control the > > index's behavior. Then, the ManageLedger can provide an API to search the > > entry > > with a ManagedLedgerIndex. With this option, we don't need to add the > > publish > > time concept to ManagedLedger directly. > > > > In this case, if the broker tries to search the entry with a predicate > and > > index. The managed > > ledger will search from the index first. Of course, if the relevant entry > > cannot be found in the index, > > just fall back to the "optimized full scan". > > > > Regards, > > Penghui > > > > > > On Mon, Mar 25, 2024 at 11:51 AM 太上玄元道君 wrote: > > > > > bump > > > > > > 太上玄元道君 于2024年3月20日 周三16:23写道: > > > > > > > bump > > > > > > > > 太上玄元道君 于2024年3月19日 周二19:35写道: > > > > > > > >> Hi Pulsar community, > > > >> > > > >> This thread is to start a vote for PIP-345: Optimize finding message > > by > > > >> timestamp > > > >> > > > >> PIP: https://github.com/apache/pulsar/pull/22234 > > > >> Discuss thread: > > > >> https://lists.apache.org/thread/5owc9os6wmy52zxbv07qo2jrfjm17hd2 > > > >> > > > >> Thanks, > > > >> Tao Jiuming > > > >> > > > > > > > > > >
Re: [VOTE] PIP-345: Optimize finding message by timestamp
Hi Penghui, Thanks for your feedback! I'm not sure about this either, since publishTimestamp is a Messaging layer concept, and ML as a Persistence layer should not be aware about this. But in ML, I'd noticed some methods searching message by PublishTimestamp(say, ManagedLedgerImpl#getEarliestMessagePublishTimeInBacklog), so that's why I want to add publishTimestamp to ML. Introduce secondary index to ML is a good idea, since RocketMQ has a `Hash index`, and Kakfa has a `Sparse index`. For finding message by timestamp, we can introduce `sparse index` to Pulsar, after add entries complete, add a index to `ManagedLedgerIndex` and store the index to ML. What do you think? Thanks, Tao Jiuming PengHui Li 于2024年3月25日周一 15:17写道: > Hi, Jiuming > > I'm sorry for not getting back to you sooner. > > First, I support the motivation to optimize this case because it could be a > significant > blocker for users who want infinite data retention, which is a BIG > differentiator > with Apache Kafka. And, I really saw the cases with high publish > throughput, and one > ledger could even hold 1M entries, 100M new entries published to a topic. > > Then, I try to check the details of the existing implementation. I think > the tricky part is > the publish time is not the concept of the ManageLedger. I saw the changes > that you > proposed will add publish time to the ManageLedger module, which doesn't > look good > me. Because it will couple the Pulsar concept with the ManageLedger > concept. > > Essentially, the publish time could be a secondary index of the > ManageLedger. > My opinion is to have a general ManagedLedgerIndex abstract, and the Pulsar > broker > can create any index it wants. Since the broker creates the index, the > broker can control the > index's behavior. Then, the ManageLedger can provide an API to search the > entry > with a ManagedLedgerIndex. With this option, we don't need to add the > publish > time concept to ManagedLedger directly. > > In this case, if the broker tries to search the entry with a predicate and > index. The managed > ledger will search from the index first. Of course, if the relevant entry > cannot be found in the index, > just fall back to the "optimized full scan". > > Regards, > Penghui > > > On Mon, Mar 25, 2024 at 11:51 AM 太上玄元道君 wrote: > > > bump > > > > 太上玄元道君 于2024年3月20日 周三16:23写道: > > > > > bump > > > > > > 太上玄元道君 于2024年3月19日 周二19:35写道: > > > > > >> Hi Pulsar community, > > >> > > >> This thread is to start a vote for PIP-345: Optimize finding message > by > > >> timestamp > > >> > > >> PIP: https://github.com/apache/pulsar/pull/22234 > > >> Discuss thread: > > >> https://lists.apache.org/thread/5owc9os6wmy52zxbv07qo2jrfjm17hd2 > > >> > > >> Thanks, > > >> Tao Jiuming > > >> > > > > > >
Re: [VOTE] PIP-344: Correct the behavior of the public API pulsarClient.getPartitionsForTopic(topicName)
Hi, Yubiao It's better to list the names of the 3 bindings. Thanks, Penghui On Mon, Mar 25, 2024 at 4:58 PM Yubiao Feng wrote: > Close the vote with 3(binding). > > Thanks > Yubiao Feng > > On Sat, Mar 16, 2024 at 6:28 AM Yubiao Feng > wrote: > > > Hi All > > > > This thread is to start a vote for PIP-344. > > > > PIP: https://github.com/apache/pulsar/pull/22182 > > Discussion thread: > > https://lists.apache.org/thread/z693blcxoqk0mj0rzyt1k7nvy72j18t5 > > > > Thanks > > Yubiao Feng > > >
Re: [VOTE] PIP-344: Correct the behavior of the public API pulsarClient.getPartitionsForTopic(topicName)
Close the vote with 3(binding). Thanks Yubiao Feng On Sat, Mar 16, 2024 at 6:28 AM Yubiao Feng wrote: > Hi All > > This thread is to start a vote for PIP-344. > > PIP: https://github.com/apache/pulsar/pull/22182 > Discussion thread: > https://lists.apache.org/thread/z693blcxoqk0mj0rzyt1k7nvy72j18t5 > > Thanks > Yubiao Feng >
Re: [RESULT] [VOTE] PIP-342: Support OpenTelemetry metrics in Pulsar client
Sorry, I forgot to submit my PR review before. Just some minor comments about the names. Please take a look. Regards, Penghui On Fri, Mar 22, 2024 at 11:21 PM Matteo Merli wrote: > Closing this vote with 4 binding and 4 non-binding +1s > > Binding +1s: > * Lari > * Mattison > * PengHui > * Matteo > > Non-Binding +1s: > * Dao Jun > * Apurva > * Asaf > * Zixuan > > > Thanks, > Matteo > > > -- > Matteo Merli > > > > On Thu, Mar 14, 2024 at 11:54 PM Zixuan Liu wrote: > > > +1 (non-binding) > > > > Thanks, > > Zixuan > > > > PengHui Li 于2024年3月15日周五 09:47写道: > > > > > +1 (binding) > > > > > > Regards, > > > Penghui > > > > > > On Fri, Mar 15, 2024 at 2:32 AM Asaf Mesika > > wrote: > > > > > > > +1 (non-binding) > > > > > > > > On Thu, Mar 14, 2024 at 8:29 PM Apurva Telang < > > apurvatelan...@gmail.com> > > > > wrote: > > > > > > > > > +1 (non-binding) > > > > > > > > > > On Thu, Mar 14, 2024 at 2:12 AM mattison chao < > > mattisonc...@gmail.com> > > > > > wrote: > > > > > > > > > > > +1 (binding) > > > > > > > > > > > > Best, > > > > > > Mattison > > > > > > On Mar 14, 2024 at 15:55 +0800, Lari Hotari >, > > > > wrote: > > > > > > > +1 (binding) > > > > > > > > > > > > > > -Lari > > > > > > > > > > > > > > On Thu, 14 Mar 2024 at 03:45, Matteo Merli < > > matteo.me...@gmail.com > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > PIP: https://github.com/apache/pulsar/pull/22178 > > > > > > > > > > > > > > > > WIP PR: https://github.com/apache/pulsar/pull/22179 > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > Matteo Merli > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > Best regards, > > > > > Apurva Telang. > > > > > > > > > > > > > > >
Re: [VOTE] Pulsar Client Python Release 3.5.0 Candidate 2
+1 (binding) - Checked the signature - Installed the wheel on macOS with Python 3.12 - Run the consume and produce examples Regards, Penghui On Fri, Mar 22, 2024 at 11:55 PM Yunze Xu wrote: > This is the 2nd release candidate for Apache Pulsar Client Python, > version 3.5.0. > > It fixes the following issues: > https://github.com/apache/pulsar-client-python/milestone/6?closed=1 > > *** Please download, test and vote on this release. This vote will > stay open for at least 72 hours *** > > Python wheels: > > https://dist.apache.org/repos/dist/dev/pulsar/pulsar-client-python-3.5.0-candidate-2/ > > The supported python versions are 3.8, 3.9, 3.10, 3.11 and 3.12. The > supported platforms and architectures are: > - Windows x86_64 (windows/) > - glibc-based Linux x86_64 (linux-glibc-x86_64/) > - glibc-based Linux arm64 (linux-glibc-arm64/) > - musl-based Linux x86_64 (linux-musl-x86_64/) > - musl-based Linux arm64 (linux-musl-arm64/) > - macOS universal 2 (macos/) > > You can download the wheel (the `.whl` file) according to your own OS > and Python version > and install the wheel: > - Windows: `py -m pip install *.whl --force-reinstall` > - Linux or macOS: `python3 -m pip install *.whl --force-reinstall` > > The tag to be voted upon: v3.5.0-candidate-2 > (730c2d7dea60ff632688463662a6101cacb98c22) > > https://github.com/apache/pulsar-client-python/releases/tag/v3.5.0-candidate-2 > > Pulsar's KEYS file containing PGP keys you use to sign the release: > https://downloads.apache.org/pulsar/KEYS > > Please download the Python wheels and follow the README to test. >
Re: Suggestions on GitHub labels and issue templates
The labels updates: - Removed the `java` label. We only have a few legacy PRs labeled with `java`. - Changed `component/*` to `area/*` - Deprecated `question` label - Changed `PIP` to `type/PIP` - Changed `flaky-test` to `type/flaky-test` On Mon, Mar 25, 2024 at 3:17 PM PengHui Li wrote: > Yes, the PR is welcome. > > Best, > Penghui > > On Mon, Mar 25, 2024 at 3:08 PM Kiryl Valkovich > wrote: > >> Hi PengHui, >> Sure. If the PR is welcome here, I’ll submit it in a few days. >> >> >> Best, >> Kiryl >> >> > On Mar 25, 2024, at 6:07 AM, PengHui Li wrote: >> > >> > Hi Kiryl, >> > >> > Thanks for your suggestions, and they are looking good to me >> > I'll follow your suggestions on renaming or deprecating the labels. >> > >> > For the label automation, do you want to push a PR to add it? >> > >> > Regards, >> > Penghui >> > >> > >> > On Mon, Mar 18, 2024 at 9:39 PM Kiryl Valkovich >> > wrote: >> > >> >> Comment with better formatting on GitHub: >> >> https://github.com/apache/pulsar/issues/22277#issuecomment-2002553745 >> >> >> >>• Deprecate java label. Pulsar is written in Java and most PRs >> update >> >> Java code. >> >>• Instead of removing labels, deprecate them by renaming them to >> >> deprecated/. Probably pick another prefix that is >> >> alphabetically closer to the end of the alphabet to reduce noise. >> >>• Add go label automatically using labeler: >> >> https://github.com/apache/pulsar/blob/master/.github/labeler.yml >> >> go: >> >> - changed-files: >> >> - any-glob-to-any-file: '**/*.go' >> >> >> >>• Add component/* labels automatically based on the file path >> >> component/config: >> >> - changed-files: >> >> - any-glob-to-any-file: 'conf/**/*' >> >> - any-glob-to-any-file: 'pulsar-config-validation/**/*' >> >> component/client: >> >> - changed-files: >> >> - any-glob-to-any-file: 'pulsar-client/**/*' >> >> - any-glob-to-any-file: 'pulsar-client-*/**/*' >> >> ... >> >> >> >>• Rename bug label to type/bug for consistency. Keep the red color. >> >>• (?) Rename component/* => area/* for shorter names. The >> >> https://github.com/kubernetes/kubernetes/labels has such naming. >> >>• Rename doc-required label to type/doc. Relabel open issues and PRs >> >> with doc labels to the type/doc. >> >>• Deprecate all other doc-* labels. If it is needed for some kind of >> >> workflow, simply use the board project with ToDo -> In Progress -> Done >> >> states. >> >>• (?) Probably it makes sense to enable and track website and docs >> >> issues in apache/pulsar-site repository. And add a good visible link to >> >> apache/pulsar README.md. >> >>• Deprecate the question label. Instead, move such issues to >> >> Discussions -> Q >> >>• Migrate issues with the enhancement either to type/feature label >> or >> >> Discussions. Add a new Suggest an idea issue template that redirects >> to the >> >> Discussions -> Ideas >> >>• (?) Rename PIP => type/PIP for consistency >> >>• Rename flaky-test => type/flaky-test to consistency >> >>• Deprecate lifecycle/stale label. Use Stale instead. Rename Stale >> => >> >> stale for consistency. >> >>• Add the ability to pick an area/* label from the dropdown on issue >> >> creation. >> >> systemd/systemd and a few other projects use this action for that: >> >> >> https://github.com/redhat-plumbers-in-action/advanced-issue-labeler?tab=readme-ov-file#real-life-examples >> >> >> >> >> >> Best, >> >> Kiryl >> >> >> >> >> >>
Re: Suggestions on GitHub labels and issue templates
Yes, the PR is welcome. Best, Penghui On Mon, Mar 25, 2024 at 3:08 PM Kiryl Valkovich wrote: > Hi PengHui, > Sure. If the PR is welcome here, I’ll submit it in a few days. > > > Best, > Kiryl > > > On Mar 25, 2024, at 6:07 AM, PengHui Li wrote: > > > > Hi Kiryl, > > > > Thanks for your suggestions, and they are looking good to me > > I'll follow your suggestions on renaming or deprecating the labels. > > > > For the label automation, do you want to push a PR to add it? > > > > Regards, > > Penghui > > > > > > On Mon, Mar 18, 2024 at 9:39 PM Kiryl Valkovich > > wrote: > > > >> Comment with better formatting on GitHub: > >> https://github.com/apache/pulsar/issues/22277#issuecomment-2002553745 > >> > >>• Deprecate java label. Pulsar is written in Java and most PRs update > >> Java code. > >>• Instead of removing labels, deprecate them by renaming them to > >> deprecated/. Probably pick another prefix that is > >> alphabetically closer to the end of the alphabet to reduce noise. > >>• Add go label automatically using labeler: > >> https://github.com/apache/pulsar/blob/master/.github/labeler.yml > >> go: > >> - changed-files: > >> - any-glob-to-any-file: '**/*.go' > >> > >>• Add component/* labels automatically based on the file path > >> component/config: > >> - changed-files: > >> - any-glob-to-any-file: 'conf/**/*' > >> - any-glob-to-any-file: 'pulsar-config-validation/**/*' > >> component/client: > >> - changed-files: > >> - any-glob-to-any-file: 'pulsar-client/**/*' > >> - any-glob-to-any-file: 'pulsar-client-*/**/*' > >> ... > >> > >>• Rename bug label to type/bug for consistency. Keep the red color. > >>• (?) Rename component/* => area/* for shorter names. The > >> https://github.com/kubernetes/kubernetes/labels has such naming. > >>• Rename doc-required label to type/doc. Relabel open issues and PRs > >> with doc labels to the type/doc. > >>• Deprecate all other doc-* labels. If it is needed for some kind of > >> workflow, simply use the board project with ToDo -> In Progress -> Done > >> states. > >>• (?) Probably it makes sense to enable and track website and docs > >> issues in apache/pulsar-site repository. And add a good visible link to > >> apache/pulsar README.md. > >>• Deprecate the question label. Instead, move such issues to > >> Discussions -> Q > >>• Migrate issues with the enhancement either to type/feature label or > >> Discussions. Add a new Suggest an idea issue template that redirects to > the > >> Discussions -> Ideas > >>• (?) Rename PIP => type/PIP for consistency > >>• Rename flaky-test => type/flaky-test to consistency > >>• Deprecate lifecycle/stale label. Use Stale instead. Rename Stale => > >> stale for consistency. > >>• Add the ability to pick an area/* label from the dropdown on issue > >> creation. > >> systemd/systemd and a few other projects use this action for that: > >> > https://github.com/redhat-plumbers-in-action/advanced-issue-labeler?tab=readme-ov-file#real-life-examples > >> > >> > >> Best, > >> Kiryl > >> > >> > >
Re: [VOTE] PIP-345: Optimize finding message by timestamp
Hi, Jiuming I'm sorry for not getting back to you sooner. First, I support the motivation to optimize this case because it could be a significant blocker for users who want infinite data retention, which is a BIG differentiator with Apache Kafka. And, I really saw the cases with high publish throughput, and one ledger could even hold 1M entries, 100M new entries published to a topic. Then, I try to check the details of the existing implementation. I think the tricky part is the publish time is not the concept of the ManageLedger. I saw the changes that you proposed will add publish time to the ManageLedger module, which doesn't look good me. Because it will couple the Pulsar concept with the ManageLedger concept. Essentially, the publish time could be a secondary index of the ManageLedger. My opinion is to have a general ManagedLedgerIndex abstract, and the Pulsar broker can create any index it wants. Since the broker creates the index, the broker can control the index's behavior. Then, the ManageLedger can provide an API to search the entry with a ManagedLedgerIndex. With this option, we don't need to add the publish time concept to ManagedLedger directly. In this case, if the broker tries to search the entry with a predicate and index. The managed ledger will search from the index first. Of course, if the relevant entry cannot be found in the index, just fall back to the "optimized full scan". Regards, Penghui On Mon, Mar 25, 2024 at 11:51 AM 太上玄元道君 wrote: > bump > > 太上玄元道君 于2024年3月20日 周三16:23写道: > > > bump > > > > 太上玄元道君 于2024年3月19日 周二19:35写道: > > > >> Hi Pulsar community, > >> > >> This thread is to start a vote for PIP-345: Optimize finding message by > >> timestamp > >> > >> PIP: https://github.com/apache/pulsar/pull/22234 > >> Discuss thread: > >> https://lists.apache.org/thread/5owc9os6wmy52zxbv07qo2jrfjm17hd2 > >> > >> Thanks, > >> Tao Jiuming > >> > > >
Re: Suggestions on GitHub labels and issue templates
Hi PengHui, Sure. If the PR is welcome here, I’ll submit it in a few days. Best, Kiryl > On Mar 25, 2024, at 6:07 AM, PengHui Li wrote: > > Hi Kiryl, > > Thanks for your suggestions, and they are looking good to me > I'll follow your suggestions on renaming or deprecating the labels. > > For the label automation, do you want to push a PR to add it? > > Regards, > Penghui > > > On Mon, Mar 18, 2024 at 9:39 PM Kiryl Valkovich > wrote: > >> Comment with better formatting on GitHub: >> https://github.com/apache/pulsar/issues/22277#issuecomment-2002553745 >> >>• Deprecate java label. Pulsar is written in Java and most PRs update >> Java code. >>• Instead of removing labels, deprecate them by renaming them to >> deprecated/. Probably pick another prefix that is >> alphabetically closer to the end of the alphabet to reduce noise. >>• Add go label automatically using labeler: >> https://github.com/apache/pulsar/blob/master/.github/labeler.yml >> go: >> - changed-files: >> - any-glob-to-any-file: '**/*.go' >> >>• Add component/* labels automatically based on the file path >> component/config: >> - changed-files: >> - any-glob-to-any-file: 'conf/**/*' >> - any-glob-to-any-file: 'pulsar-config-validation/**/*' >> component/client: >> - changed-files: >> - any-glob-to-any-file: 'pulsar-client/**/*' >> - any-glob-to-any-file: 'pulsar-client-*/**/*' >> ... >> >>• Rename bug label to type/bug for consistency. Keep the red color. >>• (?) Rename component/* => area/* for shorter names. The >> https://github.com/kubernetes/kubernetes/labels has such naming. >>• Rename doc-required label to type/doc. Relabel open issues and PRs >> with doc labels to the type/doc. >>• Deprecate all other doc-* labels. If it is needed for some kind of >> workflow, simply use the board project with ToDo -> In Progress -> Done >> states. >>• (?) Probably it makes sense to enable and track website and docs >> issues in apache/pulsar-site repository. And add a good visible link to >> apache/pulsar README.md. >>• Deprecate the question label. Instead, move such issues to >> Discussions -> Q >>• Migrate issues with the enhancement either to type/feature label or >> Discussions. Add a new Suggest an idea issue template that redirects to the >> Discussions -> Ideas >>• (?) Rename PIP => type/PIP for consistency >>• Rename flaky-test => type/flaky-test to consistency >>• Deprecate lifecycle/stale label. Use Stale instead. Rename Stale => >> stale for consistency. >>• Add the ability to pick an area/* label from the dropdown on issue >> creation. >> systemd/systemd and a few other projects use this action for that: >> https://github.com/redhat-plumbers-in-action/advanced-issue-labeler?tab=readme-ov-file#real-life-examples >> >> >> Best, >> Kiryl >> >>