Re: [DISCUSS] PIP-254: Support configuring client version at SDK level

2023-03-08 Thread Zike Yang
Hi Yunze,

> I have changed this proposal to just add a config to `ClientBuilder`.

I propose to add a field named `clientVersionSuffix` rather than the
`clientVersion`. As I said before:
https://lists.apache.org/thread/g0128l85fkcmw4821188mjjznxbo4lhd

This is helpful for debugging. Especially for the case of the Nodejs
client in which users can compile the C++ client on their own. This
way, we can know exactly which underlying C++ client version the user
uses.

Thanks,
Zike Yang

On Wed, Mar 8, 2023 at 5:17 PM Yunze Xu  wrote:
>
> Hi,
>
> I have changed this proposal to just add a config to `ClientBuilder`.
> And here is the demo implementation:
> https://github.com/BewareMyPower/pulsar/pull/21/files
>
> PTAL again.
>
> Thanks,
> Yunze
>
> On Sat, Mar 4, 2023 at 10:39 PM Yunze Xu  wrote:
> >
> > Hi Enrico,
> >
> > Thanks for your suggestion. It makes sense to me. I will think again
> > and modify this proposal.
> >
> > Hi Tison,
> >
> > I mentioned the C++ client because the initial motivation is to solve
> > the issue for the Python client and Node.js client. But after thinking
> > for a while, I believe it's more general for clients of other
> > languages, including Java. And this proposal is only for the Java
> > client.
> >
> > Thanks,
> > Yunze
> >
> > On Sat, Mar 4, 2023 at 1:42 PM tison  wrote:
> > >
> > > I agree with Enrico that it's better to have a config option.
> > >
> > > Also, we cannot simply replace the PulsarVersion call with the
> > > DynamicPulsarVersion call because the client version string is now
> > > constructed as:
> > >
> > > String.format("Pulsar-Java-v%s", PulsarVersion.getVersion())
> > >
> > > It's a config of client version string, not pulsar version.
> > >
> > > Moreover, in your proposal, you mention the case of client c++ at first,
> > > but don't talk about it later. Is the scope of this proposal in the Java
> > > client only?
> > >
> > > Best,
> > > tison.
> > >
> > >
> > > Enrico Olivelli  于2023年3月4日周六 06:38写道:
> > >
> > > > Yunze,
> > > >
> > > > Il Ven 3 Mar 2023, 12:31 Yunze Xu  ha
> > > > scritto:
> > > >
> > > > > Hi all,
> > > > >
> > > > > Based on the previous discussion [1], I created a proposal to support
> > > > > configuring client version at SDK level:
> > > > > https://github.com/apache/pulsar/issues/19705
> > > > >
> > > > > I've added more explanations in the motivation part, let's use this
> > > > > PIP as a subsequent discussion of [1].
> > > > >
> > > > > BTW, there is a PR [2] in the pulsar-client-cpp repo because the
> > > > > motivation is more meaningful for the C++ client.
> > > > >
> > > >
> > > > I understand well this problem, we have it for the cited clients but I 
> > > > also
> > > > see the same issue for other libraries based on the Java client, like 
> > > > the
> > > > official Apache Pulsar Reactive client.
> > > >
> > > > I also see this problem in Startlight for JMS that is a JMS client for
> > > > Pulsar that is based on the Java client.
> > > >
> > > > While I agree on the problem and on the solution I think that a static
> > > > field is not enough, we have some problems:
> > > >
> > > > 1) there may be multiple usages of the Java client in the same JVM, and 
> > > > you
> > > > want each client to report correctly its version
> > > >
> > > > 2) we would need to use the Java security Manager in order to prevent
> > > > malicious code to modify the version or some other mechanism to prevent
> > > > overriding the version.
> > > >
> > > > I believe that in the case of the Java client is is easier to add a
> > > > configuration entry to the Pulsar Client Configuration. That would 
> > > > become a
> > > > field in the JavaClient. So each instance can declare its version and 
> > > > also
> > > > malicious code won't be able ti easily tweak the version (because it 
> > > > won't
> > > > be a simple static method call)
> > > >
> > > > Enrico
> > > >
> > > >
> > > >
> > > > > [1] https://lists.apache.org/thread/n59k537fhthjnzkfxtc2p4zk4l0cv3mp
> > > > > [2] https://github.com/apache/pulsar-client-cpp/pull/208
> > > > >
> > > > > Thanks!
> > > > >
> > > >


Re: [VOTE] Pulsar Release 2.10.4 Candidate 1

2023-03-08 Thread Xiangying Meng
Thanks for the reminder.
I will make a release later.

Thanks.
Xiangying


On Thu, Mar 9, 2023 at 2:17 PM Haiting Jiang  wrote:

> Seems like we should include this PR in this release.
> https://github.com/apache/pulsar/pull/19754
>
> See https://lists.apache.org/thread/odofmj9h8ln6blozhgkgmx0mbyll45dp
>
> Thanks,
> Haiting
>
> On Thu, Mar 9, 2023 at 2:07 PM 丛搏  wrote:
> >
> > +1 (binding)
> >
> > os: mac 12.3.1, Intel
> > java: OpenJDK 17.0.1
> >
> > - Checked the signature
> > - Checked LICENSE
> > - Start standalone
> > - Publish and consume messages
> > - Verified Function and State Function
> > - Verified Cassandra connector
> > - Build from the source package (maven 3.6.1, openJDK 11.0.12)
> > - Run a simple transaction performance check
> >
> > Thanks,
> > Bo
> >
> > guo jiwei  于2023年3月8日周三 15:00写道:
> > >
> > > +1 (binding)
> > >
> > > - Build from the source package
> > > - Checked the signature
> > > - Publish and consume messages
> > > - Verified Function and State Function
> > > - Verified Cassandra connector
> > >
> > > Regards
> > > Jiwei Guo (Tboy)
> > >
> > > On Tue, Mar 7, 2023 at 2:22 PM Xiangying Meng 
> wrote:
> > > >
> > > > Please ignore the previous email. This commit did not break CI.
> > > > Instead, a very coincidental thing happened.
> > > > 1. There may be problems with the maven server at that time. The
> three PRs
> > > > mentioned at that time could not download the correct jar package,
> and the
> > > > retry was invalid.
> > > > 2. A flaky test `recoverLongTimeAfterMultipleWriteErrors` failed
> multiple
> > > > times in a row.
> > > >
> > > > So I mistakenly thought it was caused by the last unverified commit.
> > > > So the RC is correct, please help verify it and vote.
> > > >
> > > > Thanks
> > > > Xiangying
> > > >
> > > > On Sun, Mar 5, 2023 at 9:40 PM Xiangying Meng 
> wrote:
> > > >
> > > > > Hi, community,
> > > > >
> > > > > Sorry to tell everyone that we may need to abort the release
> > > > > 2.10.4-candidate-1 because some CI can not be passed after #19674
> [0] is
> > > > > cherry-picked.
> > > > > I will be sure to carry out the release process again as soon as
> it is
> > > > > resolved.
> > > > >
> > > > > Sincerely
> > > > > Xiangying
> > > > > [0] https://github.com/apache/pulsar/pull/19674
> > > > >
> > > > >
> > > > > On Sat, Mar 4, 2023 at 12:06 PM Xiangying Meng <
> xiangy...@apache.org>
> > > > > wrote:
> > > > >
> > > > >> This is the third release candidate for Apache Pulsar, version
> 2.10.4.
> > > > >>
> > > > >> This release contains 99 commits by 34 contributors.
> > > > >>
> https://github.com/apache/pulsar/compare/v2.10.3...v2.10.4-candidate-1
> > > > >>
> > > > >> *** Please download, test, and vote on this release. This vote
> will stay
> > > > >> open
> > > > >> for at least 72 hours ***
> > > > >>
> > > > >> Note that we are voting upon the source (tag), binaries are
> provided for
> > > > >> convenience.
> > > > >>
> > > > >> Source and binary files:
> > > > >>
> https://dist.apache.org/repos/dist/dev/pulsar/pulsar-2.10.4-candidate-1/
> > > > >>
> > > > >> SHA-512 checksums:
> > > > >>
> 8cae74a5b586ab2378c2b2737c59507180af4b8efab4a99bc0dae233096036f5b18ab94255bea03e416d8d21958bedf684c8d4bd3982f458a547d3e1efa0f19f
> > > > >>  apache-pulsar-2.10.4-bin.tar.gz
> > > > >>
> 74e16c61ff6ae9e2a51e7ae24981598c71dabbff09c820bff9303c031882e1f15d029d06b6b5b6e4cc9a02b8957a102338ce09173c8744a59e5bd848b48b1d2a
> > > > >>  apache-pulsar-2.10.4-src.tar.gz
> > > > >>
> > > > >> Maven staging repo:
> > > > >>
> https://repository.apache.org/content/repositories/orgapachepulsar-1210/
> > > > >>
> > > > >> The tag to be voted upon:
> > > > >> v2.10.4-candidate-1 (d1aebd3e4c9503406845fb2e746a289e88e00fb2)
> > > > >> https://github.com/apache/pulsar/releases/tag/v2.10.4-candidate-1
> > > > >>
> > > > >> Pulsar's KEYS file containing PGP keys you use to sign the
> release:
> > > > >> https://downloads.apache.org/pulsar/KEYS
> > > > >>
> > > > >> Docker images:
> > > > >>
> > > > >> 
> > > > >>
> > > > >>
> https://hub.docker.com/layers/xiangyingmeng/pulsar/2.10.4/images/sha256-144d0380592a7e0578772eb2fa51da7cad70f1d5f8a2b46189669b15f0e6b4b6?context=repo
> > > > >>
> > > > >> 
> > > > >>
> > > > >>
> https://hub.docker.com/layers/xiangyingmeng/pulsar-all/2.10.4/images/sha256-bcf03c05be93ced24991afbcca13f4a4b5f183d9a7b877ae84e992e16ca599ee?context=repo
> > > > >>
> > > > >> Please download the source package, and follow the README to build
> > > > >> and run the Pulsar standalone service.
> > > > >>
> > > > >
>


Re: [VOTE] Pulsar Client Python Release 3.1.0 Candidate 4

2023-03-08 Thread Zike Yang
Hi Yunze

The crash issue still exists in python 3.7. Here is the log
```
^CTraceback (most recent call last):
  File "/Users/aaronrobert/codebase/pulsar-client-python/examples/consumer.py",
line 32, in 
msg = consumer.receive()
  File 
"/Users/aaronrobert/.pyenv/versions/3.7.16/lib/python3.7/site-packages/pulsar/__init__.py",
line 1243, in receive
msg = self._consumer.receive()
_pulsar.Interrupted: Pulsar error: ResultInterrupted
2023-03-09 12:18:14.326 WARN  [0x110900600] ConsumerImpl:126 |
[persistent://public/default/my-topic, my-subscription, 0] Destroyed
consumer which was not properly closed
2023-03-09 12:18:14.326 INFO  [0x110900600] ConsumerImpl:134 |
[persistent://public/default/my-topic, my-subscription, 0] Closed
consumer for race condition: 0
libc++abi: terminating with uncaught exception of type
std::__1::bad_weak_ptr: bad_weak_ptr
[1]52874 abort  python37
~/codebase/pulsar-client-python/examples/consumer.py
```

This issue also exists in python 3.10.8. But it worked fine when I
upgraded it to the latest version of python 3.10: 3.10.10.
However, python 3.7.16, which is the latest version of python 3.7
still not working.

Not sure if it's a python issue, but only some python versions have
fixed it. Could you take a look again?

Thanks,
Zike Yang

On Wed, Mar 8, 2023 at 5:51 PM Yunze Xu  wrote:
>
> This is the 4th release candidate for Apache Pulsar Client Python,
> version 3.1.0.
>
> It fixes the following issues:
> https://github.com/apache/pulsar-client-python/milestone/2?closed=1
>
> *** Please download, test and vote on this release. This vote will
> stay open for at least 72 hours ***
>
> Python wheels:
> https://dist.apache.org/repos/dist/dev/pulsar/pulsar-client-python-3.1.0-candidate-4/
>
> The supported python versions are 3.7, 3.8, 3.9, 3.10 and 3.11. The
> supported platforms and architectures are:
> - Windows x86_64 (windows/)
> - glibc-based Linux x86_64 (linux-glibc-x86_64/)
> - glibc-based Linux arm64 (linux-glibc-arm64/)
> - musl-based Linux x86_64 (linux-musl-x86_64/)
> - musl-based Linux arm64 (linux-musl-arm64/)
> - macOS universal 2 (macos/)
>
> You can download the wheel (the `.whl` file) according to your own OS
> and Python version
> and install the wheel:
> - Windows: `py -m pip install *.whl --force-reinstall`
> - Linux or macOS: `python3 -m pip install *.whl --force-reinstall`
>
> The tag to be voted upon: v3.1.0-candidate-4
> (b883f42aa4287d46423b85f7af77f604cacf2a7e)
> https://github.com/apache/pulsar-client-python/releases/tag/v3.1.0-candidate-4
>
> Pulsar's KEYS file containing PGP keys you use to sign the release:
> https://downloads.apache.org/pulsar/KEYS
>
> Please download the Python wheels and follow the README to test.


Re: [Discuss] PIP-248: Add backlog eviction metric

2023-03-08 Thread 太上玄元道君
> backlogQuotaLimitSize
> should be `backlogQuotaSizeBytes`

> backlogQuotaLimitTime
> should be `backlogQuotaTimeSeconds`

> So you need to rename the metric.
> "pulsar_storage_backlog_quota_count" -->
> `pulsar_storage_backlog_eviction_count`

> the topic's existing subscription.
> "subscription" --> "subscription*s*"

> Number of backlog quota happends.
> Number of times backlog evictions happened due to exceeding backlog quota
> (either time or size).

Accepted, if there is no more need to change, I'll start the vote next week.

Thanks,
Tao Jiuming


Asaf Mesika  于2023年3月7日周二 00:02写道:

> >
> > Pulsar has a feature called backlog quota (place link).
>
> You need to place a link :)
>
> Expose pulsar_storage_backlog_quota_count in the topic leve
>
> You already have "pulsar_storage_backlog_size", so why do you need this
> metric for?
>
> backlogQuotaLimitSize
>
> should be `backlogQuotaSizeBytes`
>
> backlogQuotaLimitTime
>
> should be `backlogQuotaTimeSeconds`
>
> What about goal no.4? Expose oldest unacknowledged message subscription
> name?
>
> IMO, metrics are like API - perhaps indicate the change there as well
>
> Record the event when dropBacklogForSizeLimit
> > <
> https://github.com/apache/pulsar/blob/master/pulsar-broker/src/main/java/org/apache/pulsar/broker/service/BacklogQuotaManager.java#L121
> >
> >  or dropBacklogForTimeLimit
> > <
> https://github.com/apache/pulsar/blob/master/pulsar-broker/src/main/java/org/apache/pulsar/broker/service/BacklogQuotaManager.java#L194>
> is
> > going to invoked.
>
>
> Oh, now I get it.
> So you need to rename the metric.
> "pulsar_storage_backlog_quota_count" -->
> `pulsar_storage_backlog_eviction_count`
>
>
> > the topic's existing subscription.
>
> "subscription" --> "subscription*s*"
>
> Number of backlog quota happends.
>
> Number of times backlog evictions happened due to exceeding backlog quota
> (either time or size).
>
>
> >1. Find the backlog subscriptions
> >After received the alarm, users could request
> Topics#getStats(topicName,
> >true/false, true, true)
> ><
> https://github.com/apache/pulsar/blob/master/pulsar-client-admin-api/src/main/java/org/apache/pulsar/client/admin/Topics.java#L1139>
> to
> >get the topic stats, and find which subscriptions are in backlog.
> >Pulsar exposed backlogSize and earliestMsgPublishTimeInBacklog in the
> >subscription level, and we will expose backlogQuotaLimitSize and
> >backlogQuotaLimitTime in the topic level, so users could find which
> >subscriptions in backlog easily.
> >
> > I wrote how it should be done IMO in a previous email.
>
>
> On Mon, Mar 6, 2023 at 1:20 PM 太上玄元道君  wrote:
>
> > Hi Aasf,
> > I've updated the PIP, PTAL
> >
> > Thanks,
> > Tao Jiuming
> >
> > Asaf Mesika  于2023年3月5日周日 21:00写道:
> >
> > > On Thu, Mar 2, 2023 at 12:57 PM 太上玄元道君  wrote:
> > >
> > > > > I  think you should fix this explanation:
> > > >
> > > > Thanks! I would like to copy the context you provide to the PIP
> > > motivation,
> > > > your description is more detailed, so developers don't have to go
> > through
> > > > the code.
> > > >
> > >
> > > Sure
> > >
> > >
> > > >
> > > > > Today the quota is checked periodically, right? So that's how the
> > > > operator
> > > > > knows the cost in terms of I/O is limited.
> > > > > Now you are adding one additional I/O per collection, every 1 min
> by
> > > > > default. That's a lot perhaps. How long is the check interval
> today?
> > > >
> > > > Actually, I don't want to introduce additional costs, I thought we
> > > > could cache its result, so that it won't introduce additional costs.
> > > > It may be that I did not make it clear in the PIP and caused this
> > > > misunderstanding, sorry.
> > > >
> > >
> > > Ok, just to verify: You plan to modify the code that runs periodically
> > the
> > > backlog quota check, so the result will be cached there? This way when
> > you
> > > pull that information from that code every 1min to expose it as a
> metric
> > it
> > > will have 0 I/O cost?
> > >
> > >
> > >
> > > >
> > > > > The user today can calculate quota used for size based limit, since
> > > there
> > > > > are two metrics that are exposed today on a topic level: "
> > > > > pulsar_storage_backlog_quota_limit" and
> > "pulsar_storage_backlog_size".
> > > > You
> > > > > can just divide the two to get a percentage.
> > > > > For the time-based limit, the only metric exposed today is quota
> > > itself ,
> > > > "
> > > > > pulsar_storage_backlog_quota_limit_time".
> > > >
> > > > I only noticed `pulsar_storage_backlog_size` but missed
> > > > `pulsar_storage_backlog_quota_limit` and
> > > > `pulsar_storage_backlog_quota_limit_time`. Many thanks for your
> > reminder.
> > > >
> > > >
> > > > So, in this condition, we already have the following topic-level
> > metrics:
> > > > `pulsar_storage_backlog_size`: The total backlog size of the topics
> of
> > > this
> > > > topic owned by this broker (in bytes).
> > > > 

Re: [DISCUSS] new idea: reverse reading a topic

2023-03-08 Thread Haiting Jiang
Hi Kannar,

+1 to find the position first and then read like normal as mentioned
by Yong and Michael.

Another problem of  reading reverse is that  it would break all the
read ahead techniques in the storage and result in very poor
performance.

> This would work but it will need something to store every messages read
> to reverse them before answer which can be heavy in RAM usages.

Finding the position doesn't require reading all the messages body.
Just use the ledger metadata info and maybe some message heads in the
last ledger would be enough.

Thanks,
Haiting

On Thu, Mar 9, 2023 at 11:09 AM Zike Yang  wrote:
>
> > Have you looked at the seek implementation to see if it would be
> feasible to extend the implementation and add a method to "seekBefore"
> a message id in the way you described?
>
> I think it's not very feasible for this case. Seeking before can lead
> to consumer reconnection, which can cause significant performance
> issues and overhead.
>
>
> Zike Yang
>
> On Thu, Mar 9, 2023 at 10:22 AM Yong Zhang  wrote:
> >
> > Kannar,
> >
> > Why not find the stop position first, then read the message
> > until a given position?
> > Does the stop position change dynamically? You only know
> > it once you meet it?
> >
> > Yong
> >
> >
> >
> > On Wed, 8 Mar 2023 at 21:37, Alexandre DUVAL
> >  wrote:
> >
> > > Hi Michael,
> > >
> > > This would work but it will need something to store every messages read
> > > to reverse them before answer which can be heavy in RAM usages. The key
> > > point of the future is to read message by message from a MessageId to
> > > past with stop read possible conditions.
> > >
> > > Best,
> > >
> > > Kannar
> > >
> > > On 3/7/23 22:10, Michael Marshall wrote:
> > > >> The goal is to start from a known MessageId and read the N message
> > > >> before this MessageId.
> > > > Have you looked at the seek implementation to see if it would be
> > > > feasible to extend the implementation and add a method to "seekBefore"
> > > > a message id in the way you described? I haven't considered all of the
> > > > implications, but if the main goal is to move the cursor, I think the
> > > > solution should be about moving the cursor, not about reading a topic
> > > > in reverse.
> > > >
> > > > Thanks,
> > > > Michael
> > > >
> > > > On Tue, Mar 7, 2023 at 6:50 AM Alexandre DUVAL 
> > > wrote:
> > > >> Hi Yong,
> > > >>
> > > >> The goal is to start from a known MessageId and read the N message
> > > >> before this MessageId.
> > > >>
> > > >> Best,
> > > >>
> > > >> Kannar
> > > >>
> > > >>
> > > >> On 3/7/23 01:53, Yong Zhang wrote:
> > > >>> Hi Kannar,
> > > >>>
> > > >>> Just interested in what exactly your case.
> > > >>>
> > > >>> Why do you need to read messages in a reversed order? What is your
> > > case?
> > > >>>
> > > >>> Best,
> > > >>> Yong
> > > >>>
> > > >>> On Mon, 6 Mar 2023 at 23:37, Alexandre DUVAL
> > > wrote:
> > > >>>
> > >  Hi,
> > > 
> > >  I'm wondering if it is possible to introduce a new feature on Pulsar
> > >  which will enable users to read topic from a defined MessageId to
> > >  previous messages until the begin of the topic.
> > > 
> > >  I tried to use Pulsar SQL but it requires so much RAM even for little
> > >  queries (due to Presto design).
> > > 
> > >  Currently, every read in Pulsar are expected to be going forward. So
> > > it
> > >  might be a bit tricky to prevent every weird behavior by introducing
> > > the
> > >  feature.
> > > 
> > >  I'm currently tried to make an MVP/POC by introducting a readReverse
> > >  field in the CommandSubscribe that is used by ReaderAPI and currently
> > >  looking for to create a getFirstMessageId() on ManagedLedger
> > >  (https://github.com/CleverCloud/pulsar/pull/3). I also removed
> > >  startPosition < endPosition sanity checks in BookKeeper locally
> > >  (https://github.com/CleverCloud/bookkeeper/pull/2).
> > > 
> > >  We definitely prefer a readPrevious(), hasPreviousMessageAvailable()
> > > in
> > >  the ReaderAPI.
> > > 
> > >  I'm not familiar with these internals such as NonDurableCursor,
> > >  RangeEntryCache, ManagedCursor so it's a bit tricky.
> > > 
> > >  So I wondering someone to help/guide me or even directly handle the
> > >  subject (or the discuss).
> > > 
> > >  Regards,
> > > 
> > >  Kannar
> > > 
> > > 
> > > 
> > > 
> > >


Re: [VOTE] Pulsar Release 2.10.4 Candidate 1

2023-03-08 Thread Haiting Jiang
Seems like we should include this PR in this release.
https://github.com/apache/pulsar/pull/19754

See https://lists.apache.org/thread/odofmj9h8ln6blozhgkgmx0mbyll45dp

Thanks,
Haiting

On Thu, Mar 9, 2023 at 2:07 PM 丛搏  wrote:
>
> +1 (binding)
>
> os: mac 12.3.1, Intel
> java: OpenJDK 17.0.1
>
> - Checked the signature
> - Checked LICENSE
> - Start standalone
> - Publish and consume messages
> - Verified Function and State Function
> - Verified Cassandra connector
> - Build from the source package (maven 3.6.1, openJDK 11.0.12)
> - Run a simple transaction performance check
>
> Thanks,
> Bo
>
> guo jiwei  于2023年3月8日周三 15:00写道:
> >
> > +1 (binding)
> >
> > - Build from the source package
> > - Checked the signature
> > - Publish and consume messages
> > - Verified Function and State Function
> > - Verified Cassandra connector
> >
> > Regards
> > Jiwei Guo (Tboy)
> >
> > On Tue, Mar 7, 2023 at 2:22 PM Xiangying Meng  wrote:
> > >
> > > Please ignore the previous email. This commit did not break CI.
> > > Instead, a very coincidental thing happened.
> > > 1. There may be problems with the maven server at that time. The three PRs
> > > mentioned at that time could not download the correct jar package, and the
> > > retry was invalid.
> > > 2. A flaky test `recoverLongTimeAfterMultipleWriteErrors` failed multiple
> > > times in a row.
> > >
> > > So I mistakenly thought it was caused by the last unverified commit.
> > > So the RC is correct, please help verify it and vote.
> > >
> > > Thanks
> > > Xiangying
> > >
> > > On Sun, Mar 5, 2023 at 9:40 PM Xiangying Meng  
> > > wrote:
> > >
> > > > Hi, community,
> > > >
> > > > Sorry to tell everyone that we may need to abort the release
> > > > 2.10.4-candidate-1 because some CI can not be passed after #19674 [0] is
> > > > cherry-picked.
> > > > I will be sure to carry out the release process again as soon as it is
> > > > resolved.
> > > >
> > > > Sincerely
> > > > Xiangying
> > > > [0] https://github.com/apache/pulsar/pull/19674
> > > >
> > > >
> > > > On Sat, Mar 4, 2023 at 12:06 PM Xiangying Meng 
> > > > wrote:
> > > >
> > > >> This is the third release candidate for Apache Pulsar, version 2.10.4.
> > > >>
> > > >> This release contains 99 commits by 34 contributors.
> > > >> https://github.com/apache/pulsar/compare/v2.10.3...v2.10.4-candidate-1
> > > >>
> > > >> *** Please download, test, and vote on this release. This vote will 
> > > >> stay
> > > >> open
> > > >> for at least 72 hours ***
> > > >>
> > > >> Note that we are voting upon the source (tag), binaries are provided 
> > > >> for
> > > >> convenience.
> > > >>
> > > >> Source and binary files:
> > > >> https://dist.apache.org/repos/dist/dev/pulsar/pulsar-2.10.4-candidate-1/
> > > >>
> > > >> SHA-512 checksums:
> > > >> 8cae74a5b586ab2378c2b2737c59507180af4b8efab4a99bc0dae233096036f5b18ab94255bea03e416d8d21958bedf684c8d4bd3982f458a547d3e1efa0f19f
> > > >>  apache-pulsar-2.10.4-bin.tar.gz
> > > >> 74e16c61ff6ae9e2a51e7ae24981598c71dabbff09c820bff9303c031882e1f15d029d06b6b5b6e4cc9a02b8957a102338ce09173c8744a59e5bd848b48b1d2a
> > > >>  apache-pulsar-2.10.4-src.tar.gz
> > > >>
> > > >> Maven staging repo:
> > > >> https://repository.apache.org/content/repositories/orgapachepulsar-1210/
> > > >>
> > > >> The tag to be voted upon:
> > > >> v2.10.4-candidate-1 (d1aebd3e4c9503406845fb2e746a289e88e00fb2)
> > > >> https://github.com/apache/pulsar/releases/tag/v2.10.4-candidate-1
> > > >>
> > > >> Pulsar's KEYS file containing PGP keys you use to sign the release:
> > > >> https://downloads.apache.org/pulsar/KEYS
> > > >>
> > > >> Docker images:
> > > >>
> > > >> 
> > > >>
> > > >> https://hub.docker.com/layers/xiangyingmeng/pulsar/2.10.4/images/sha256-144d0380592a7e0578772eb2fa51da7cad70f1d5f8a2b46189669b15f0e6b4b6?context=repo
> > > >>
> > > >> 
> > > >>
> > > >> https://hub.docker.com/layers/xiangyingmeng/pulsar-all/2.10.4/images/sha256-bcf03c05be93ced24991afbcca13f4a4b5f183d9a7b877ae84e992e16ca599ee?context=repo
> > > >>
> > > >> Please download the source package, and follow the README to build
> > > >> and run the Pulsar standalone service.
> > > >>
> > > >


Re: [VOTE] Pulsar Release 2.10.4 Candidate 1

2023-03-08 Thread 丛搏
+1 (binding)

os: mac 12.3.1, Intel
java: OpenJDK 17.0.1

- Checked the signature
- Checked LICENSE
- Start standalone
- Publish and consume messages
- Verified Function and State Function
- Verified Cassandra connector
- Build from the source package (maven 3.6.1, openJDK 11.0.12)
- Run a simple transaction performance check

Thanks,
Bo

guo jiwei  于2023年3月8日周三 15:00写道:
>
> +1 (binding)
>
> - Build from the source package
> - Checked the signature
> - Publish and consume messages
> - Verified Function and State Function
> - Verified Cassandra connector
>
> Regards
> Jiwei Guo (Tboy)
>
> On Tue, Mar 7, 2023 at 2:22 PM Xiangying Meng  wrote:
> >
> > Please ignore the previous email. This commit did not break CI.
> > Instead, a very coincidental thing happened.
> > 1. There may be problems with the maven server at that time. The three PRs
> > mentioned at that time could not download the correct jar package, and the
> > retry was invalid.
> > 2. A flaky test `recoverLongTimeAfterMultipleWriteErrors` failed multiple
> > times in a row.
> >
> > So I mistakenly thought it was caused by the last unverified commit.
> > So the RC is correct, please help verify it and vote.
> >
> > Thanks
> > Xiangying
> >
> > On Sun, Mar 5, 2023 at 9:40 PM Xiangying Meng  wrote:
> >
> > > Hi, community,
> > >
> > > Sorry to tell everyone that we may need to abort the release
> > > 2.10.4-candidate-1 because some CI can not be passed after #19674 [0] is
> > > cherry-picked.
> > > I will be sure to carry out the release process again as soon as it is
> > > resolved.
> > >
> > > Sincerely
> > > Xiangying
> > > [0] https://github.com/apache/pulsar/pull/19674
> > >
> > >
> > > On Sat, Mar 4, 2023 at 12:06 PM Xiangying Meng 
> > > wrote:
> > >
> > >> This is the third release candidate for Apache Pulsar, version 2.10.4.
> > >>
> > >> This release contains 99 commits by 34 contributors.
> > >> https://github.com/apache/pulsar/compare/v2.10.3...v2.10.4-candidate-1
> > >>
> > >> *** Please download, test, and vote on this release. This vote will stay
> > >> open
> > >> for at least 72 hours ***
> > >>
> > >> Note that we are voting upon the source (tag), binaries are provided for
> > >> convenience.
> > >>
> > >> Source and binary files:
> > >> https://dist.apache.org/repos/dist/dev/pulsar/pulsar-2.10.4-candidate-1/
> > >>
> > >> SHA-512 checksums:
> > >> 8cae74a5b586ab2378c2b2737c59507180af4b8efab4a99bc0dae233096036f5b18ab94255bea03e416d8d21958bedf684c8d4bd3982f458a547d3e1efa0f19f
> > >>  apache-pulsar-2.10.4-bin.tar.gz
> > >> 74e16c61ff6ae9e2a51e7ae24981598c71dabbff09c820bff9303c031882e1f15d029d06b6b5b6e4cc9a02b8957a102338ce09173c8744a59e5bd848b48b1d2a
> > >>  apache-pulsar-2.10.4-src.tar.gz
> > >>
> > >> Maven staging repo:
> > >> https://repository.apache.org/content/repositories/orgapachepulsar-1210/
> > >>
> > >> The tag to be voted upon:
> > >> v2.10.4-candidate-1 (d1aebd3e4c9503406845fb2e746a289e88e00fb2)
> > >> https://github.com/apache/pulsar/releases/tag/v2.10.4-candidate-1
> > >>
> > >> Pulsar's KEYS file containing PGP keys you use to sign the release:
> > >> https://downloads.apache.org/pulsar/KEYS
> > >>
> > >> Docker images:
> > >>
> > >> 
> > >>
> > >> https://hub.docker.com/layers/xiangyingmeng/pulsar/2.10.4/images/sha256-144d0380592a7e0578772eb2fa51da7cad70f1d5f8a2b46189669b15f0e6b4b6?context=repo
> > >>
> > >> 
> > >>
> > >> https://hub.docker.com/layers/xiangyingmeng/pulsar-all/2.10.4/images/sha256-bcf03c05be93ced24991afbcca13f4a4b5f183d9a7b877ae84e992e16ca599ee?context=repo
> > >>
> > >> Please download the source package, and follow the README to build
> > >> and run the Pulsar standalone service.
> > >>
> > >


Re: [DISCUSS] Critical problem report - session notification thread deadlock

2023-03-08 Thread PengHui Li
Thanks, Qiang,

And for the upcoming patch releases

2.11.1,
2.10.4,
2.9.5

Please ship the fix to the release. Although it's not a fix for breaking
change.
But it fixed a critical issue from previous releases.

Thanks,
Penghui

On Thu, Mar 9, 2023 at 11:44 AM  wrote:

> Hi, All
>
> We found a critical problem that will cause the pulsar cluster to part
> “deaf” status. the broker can’t receive zookeeper session notification to
> revalidate namespace bundle ownership and leader election. That means it
> will cause one topic may have two owner brokers, and the leader election
> problem. etc
>
> Blast radius:
> Since this problem was introduced by
> https://github.com/apache/pulsar/pull/17401. The releases as follows may
> affect:
>
> 2.8.5
> 2.9.4
> 2.10.3
> 2.11.0
>
> Workaround:
>
> Restart broker
>
> The fix is over here: https://github.com/apache/pulsar/pull/19754
>
> We can avoid upgrading to the above version and wait for the latest
> progress. I will continue to push for a solution to this problem.
>
> Please correct me if I got something wrong. thanks!
>
>
> Best,
> Mattison
>


Re: [DISCUSS] PIP-246: Improved PROTOBUF_NATIVE schema compatibility checks without using avro-protobuf

2023-03-08 Thread 丛搏
 Hi siNan:

>From my point of view, it is just a plug-in. I don't think it is
necessary to add configuration for the plugin.
This is meaningless, and it will increase the difficulty of use for users.


SiNan Liu  于2023年3月8日周三 15:54写道:
>
> Hi, bo.
>
> 1. I understand what you say, to develop a new
> `ProtobufNativeAdvancedSchemaCompatibilityCheck`, rather than changing
> existing `ProtobufNativeSchemaCompatibilityCheck`. But I found a few small
> problems:
>
> (1)ProtobufNativeAdvancedSchemaCompatibilityCheck and
> ProtobufNativeSchemaCompatibilityCheck schemaType is PROTOBUF_NATIVE. It
> looks like both checkers are PROTOBUF not using AVRO-PROTOBUF's "native"
> implementation, which leads to some problems or "unreasonable" and gives me
> some extended thinking and questions.
>
`CompatibilityCheck ` its only a plugin.
`ProtobufNativeSchemaCompatibilityCheck` may sooner or later leave the
stage, when `ProtobufNativeAdvancedSchemaCompatibilityCheck` is
stable, we can make it the default Checker.

It is just a plug-in, users can change it at will and ensure that it
is used correctly
> (2)In broker.conf
>
> `schemaRegistryCompatibilityCheckers`. If
> ProtobufNativeSchemaCompatibilityCheck and
> ProtobufNativeAdvancedSchemaCompatibilityCheck all set. This is going to
> overwrite each other. Because this is a map:
>
> https://github.com/apache/pulsar/blob/af1360fb167c1f9484fda5771df3ea9b21d1440b/pulsar-broker/src/main/java/org/apache/pulsar/broker/service/schema/SchemaRegistryService.java#L36-L44
>
> ```java
>
> Map checkers = new HashMap<>();
>
> for (String className : checkerClasses) {
>
> SchemaCompatibilityCheck schemaCompatibilityCheck =
> Reflections.createInstance(className,
>
> SchemaCompatibilityCheck.class,
> Thread.currentThread().getContextClassLoader());
>
> checkers.put(schemaCompatibilityCheck.getSchemaType(),
> schemaCompatibilityCheck);
>
> ```
>
> Is this a big problem or a small one? Is it possible or unnecessary? Maybe
> we can write in the documentation that protobufNative checkers can only
> choose one of the two? Why are there two Checkers for different
> implementations of the same schemaType? Why not the checker to create
> different validator, so we don not have to change
> schemaRegistryCompatibilityCheckers.

users can only use one, not two, which will bring complexity to users

>
> (3)And after the update to ProtobufNativeAdvancedSchemaCompatibilityCheck.
> Existing topics previously only checked the name of the root message, not
> the content of protobuf.
>
> What if the user wants both Checkers?
>
> Set to ProtobufNativeAdvancedSchemaCompatibilityCheck, affect the topic of
> the existing schema?
>
> Older topics still use the old checker, and newer topics or certain older
> topics use the new advancedchecker.
>
when `ProtobufNativeAdvancedSchemaCompatibilityCheck` stable,
users will not choose `ProtobufNativeSchemaCompatibilityCheck`.
because it not a complete checker.
> (4)So should we have one schemaType for a checker? protobufNativeChecker
> can have as many different implementation classes as possible. This
> classname configuration in PIP, let's see if it can be set at the topic
> level. In the current PIP design I just load this parameter into the
> checker when the broker is started and the checkers map is set up. Can I do
> this in the new normal pr if I want to support topic level? Or perfect it
> here?
>
> Add a call PROTOBUF_NATIVE_ADVANCE schemaType corresponding
> ProtobufNativeAdvancedSchemaCompatibilityCheck? (Seems to be more trouble).
>
> Sorry I can not use the computer and network in the company, I use my
> mobile phone to reply to the email, the format may be a bit messy. Please
> understand.
>
> Thanks,
>
> sinan
>
>
> 丛搏  于 2023年3月7日周二 下午11:39写道:
>
> > SiNan Liu  于2023年3月7日周二 13:22写道:
> > >
> > > Great to see your comment, bo!
> > >
> > > 1. The first way. The protobuf website has a description of the rules,
> > but
> > > no plans to implement them.
> > > https://protobuf.dev/programming-guides/proto/#updating
> >
> > https://groups.google.com/g/protobuf
> > maybe ask here
> >
> > >
> > > 2. I think this PIP can be divided into two parts.
> > > (1) Add a flag(`ValidatorClassName`), load it into
> > > `ProtobufNativeSchemaCompatibilityCheck` when the broker starts.
> > > ValidatorClassName is empty by default, and the implementation continues
> > as
> > > before, with no change for the user.
> >
> > `ProtobufNativeSchemaCompatibilityCheck` is a plugin in `broker.conf`
> > ```
> >
> > schemaRegistryCompatibilityCheckers=org.apache.pulsar.broker.service.schema.JsonSchemaCompatibilityCheck,org.apache.pulsar.broker.service.schema.AvroSchemaCompatibilityCheck,org.apache.pulsar.broker.service.schema.ProtobufNativeSchemaCompatibilityCheck
> > ```
> > I do not recommend that we directly modify this plugin and continue to
> > add configuration items, which will cause trouble for users.
> > We have a lot of configs and it's getting very unwieldy.
> > in my opinion, we 

[DISCUSS] Critical problem report - session notification thread deadlock

2023-03-08 Thread mattisonchao
Hi, All

We found a critical problem that will cause the pulsar cluster to part “deaf” 
status. the broker can’t receive zookeeper session notification to revalidate 
namespace bundle ownership and leader election. That means it will cause one 
topic may have two owner brokers, and the leader election problem. etc

Blast radius:
Since this problem was introduced by 
https://github.com/apache/pulsar/pull/17401. The releases as follows may affect:

2.8.5
2.9.4
2.10.3
2.11.0

Workaround:

Restart broker

The fix is over here: https://github.com/apache/pulsar/pull/19754

We can avoid upgrading to the above version and wait for the latest progress. I 
will continue to push for a solution to this problem.

Please correct me if I got something wrong. thanks!


Best,
Mattison


Re: [DISCUSS] new idea: reverse reading a topic

2023-03-08 Thread Zike Yang
> Have you looked at the seek implementation to see if it would be
feasible to extend the implementation and add a method to "seekBefore"
a message id in the way you described?

I think it's not very feasible for this case. Seeking before can lead
to consumer reconnection, which can cause significant performance
issues and overhead.


Zike Yang

On Thu, Mar 9, 2023 at 10:22 AM Yong Zhang  wrote:
>
> Kannar,
>
> Why not find the stop position first, then read the message
> until a given position?
> Does the stop position change dynamically? You only know
> it once you meet it?
>
> Yong
>
>
>
> On Wed, 8 Mar 2023 at 21:37, Alexandre DUVAL
>  wrote:
>
> > Hi Michael,
> >
> > This would work but it will need something to store every messages read
> > to reverse them before answer which can be heavy in RAM usages. The key
> > point of the future is to read message by message from a MessageId to
> > past with stop read possible conditions.
> >
> > Best,
> >
> > Kannar
> >
> > On 3/7/23 22:10, Michael Marshall wrote:
> > >> The goal is to start from a known MessageId and read the N message
> > >> before this MessageId.
> > > Have you looked at the seek implementation to see if it would be
> > > feasible to extend the implementation and add a method to "seekBefore"
> > > a message id in the way you described? I haven't considered all of the
> > > implications, but if the main goal is to move the cursor, I think the
> > > solution should be about moving the cursor, not about reading a topic
> > > in reverse.
> > >
> > > Thanks,
> > > Michael
> > >
> > > On Tue, Mar 7, 2023 at 6:50 AM Alexandre DUVAL 
> > wrote:
> > >> Hi Yong,
> > >>
> > >> The goal is to start from a known MessageId and read the N message
> > >> before this MessageId.
> > >>
> > >> Best,
> > >>
> > >> Kannar
> > >>
> > >>
> > >> On 3/7/23 01:53, Yong Zhang wrote:
> > >>> Hi Kannar,
> > >>>
> > >>> Just interested in what exactly your case.
> > >>>
> > >>> Why do you need to read messages in a reversed order? What is your
> > case?
> > >>>
> > >>> Best,
> > >>> Yong
> > >>>
> > >>> On Mon, 6 Mar 2023 at 23:37, Alexandre DUVAL
> > wrote:
> > >>>
> >  Hi,
> > 
> >  I'm wondering if it is possible to introduce a new feature on Pulsar
> >  which will enable users to read topic from a defined MessageId to
> >  previous messages until the begin of the topic.
> > 
> >  I tried to use Pulsar SQL but it requires so much RAM even for little
> >  queries (due to Presto design).
> > 
> >  Currently, every read in Pulsar are expected to be going forward. So
> > it
> >  might be a bit tricky to prevent every weird behavior by introducing
> > the
> >  feature.
> > 
> >  I'm currently tried to make an MVP/POC by introducting a readReverse
> >  field in the CommandSubscribe that is used by ReaderAPI and currently
> >  looking for to create a getFirstMessageId() on ManagedLedger
> >  (https://github.com/CleverCloud/pulsar/pull/3). I also removed
> >  startPosition < endPosition sanity checks in BookKeeper locally
> >  (https://github.com/CleverCloud/bookkeeper/pull/2).
> > 
> >  We definitely prefer a readPrevious(), hasPreviousMessageAvailable()
> > in
> >  the ReaderAPI.
> > 
> >  I'm not familiar with these internals such as NonDurableCursor,
> >  RangeEntryCache, ManagedCursor so it's a bit tricky.
> > 
> >  So I wondering someone to help/guide me or even directly handle the
> >  subject (or the discuss).
> > 
> >  Regards,
> > 
> >  Kannar
> > 
> > 
> > 
> > 
> >


Re: [DISCUSS] new idea: reverse reading a topic

2023-03-08 Thread Yong Zhang
Kannar,

Why not find the stop position first, then read the message
until a given position?
Does the stop position change dynamically? You only know
it once you meet it?

Yong



On Wed, 8 Mar 2023 at 21:37, Alexandre DUVAL
 wrote:

> Hi Michael,
>
> This would work but it will need something to store every messages read
> to reverse them before answer which can be heavy in RAM usages. The key
> point of the future is to read message by message from a MessageId to
> past with stop read possible conditions.
>
> Best,
>
> Kannar
>
> On 3/7/23 22:10, Michael Marshall wrote:
> >> The goal is to start from a known MessageId and read the N message
> >> before this MessageId.
> > Have you looked at the seek implementation to see if it would be
> > feasible to extend the implementation and add a method to "seekBefore"
> > a message id in the way you described? I haven't considered all of the
> > implications, but if the main goal is to move the cursor, I think the
> > solution should be about moving the cursor, not about reading a topic
> > in reverse.
> >
> > Thanks,
> > Michael
> >
> > On Tue, Mar 7, 2023 at 6:50 AM Alexandre DUVAL 
> wrote:
> >> Hi Yong,
> >>
> >> The goal is to start from a known MessageId and read the N message
> >> before this MessageId.
> >>
> >> Best,
> >>
> >> Kannar
> >>
> >>
> >> On 3/7/23 01:53, Yong Zhang wrote:
> >>> Hi Kannar,
> >>>
> >>> Just interested in what exactly your case.
> >>>
> >>> Why do you need to read messages in a reversed order? What is your
> case?
> >>>
> >>> Best,
> >>> Yong
> >>>
> >>> On Mon, 6 Mar 2023 at 23:37, Alexandre DUVAL
> wrote:
> >>>
>  Hi,
> 
>  I'm wondering if it is possible to introduce a new feature on Pulsar
>  which will enable users to read topic from a defined MessageId to
>  previous messages until the begin of the topic.
> 
>  I tried to use Pulsar SQL but it requires so much RAM even for little
>  queries (due to Presto design).
> 
>  Currently, every read in Pulsar are expected to be going forward. So
> it
>  might be a bit tricky to prevent every weird behavior by introducing
> the
>  feature.
> 
>  I'm currently tried to make an MVP/POC by introducting a readReverse
>  field in the CommandSubscribe that is used by ReaderAPI and currently
>  looking for to create a getFirstMessageId() on ManagedLedger
>  (https://github.com/CleverCloud/pulsar/pull/3). I also removed
>  startPosition < endPosition sanity checks in BookKeeper locally
>  (https://github.com/CleverCloud/bookkeeper/pull/2).
> 
>  We definitely prefer a readPrevious(), hasPreviousMessageAvailable()
> in
>  the ReaderAPI.
> 
>  I'm not familiar with these internals such as NonDurableCursor,
>  RangeEntryCache, ManagedCursor so it's a bit tricky.
> 
>  So I wondering someone to help/guide me or even directly handle the
>  subject (or the discuss).
> 
>  Regards,
> 
>  Kannar
> 
> 
> 
> 
>


Re: [VOTE] Apache Pulsar Adapters Release 2.11.0 Candidate 3

2023-03-08 Thread Michael Marshall
+1 (binding)

I verified 1 signature and 1 checksum. One nit, the checksum has file
extension `.sha512sum` instead of the usual `.sha512`. I am assuming
this is fine.

While trying to build the project with `mvn clean install`, it failed
with the following error

[ERROR] Failed to execute goal on project
pulsar-kafka-compat-client-test: Could not resolve dependencies for
project org.apache.pulsar.tests:pulsar-kafka-compat-client-test:jar:2.11.0:
Failed to collect dependencies at
org.apache.pulsar.tests:integration:jar:tests:2.11.0: Failed to read
artifact descriptor for
org.apache.pulsar.tests:integration:jar:tests:2.11.0: Failure to find
org.apache.pulsar.tests:tests-parent:pom:2.11.0 in
https://repo1.maven.org/maven2 was cached in the local repository,
resolution will not be reattempted until the update interval of
central has elapsed or updates are forced -> [Help 1]

The relevant dependency
`org.apache.pulsar.tests:tests-parent:pom:2.11.0` is not published to
maven central, but it is an artifact of the apache/pulsar build. When
I build pulsar v2.11.0 locally, the pulsar-adapters build then passes.

I think we should consider publishing the relevant pom file to maven
central to make it easier for users to build this project.

The README indicates the necessity to build the apache/pulsar project
from source in the event that dependencies from pulsar are
unpublished. As such, I think we're okay to move forward with the
release.

Thanks,
Michael


On Wed, Mar 8, 2023 at 3:39 AM Nicolò Boschi  wrote:
>
> +1 (binding)
>
> - verified checksum/signatures
> - build and tests passing w jdk17 on Mac Intel
> Nicolò Boschi
>
>
> Il giorno mar 7 mar 2023 alle ore 09:57 Enrico Olivelli 
> ha scritto:
>
> > Nice work Christophe, thanks for driving this release
> >
> > +1 (binding)
> >
> > - verified checksums/signatures
> > - built on Mac (M1) on JDK17, all tests passing
> >
> > We need more VOTES :-)
> >
> > Enrico
> >
> > Il giorno gio 2 mar 2023 alle ore 18:48 Christophe Bornet
> >  ha scritto:
> > >
> > > This is the release candidate 3 for Apache Pulsar Adapters, version
> > 2.11.0.
> > >
> > > It fixes the following issues:
> > > https://github.com/apache/pulsar-adapters/milestone/4?closed=1
> > >
> > > *** Please download, test and vote on this release. This vote will
> > > stay open for at least 72 hours ***
> > >
> > > Note that we are voting upon the source (tag), binaries are provided
> > > for convenience.
> > >
> > > Source and binary files:
> > >
> > https://dist.apache.org/repos/dist/dev/pulsar/pulsar-adapters-2.11.0-candidate-3/
> > >
> > > SHA-512 checksums:
> > >
> > 8272f786cb64caa34f8df0e07ea5ce7901e18c67ebf6f3c3cb6d887a7b0b44c6087279ddb5be8abbc62b56600a34060d556937ad9156266fd9d8c6314dc4dcd0
> > >  apache-pulsar-adapters-2.11.0-src.tar.gz
> > >
> > > Maven staging repo:
> > > https://repository.apache.org/content/repositories/orgapachepulsar-1208/
> > >
> > > The tag to be voted upon:
> > > v2.11.0-candidate-3 (3fbd9325920ff3eec3f32d539e1df81603e319f3)
> > >
> > https://github.com/apache/pulsar-adapters/releases/tag/v2.11.0-candidate-3
> > >
> > > Pulsar's KEYS file containing PGP keys we use to sign the release:
> > > https://dist.apache.org/repos/dist/dev/pulsar/KEYS
> > >
> > > Please download the source package, and follow the README to build the
> > > Pulsar Adapters code
> >


Re: [Discussion] Allowing configure if function consumer should skip to latest

2023-03-08 Thread Neng Lu
Hi everyone,

This discussion has been one week old. If there's no objection or concerns, 
I'll move forward and close the discussion with the conclusion that we are good 
with the proposed change.

This will result in the PR merge (although it was already merged, so the merged 
change won't be reverted in this case).

On 2023/03/07 03:45:02 Neng Lu wrote:
> Hi Penghui,
> 
> Thanks for your question.
> 
> One case is failure recovery for a windowing function.
> 
> A windowing function will ack message until its window is emitted. If the 
> window function fails due to issues such as OOM and restarts, it has a 
> massive backlog to catch up. And the function will never be able to recover 
> itself since the backlog keeps growing and it keeps OOM.
> 
> Our user prefers an automatic way for recovery, given they are okay with 
> skipping some backlog data. (This is acceptable in IoT cases). Also, Users 
> may deploy hundreds of functions in their environment. Manually resetting the 
> cursor is not scalable and is a heavy burden for the on-call person in such 
> cases. 
> 
> Hope the above use case can help provide some more context regarding the 
> change.
> 
> On 2023/03/03 03:51:35 PengHui Li wrote:
> > Hi Neng,
> > 
> > Thanks for raising up the discussion
> > 
> > > In certain failure cases, the function needs to skip all the content
> > between the last successfully acked message and the latest message in the
> > topic in order to skip the huge backlog and quick recovery.
> > 
> > Do you have some real cases that can help us to understand it
> > is necessary to introduce a new flag? Another possibility is users
> > can use pulsar admin to reset the cursor to the latest position,
> > Why will it not work for users? 
> > 
> > Regards,
> > Penghui
> > 
> > > On Mar 1, 2023, at 10:16, Neng Lu  wrote:
> > > 
> > > In certain failure cases, the function needs to skip all the content
> > > between the last successfully acked message and the latest message in the
> > > topic in order to skip the huge backlog and quick recovery.
> > 
> > 
> 


Re: [DISCUSS] new idea: reverse reading a topic

2023-03-08 Thread Alexandre DUVAL

Hi Michael,

This would work but it will need something to store every messages read 
to reverse them before answer which can be heavy in RAM usages. The key 
point of the future is to read message by message from a MessageId to 
past with stop read possible conditions.


Best,

Kannar

On 3/7/23 22:10, Michael Marshall wrote:

The goal is to start from a known MessageId and read the N message
before this MessageId.

Have you looked at the seek implementation to see if it would be
feasible to extend the implementation and add a method to "seekBefore"
a message id in the way you described? I haven't considered all of the
implications, but if the main goal is to move the cursor, I think the
solution should be about moving the cursor, not about reading a topic
in reverse.

Thanks,
Michael

On Tue, Mar 7, 2023 at 6:50 AM Alexandre DUVAL  wrote:

Hi Yong,

The goal is to start from a known MessageId and read the N message
before this MessageId.

Best,

Kannar


On 3/7/23 01:53, Yong Zhang wrote:

Hi Kannar,

Just interested in what exactly your case.

Why do you need to read messages in a reversed order? What is your case?

Best,
Yong

On Mon, 6 Mar 2023 at 23:37, Alexandre DUVAL  wrote:


Hi,

I'm wondering if it is possible to introduce a new feature on Pulsar
which will enable users to read topic from a defined MessageId to
previous messages until the begin of the topic.

I tried to use Pulsar SQL but it requires so much RAM even for little
queries (due to Presto design).

Currently, every read in Pulsar are expected to be going forward. So it
might be a bit tricky to prevent every weird behavior by introducing the
feature.

I'm currently tried to make an MVP/POC by introducting a readReverse
field in the CommandSubscribe that is used by ReaderAPI and currently
looking for to create a getFirstMessageId() on ManagedLedger
(https://github.com/CleverCloud/pulsar/pull/3). I also removed
startPosition < endPosition sanity checks in BookKeeper locally
(https://github.com/CleverCloud/bookkeeper/pull/2).

We definitely prefer a readPrevious(), hasPreviousMessageAvailable() in
the ReaderAPI.

I'm not familiar with these internals such as NonDurableCursor,
RangeEntryCache, ManagedCursor so it's a bit tricky.

So I wondering someone to help/guide me or even directly handle the
subject (or the discuss).

Regards,

Kannar






[VOTE] Pulsar Client Python Release 3.1.0 Candidate 4

2023-03-08 Thread Yunze Xu
This is the 4th release candidate for Apache Pulsar Client Python,
version 3.1.0.

It fixes the following issues:
https://github.com/apache/pulsar-client-python/milestone/2?closed=1

*** Please download, test and vote on this release. This vote will
stay open for at least 72 hours ***

Python wheels:
https://dist.apache.org/repos/dist/dev/pulsar/pulsar-client-python-3.1.0-candidate-4/

The supported python versions are 3.7, 3.8, 3.9, 3.10 and 3.11. The
supported platforms and architectures are:
- Windows x86_64 (windows/)
- glibc-based Linux x86_64 (linux-glibc-x86_64/)
- glibc-based Linux arm64 (linux-glibc-arm64/)
- musl-based Linux x86_64 (linux-musl-x86_64/)
- musl-based Linux arm64 (linux-musl-arm64/)
- macOS universal 2 (macos/)

You can download the wheel (the `.whl` file) according to your own OS
and Python version
and install the wheel:
- Windows: `py -m pip install *.whl --force-reinstall`
- Linux or macOS: `python3 -m pip install *.whl --force-reinstall`

The tag to be voted upon: v3.1.0-candidate-4
(b883f42aa4287d46423b85f7af77f604cacf2a7e)
https://github.com/apache/pulsar-client-python/releases/tag/v3.1.0-candidate-4

Pulsar's KEYS file containing PGP keys you use to sign the release:
https://downloads.apache.org/pulsar/KEYS

Please download the Python wheels and follow the README to test.


Re: [VOTE] Apache Pulsar Adapters Release 2.11.0 Candidate 3

2023-03-08 Thread Nicolò Boschi
+1 (binding)

- verified checksum/signatures
- build and tests passing w jdk17 on Mac Intel
Nicolò Boschi


Il giorno mar 7 mar 2023 alle ore 09:57 Enrico Olivelli 
ha scritto:

> Nice work Christophe, thanks for driving this release
>
> +1 (binding)
>
> - verified checksums/signatures
> - built on Mac (M1) on JDK17, all tests passing
>
> We need more VOTES :-)
>
> Enrico
>
> Il giorno gio 2 mar 2023 alle ore 18:48 Christophe Bornet
>  ha scritto:
> >
> > This is the release candidate 3 for Apache Pulsar Adapters, version
> 2.11.0.
> >
> > It fixes the following issues:
> > https://github.com/apache/pulsar-adapters/milestone/4?closed=1
> >
> > *** Please download, test and vote on this release. This vote will
> > stay open for at least 72 hours ***
> >
> > Note that we are voting upon the source (tag), binaries are provided
> > for convenience.
> >
> > Source and binary files:
> >
> https://dist.apache.org/repos/dist/dev/pulsar/pulsar-adapters-2.11.0-candidate-3/
> >
> > SHA-512 checksums:
> >
> 8272f786cb64caa34f8df0e07ea5ce7901e18c67ebf6f3c3cb6d887a7b0b44c6087279ddb5be8abbc62b56600a34060d556937ad9156266fd9d8c6314dc4dcd0
> >  apache-pulsar-adapters-2.11.0-src.tar.gz
> >
> > Maven staging repo:
> > https://repository.apache.org/content/repositories/orgapachepulsar-1208/
> >
> > The tag to be voted upon:
> > v2.11.0-candidate-3 (3fbd9325920ff3eec3f32d539e1df81603e319f3)
> >
> https://github.com/apache/pulsar-adapters/releases/tag/v2.11.0-candidate-3
> >
> > Pulsar's KEYS file containing PGP keys we use to sign the release:
> > https://dist.apache.org/repos/dist/dev/pulsar/KEYS
> >
> > Please download the source package, and follow the README to build the
> > Pulsar Adapters code
>


Re: [DISCUSS] PIP-254: Support configuring client version at SDK level

2023-03-08 Thread Yunze Xu
Hi,

I have changed this proposal to just add a config to `ClientBuilder`.
And here is the demo implementation:
https://github.com/BewareMyPower/pulsar/pull/21/files

PTAL again.

Thanks,
Yunze

On Sat, Mar 4, 2023 at 10:39 PM Yunze Xu  wrote:
>
> Hi Enrico,
>
> Thanks for your suggestion. It makes sense to me. I will think again
> and modify this proposal.
>
> Hi Tison,
>
> I mentioned the C++ client because the initial motivation is to solve
> the issue for the Python client and Node.js client. But after thinking
> for a while, I believe it's more general for clients of other
> languages, including Java. And this proposal is only for the Java
> client.
>
> Thanks,
> Yunze
>
> On Sat, Mar 4, 2023 at 1:42 PM tison  wrote:
> >
> > I agree with Enrico that it's better to have a config option.
> >
> > Also, we cannot simply replace the PulsarVersion call with the
> > DynamicPulsarVersion call because the client version string is now
> > constructed as:
> >
> > String.format("Pulsar-Java-v%s", PulsarVersion.getVersion())
> >
> > It's a config of client version string, not pulsar version.
> >
> > Moreover, in your proposal, you mention the case of client c++ at first,
> > but don't talk about it later. Is the scope of this proposal in the Java
> > client only?
> >
> > Best,
> > tison.
> >
> >
> > Enrico Olivelli  于2023年3月4日周六 06:38写道:
> >
> > > Yunze,
> > >
> > > Il Ven 3 Mar 2023, 12:31 Yunze Xu  ha
> > > scritto:
> > >
> > > > Hi all,
> > > >
> > > > Based on the previous discussion [1], I created a proposal to support
> > > > configuring client version at SDK level:
> > > > https://github.com/apache/pulsar/issues/19705
> > > >
> > > > I've added more explanations in the motivation part, let's use this
> > > > PIP as a subsequent discussion of [1].
> > > >
> > > > BTW, there is a PR [2] in the pulsar-client-cpp repo because the
> > > > motivation is more meaningful for the C++ client.
> > > >
> > >
> > > I understand well this problem, we have it for the cited clients but I 
> > > also
> > > see the same issue for other libraries based on the Java client, like the
> > > official Apache Pulsar Reactive client.
> > >
> > > I also see this problem in Startlight for JMS that is a JMS client for
> > > Pulsar that is based on the Java client.
> > >
> > > While I agree on the problem and on the solution I think that a static
> > > field is not enough, we have some problems:
> > >
> > > 1) there may be multiple usages of the Java client in the same JVM, and 
> > > you
> > > want each client to report correctly its version
> > >
> > > 2) we would need to use the Java security Manager in order to prevent
> > > malicious code to modify the version or some other mechanism to prevent
> > > overriding the version.
> > >
> > > I believe that in the case of the Java client is is easier to add a
> > > configuration entry to the Pulsar Client Configuration. That would become 
> > > a
> > > field in the JavaClient. So each instance can declare its version and also
> > > malicious code won't be able ti easily tweak the version (because it won't
> > > be a simple static method call)
> > >
> > > Enrico
> > >
> > >
> > >
> > > > [1] https://lists.apache.org/thread/n59k537fhthjnzkfxtc2p4zk4l0cv3mp
> > > > [2] https://github.com/apache/pulsar-client-cpp/pull/208
> > > >
> > > > Thanks!
> > > >
> > >


Re: [DISCUSS] PIP-247: Notifications for partitions update

2023-03-08 Thread houxiaoyu
Hi Michael,

> is there a reason that we couldn't also use this to improve PIP 145?

The protocol described in this PIP could also be used to improve PIP-145.
However I think that it' not a good reason that we use  the regex sub
watcher to implement the partitioned update watcher because of the other
reasons we mentioned above.

> Since we know we're using a TCP connection, is it possible to rely on
> pulsar's keep alive timeout (the broker and the client each have their
> own) to close a connection that isn't responsive?

Maybe it could fail on application layer I think,  for example, the
partitioned update listener run fail unexceptionly.  Currently another task
will be scheduled if the poll task encounters error in partition auto
update timer task. [0]

> Regarding the connection, which connection should the client use to send
the watch requests?

The `PartitionUpdateWatcher` will call `connectionHandler.grabCnx()` to
open an connection, which is analogous to `TopicListWatcher`. [1]

> do we plan on using metadata storenotifications to trigger the callbacks
that trigger notifications sent
> to the clients

Yes, we will just look up the metadataStore to fetch the count of the
partitions and register a watcher to the metadataStore to trigger the count
update.

> One nit on the protobuf for CommandWatchPartitionUpdateSuccess:
>
>repeated string topics = 3;
>   repeated uint32 partitions = 4;
>
> What do you think about using a repeated message that represents a
> pair of a topic and its partition count instead of using two lists?

Great. It looks better using a repeated message, I will update the protobuf.

> How will we handle the case where a watched topic does not exist?

1. When `PulsarClient` calls `create()` to create a producer or  calls
`subscribe()` to create a consumer,  the client will first get
partitioned-topic metadata from broker, [2]. If the topic doesn't exist and
`isAllowAutoTopicCreation=true` in broker, the partitioned-topic zk node
will auto create with default partition num.
2.  After the client getting partitioned-topic metadata successfully,  the
`PartitionedProducerImpl` will be create if `meta.partition >
0`.  `PartitionUpdateWatcher` will be initilized in
`PartitionedProducerImpl` constructor. The `PartitionUpdateWatcher` sends
command to broker to register a watcher. If any topic in the topicList
doesn't exist,  the broker will send error to the client and the
`PartitionedProducerImpl` will start fail.  `MultiTopicsConsumerImpl` will
work in the same way.

> I want to touch on authorization. A role should have "lookup"
> permission to watch for updates on each partitioned topic that it
> watches. As a result, if we allow for a request to watch multiple
> topics, some might succeed while others fail. How do we handle partial
> success?

If any topic in the topicList authorizes fail, the broker will send error
to the client. The following reasons support this action:
1. Before we sending command to register a partition update watcher, the
client should have send the `CommandPartitionedTopicMetadata` and should
have the `lookup` permission [3] [4].
2. Currently if any topic subsrbies fail the consumer wil start faiil. [5]


[0]
https://github.com/apache/pulsar/blob/af1360fb167c1f9484fda5771df3ea9b21d1440b/pulsar-client/src/main/java/org/apache/pulsar/client/impl/MultiTopicsConsumerImpl.java#L1453-L1461

[1]
https://github.com/apache/pulsar/blob/af1360fb167c1f9484fda5771df3ea9b21d1440b/pulsar-client/src/main/java/org/apache/pulsar/client/impl/TopicListWatcher.java#L67-L81

[2]
https://github.com/apache/pulsar/blob/af1360fb167c1f9484fda5771df3ea9b21d1440b/pulsar-client/src/main/java/org/apache/pulsar/client/impl/PulsarClientImpl.java#L365-L371

[3]
https://github.com/apache/pulsar/blob/af1360fb167c1f9484fda5771df3ea9b21d1440b/pulsar-client/src/main/java/org/apache/pulsar/client/impl/MultiTopicsConsumerImpl.java#L903-L923

[4]
https://github.com/apache/pulsar/blob/af1360fb167c1f9484fda5771df3ea9b21d1440b/pulsar-broker/src/main/java/org/apache/pulsar/broker/service/ServerCnx.java#L558-L560

[5]
https://github.com/apache/pulsar/blob/af1360fb167c1f9484fda5771df3ea9b21d1440b/pulsar-client/src/main/java/org/apache/pulsar/client/impl/MultiTopicsConsumerImpl.java#L171-L193

Thanks,
Xiaoyu Hou

Michael Marshall  于2023年3月7日周二 15:43写道:

> Thanks for the context Xiaoyu Hou and Asaf. I appreciate the
> efficiencies that we can gain by creating a specific implementation
> for the partitioned topic use case. I agree that this new notification
> system makes sense based on Pulsar's current features, and I have some
> implementation questions.
>
> >- If the broker sends notification and it's lost due network issues,
> > you'll only know about it due to the client doing constant polling, using
> > its hash to minimize response.
>
> I see that we implemented an ack mechanism to get around this. I
> haven't looked closely, but is there a reason that we couldn't also
> use this to