Re: [DISCUSS] PIP-261: Restructure Getting Started section

2023-03-29 Thread Yu
Hi Asaf,

Thanks for your great initiative!

To make the learning path accurate for all roles (beginners, developers,
operators) and give them what they need in minimal viable docs, I would
suggest making changes to information architecture, like creating 3 guides
(subpages of https://pulsar.apache.org/docs) for 3 roles respectively.

I've explained in detail with examples here [1], PTAL. Thank you!

[1] https://github.com/apache/pulsar/issues/19912#issuecomment-1489677523

On Thu, Mar 30, 2023 at 1:04 AM Asaf Mesika  wrote:

> I have only one reviewer so far.
> Would appreciate 2 more PMC members eyes on this.
>
> Thanks!
>
> Asaf
>
> > On 23 Mar 2023, at 17:48, Asaf Mesika  wrote:
> >
> > Hi,
> >
> > In light of PIP-98, I would like to present sub-PIP to restructure the
> Getting Started Section.
> >
> > https://github.com/apache/pulsar/issues/19912
> >
> >
> > The goal of this PIP is to describe how we want the Table Of Contents of
> the Getting Started section to look like. Using the TOC we’ll be able to
> rebuild this section to make it very easy to get started with Pulsar.
> >
> >
> > Would love to get your feedback on it.
> >
> >
> > Thanks!
> >
> > Asaf
> >
> >
>
>


Re: [DISCUSS] Cherry-pick #15121 into branch-2.10 to solve the issue sasl authentication failure

2023-03-29 Thread Yubiao Feng
There is no objection, and I will cherry-pick #15121 into branch-2.10 today

Thanks
Yubiao Feng

On Tue, Mar 28, 2023 at 7:52 PM Yubiao Feng 
wrote:

> Hi community
>
> ### Summary
> The Admin client (`pulsar-admin`) and Java Client (PulsarAdmin) will throw
> Unauthorized Ex in both scenarios:
> - If there have more than one broker in a cluster( see issue 1 below ).
> - If authentication is enabled for both Pulsar-Proxy and Pulsar-Broker(
> see issue 2 below),
>
> ```
> bin/pulsar-admin topics stats persistent://public/default/tp1
> 2023-03-28T07:30:58,453+ [main] INFO
> org.apache.pulsar.client.impl.auth.AuthenticationSasl - JAAS loginContext
> is: PulsarAdmin.
> 2023-03-28T07:30:58,583+ [main] INFO
> org.apache.pulsar.common.sasl.JAASCredentialsContainer - successfully
> logged in.
> 2023-03-28T07:30:58,587+ [pulsar-tgt-refresh-thread] INFO
> org.apache.pulsar.common.sasl.TGTRefreshThread - TGT refresh thread started.
> 2023-03-28T07:30:58,612+ [pulsar-tgt-refresh-thread] INFO
> org.apache.pulsar.common.sasl.TGTRefreshThread - Client principal is "
> pulsar-ad...@sn.io".
> 2023-03-28T07:30:58,613+ [pulsar-tgt-refresh-thread] INFO
> org.apache.pulsar.common.sasl.TGTRefreshThread - Server principal is
> "krbtgt/sn...@sn.io".
> 2023-03-28T07:30:58,617+ [pulsar-tgt-refresh-thread] INFO
> org.apache.pulsar.common.sasl.TGTRefreshThread - TGT valid starting at:
> Tue Mar 28 07:30:58 UTC 2023
> 2023-03-28T07:30:58,617+ [pulsar-tgt-refresh-thread] INFO
> org.apache.pulsar.common.sasl.TGTRefreshThread - TGT expires:
> Wed Mar 29 07:30:58 UTC 2023
> 2023-03-28T07:30:58,617+ [pulsar-tgt-refresh-thread] INFO
> org.apache.pulsar.common.sasl.TGTRefreshThread - TGT refresh sleeping
> until: Wed Mar 29 03:12:29 UTC 2023
> 2023-03-28T07:30:59,861+ [main] INFO
> org.apache.pulsar.client.impl.auth.PulsarSaslClient - Using
> JAAS/SASL/GSSAPI auth to connect to server Principal broker/pulsar03,
> HTTP 401 Unauthorized
> Reason: HTTP 401 Unauthorized
> ```
>
> And I want to cherry-pick https://github.com/apache/pulsar/pull/15121
> into branch-2.10 to fix it.
>
> ### Background
> When using Kerberos for authentication, Pulsar works like this:
> - client: init ticket
> - request to broker
> - broker identifies the client (Broker can confirm the ticket is valid by
> Kerberos)
> - sends a token(we call it sasl_role_token) to the client ( at this
> moment, the session is successfully created )
> - then the client will be authenticated through sasl_role_token, do not
> use Kerberos anymore.
>
> The `sasl_role_token` is generated by this logic: `Sha512(saslRoleName,
> ${secret})`, we call the `secret` sasl_sign_secret.
> In version `2.10.x`, the variable `secret` is a random string initialized
> when the broker starts.
>
> ### Issue 1
> If a cluster includes two brokers, and a topic `public/default/tp1` is
> owned by broker-0. We will get an error when we call `pulsar-admin topics
> stats public/default/tp1` to broker-1.
>
> The whole process goes like this:
> - client succeeds in authentication and gets a token from broker-1
> - broker-1 tells the client to redirect to broker-0
> - client request to broker-0 carries the sasl_role_token generated by
> broker-1
> - broker-0 can not decode the sasl_role_token, because it has differ
> secret of broker-1, and responses 401
>
> ### Issue 2
> After authentication is enabled for both Pulsar-Proxy and Pulsar-Broker,
> the error occurs as follows
> - client succeeds in authentication and gets a token from Pulsar Proxy
> - proxy forwards the request to broker
> - the broker can not decode the `sasl_role_token`, because it has differed
> secret of Pulsar Proxy, and responses 401
>
> ### solutions
> There have two solutions to solve this issue:
>
> Solution 1
> - The client saves different tokens for different servers(e.g.
> ["broker-0", "broker-1", "pulsar-proxy"]) so servers will receive the
> tokens issued by each other, then we can fix Issue 1.
> - Proxy and Broker do not enable authentication simultaneously, then we
> can fix Issue 2.
>
> Solution 2
> - Make `sasl_sign_secret` configurable. Users can configure this variable
> to the same value, then multi servers can decode every
> `sasl_role_token.`  PR #15121 does this.
>
> I'd prefer Solution 2 because it is already in the master branch, so I
> want to cherry-pick #15121 into branch-2.10.
>
> ### Forward Compatibility
> In PR #15121, the config `sasl_sign_secret` is a new item in config files.
> Since it is required, users will get a system error if does not set it. To
> ensure forward compatibility, we can make this variable optional in
> branch-2.10
>
>
> Thanks
> Yubiao Feng
>


Re: [Python] Should we make the schema default compatible with Java client?

2023-03-29 Thread 丛搏
Hi, Yunze :

1. If the changes may cause some compatibility issues.
How do we solve the compatibility issues? It may be a
breaking change.

2. Another question is if sorting is enabled by default,
is the sorting rule the same as java or other clients?

Putting aside the above two problems, I think it is
good to be consistent with other clients.

Thanks,
Bo

Eric Hare  于2023年3月29日周三 22:42写道:
>
> +1 - i think keeping the `_sorted_fields` and `_required` defaults consistent 
> between the clients is the way to go.
>
> > On Mar 29, 2023, at 7:09 AM, Yunze Xu  wrote:
> >
> > I found the Python client has two options to control the behavior:
> > 1. Set `_sorted_fields`. It's false by default in the Python client,
> > but it's true in the Java client. i.e. the Java client sorts all
> > fields by default.
> > 2. Set `_required`. It's false by default for all types in the Python
> > client, but it's only false for the string type in the Java client.
> >
> > i.e. given the following Java class:
> >
> > ```java
> > class User {
> >String name;
> >int age;
> >double score;
> > }
> > ```
> >
> > We have to give the following definition in Python:
> >
> > ```python
> > class User(Record):
> >_sorted_fields = True
> >name = String()
> >age = Integer(required=True)
> >score = Double(required=True)
> > ```
> >
> > I see https://github.com/apache/pulsar/pull/12232 adds the
> > `_sorted_fields` field and disables the field sort by default. It
> > breaks compatibility with the Java client.
> >
> > IMO, we should make `_sorted_fields` true by default and `_required`
> > true for all types other than `String` by default.
> >
> > Thanks,
> > Yunze
> >
> > On Wed, Mar 29, 2023 at 4:00 PM Yunze Xu  wrote:
> >>
> >> Hi all,
> >>
> >> Recently I found the default generated schema definition in the Python
> >> client is different from the Java client, which leads to some
> >> unexpected behavior.
> >>
> >> For example, given the following class definition in Python:
> >>
> >> ```python
> >> class Data(Record):
> >>i = Integer()
> >> ```
> >>
> >> The type of `i` field is a union: "type": ["null", "int"]
> >>
> >> While given the following class definition in Java:
> >>
> >> ```java
> >> class Data {
> >>private final int i;
> >>/* ... */
> >> }
> >> ```
> >>
> >> The type of `i` field is an integer: "type": "int"
> >>
> >> It brings an issue that if a Python consumer subscribes to a topic
> >> with schema defined above, then a Java producer will fail to create
> >> because of the schema incompatibility.
> >>
> >> Currently, the workaround is to change the schema compatibility
> >> strategy to FORWARD.
> >>
> >> Should we change the way to generate schema definition in the Python
> >> client to be compatible with the Java client? It could bring breaking
> >> changes to old Python clients, but it could guarantee compatibility
> >> with the Java client.
> >>
> >> If not, we still have to introduce an extra configuration to make
> >> Python schema compatible with Java schema. But it requires code
> >> changes. e.g. here is a possible solution:
> >>
> >> ```python
> >> class Data(Record):
> >># NOTE: Users might have to add this extra field to control how to
> >> generate the schema
> >>__java_compatible = True
> >>i = Integer()
> >> ```
> >>
> >> Thanks,
> >> Yunze
>


Re: [ANNOUNCE] Qiang Zhao as new PMC member in Apache Pulsar

2023-03-29 Thread 太上玄元道君
Congrats!!

Thanks,
Tao Jiuming

> 2023年3月29日 23:51,Devin Bost  写道:
> 
> Congrats!
> 
> Devin G. Bost
> 
> 
> On Wed, Mar 29, 2023 at 6:38 AM ZhangJian He  wrote:
> 
>> Congratulations!
>> 
>> Thanks
>> ZhangJian He
>> 
>> 
>> On Wed, 29 Mar 2023 at 19:33, Haiting Jiang 
>> wrote:
>> 
>>> Congratulations!
>>> 
>>> 
>>> Haiting
>>> 
>>> On Wed, Mar 29, 2023 at 5:29 PM Cong Zhao  wrote:
 
 Congrats! Qiang.
 
 
 Thanks,
 Cong Zhao
 
 On 2023/03/29 03:22:43 guo jiwei wrote:
> Dear Community,
> 
> We are thrilled to announce that Qiang Zhao
> (https://github.com/mattisonchao) has been invited and has accepted
>>> the
> role of member of the Apache Pulsar Project Management Committee
>> (PMC).
> 
> Qiang has been a vital asset to our community, consistently
> demonstrating his dedication and active participation through
> significant contributions. In addition to his technical
>> contributions,
> Qiang also plays an important role in reviewing pull requests and
> ensuring the overall quality of our project. We look forward to his
> continued contributions.
> 
> On behalf of the Pulsar PMC, we extend a warm welcome and
> congratulations to Qiang Zhao.
> 
> Best regards
> Jiwei
> 
>>> 
>> 



Re: [DISCUSS] PIP-261: Restructure Getting Started section

2023-03-29 Thread Asaf Mesika
I have only one reviewer so far. 
Would appreciate 2 more PMC members eyes on this.

Thanks!

Asaf

> On 23 Mar 2023, at 17:48, Asaf Mesika  wrote:
> 
> Hi,
> 
> In light of PIP-98, I would like to present sub-PIP to restructure the 
> Getting Started Section.
> 
> https://github.com/apache/pulsar/issues/19912
> 
> 
> The goal of this PIP is to describe how we want the Table Of Contents of the 
> Getting Started section to look like. Using the TOC we’ll be able to rebuild 
> this section to make it very easy to get started with Pulsar.
> 
> 
> Would love to get your feedback on it.
> 
> 
> Thanks!
> 
> Asaf
> 
> 



Re: [ANNOUNCE] Qiang Zhao as new PMC member in Apache Pulsar

2023-03-29 Thread Devin Bost
Congrats!

Devin G. Bost


On Wed, Mar 29, 2023 at 6:38 AM ZhangJian He  wrote:

> Congratulations!
>
> Thanks
> ZhangJian He
>
>
> On Wed, 29 Mar 2023 at 19:33, Haiting Jiang 
> wrote:
>
> > Congratulations!
> >
> >
> > Haiting
> >
> > On Wed, Mar 29, 2023 at 5:29 PM Cong Zhao  wrote:
> > >
> > > Congrats! Qiang.
> > >
> > >
> > > Thanks,
> > > Cong Zhao
> > >
> > > On 2023/03/29 03:22:43 guo jiwei wrote:
> > > > Dear Community,
> > > >
> > > > We are thrilled to announce that Qiang Zhao
> > > > (https://github.com/mattisonchao) has been invited and has accepted
> > the
> > > > role of member of the Apache Pulsar Project Management Committee
> (PMC).
> > > >
> > > > Qiang has been a vital asset to our community, consistently
> > > > demonstrating his dedication and active participation through
> > > > significant contributions. In addition to his technical
> contributions,
> > > > Qiang also plays an important role in reviewing pull requests and
> > > > ensuring the overall quality of our project. We look forward to his
> > > > continued contributions.
> > > >
> > > > On behalf of the Pulsar PMC, we extend a warm welcome and
> > > > congratulations to Qiang Zhao.
> > > >
> > > > Best regards
> > > > Jiwei
> > > >
> >
>


Re: [Python] Should we make the schema default compatible with Java client?

2023-03-29 Thread Eric Hare
+1 - i think keeping the `_sorted_fields` and `_required` defaults consistent 
between the clients is the way to go. 

> On Mar 29, 2023, at 7:09 AM, Yunze Xu  wrote:
> 
> I found the Python client has two options to control the behavior:
> 1. Set `_sorted_fields`. It's false by default in the Python client,
> but it's true in the Java client. i.e. the Java client sorts all
> fields by default.
> 2. Set `_required`. It's false by default for all types in the Python
> client, but it's only false for the string type in the Java client.
> 
> i.e. given the following Java class:
> 
> ```java
> class User {
>String name;
>int age;
>double score;
> }
> ```
> 
> We have to give the following definition in Python:
> 
> ```python
> class User(Record):
>_sorted_fields = True
>name = String()
>age = Integer(required=True)
>score = Double(required=True)
> ```
> 
> I see https://github.com/apache/pulsar/pull/12232 adds the
> `_sorted_fields` field and disables the field sort by default. It
> breaks compatibility with the Java client.
> 
> IMO, we should make `_sorted_fields` true by default and `_required`
> true for all types other than `String` by default.
> 
> Thanks,
> Yunze
> 
> On Wed, Mar 29, 2023 at 4:00 PM Yunze Xu  wrote:
>> 
>> Hi all,
>> 
>> Recently I found the default generated schema definition in the Python
>> client is different from the Java client, which leads to some
>> unexpected behavior.
>> 
>> For example, given the following class definition in Python:
>> 
>> ```python
>> class Data(Record):
>>i = Integer()
>> ```
>> 
>> The type of `i` field is a union: "type": ["null", "int"]
>> 
>> While given the following class definition in Java:
>> 
>> ```java
>> class Data {
>>private final int i;
>>/* ... */
>> }
>> ```
>> 
>> The type of `i` field is an integer: "type": "int"
>> 
>> It brings an issue that if a Python consumer subscribes to a topic
>> with schema defined above, then a Java producer will fail to create
>> because of the schema incompatibility.
>> 
>> Currently, the workaround is to change the schema compatibility
>> strategy to FORWARD.
>> 
>> Should we change the way to generate schema definition in the Python
>> client to be compatible with the Java client? It could bring breaking
>> changes to old Python clients, but it could guarantee compatibility
>> with the Java client.
>> 
>> If not, we still have to introduce an extra configuration to make
>> Python schema compatible with Java schema. But it requires code
>> changes. e.g. here is a possible solution:
>> 
>> ```python
>> class Data(Record):
>># NOTE: Users might have to add this extra field to control how to
>> generate the schema
>>__java_compatible = True
>>i = Integer()
>> ```
>> 
>> Thanks,
>> Yunze



Re: [DISCUSS] PIP-260: Client consumer filter received messages

2023-03-29 Thread Yunze Xu
Thanks for your explanation. It now makes sense to me. So I suggest:
1. Document this use case in the PIP
2. Document the result that resetting cursor might lead to in the API
doc of this configuration

Thanks,
Yunze

On Wed, Mar 29, 2023 at 9:11 PM 丛搏  wrote:
>
> Hi, Yunze:
>
> > It's better to describe how it could bring the benefit to transaction
> > use cases, since now it's designed to be a configuration related to
> > the transaction.
> sorry, that I haven't explained in detail why the transaction needs it.
> let's look at a simple example:
>
> ```
> Transaction txn = getTxn();
> int num = 0;
> MessageId messageId = null;
> while (num < 10) {
> messageId = consumer.receive(5, TimeUnit.SECONDS).getMessageId();
> producer.newMessage(txn).value(messageId.toString()).sendAsync();
> num++;
> }
> consumer.acknowledgeCumulativeAsync(messageId);
> txn.commit();
> ```
> This example mainly describes the atomicity of ack and produce of
> 10 messages by a transaction.
> If the messages we receive are duplicates, the messages we
> produce will also be duplicates. Therefore, we need to ensure that
> the messages we receive will not be repeated and are ordered in
> failover and exclusive subscription modes. But the client consumer
> does not currently have this guarantee. And it must be exactly,
> otherwise, it will break the exactly-once semantics
>
>
> > With this proposal and the option enabled, all these cases will filter
> > the messages. That's why I think we have to consider the case for
> > resetting cursors because it makes things worse.
>
> Yes, This configuration may make the reset cursor more
> difficult to use, But without this configuration, it is difficult to guarantee
> the correctness of the transaction. Although we made the reset
> cursor worse, we ensured correctness.
>
> For transaction, we must first consider its correctness, and secondly,
> what features to support (reset cursor eg.)
>
> Thanks,
> Bo
> >
> > The three cases above do not involve transaction operations. So it
> > would be better to understand the benefit if you can show some typical
> > cases involved with transaction operations.
> >
> > Thanks,
> > Yunze
> >
> > On Wed, Mar 29, 2023 at 12:02 PM 丛搏  wrote:
> > >
> > > Hi, all :
> > >
> > > Thanks to everyone who discussed it.
> > >
> > > Our current care points include the following aspects:
> > >
> > > 1. The filtering efficiency of the client consumer is not as
> > > good as doing something directly in startMessageId
> > > 2. Does not support reset cursor
> > >
> > > Because my previous PIP description is to add configuration
> > > in consumerBuilder. The definition of this configuration is not
> > > clear, and it will cause great trouble to users.
> > >
> > > We can add a separate configuration that is only used for
> > > acks with transactions. Simple example:
> > >
> > > ```
> > > ConsumerBuilder 
> > > transactionConfiguration(ConsumerTransactionConfiguration);
> > >
> > > @Builder
> > > @Data
> > > @NoArgsConstructor
> > > @AllArgsConstructor
> > > @InterfaceAudience.Public
> > > @InterfaceStability.Stable
> > >
> > > public class ConsumerTransactionConfiguration {
> > >boolean isFilterReceivedMessagesEnabled = false;
> > > }
> > >
> > > ```
> > >
> > > if the design of startMessageId can provide the feature,
> > > we can remove the configuration, or currently has a startMessageId
> > > closed loop solution, I agree to use startMessageId.
> > >
> > > As for the reset cursor, I think it is another problem,
> > > not related to this PIP.
> > >
> > > Thanks,
> > > Bo
> > >
> > > 丛搏  于2023年3月24日周五 18:53写道:
> > > >
> > > > Hi, Michael:
> > > >
> > > > I thought about it carefully, and using 'startMessageId'
> > > > is indeed a good idea. But it is more complicated, we
> > > > need to ensure its absolute correctness, and take
> > > > performance into consideration. If you can come up
> > > >  with a closed-loop solution based on 'startMessageId',
> > > > I support you. If it can't take into account performance
> > > > and correctness, I think we will make a combination of
> > > > our two solutions. You are responsible for ensuring that
> > > > a certain degree of messages are not re-delivered, which
> > > >  reduces the overhead caused by the repeated delivery
> > > > of many messages. My design is responsible for
> > > > the final consistency.
> > > >
> > > > Thanks,
> > > > Bo
> > > >
> > > > Michael Marshall  于2023年3月22日周三 14:22写道:
> > > > >
> > > > > Because we already send the `startMessageId`, there is a chance where
> > > > > we might not even need to update the protocol for the
> > > > > CommandSubscribe. In light of that, I quickly put together a PR
> > > > > showing how that field might be used to inform the broker where to
> > > > > start the read position for the cursor.
> > > > >
> > > > > https://github.com/apache/pulsar/pull/19892
> > > > >
> > > > > The PR is not complete, but it does convey the general idea. I wrote
> > > > > additional 

Re: [DISCUSS] Change PIP template

2023-03-29 Thread Asaf Mesika
Bo, I need a review of the PR 
:)

On Wed, Mar 29, 2023 at 4:35 PM 丛搏  wrote:

> +1
>
> Good discussion!
>
> Thanks,
> Bo
>
> Asaf Mesika  于2023年3月29日周三 20:11写道:
> >
> > So far only 1 PMC member reviewed it.
> > Any other PMC member would like to review the new template for PIP?
> >
> > On Wed, Mar 22, 2023 at 1:10 PM Asaf Mesika 
> wrote:
> >
> > > Any other PMC member can take a look at the new template PR
> > > ?
> > > Ideally I would like to have 2-3 PMC member approval for this.
> > >
> > >
> > > On 17 Mar 2023, at 18:23, Michael Marshall 
> wrote:
> > >
> > > Thanks for this initiative, Asaf.
> > >
> > > As part of this process, I would like for us to add a security and a
> > > multi-tenancy section to the PIP template.
> > >
> > > As you suggest, the template conveys what the community values, and
> > > these two sections must always be considered when changing Pulsar in
> > > fundamental ways.
> > >
> > > (Thanks for already adding the security section to your template!)
> > >
> > > Thanks,
> > > Michael
> > >
> > > On Thu, Mar 16, 2023 at 2:58 AM Asaf Mesika 
> wrote:
> > >
> > >
> > > Here's the PR to remove the form and add a new issue template in
> Markdown
> > > containing the suggested structure and description for each section.
> > >
> > > https://github.com/apache/pulsar/pull/19832
> > >
> > >
> > > On Wed, Mar 1, 2023 at 3:43 PM Elliot West
> > >  wrote:
> > >
> > > +1 Asaf
> > >
> > > I'd also suggest that we encourage the submission of relevant diagrams.
> > > This is trivial to do with the GitHub markdown editor, but I suspect is
> > > often neglected because users do not know the feature exists.
> > >
> > > On Wed, 1 Mar 2023 at 13:22, Asaf Mesika 
> wrote:
> > >
> > > Ok.
> > >
> > > I'll draft a PR and link it here when I'm done. Thanks!
> > >
> > > On Tue, Feb 28, 2023 at 7:08 AM PengHui Li  wrote:
> > >
> > > +1
> > >
> > > Penghui
> > >
> > > On Mon, Feb 27, 2023 at 9:24 PM Asaf Mesika 
> > >
> > > wrote:
> > >
> > >
> > > Mails don't support things like markdown diagrams or images and are
> > > generally less easy to read.
> > > My proposal includes a required section called Links in which you
> > >
> > > need
> > >
> > > to
> > >
> > > fill in the discussion thread in DEV mailing list and vote thread.
> > >
> > >
> > > On Mon, Feb 27, 2023 at 3:08 PM Girish Sharma <
> > >
> > > scrapmachi...@gmail.com
> > >
> > >
> > > wrote:
> > >
> > > Hi Asaf,
> > > I was referring to the PIP process, as a whole, as explained in
> > > https://github.com/apache/pulsar/blob/master/wiki/proposals/PIP.md
> > > Someone looking at GitHub ticket would find and almost empty PIP GH
> > >
> > > issue
> > >
> > > while the same PIP has had many discussions over here in the ML.
> > > There is scope of improvement in the process where we either remove
> > >
> > > the
> > >
> > > first step to create the PIP over at GitHub and directly present
> > >
> > > the
> > >
> > > PIP
> > >
> > > in
> > >
> > > the first mail of the thread here, or we do all discussions in GH.
> > > Both the ML and GH are searchable and linkable for tracking
> > >
> > > purposes.
> > >
> > >
> > > Regards
> > >
> > > On Mon, Feb 27, 2023 at 6:23 PM Asaf Mesika  > >
> > >
> > > wrote:
> > >
> > >
> > > On Sun, Feb 26, 2023 at 2:49 PM Girish Sharma <
> > >
> > > scrapmachi...@gmail.com
> > >
> > >
> > > wrote:
> > >
> > > Good proposal Asaf.
> > > I've also wondered why the PIP creation and discussion process
> > >
> > > is
> > >
> > > so
> > >
> > > separated. The PIP discussion and voting starts off as a GitHub
> > >
> > > issue,
> > >
> > > but
> > >
> > > all of its discussion happens here on the mailing list. Is
> > >
> > > there
> > >
> > > scope
> > >
> > > of
> > >
> > > improvement in that process as well?
> > >
> > >
> > > Not sure I follow. Can you outline the problem exactly?
> > >
> > >
> > >
> > > Regards
> > >
> > > On Sun, Feb 26, 2023 at 6:16 PM tison 
> > >
> > > wrote:
> > >
> > >
> > > Hi Asaf,
> > >
> > > I agree that, generally, a PIP is written as a whole and
> > >
> > > paste
> > >
> > > as
> > >
> > > the
> > >
> > > body.
> > >
> > > So +1 for your proposal.
> > >
> > > Additionally, I'm thinking of moving the doc of procedure
> > >
> > > (wiki/PIP.md)
> > >
> > > to
> > >
> > > the contributions guide and use the new markdown template to
> > >
> > > supersede
> > >
> > > the
> > >
> > > wiki/PIP-template.md. Then we don't need to hold the wiki
> > >
> > > folder.
> > >
> > >
> > > It can be an extended version to your proposal, so let's keep
> > >
> > > on
> > >
> > > your
> > >
> > > proposal in this thread. Just for your reference.
> > >
> > > Best,
> > > tison.
> > >
> > >
> > > Asaf Mesika  于2023年2月26日周日 19:18写道:
> > >
> > > Hi,
> > >
> > > I would like to suggest two changes I'd like to make to the
> > >
> > > PIP
> > >
> > > design
> > >
> > > template:
> > > 1. Remove the form - just have a markdown template 

Re: [Python] Should we make the schema default compatible with Java client?

2023-03-29 Thread Yunze Xu
I found the Python client has two options to control the behavior:
1. Set `_sorted_fields`. It's false by default in the Python client,
but it's true in the Java client. i.e. the Java client sorts all
fields by default.
2. Set `_required`. It's false by default for all types in the Python
client, but it's only false for the string type in the Java client.

i.e. given the following Java class:

```java
class User {
String name;
int age;
double score;
}
```

We have to give the following definition in Python:

```python
class User(Record):
_sorted_fields = True
name = String()
age = Integer(required=True)
score = Double(required=True)
```

I see https://github.com/apache/pulsar/pull/12232 adds the
`_sorted_fields` field and disables the field sort by default. It
breaks compatibility with the Java client.

IMO, we should make `_sorted_fields` true by default and `_required`
true for all types other than `String` by default.

Thanks,
Yunze

On Wed, Mar 29, 2023 at 4:00 PM Yunze Xu  wrote:
>
> Hi all,
>
> Recently I found the default generated schema definition in the Python
> client is different from the Java client, which leads to some
> unexpected behavior.
>
> For example, given the following class definition in Python:
>
> ```python
> class Data(Record):
> i = Integer()
> ```
>
> The type of `i` field is a union: "type": ["null", "int"]
>
> While given the following class definition in Java:
>
> ```java
> class Data {
> private final int i;
> /* ... */
> }
> ```
>
> The type of `i` field is an integer: "type": "int"
>
> It brings an issue that if a Python consumer subscribes to a topic
> with schema defined above, then a Java producer will fail to create
> because of the schema incompatibility.
>
> Currently, the workaround is to change the schema compatibility
> strategy to FORWARD.
>
> Should we change the way to generate schema definition in the Python
> client to be compatible with the Java client? It could bring breaking
> changes to old Python clients, but it could guarantee compatibility
> with the Java client.
>
> If not, we still have to introduce an extra configuration to make
> Python schema compatible with Java schema. But it requires code
> changes. e.g. here is a possible solution:
>
> ```python
> class Data(Record):
> # NOTE: Users might have to add this extra field to control how to
> generate the schema
> __java_compatible = True
> i = Integer()
> ```
>
> Thanks,
> Yunze


Re: [DISCUSS] Change PIP template

2023-03-29 Thread 丛搏
+1

Good discussion!

Thanks,
Bo

Asaf Mesika  于2023年3月29日周三 20:11写道:
>
> So far only 1 PMC member reviewed it.
> Any other PMC member would like to review the new template for PIP?
>
> On Wed, Mar 22, 2023 at 1:10 PM Asaf Mesika  wrote:
>
> > Any other PMC member can take a look at the new template PR
> > ?
> > Ideally I would like to have 2-3 PMC member approval for this.
> >
> >
> > On 17 Mar 2023, at 18:23, Michael Marshall  wrote:
> >
> > Thanks for this initiative, Asaf.
> >
> > As part of this process, I would like for us to add a security and a
> > multi-tenancy section to the PIP template.
> >
> > As you suggest, the template conveys what the community values, and
> > these two sections must always be considered when changing Pulsar in
> > fundamental ways.
> >
> > (Thanks for already adding the security section to your template!)
> >
> > Thanks,
> > Michael
> >
> > On Thu, Mar 16, 2023 at 2:58 AM Asaf Mesika  wrote:
> >
> >
> > Here's the PR to remove the form and add a new issue template in Markdown
> > containing the suggested structure and description for each section.
> >
> > https://github.com/apache/pulsar/pull/19832
> >
> >
> > On Wed, Mar 1, 2023 at 3:43 PM Elliot West
> >  wrote:
> >
> > +1 Asaf
> >
> > I'd also suggest that we encourage the submission of relevant diagrams.
> > This is trivial to do with the GitHub markdown editor, but I suspect is
> > often neglected because users do not know the feature exists.
> >
> > On Wed, 1 Mar 2023 at 13:22, Asaf Mesika  wrote:
> >
> > Ok.
> >
> > I'll draft a PR and link it here when I'm done. Thanks!
> >
> > On Tue, Feb 28, 2023 at 7:08 AM PengHui Li  wrote:
> >
> > +1
> >
> > Penghui
> >
> > On Mon, Feb 27, 2023 at 9:24 PM Asaf Mesika 
> >
> > wrote:
> >
> >
> > Mails don't support things like markdown diagrams or images and are
> > generally less easy to read.
> > My proposal includes a required section called Links in which you
> >
> > need
> >
> > to
> >
> > fill in the discussion thread in DEV mailing list and vote thread.
> >
> >
> > On Mon, Feb 27, 2023 at 3:08 PM Girish Sharma <
> >
> > scrapmachi...@gmail.com
> >
> >
> > wrote:
> >
> > Hi Asaf,
> > I was referring to the PIP process, as a whole, as explained in
> > https://github.com/apache/pulsar/blob/master/wiki/proposals/PIP.md
> > Someone looking at GitHub ticket would find and almost empty PIP GH
> >
> > issue
> >
> > while the same PIP has had many discussions over here in the ML.
> > There is scope of improvement in the process where we either remove
> >
> > the
> >
> > first step to create the PIP over at GitHub and directly present
> >
> > the
> >
> > PIP
> >
> > in
> >
> > the first mail of the thread here, or we do all discussions in GH.
> > Both the ML and GH are searchable and linkable for tracking
> >
> > purposes.
> >
> >
> > Regards
> >
> > On Mon, Feb 27, 2023 at 6:23 PM Asaf Mesika  >
> >
> > wrote:
> >
> >
> > On Sun, Feb 26, 2023 at 2:49 PM Girish Sharma <
> >
> > scrapmachi...@gmail.com
> >
> >
> > wrote:
> >
> > Good proposal Asaf.
> > I've also wondered why the PIP creation and discussion process
> >
> > is
> >
> > so
> >
> > separated. The PIP discussion and voting starts off as a GitHub
> >
> > issue,
> >
> > but
> >
> > all of its discussion happens here on the mailing list. Is
> >
> > there
> >
> > scope
> >
> > of
> >
> > improvement in that process as well?
> >
> >
> > Not sure I follow. Can you outline the problem exactly?
> >
> >
> >
> > Regards
> >
> > On Sun, Feb 26, 2023 at 6:16 PM tison 
> >
> > wrote:
> >
> >
> > Hi Asaf,
> >
> > I agree that, generally, a PIP is written as a whole and
> >
> > paste
> >
> > as
> >
> > the
> >
> > body.
> >
> > So +1 for your proposal.
> >
> > Additionally, I'm thinking of moving the doc of procedure
> >
> > (wiki/PIP.md)
> >
> > to
> >
> > the contributions guide and use the new markdown template to
> >
> > supersede
> >
> > the
> >
> > wiki/PIP-template.md. Then we don't need to hold the wiki
> >
> > folder.
> >
> >
> > It can be an extended version to your proposal, so let's keep
> >
> > on
> >
> > your
> >
> > proposal in this thread. Just for your reference.
> >
> > Best,
> > tison.
> >
> >
> > Asaf Mesika  于2023年2月26日周日 19:18写道:
> >
> > Hi,
> >
> > I would like to suggest two changes I'd like to make to the
> >
> > PIP
> >
> > design
> >
> > template:
> > 1. Remove the form - just have a markdown template fill the
> >
> > issue
> >
> > body
> >
> > as
> >
> > it is created.
> > 2. Change the PIP template structure
> >
> > == Removing the form
> >
> > Today, when you want to submit a PIP, you are required to
> >
> > fill
> >
> > out
> >
> > a
> >
> > form
> >
> > with boxes composed of 3-4 lines length.
> > It's not good because:
> > * It broadcasts to the author: we want a very small PIP,
> >
> > something
> >
> > that
> >
> > fits those small boxes.
> > * It makes the PIP look like a bug, where you fill out
> >
> > fields.
> >
> > * It doesn't allow having H2 

Re: [DISCUSS] PIP-260: Client consumer filter received messages

2023-03-29 Thread 丛搏
Hi, Yunze:

> It's better to describe how it could bring the benefit to transaction
> use cases, since now it's designed to be a configuration related to
> the transaction.
sorry, that I haven't explained in detail why the transaction needs it.
let's look at a simple example:

```
Transaction txn = getTxn();
int num = 0;
MessageId messageId = null;
while (num < 10) {
messageId = consumer.receive(5, TimeUnit.SECONDS).getMessageId();
producer.newMessage(txn).value(messageId.toString()).sendAsync();
num++;
}
consumer.acknowledgeCumulativeAsync(messageId);
txn.commit();
```
This example mainly describes the atomicity of ack and produce of
10 messages by a transaction.
If the messages we receive are duplicates, the messages we
produce will also be duplicates. Therefore, we need to ensure that
the messages we receive will not be repeated and are ordered in
failover and exclusive subscription modes. But the client consumer
does not currently have this guarantee. And it must be exactly,
otherwise, it will break the exactly-once semantics


> With this proposal and the option enabled, all these cases will filter
> the messages. That's why I think we have to consider the case for
> resetting cursors because it makes things worse.

Yes, This configuration may make the reset cursor more
difficult to use, But without this configuration, it is difficult to guarantee
the correctness of the transaction. Although we made the reset
cursor worse, we ensured correctness.

For transaction, we must first consider its correctness, and secondly,
what features to support (reset cursor eg.)

Thanks,
Bo
>
> The three cases above do not involve transaction operations. So it
> would be better to understand the benefit if you can show some typical
> cases involved with transaction operations.
>
> Thanks,
> Yunze
>
> On Wed, Mar 29, 2023 at 12:02 PM 丛搏  wrote:
> >
> > Hi, all :
> >
> > Thanks to everyone who discussed it.
> >
> > Our current care points include the following aspects:
> >
> > 1. The filtering efficiency of the client consumer is not as
> > good as doing something directly in startMessageId
> > 2. Does not support reset cursor
> >
> > Because my previous PIP description is to add configuration
> > in consumerBuilder. The definition of this configuration is not
> > clear, and it will cause great trouble to users.
> >
> > We can add a separate configuration that is only used for
> > acks with transactions. Simple example:
> >
> > ```
> > ConsumerBuilder 
> > transactionConfiguration(ConsumerTransactionConfiguration);
> >
> > @Builder
> > @Data
> > @NoArgsConstructor
> > @AllArgsConstructor
> > @InterfaceAudience.Public
> > @InterfaceStability.Stable
> >
> > public class ConsumerTransactionConfiguration {
> >boolean isFilterReceivedMessagesEnabled = false;
> > }
> >
> > ```
> >
> > if the design of startMessageId can provide the feature,
> > we can remove the configuration, or currently has a startMessageId
> > closed loop solution, I agree to use startMessageId.
> >
> > As for the reset cursor, I think it is another problem,
> > not related to this PIP.
> >
> > Thanks,
> > Bo
> >
> > 丛搏  于2023年3月24日周五 18:53写道:
> > >
> > > Hi, Michael:
> > >
> > > I thought about it carefully, and using 'startMessageId'
> > > is indeed a good idea. But it is more complicated, we
> > > need to ensure its absolute correctness, and take
> > > performance into consideration. If you can come up
> > >  with a closed-loop solution based on 'startMessageId',
> > > I support you. If it can't take into account performance
> > > and correctness, I think we will make a combination of
> > > our two solutions. You are responsible for ensuring that
> > > a certain degree of messages are not re-delivered, which
> > >  reduces the overhead caused by the repeated delivery
> > > of many messages. My design is responsible for
> > > the final consistency.
> > >
> > > Thanks,
> > > Bo
> > >
> > > Michael Marshall  于2023年3月22日周三 14:22写道:
> > > >
> > > > Because we already send the `startMessageId`, there is a chance where
> > > > we might not even need to update the protocol for the
> > > > CommandSubscribe. In light of that, I quickly put together a PR
> > > > showing how that field might be used to inform the broker where to
> > > > start the read position for the cursor.
> > > >
> > > > https://github.com/apache/pulsar/pull/19892
> > > >
> > > > The PR is not complete, but it does convey the general idea. I wrote
> > > > additional details in the draft's description.
> > > >
> > > > Thanks,
> > > > Michael
> > > >
> > > > On Tue, Mar 21, 2023 at 11:31 PM Michael Marshall 
> > > >  wrote:
> > > > >
> > > > > I am not following your objections to the protocol solution. It might
> > > > > be more productive if I provided a draft PR with a sample
> > > > > implementation. I'm not sure that I'll have time, but I'll try to put
> > > > > something together this week.
> > > > >
> > > > > > At least it will simplify the process of using cumulative ack 

A Message from the Board to PMC members

2023-03-29 Thread Rich Bowen
Dear Apache Project Management Committee (PMC) members,

The Board wants to take just a moment of your time to communicate a few
things that seem to have been forgotten by a number of PMC members,
across the Foundation, over the past few years.  Please note that this
is being sent to all projects - yours has not been singled out.

The Project Management Committee (PMC) as a whole[1] is tasked with the
oversight, health, and sustainability of the project. The PMC members
are responsible collectively, and individually, for ensuring that the
project operates in a way that is in line with ASF philosophy, and in a
way that serves the developers and users of the project.

The PMC Chair is not the project leader, in any sense. It is the person
who files board reports and makes sure they are delivered on time. It
is the secretary for the project, and the project’s  ambassador to the
Board of Directors. The VP title is given as an artifact of US
corporate law, and not because the PMC Chair has any special powers. If
you are treating your PMC Chair as the project lead, or granting them
any other special powers or privileges, you need to be aware that
that’s not the intent of the Chair role. The Chair is a PMC member peer
with a few extra duties.

Every PMC member has an equal voice in deliberations. Each has one
vote. Each has veto power. Every vote weighs the same. It is not only
your right, but it is your obligation, to use that vote for the good of
the project and its users, not to appease the Chair, your employer, or
any other voice in the project. 

Every PMC member can, and should, nominate new committers, and new PMC
members. This is not the sole domain of the PMC Chair. This might be
your most important responsibility to the project, as succession
planning is the path to sustainability.

Every PMC member can, and should, respond when the Board sends email to
your private list. You should not wait for the PMC Chair to respond.
The Board views the entire PMC as responsible for the project, not just
one member.

Every PMC member should be subscribed to the private@ mailing list. If
you are not, then you are neglecting your duty of oversight. If you no
longer wish to be responsible for oversight of the project, you should
resign your PMC seat, not merely drop off of the private@ list and
ignore it. You can determine which PMC members are not subscribed to
your private list by looking at your PMC roster at
https://whimsy.apache.org/roster/committee/  Names with an asterisk (*)
next to them are not subscribed to the list. We encourage you to take a
moment to contact them with this information.

Thank you for your attention to these matters, and thank you for
keeping our projects healthy.

Rich, for The Board of Directors

[1] https://apache.org/foundation/how-it-works.html#pmc-members



Re: [DISCUSS] Change PIP template

2023-03-29 Thread Asaf Mesika
So far only 1 PMC member reviewed it.
Any other PMC member would like to review the new template for PIP?

On Wed, Mar 22, 2023 at 1:10 PM Asaf Mesika  wrote:

> Any other PMC member can take a look at the new template PR
> ?
> Ideally I would like to have 2-3 PMC member approval for this.
>
>
> On 17 Mar 2023, at 18:23, Michael Marshall  wrote:
>
> Thanks for this initiative, Asaf.
>
> As part of this process, I would like for us to add a security and a
> multi-tenancy section to the PIP template.
>
> As you suggest, the template conveys what the community values, and
> these two sections must always be considered when changing Pulsar in
> fundamental ways.
>
> (Thanks for already adding the security section to your template!)
>
> Thanks,
> Michael
>
> On Thu, Mar 16, 2023 at 2:58 AM Asaf Mesika  wrote:
>
>
> Here's the PR to remove the form and add a new issue template in Markdown
> containing the suggested structure and description for each section.
>
> https://github.com/apache/pulsar/pull/19832
>
>
> On Wed, Mar 1, 2023 at 3:43 PM Elliot West
>  wrote:
>
> +1 Asaf
>
> I'd also suggest that we encourage the submission of relevant diagrams.
> This is trivial to do with the GitHub markdown editor, but I suspect is
> often neglected because users do not know the feature exists.
>
> On Wed, 1 Mar 2023 at 13:22, Asaf Mesika  wrote:
>
> Ok.
>
> I'll draft a PR and link it here when I'm done. Thanks!
>
> On Tue, Feb 28, 2023 at 7:08 AM PengHui Li  wrote:
>
> +1
>
> Penghui
>
> On Mon, Feb 27, 2023 at 9:24 PM Asaf Mesika 
>
> wrote:
>
>
> Mails don't support things like markdown diagrams or images and are
> generally less easy to read.
> My proposal includes a required section called Links in which you
>
> need
>
> to
>
> fill in the discussion thread in DEV mailing list and vote thread.
>
>
> On Mon, Feb 27, 2023 at 3:08 PM Girish Sharma <
>
> scrapmachi...@gmail.com
>
>
> wrote:
>
> Hi Asaf,
> I was referring to the PIP process, as a whole, as explained in
> https://github.com/apache/pulsar/blob/master/wiki/proposals/PIP.md
> Someone looking at GitHub ticket would find and almost empty PIP GH
>
> issue
>
> while the same PIP has had many discussions over here in the ML.
> There is scope of improvement in the process where we either remove
>
> the
>
> first step to create the PIP over at GitHub and directly present
>
> the
>
> PIP
>
> in
>
> the first mail of the thread here, or we do all discussions in GH.
> Both the ML and GH are searchable and linkable for tracking
>
> purposes.
>
>
> Regards
>
> On Mon, Feb 27, 2023 at 6:23 PM Asaf Mesika 
>
> wrote:
>
>
> On Sun, Feb 26, 2023 at 2:49 PM Girish Sharma <
>
> scrapmachi...@gmail.com
>
>
> wrote:
>
> Good proposal Asaf.
> I've also wondered why the PIP creation and discussion process
>
> is
>
> so
>
> separated. The PIP discussion and voting starts off as a GitHub
>
> issue,
>
> but
>
> all of its discussion happens here on the mailing list. Is
>
> there
>
> scope
>
> of
>
> improvement in that process as well?
>
>
> Not sure I follow. Can you outline the problem exactly?
>
>
>
> Regards
>
> On Sun, Feb 26, 2023 at 6:16 PM tison 
>
> wrote:
>
>
> Hi Asaf,
>
> I agree that, generally, a PIP is written as a whole and
>
> paste
>
> as
>
> the
>
> body.
>
> So +1 for your proposal.
>
> Additionally, I'm thinking of moving the doc of procedure
>
> (wiki/PIP.md)
>
> to
>
> the contributions guide and use the new markdown template to
>
> supersede
>
> the
>
> wiki/PIP-template.md. Then we don't need to hold the wiki
>
> folder.
>
>
> It can be an extended version to your proposal, so let's keep
>
> on
>
> your
>
> proposal in this thread. Just for your reference.
>
> Best,
> tison.
>
>
> Asaf Mesika  于2023年2月26日周日 19:18写道:
>
> Hi,
>
> I would like to suggest two changes I'd like to make to the
>
> PIP
>
> design
>
> template:
> 1. Remove the form - just have a markdown template fill the
>
> issue
>
> body
>
> as
>
> it is created.
> 2. Change the PIP template structure
>
> == Removing the form
>
> Today, when you want to submit a PIP, you are required to
>
> fill
>
> out
>
> a
>
> form
>
> with boxes composed of 3-4 lines length.
> It's not good because:
> * It broadcasts to the author: we want a very small PIP,
>
> something
>
> that
>
> fits those small boxes.
> * It makes the PIP look like a bug, where you fill out
>
> fields.
>
> * It doesn't allow having H2 headings, only H1 headings,
>
> thus
>
> limiting
>
> the
>
> structure.
>
> A PIP is a design essentially, something 1-3 pages long.
>
> Thus,
>
> people take the time to write it down. Preferably, they
>
> copy
>
> paste
>
> the
>
> body
>
> of the PIP issue, and use it to fill in sections.
>
> My suggestion is to define an issue template using only
>
> markdown,
>
> without a
>
> form.
>
> == Changing PIP Structure
>
> Today the structure of the PIP doc (pasted below), is
>
> missing a
>
> section
>
> and
>
> generally aims to jump directly into API changes / 

Re: [ANNOUNCE] Qiang Zhao as new PMC member in Apache Pulsar

2023-03-29 Thread ZhangJian He
Congratulations!

Thanks
ZhangJian He


On Wed, 29 Mar 2023 at 19:33, Haiting Jiang  wrote:

> Congratulations!
>
>
> Haiting
>
> On Wed, Mar 29, 2023 at 5:29 PM Cong Zhao  wrote:
> >
> > Congrats! Qiang.
> >
> >
> > Thanks,
> > Cong Zhao
> >
> > On 2023/03/29 03:22:43 guo jiwei wrote:
> > > Dear Community,
> > >
> > > We are thrilled to announce that Qiang Zhao
> > > (https://github.com/mattisonchao) has been invited and has accepted
> the
> > > role of member of the Apache Pulsar Project Management Committee (PMC).
> > >
> > > Qiang has been a vital asset to our community, consistently
> > > demonstrating his dedication and active participation through
> > > significant contributions. In addition to his technical contributions,
> > > Qiang also plays an important role in reviewing pull requests and
> > > ensuring the overall quality of our project. We look forward to his
> > > continued contributions.
> > >
> > > On behalf of the Pulsar PMC, we extend a warm welcome and
> > > congratulations to Qiang Zhao.
> > >
> > > Best regards
> > > Jiwei
> > >
>


Re: [ANNOUNCE] Qiang Zhao as new PMC member in Apache Pulsar

2023-03-29 Thread Haiting Jiang
Congratulations!


Haiting

On Wed, Mar 29, 2023 at 5:29 PM Cong Zhao  wrote:
>
> Congrats! Qiang.
>
>
> Thanks,
> Cong Zhao
>
> On 2023/03/29 03:22:43 guo jiwei wrote:
> > Dear Community,
> >
> > We are thrilled to announce that Qiang Zhao
> > (https://github.com/mattisonchao) has been invited and has accepted the
> > role of member of the Apache Pulsar Project Management Committee (PMC).
> >
> > Qiang has been a vital asset to our community, consistently
> > demonstrating his dedication and active participation through
> > significant contributions. In addition to his technical contributions,
> > Qiang also plays an important role in reviewing pull requests and
> > ensuring the overall quality of our project. We look forward to his
> > continued contributions.
> >
> > On behalf of the Pulsar PMC, we extend a warm welcome and
> > congratulations to Qiang Zhao.
> >
> > Best regards
> > Jiwei
> >


Re: Unstable codecov action

2023-03-29 Thread Lari Hotari
Update regarding Codecov improvements for apache/pulsar CI:
- fixed issue with Jacoco coverage data not getting stored in files:
  https://github.com/apache/pulsar/pull/19947
This seemed to be a broader issues since the reported total code coverage 
increased to about 72.8% with this fix, example 
https://app.codecov.io/gh/apache/pulsar/pull/19947/tree .

There's a workaround for Codecov upload issue in progress. More details in the 
comment
https://github.com/apache/pulsar/issues/19952#issuecomment-1487997039 .
This is waiting for ASF Infra to resolve 
https://issues.apache.org/jira/browse/INFRA-24399 .

After this, I believe that Codecov will be reasonably stable in our CI. Actions 
will be needed for individuals for adding a Codecov upload token for builds in 
personal forks. I'll add instructions for that while resolving #19952.

-Lari

On 2023/03/24 09:51:51 Lari Hotari wrote:
> Thanks for sharing the pain. That's the first step in improving something 
> that is painful.
> 
> For the flaky tests GitHub Actions workflow pulsar-ci-flaky.yaml, the Codecov 
> upload should be a separate job in the workflow so that the upload could be 
> retried separately without running all tests. This type of approach is 
> already used in the main GitHub Actions workflow, "Pulsar CI".
> Contributions are welcome!
> 
> We could also consider disabling codecov for pull request builds until 
> someone who cares about test code coverage metrics picks up the work. 
> 
> Code coverage is the first metric that most will ask about tests. It's not 
> the only metric that matter, but it is something that helps understand what 
> parts of the code isn't even run in our tests. It will also help plan 
> improvements to tests.
> 
> Codecov upload fails very frequently with errors such as 
> https://github.com/codecov/codecov-action/issues/837 and 
> https://github.com/codecov/codecov-action/issues/598
> One possible resolution is 
> https://community.codecov.com/t/upload-issues-unable-to-locate-build-via-github-actions-api/3954
>  .
> It's possible to make the codecov upload more stable by providing a token. 
> This should be done for the master branch build so that the baseline code 
> coverage metrics would succeed. For pull requests, the solution is to make 
> the codecov upload retryable also in pulsar-ci-flaky.yaml. In addition, it 
> could be made optional for builds in own forks.
> 
> We should find a way as a development community to get code coverage metrics 
> solution working. It is valuable even if an individual developer doesn't care 
> about it at the moment.
> We need more Pulsar contributors to stand up that care about the quality 
> aspects of our code base. Any volunteers?
> 
> -Lari
> 
> On 2023/03/21 10:50:17 tison wrote:
> > For example
> > https://github.com/apache/pulsar/actions/runs/4454158774/jobs/7867745340?pr=19842
> > 
> > I'm wondering if anyone cares about the report and if it helps you during
> > the coding or reviewing process? Now it generates a few of noise but I just
> > omit the report it gives ;-)
> > 
> > For the issue itself, it seems some artifacts don't retain properly.
> > 
> > Best,
> > tison.
> > 
> 


Re: [ANNOUNCE] Qiang Zhao as new PMC member in Apache Pulsar

2023-03-29 Thread Cong Zhao
Congrats! Qiang.


Thanks,
Cong Zhao

On 2023/03/29 03:22:43 guo jiwei wrote:
> Dear Community,
> 
> We are thrilled to announce that Qiang Zhao
> (https://github.com/mattisonchao) has been invited and has accepted the
> role of member of the Apache Pulsar Project Management Committee (PMC).
> 
> Qiang has been a vital asset to our community, consistently
> demonstrating his dedication and active participation through
> significant contributions. In addition to his technical contributions,
> Qiang also plays an important role in reviewing pull requests and
> ensuring the overall quality of our project. We look forward to his
> continued contributions.
> 
> On behalf of the Pulsar PMC, we extend a warm welcome and
> congratulations to Qiang Zhao.
> 
> Best regards
> Jiwei
> 


Re: [VOTE] PIP-254: Support configuring client version with a description suffix

2023-03-29 Thread Enrico Olivelli
+1 (binding)

Enrico

Il giorno mer 29 mar 2023 alle ore 09:16 丛搏  ha scritto:
>
> +1 (binding)
>
> Thanks,
> Bo
>
> Lin Lin  于2023年3月27日周一 17:49写道:
> >
> > +1
> >
> > Thanks,
> > Lin Lin
> >
> > On 2023/03/15 07:54:20 Yunze Xu wrote:
> > > Hi all,
> > >
> > > This thread is to start the vote for PIP-254.
> > >
> > > Discussion thread:
> > > https://lists.apache.org/thread/65cf7w76tt23sbsjnr8rpfxqf1nt9s9l
> > >
> > > PIP link: https://github.com/apache/pulsar/issues/19705
> > >
> > > Thanks,
> > > Yunze
> > >


Re: [DISCUSS] PIP-260: Client consumer filter received messages

2023-03-29 Thread Yunze Xu
It's better to describe how it could bring the benefit to transaction
use cases, since now it's designed to be a configuration related to
the transaction.

I've thought about some cases:
1. A consumer received N messages, then the cursor was reset to the earliest.
2. A consumer received N messages and acknowledged all of them, then
the cursor was reset to the earliest.
2.1 The acknowledgment is flushed
2.2 The acknowledgment is not flushed

Without this proposal, only 2.2 will filter all these messages because
the MessageIds are cached in
`PersistentAcknowledgmentsGroupingTracker#pendingIndividualAcks`. It's
an existing bug and I agree that we can discuss how to solve this in
another proposal. (e.g. distinguish the normal network issue and
cursor reset)

With this proposal and the option enabled, all these cases will filter
the messages. That's why I think we have to consider the case for
resetting cursors because it makes things worse.

The three cases above do not involve transaction operations. So it
would be better to understand the benefit if you can show some typical
cases involved with transaction operations.

Thanks,
Yunze

On Wed, Mar 29, 2023 at 12:02 PM 丛搏  wrote:
>
> Hi, all :
>
> Thanks to everyone who discussed it.
>
> Our current care points include the following aspects:
>
> 1. The filtering efficiency of the client consumer is not as
> good as doing something directly in startMessageId
> 2. Does not support reset cursor
>
> Because my previous PIP description is to add configuration
> in consumerBuilder. The definition of this configuration is not
> clear, and it will cause great trouble to users.
>
> We can add a separate configuration that is only used for
> acks with transactions. Simple example:
>
> ```
> ConsumerBuilder transactionConfiguration(ConsumerTransactionConfiguration);
>
> @Builder
> @Data
> @NoArgsConstructor
> @AllArgsConstructor
> @InterfaceAudience.Public
> @InterfaceStability.Stable
>
> public class ConsumerTransactionConfiguration {
>boolean isFilterReceivedMessagesEnabled = false;
> }
>
> ```
>
> if the design of startMessageId can provide the feature,
> we can remove the configuration, or currently has a startMessageId
> closed loop solution, I agree to use startMessageId.
>
> As for the reset cursor, I think it is another problem,
> not related to this PIP.
>
> Thanks,
> Bo
>
> 丛搏  于2023年3月24日周五 18:53写道:
> >
> > Hi, Michael:
> >
> > I thought about it carefully, and using 'startMessageId'
> > is indeed a good idea. But it is more complicated, we
> > need to ensure its absolute correctness, and take
> > performance into consideration. If you can come up
> >  with a closed-loop solution based on 'startMessageId',
> > I support you. If it can't take into account performance
> > and correctness, I think we will make a combination of
> > our two solutions. You are responsible for ensuring that
> > a certain degree of messages are not re-delivered, which
> >  reduces the overhead caused by the repeated delivery
> > of many messages. My design is responsible for
> > the final consistency.
> >
> > Thanks,
> > Bo
> >
> > Michael Marshall  于2023年3月22日周三 14:22写道:
> > >
> > > Because we already send the `startMessageId`, there is a chance where
> > > we might not even need to update the protocol for the
> > > CommandSubscribe. In light of that, I quickly put together a PR
> > > showing how that field might be used to inform the broker where to
> > > start the read position for the cursor.
> > >
> > > https://github.com/apache/pulsar/pull/19892
> > >
> > > The PR is not complete, but it does convey the general idea. I wrote
> > > additional details in the draft's description.
> > >
> > > Thanks,
> > > Michael
> > >
> > > On Tue, Mar 21, 2023 at 11:31 PM Michael Marshall  
> > > wrote:
> > > >
> > > > I am not following your objections to the protocol solution. It might
> > > > be more productive if I provided a draft PR with a sample
> > > > implementation. I'm not sure that I'll have time, but I'll try to put
> > > > something together this week.
> > > >
> > > > > At least it will simplify the process of using cumulative ack with the
> > > > > transaction.
> > > >
> > > > Is this the underlying motivation for the PIP?
> > > >
> > > > From my perspective, the PIP is seeking to decrease duplicate messages
> > > > experienced due to disconnections from the broker.
> > > >
> > > > > The problem of the resetting cursor can be optimized in the future
> > > >
> > > > Why should we push off solving this problem? It seems fundamental to
> > > > this PIP and should not be ignored. At the very least, I think we need
> > > > to have an idea of what the future solution would be before we defer
> > > > its implementation.
> > > >
> > > > Thanks,
> > > > Michael
> > > >
> > > >
> > > > On Tue, Mar 21, 2023 at 10:52 PM 丛搏  wrote:
> > > > >
> > > > > Hi, Michael
> > > > > > In this case, the consumer does not have the source of truth for the
> > > > > > readPosition. It would leave the new 

Re: [ANNOUNCE] Qiang Zhao as new PMC member in Apache Pulsar

2023-03-29 Thread Yunze Xu
Congratulations!

Thanks,
Yunze

On Wed, Mar 29, 2023 at 3:52 PM Lari Hotari  wrote:
>
> Congrats, Qiang Zhao! Well deserved!
>
> -Lari
>
> On 2023/03/29 03:22:43 guo jiwei wrote:
> > Dear Community,
> >
> > We are thrilled to announce that Qiang Zhao
> > (https://github.com/mattisonchao) has been invited and has accepted the
> > role of member of the Apache Pulsar Project Management Committee (PMC).
> >
> > Qiang has been a vital asset to our community, consistently
> > demonstrating his dedication and active participation through
> > significant contributions. In addition to his technical contributions,
> > Qiang also plays an important role in reviewing pull requests and
> > ensuring the overall quality of our project. We look forward to his
> > continued contributions.
> >
> > On behalf of the Pulsar PMC, we extend a warm welcome and
> > congratulations to Qiang Zhao.
> >
> > Best regards
> > Jiwei
> >


[Python] Should we make the schema default compatible with Java client?

2023-03-29 Thread Yunze Xu
Hi all,

Recently I found the default generated schema definition in the Python
client is different from the Java client, which leads to some
unexpected behavior.

For example, given the following class definition in Python:

```python
class Data(Record):
i = Integer()
```

The type of `i` field is a union: "type": ["null", "int"]

While given the following class definition in Java:

```java
class Data {
private final int i;
/* ... */
}
```

The type of `i` field is an integer: "type": "int"

It brings an issue that if a Python consumer subscribes to a topic
with schema defined above, then a Java producer will fail to create
because of the schema incompatibility.

Currently, the workaround is to change the schema compatibility
strategy to FORWARD.

Should we change the way to generate schema definition in the Python
client to be compatible with the Java client? It could bring breaking
changes to old Python clients, but it could guarantee compatibility
with the Java client.

If not, we still have to introduce an extra configuration to make
Python schema compatible with Java schema. But it requires code
changes. e.g. here is a possible solution:

```python
class Data(Record):
# NOTE: Users might have to add this extra field to control how to
generate the schema
__java_compatible = True
i = Integer()
```

Thanks,
Yunze


Re: [DISCUSS] forbid user to upload `BYTES` schema

2023-03-29 Thread Yufan Sheng
Hi SiNan,

In the flink world, we don't always rely on the schema information
provided by Pulsar or other connector systems. Flink application has
its own (de)serialization schema logic, which treats the messages only
in a binary format like a byte array.

In flink-connector-pulsar, we only use the schema when the users want
to do some evolution check. Otherwise, we will only send messages in
BYTES schema.

On Tue, Mar 28, 2023 at 10:06 AM SiNan Liu  wrote:
>
> Hi yufan.
> Can you describe a bit the usage scenario of byte schema in
> flink-connector-pulsa?
>
>
> Thanks,
> sinan
>
> Yufan Sheng  于 2023年3月28日周二 上午9:53写道:
>
> > As the flink-connector-pulsar developer, I don't want to disable the
> > BYTES schema upload. In my opinion, using BYTES schema means the users
> > want to bypass the schema check and handle the schema validation by
> > themselves.
> >
> > On Tue, Mar 28, 2023 at 8:58 AM SiNan Liu  wrote:
> > >
> > > Hi, everyone.
> > > When a user uploads bytes schema. We can warn the user and skip uploading
> > > bytes schema.
> > > Also check to see if the topic has a schema other than bytes.
> > > 1. If yes, warn the user that it is not necessary to upload bytes schema.
> > > You can subscribe to a topic using bytes schema.
> > > 2. If there is no schema, warn the user that the topic does not have a
> > > schema. The default is bytes schema, and there is no need to upload it.
> > > Rather than simply throwing an exception rejecting the upload bytes
> > schema.
> > >
> > >
> > > Thanks,
> > > sinan
> > >
> > >
> > > Christophe Bornet  于 2023年3月28日周二 上午1:15写道:
> > >
> > > > This change broke the Flink SQL Pulsar connector:
> > > > https://github.com/streamnative/flink/issues/270
> > > > So I propose to revert it.
> > > >
> > > > Le ven. 9 déc. 2022 à 11:57, labuladong  a
> > écrit :
> > > > >
> > > > > Hi pulsar community,
> > > > >
> > > > >
> > > > > I'd like to discuss the behavior of schema uploading, for more
> > context
> > > > see https://github.com/apache/pulsar/issues/18825
> > > > >
> > > > >
> > > > > I think that forbidding users to upload `BYTES` schema is a
> > recommended
> > > > way to solve this issue. But this may change the existing behavior, so
> > do
> > > > you have any suggestion about this issue?
> > > > >
> > > > >
> > > > > Thanks,
> > > > > Donglai
> > > >
> >


Re: [ANNOUNCE] Qiang Zhao as new PMC member in Apache Pulsar

2023-03-29 Thread Lari Hotari
Congrats, Qiang Zhao! Well deserved!

-Lari

On 2023/03/29 03:22:43 guo jiwei wrote:
> Dear Community,
> 
> We are thrilled to announce that Qiang Zhao
> (https://github.com/mattisonchao) has been invited and has accepted the
> role of member of the Apache Pulsar Project Management Committee (PMC).
> 
> Qiang has been a vital asset to our community, consistently
> demonstrating his dedication and active participation through
> significant contributions. In addition to his technical contributions,
> Qiang also plays an important role in reviewing pull requests and
> ensuring the overall quality of our project. We look forward to his
> continued contributions.
> 
> On behalf of the Pulsar PMC, we extend a warm welcome and
> congratulations to Qiang Zhao.
> 
> Best regards
> Jiwei
> 


Re: [VOTE] PIP-254: Support configuring client version with a description suffix

2023-03-29 Thread 丛搏
+1 (binding)

Thanks,
Bo

Lin Lin  于2023年3月27日周一 17:49写道:
>
> +1
>
> Thanks,
> Lin Lin
>
> On 2023/03/15 07:54:20 Yunze Xu wrote:
> > Hi all,
> >
> > This thread is to start the vote for PIP-254.
> >
> > Discussion thread:
> > https://lists.apache.org/thread/65cf7w76tt23sbsjnr8rpfxqf1nt9s9l
> >
> > PIP link: https://github.com/apache/pulsar/issues/19705
> >
> > Thanks,
> > Yunze
> >


Re: [ANNOUNCE] Qiang Zhao as new PMC member in Apache Pulsar

2023-03-29 Thread Nicolò Boschi
Congrats, well deserved!

Nicolò Boschi


Il giorno mer 29 mar 2023 alle ore 09:09 Zike Yang  ha
scritto:

> Congratulations! Qiang Zhao
>
> Best,
> Zike Yang
>
> On Wed, Mar 29, 2023 at 3:04 PM houxiaoyu  wrote:
> >
> > Congratulations !
> >
> > Hou Xiaoyu
> >
> > Enrico Olivelli  于2023年3月29日周三 15:03写道:
> >
> > > Congratulations !
> > >
> > > Well deserved !
> > >
> > > Enrico
> > >
> > > Il giorno mer 29 mar 2023 alle ore 06:17 Xiangying Meng
> > >  ha scritto:
> > > >
> > > >  Congrats! Qiang.
> > > >
> > > > Sincerely,
> > > > Xiangying
> > > >
> > > > On Wed, Mar 29, 2023 at 11:51 AM Yubiao Feng
> > > >  wrote:
> > > >
> > > > > Congrats! Qiang.
> > > > >
> > > > > Thanks
> > > > > Yubiao
> > > > >
> > > > > On Wed, Mar 29, 2023 at 11:22 AM guo jiwei 
> > > wrote:
> > > > >
> > > > > > Dear Community,
> > > > > >
> > > > > > We are thrilled to announce that Qiang Zhao
> > > > > > (https://github.com/mattisonchao) has been invited and has
> accepted
> > > the
> > > > > > role of member of the Apache Pulsar Project Management Committee
> > > (PMC).
> > > > > >
> > > > > > Qiang has been a vital asset to our community, consistently
> > > > > > demonstrating his dedication and active participation through
> > > > > > significant contributions. In addition to his technical
> > > contributions,
> > > > > > Qiang also plays an important role in reviewing pull requests and
> > > > > > ensuring the overall quality of our project. We look forward to
> his
> > > > > > continued contributions.
> > > > > >
> > > > > > On behalf of the Pulsar PMC, we extend a warm welcome and
> > > > > > congratulations to Qiang Zhao.
> > > > > >
> > > > > > Best regards
> > > > > > Jiwei
> > > > > >
> > > > >
> > >
>


Re: [ANNOUNCE] Qiang Zhao as new PMC member in Apache Pulsar

2023-03-29 Thread Zike Yang
Congratulations! Qiang Zhao

Best,
Zike Yang

On Wed, Mar 29, 2023 at 3:04 PM houxiaoyu  wrote:
>
> Congratulations !
>
> Hou Xiaoyu
>
> Enrico Olivelli  于2023年3月29日周三 15:03写道:
>
> > Congratulations !
> >
> > Well deserved !
> >
> > Enrico
> >
> > Il giorno mer 29 mar 2023 alle ore 06:17 Xiangying Meng
> >  ha scritto:
> > >
> > >  Congrats! Qiang.
> > >
> > > Sincerely,
> > > Xiangying
> > >
> > > On Wed, Mar 29, 2023 at 11:51 AM Yubiao Feng
> > >  wrote:
> > >
> > > > Congrats! Qiang.
> > > >
> > > > Thanks
> > > > Yubiao
> > > >
> > > > On Wed, Mar 29, 2023 at 11:22 AM guo jiwei 
> > wrote:
> > > >
> > > > > Dear Community,
> > > > >
> > > > > We are thrilled to announce that Qiang Zhao
> > > > > (https://github.com/mattisonchao) has been invited and has accepted
> > the
> > > > > role of member of the Apache Pulsar Project Management Committee
> > (PMC).
> > > > >
> > > > > Qiang has been a vital asset to our community, consistently
> > > > > demonstrating his dedication and active participation through
> > > > > significant contributions. In addition to his technical
> > contributions,
> > > > > Qiang also plays an important role in reviewing pull requests and
> > > > > ensuring the overall quality of our project. We look forward to his
> > > > > continued contributions.
> > > > >
> > > > > On behalf of the Pulsar PMC, we extend a warm welcome and
> > > > > congratulations to Qiang Zhao.
> > > > >
> > > > > Best regards
> > > > > Jiwei
> > > > >
> > > >
> >


Re: [ANNOUNCE] Qiang Zhao as new PMC member in Apache Pulsar

2023-03-29 Thread houxiaoyu
Congratulations !

Hou Xiaoyu

Enrico Olivelli  于2023年3月29日周三 15:03写道:

> Congratulations !
>
> Well deserved !
>
> Enrico
>
> Il giorno mer 29 mar 2023 alle ore 06:17 Xiangying Meng
>  ha scritto:
> >
> >  Congrats! Qiang.
> >
> > Sincerely,
> > Xiangying
> >
> > On Wed, Mar 29, 2023 at 11:51 AM Yubiao Feng
> >  wrote:
> >
> > > Congrats! Qiang.
> > >
> > > Thanks
> > > Yubiao
> > >
> > > On Wed, Mar 29, 2023 at 11:22 AM guo jiwei 
> wrote:
> > >
> > > > Dear Community,
> > > >
> > > > We are thrilled to announce that Qiang Zhao
> > > > (https://github.com/mattisonchao) has been invited and has accepted
> the
> > > > role of member of the Apache Pulsar Project Management Committee
> (PMC).
> > > >
> > > > Qiang has been a vital asset to our community, consistently
> > > > demonstrating his dedication and active participation through
> > > > significant contributions. In addition to his technical
> contributions,
> > > > Qiang also plays an important role in reviewing pull requests and
> > > > ensuring the overall quality of our project. We look forward to his
> > > > continued contributions.
> > > >
> > > > On behalf of the Pulsar PMC, we extend a warm welcome and
> > > > congratulations to Qiang Zhao.
> > > >
> > > > Best regards
> > > > Jiwei
> > > >
> > >
>


Re: [ANNOUNCE] Qiang Zhao as new PMC member in Apache Pulsar

2023-03-29 Thread Enrico Olivelli
Congratulations !

Well deserved !

Enrico

Il giorno mer 29 mar 2023 alle ore 06:17 Xiangying Meng
 ha scritto:
>
>  Congrats! Qiang.
>
> Sincerely,
> Xiangying
>
> On Wed, Mar 29, 2023 at 11:51 AM Yubiao Feng
>  wrote:
>
> > Congrats! Qiang.
> >
> > Thanks
> > Yubiao
> >
> > On Wed, Mar 29, 2023 at 11:22 AM guo jiwei  wrote:
> >
> > > Dear Community,
> > >
> > > We are thrilled to announce that Qiang Zhao
> > > (https://github.com/mattisonchao) has been invited and has accepted the
> > > role of member of the Apache Pulsar Project Management Committee (PMC).
> > >
> > > Qiang has been a vital asset to our community, consistently
> > > demonstrating his dedication and active participation through
> > > significant contributions. In addition to his technical contributions,
> > > Qiang also plays an important role in reviewing pull requests and
> > > ensuring the overall quality of our project. We look forward to his
> > > continued contributions.
> > >
> > > On behalf of the Pulsar PMC, we extend a warm welcome and
> > > congratulations to Qiang Zhao.
> > >
> > > Best regards
> > > Jiwei
> > >
> >