Re: [PIP-78] Reduce redundant producers from partitioned producer

2021-12-01 Thread Yuri Mizushima
Do you have any comments?
If there are no comments by Dec. 7, I will close the discussion and rebase the 
PR commit to current master.

Regards,

-- 
Yuri Mizushima
yumiz...@yahoo-corp.jp
 

On 2021/11/16 15:46, "Yuri Mizushima"  wrote:

Dear Pulsar community,

I have created a new PR 
https://jpn01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Fpulsar%2Fpull%2F12401data=04%7C01%7Cyumizush%40yahoo-corp.jp%7Cd1cf3f01dc4b48994a0008d9a8cc7154%7Ca208d369cd4e4f87b11998eaf31df2c3%7C1%7C0%7C637726419802266126%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000sdata=Sn57dCx1iItNEMArKWAXxE9frCbmQClw%2Fts0nmMkwyw%3Dreserved=0
 for stats aggregation,
but I didn't discuss about the wire protocol change. I hope we will discuss 
it here.

Currently, partitioned producer can't aggregate by any key such as cnx, 
producerId, producerName, and so on.
I think we need to add any aggregation system.
Therefore, added new aggregation policy as producerName (with client side 
implementation).

New protocol field partial_producer_supported is not used for stats 
aggregation. It is used for backward compatibility.

https://jpn01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Fpulsar%2Fpull%2F12401%2Ffiles%23diff-f29399fed32e0916cf28452ba71078a3ae5ed77bbaef9f52a925165d8ee66cfdR489data=04%7C01%7Cyumizush%40yahoo-corp.jp%7Cd1cf3f01dc4b48994a0008d9a8cc7154%7Ca208d369cd4e4f87b11998eaf31df2c3%7C1%7C0%7C637726419802266126%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000sdata=QswqgZHSeHJArmgAdBx1wPrm6rX4l4lTf39hs57g6NI%3Dreserved=0

In my understanding, if introduce new stats aggregation key to client side,
need a way to determine whether the feature is enabled at client side.
For example, whether the producer has specific field or metadata,
the version (e.g. protocol version) is greater than threshold, etc.

Of course, if we can introduce aggregation feature without adding any new 
key or implementations from client side,
we can support the feature not only new client but also old one.

I'm looking forward to your opinions or suggestions to this PR.

Regards,
--
Yuri Mizushima
yumiz...@yahoo-corp.jp


On 2021/05/11 14:26, "Yuri Mizushima"  wrote:


Dear Pulsar Community,

> I will submit the next PR about PartitionedTopicStats later.
I submitted the next PR for this PIP. If you have any suggestions, please 
comment to this PR.

https://jpn01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Fpulsar%2Fpull%2F10534data=04%7C01%7Cyumizush%40yahoo-corp.jp%7Cd1cf3f01dc4b48994a0008d9a8cc7154%7Ca208d369cd4e4f87b11998eaf31df2c3%7C1%7C0%7C637726419802266126%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000sdata=cHqzaY1H4N%2FaQEemy5ZRQSBAiXa3FDXDuFLyoJ4dRtA%3Dreserved=0

Regards,

--
Yuri Mizushima
yumiz...@yahoo-corp.jp


"Yuri Mizushima"  wrote:

Dear Pulsar Community,

I submitted the PR for this PIP.

https://jpn01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Fpulsar%2Fpull%2F10279data=04%7C01%7Cyumizush%40yahoo-corp.jp%7Cd1cf3f01dc4b48994a0008d9a8cc7154%7Ca208d369cd4e4f87b11998eaf31df2c3%7C1%7C0%7C637726419802266126%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000sdata=AzmPY98arKmn7UM1%2F6FJMwOKOavPVC5xVG%2Br5FBJlH4%3Dreserved=0

This is a part of implementations.
I will submit the next PR about PartitionedTopicStats later.

Regards,
--
Yuri Mizushima
yumiz...@yahoo-corp.jp


"Yuri Mizushima"  wrote:

Sijie,

After sending previous mail, I watched meeting recording and 
understand about authn/authz issue.
Therefore, I updated the PIP document.

https://jpn01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Fpulsar%2Fwiki%2FPIP-79%253A-Reduce-redundant-producers-from-partitioned-producerdata=04%7C01%7Cyumizush%40yahoo-corp.jp%7Cd1cf3f01dc4b48994a0008d9a8cc7154%7Ca208d369cd4e4f87b11998eaf31df2c3%7C1%7C0%7C637726419802266126%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000sdata=lZrLE4EhYfy7xUNmfUraIchYOkH%2B0HQaeVcKGaTBWLw%3Dreserved=0

Regards,
--
Yuri Mizushima
yumiz...@yahoo-corp.jp


"Yuri Mizushima"  wrote:

Sijie,

> If the lazy-loading approach sounds attractive to you and you 
like it,
> maybe the next step is to update the PIP, what do you think?

I think so too. I will update the PIP after discussing the 
authn/authz issue.

Regards,
--
Yuri Mizushima
  

[Vote] PIP 104: Add new consumer type: TableView

2021-12-01 Thread Neng Lu
Hi Pulsar Community,

I would like to start a VOTE on the Pulsar TableView consumer (PIP 104).

The issue for PIP 104 is here: https://github.com/apache/pulsar/issues/12356
And the prototype implementation is here:
https://github.com/apache/pulsar/pull/12838


Please VOTE within 72 hours.

Best Regards,
Neng Lu


Re: [DISCUSS] Pulsar Protocol For Client Timeouts and Creating Producers

2021-12-01 Thread Michael Marshall
Following up here, I am still in need of reviews for PR [0]. It
introduces an important clarification to the Pulsar Protocol Spec.
Please take a look, if you are able.

[0] - https://github.com/apache/pulsar/pull/12948

Thanks!
Michael

On Tue, Nov 23, 2021 at 1:10 PM Michael Marshall  wrote:
>
> I created a PR to update the protocol's documentation:
> https://github.com/apache/pulsar/pull/12948. Please take a look, if
> you're able.
>
> Once this PR is accepted/merged, I will follow up with an update to
> the Java client.
>
> - Michael
>
> On Thu, Nov 18, 2021 at 1:29 PM Michael Marshall  
> wrote:
> >
> > I view this as an edge case of the Pulsar Protocol that requires
> > clarification. Once we clarify the spec, we can update the clients to
> > conform to the spec. I don't think we need to give end users control
> > over this small part of the protocol.
> >
> > After reviewing the design a bit more, I think we should update the
> > Java client to send the `CloseProducer` command, as well as update the
> > spec to recommend this implementation. While the `ServerCnx` class
> > "works" both ways, the current Java client implementation leads to
> > unnecessary calls, unnecessary errors, and a longer time to recovery.
> >
> > Specifically, if the client fails to send a `CloseProducer` command,
> > it ends up getting into a sequence of retries where each new
> > `Producer` command receives an immediate `ErrorResponse` because the
> > `ServerCnx` already has a pending producer. By sending a
> > `CloseProducer` command, the client gives the broker permission to
> > stop keeping track of the original create producer request. It also
> > means that if the topic eventually loads, the broker will respond to
> > the right request id with a `ProducerSuccessResponse` command.
> >
> > I will follow up with an update to the client and the protocol spec,
> > unless there are any objections.
> >
> > Thanks,
> > Michael
> >
> > On Thu, Nov 18, 2021 at 12:25 PM Neng Lu  wrote:
> > >
> > > How about making the behavior when timeout configurable? And by default, 
> > > it will send the `CloseProducer` cmd?
> > >
> > > On 2021/11/17 05:52:21 Michael Marshall wrote:
> > > > Hi All,
> > > >
> > > > I noticed that the `ServerCnxTest#testCreateProducerTimeout` test
> > > > indicates that a producer will send a `CloserProducer` command when
> > > > producer creation times out for the client.
> > > >
> > > > The Java client does not follow this protocol. When the producer
> > > > creation times out, it just schedules another attempt to create the
> > > > producer.
> > > >
> > > > The C++ client does follow this protocol and is annotated with the
> > > > following comment:
> > > >
> > > > > // Creating the producer has timed out. We need to ensure the broker 
> > > > > closes the producer
> > > > > // in case it was indeed created, otherwise it might prevent new 
> > > > > create producer operation,
> > > > > // since we are not closing the connection
> > > >
> > > > I don't see anything in our official protocol spec indicating the
> > > > "right" behavior. Given the test coverage, it seems like the initial
> > > > design was to expect a `CloseProducer` command. However, I also see that
> > > > the broker's `ServerCnx` class will reply to a redundant `Producer`
> > > > command with a `ProducerSuccess` command, as long as the producer
> > > > is already created.
> > > >
> > > > Should I submit a PR to update the Java client to send a
> > > > `CloseProducer` command when a `Producer` command times out?
> > > >
> > > > Thanks,
> > > > Michael
> > > >


Re: [DISCUSS] Mitigate possibility of out of order messages

2021-12-01 Thread Michael Marshall
Following up on this, the PR to fix the Java Client [0] is still open
and needs reviews. Please take a look, if you're able.

[0] - https://github.com/apache/pulsar/pull/12779

Thanks!
Michael
On Fri, Nov 12, 2021 at 5:35 PM Michael Marshall  wrote:
>
> Hi Pulsar Community,
>
> I discovered a race condition in Pulsar’s Java Client ProducerImpl
> that can lead to messages persisted out-of-order for a single producer
> sending to a non-partitioned topic. I can reproduce this issue, and I
> verified the order by adding sequence ids to the message payload
> before calling `producer.send`. I opened a PR to fix the race [0] and
> another to improve the broker’s behavior [1].
>
> At a high level, the ProducerImpl can get into a corrupt state if it
> switches connections too quickly. In this corrupt state, the producer
> can send messages before, during, and after the producer is registered
> to the broker. Because the broker ignores messages until a producer is
> created for the ServerCnx, some of the early messages are ignored and,
> once the producer is created, some later ones are persisted.
>
> In PR [1], I propose that when a broker gets an unexpected message
> (Send command), it should close the connection to protect against
> clients that are not following the protocol instead of simply ignoring
> unexpected messages. The protocol already states that clients are to
> register producers and then start sending messages [2]. It does not
> state what happens if a client does not follow this part of the
> protocol.
>
> One tradeoff for this implementation is that when the broker initiates
> closing a producer, there is a chance that the whole connection will
> get closed if the producer has messages in flight. I think this is a
> reasonable tradeoff to ensure that clients not following the protocol
> are not able to persist messages out-of-order.
>
> From my perspective, this is the simplest solution that will ensure
> message order is preserved. Alternatively, we could come up with logic
> to try to handle messages sent to "recently" closed producers, but
> that would greatly increase the complexity for this edge case. Note
> that it is not sufficient to reply to each message with a SendError
> because the producer may have already sent later messages and those
> could be persisted if the producer is concurrently being created. Note
> also that when the Java Client producer receives a generic SendError,
> it reacts by closing the connection in most cases.
>
> I include more detail in each of the PRs. I look forward to your feedback.
>
> Thanks,
> Michael
>
> [0] - https://github.com/apache/pulsar/pull/12779
> [1] - https://github.com/apache/pulsar/pull/12780
> [2] - https://pulsar.apache.org/docs/en/develop-binary-protocol/#producer