>> If it contains namespace policy replication, There are some policies no
need to replicate to another cluster
yes, local policies doesn't need to be replicate to other clusters and it
will only replicate global policies which is shared across multiple
clusters such tenant/namespace's identity-creation, ACLs, replication
clusters, etc.

>> The new partitioned topic also needs to be replicated to the remote
cluster?
Yes.

Topic that will be used to share policies across clusters is configurable
and it can be named anything. However, we should keep it a separate topic
as it requires unique schema and special handling to synchronize policies
across the clusters.

Thanks,
Rajan

On Fri, Mar 18, 2022 at 9:12 PM PengHui Li <peng...@apache.org> wrote:

> Hi Rajan,
>
> Thanks for the great proposal.
>
> Will all the namespace policies be replicated to the remote cluster?
> I noticed the PIP title mentioned policies, but looks like from the
> `MetadataChangeEvent`,
> no namespaces policies defined. If it contains namespace policy
> replication,
> There are some policies no need to replicate to another cluster,
> for example, the rate limiter, max producers/consumers limiter.
> In
>
> https://github.com/apache/pulsar/wiki/PIP-92%3A-Topic-policy-across-multiple-clusters
> ,
> it introduced a --global option to provide ability to apply the policy in
> global or local.
>
> The new partitioned topic also needs to be replicated to the remote
> cluster?
>
> Currently, we already have a PulsarEvent struct to define the pulsar system
> events,
> Looks like we can use a unified event definition by PulsarEvent.
>
> Others look good to me.
>
> Regards,
> Penghui
>
>
>
> On Sat, Mar 19, 2022 at 1:32 AM Joe F <joefranc...@gmail.com> wrote:
>
> > +1
> >
> > On Thu, Mar 17, 2022 at 12:07 PM Rajan Dhabalia <rdhaba...@apache.org>
> > wrote:
> >
> > > Hi,
> > >
> > > I would like to start VOTE on PIP-136:
> > > https://github.com/apache/pulsar/issues/13728
> > >
> > > Thanks,
> > > Rajan
> > >
> > > On Tue, Feb 8, 2022 at 4:58 PM Rajan Dhabalia <dhabalia...@gmail.com>
> > > wrote:
> > >
> > > >
> > > > >> How do we designate the host broker? Is it manual? How does it
> work
> > > > when the host broker is removed from the cluster?
> > > > No, it will not be manual but as I explained earlier a broker which
> > has a
> > > > failover consumer to consume remote events will be the publisher for
> > > > metadata update. If that broker is removed then a new failover
> > > > consumer/broker will be selected for the same.
> > > >
> > > > >> I look forward to seeing more about this design for conflict
> > > resolution.
> > > > Sure, I have updated PIP to handle such race condition:
> > > https://github.com/apache/pulsar/issues/13728
> > > >
> > > >
> > > > >> (1) scenarios where the Pulsar cluster operators and tenant admins
> > > are
> > > > different entities and tenants can be malicious, or more probably,
> > write
> > > > bad code that will produce malicious outcomes.
> > > > I agree, Pulsar should have provision to prevent such scenarios where
> > > > changes from one tenant in a cluster can impact other clusters. This
> > PIP
> > > > considers the tenant/admin will be the same at both the ends but that
> > can
> > > > not be true in all cases. We can add an enhancement later or we can
> > > create
> > > > a separate PIP to start discussion with the possible solutions.
> > > >
> > > > Thanks,
> > > > Rajan
> > > >
> > > > On Thu, Feb 3, 2022 at 9:59 AM Joe F <joefranc...@gmail.com> wrote:
> > > >
> > > >> >On my first reading, it wasn't clear if there was only one topic
> > > >> required for this feature. I now see that the topic is not tied to a
> > > >> specific tenant or namespace. As such, we can avoid complicated
> > > >> authorization questions by putting the required event topic(s) into
> a
> > > >> "system" tenant and namespace
> > > >>
> > > >> We should consider complicated questions. We can say why we chose
> not
> > to
> > > >> address it, or why it does not apply. for a particular situation
> > > >>
> > > >> Many namespace policies are administered by tenants.  As such any
> > tenant
> > > >> can load this topic.  Is it possible for one abusive tenant to make
> > your
> > > >> system topic dysfunctional?
> > > >>
> > > >> Pulsar committers should think about
> > > >> (1) scenarios where the Pulsar cluster operators and tenant admins
> > are
> > > >> different entities and tenants can be malicious, or more probably,
> > write
> > > >> bad code that will produce malicious outcomes.
> > > >> (2) whether the changes introduce  additional SPOFs into the
> cluster.
> > > >>
> > > >> I don't think this PIP has those issues, but  as a matter of
> > practice, I
> > > >> would like to see backend/system PIPs consider these questions  and
> > > >> explicitly state the conclusions with rationale
> > > >>
> > > >> Joe
> > > >>
> > > >>
> > > >> On Wed, Feb 2, 2022 at 9:27 PM Michael Marshall <
> mmarsh...@apache.org
> > >
> > > >> wrote:
> > > >>
> > > >> > Thanks for your responses.
> > > >> >
> > > >> > > I don't see a need of protobuf for this particular usecase
> > > >> >
> > > >> > If no one else feels strongly on this point, I am good with using
> a
> > > >> POJO.
> > > >> >
> > > >> > > It doesn't matter if it's system-topic or not because it's
> > > >> > > configurable and the admin of the system can decide and
> configure
> > it
> > > >> > > according to the required persistent policy.
> > > >> >
> > > >> > On my first reading, it wasn't clear if there was only one topic
> > > >> > required for this feature. I now see that the topic is not tied
> to a
> > > >> > specific tenant or namespace. As such, we can avoid complicated
> > > >> > authorization questions by putting the required event topic(s)
> into
> > a
> > > >> > "system" tenant and namespace, by default. The `pulsar/system`
> > tenant
> > > >> > and namespace seem appropriate to me.
> > > >> >
> > > >> > > I would keep the system topic
> > > >> > > separate because this topic serves a specific purpose with
> > specific
> > > >> > schema,
> > > >> > > replication policy and retention policy.
> > > >> >
> > > >> > I think we need a more formal definition for system topics. This
> > topic
> > > >> > is exactly the kind of topic I would call a system topic: its
> > intended
> > > >> > producers and consumers are Pulsar components. However, because
> > > >> > this feature can live on a topic in a system namespace, we can
> avoid
> > > >> > the classification discussion for this PIP.
> > > >> >
> > > >> > > Source region will have a broker which will create a failover
> > > >> consumer on
> > > >> > > that topic and a broker with an active consumer will watch the
> > > >> metadata
> > > >> > > changes and publish the changes to the event topic.
> > > >> >
> > > >> > How do we designate the host broker? Is it manual? How does it
> work
> > > >> > when the host broker is removed from the cluster?
> > > >> >
> > > >> > If we collocate the active consumer with the broker hosting the
> > event
> > > >> > topic, can we skip creating the failover consumer?
> > > >> >
> > > >> > > PIP briefly talks about it but I will update the PIP with more
> > > >> > > explanation.
> > > >> >
> > > >> > I look forward to seeing more about this design for conflict
> > > resolution.
> > > >> >
> > > >> > Thanks,
> > > >> > Michael
> > > >> >
> > > >> >
> > > >> >
> > > >> > On Tue, Feb 1, 2022 at 3:01 AM Rajan Dhabalia <
> > dhabalia...@gmail.com>
> > > >> > wrote:
> > > >> > >
> > > >> > > Please find my response inline.
> > > >> > >
> > > >> > > On Mon, Jan 31, 2022 at 9:17 PM Michael Marshall <
> > > >> mmarsh...@apache.org>
> > > >> > > wrote:
> > > >> > >
> > > >> > > > I think this is a very appropriate direction to take Pulsar's
> > > >> > > > geo-replication. Your proposal is essentially to make the
> > > >> > > > inter-cluster configuration event driven. This increases fault
> > > >> > > > tolerance and better decouples clusters.
> > > >> > > >
> > > >> > > > Thank you for your detailed proposal. After reading through
> it,
> > I
> > > >> have
> > > >> > > > some questions :)
> > > >> > > >
> > > >> > > > 1. What do you think about using protobuf to define the event
> > > >> > > > protocol? I know we already have a topic policy event stream
> > > >> > > > defined with Java POJOs, but since this feature is
> specifically
> > > >> > > > designed for egressing cloud providers, ensuring compact data
> > > >> transfer
> > > >> > > > would keep egress costs down. Additionally, protobuf can help
> > make
> > > >> it
> > > >> > > > clear that the schema is strict, should evolve thoughtfully,
> and
> > > >> > > > should be designed to work between clusters of different
> > versions.
> > > >> > > >
> > > >> > >
> > > >> > >  >>> I don't see a need of protobuf for this particular usecase
> > > >> because
> > > >> > of
> > > >> > > two reasons:
> > > >> > >   >> a. policy changes don't generate huge traffic which could
> be
> > 1
> > > >> rps
> > > >> > b.
> > > >> > > and it doesn't need performance optimization.
> > > >> > >   >> It should be similar as storing policy in text instead
> > protobuf
> > > >> > which
> > > >> > > doesn't impact footprint size or performance due to limited
> number
> > > of
> > > >> > >  >> update operations and relatively less complexity. I agree
> that
> > > >> > protobuf
> > > >> > > could be another option but in this case it's not needed. Also,
> > POJO
> > > >> > >  >> can also support schema and versioning.
> > > >> > >
> > > >> > >
> > > >> > >
> > > >> > > >
> > > >> > > > 2. In your view, which tenant/namespace will host
> > > >> > > > `metadataSyncEventTopic`? Will there be several of these
> topics
> > or
> > > >> is
> > > >> > > > it just hosted in a system tenant/namespace? This question
> gets
> > > back
> > > >> > > > to my questions about system topics on this mailing list last
> > week
> > > >> > [0]. I
> > > >> > > > view this topic as a system topic, so we'd need to make sure
> > that
> > > it
> > > >> > > > has the right authorization rules and that it won't be
> affected
> > by
> > > >> > calls
> > > >> > > > like "clearNamespaceBacklog".
> > > >> > >
> > > >> > >
> > > >> > >   >> It doesn't matter if it's system-topic or not because it's
> > > >> > > configurable and the admin of the system can decide and
> configure
> > it
> > > >> > > according to the required persistent policy. I would keep the
> > system
> > > >> > topic
> > > >> > > separate because this topic serves a specific purpose with
> > specific
> > > >> > schema,
> > > >> > > replication policy and retention policy.
> > > >> > >
> > > >> > >
> > > >> > >
> > > >> > > >
> > > >> > > > 3. Which broker will host the metadata update publisher? I
> > assume
> > > we
> > > >> > > > want the producer to be collocated with the bundle that hosts
> > the
> > > >> > > > event topic. How will this be coordinated?
> > > >> > > >
> > > >> > > >> It's already explained into PIP in section: "Event publisher
> > and
> > > >> > handler"
> > > >> > > >> Every isolated cluster deployed on a separate cloud platform
> > will
> > > >> > have a
> > > >> > > source region and part of replicated clusters for the event
> topic.
> > > The
> > > >> > > Source region will have a broker which will create a failover
> > > >> consumer on
> > > >> > > that topic and a broker with an active consumer will watch the
> > > >> metadata
> > > >> > > changes and publish the changes to the event topic.
> > > >> > >
> > > >> > >
> > > >> > >
> > > >> > > >
> > > >> > > > 4. Why isn't a topic a `ResourceType`? Is this because the
> topic
> > > >> level
> > > >> > > > policies already have this feature? If so, is there a way to
> > > >> integrate
> > > >> > > > this feature with the existing topic policy feature?
> > > >> > > >
> > > >> > > >> Yes, ResourceType can be extensible to a topic as well.
> > > >> > >
> > > >> > >
> > > >> > >
> > > >> > > >
> > > >> > > > 5. By decentralizing the metadata store, it looks like there
> is
> > a
> > > >> > > > chance for conflicts due to concurrent updates. How do we
> handle
> > > >> those
> > > >> > > > conflicts?
> > > >> > > >
> > > >> > > >>  PIP briefly talks about it but I will update the PIP with
> more
> > > >> > > explanation. MetadataChangeEvent contains source-cluster and
> > updated
> > > >> > time.
> > > >> > > Also, resources Tenant/Namespace will also contain
> lastUpdatedTime
> > > >> which
> > > >> > > will help to destination clusters to handle stale/duplicate
> events
> > > and
> > > >> > race
> > > >> > > conditions. Also, snapshot-sync an additional task helps all
> > > clusters
> > > >> to
> > > >> > be
> > > >> > > synced with each other eventually.
> > > >> > >
> > > >> > >
> > > >> > >
> > > >> > > > I'll also note that I previously proposed a system event topic
> > > here
> > > >> > > > [1] and it was proposed again here [2]. Those features were
> for
> > > >> > > > different use cases, but ultimately looked very similar. In my
> > > >> view, a
> > > >> > > > stream of system events is a very natural feature to expect
> in a
> > > >> > > > streaming technology. I wonder if there is a way to generalize
> > > this
> > > >> > > > feature to fulfill local cluster consumers and geo-replication
> > > >> > > > consumers. Even if this PIP only implements the
> geo-replication
> > > >> > > > portion of the feature, it'd be good to design it in an
> > extensible
> > > >> > fashion.
> > > >> > > >
> > > >> > >  >> I think answer (2) addresses this concern as well.
> > > >> > >
> > > >> > >
> > > >> > >
> > > >> > > > Thanks,
> > > >> > > > Michael
> > > >> > > >
> > > >> > > > [0]
> > > >> https://lists.apache.org/thread/pj4n4wzm3do8nkc52l7g7obh0sktzm17
> > > >> > > > [1]
> > > >> https://lists.apache.org/thread/h4cbvwjdomktsq2jo66x5qpvhdrqk871
> > > >> > > > [2]
> > > >> https://lists.apache.org/thread/0xkg0gpsobp0dbgb6tp9xq097lpm65bx
> > > >> > > >
> > > >> > > >
> > > >> > > >
> > > >> > > > On Sun, Jan 30, 2022 at 10:33 PM Rajan Dhabalia <
> > > >> rdhaba...@apache.org>
> > > >> > > > wrote:
> > > >> > > > >
> > > >> > > > > Hi,
> > > >> > > > >
> > > >> > > > > I would like to start a discussion about PIP-136: Sync
> Pulsar
> > > >> > policies
> > > >> > > > > across multiple clouds.
> > > >> > > > >
> > > >> > > > > PIP documentation:
> > > https://github.com/apache/pulsar/issues/13728
> > > >> > > > >
> > > >> > > > > *Motivation*
> > > >> > > > > Apache Pulsar is a cloud-native, distributed messaging
> > framework
> > > >> > which
> > > >> > > > > natively provides geo-replication. Many organizations deploy
> > > >> pulsar
> > > >> > > > > instances on-prem and on multiple different cloud providers
> > and
> > > at
> > > >> > the
> > > >> > > > same
> > > >> > > > > time they would like to enable replication between multiple
> > > >> clusters
> > > >> > > > > deployed in different cloud providers. Pulsar already
> provides
> > > >> > various
> > > >> > > > > proxy options (Pulsar proxy/ enterprise proxy solutions on
> > SNI)
> > > to
> > > >> > > > fulfill
> > > >> > > > > security requirements when brokers are deployed on different
> > > >> security
> > > >> > > > zones
> > > >> > > > > connected with each other. However, sometimes it's not
> > possible
> > > to
> > > >> > share
> > > >> > > > > metadata-store (global zookeeper) between pulsar clusters
> > > >> deployed on
> > > >> > > > > separate cloud provider platforms, and synchronizing
> > > configuration
> > > >> > > > metadata
> > > >> > > > > (policies) can be a critical path to share
> > > tenant/namespace/topic
> > > >> > > > policies
> > > >> > > > > between clusters and administrate pulsar policies uniformly
> > > across
> > > >> > all
> > > >> > > > > clusters. Therefore, we need a mechanism to sync
> configuration
> > > >> > metadata
> > > >> > > > > between clusters deployed on the different cloud platforms.
> > > >> > > > >
> > > >> > > > > *Sync Pulsar policies across multiple clouds*
> > > >> > > > > https://github.com/apache/pulsar/issues/13728
> > > >> > > > > Prototype git-hub-link
> > > >> > > > > <
> > > >> > > >
> > > >> >
> > > >>
> > >
> >
> https://github.com/rdhabalia/pulsar/commit/e59803b942918076ce6376b50b35ca827a49bcf6
> > > >> > > > >
> > > >> > > > > Thanks,
> > > >> > > > > Rajan
> > > >> > > >
> > > >> >
> > > >>
> > > >
> > >
> >
>

Reply via email to