Hi Girish,

>But the reason why you've raised this PIP is to bring down the actual
replication semantics at a topic level. Yes, namespace level still exists
as per your PIP as well, but is basically left only to be a "default in
case topic level is missing".

I'm afraid there's some misunderstanding here. According to the Pulsar
website, replication can actually be enabled at the topic level.

>You can enable geo-replication at namespace or topic level. [1]

So, it's not this proposal that introduces topic-level replication
semantics. Prior to this, topic-level replication was constrained by the
namespace-level replication policy. There are just some problems here. If
replication was not configured at the namespace level, then topic-level
replication would also be ineffective. Moreover, users would not be aware
of this replication failure.

> Yes, namespace level still exists as per your PIP as well, but is
basically left only to be a "default in case topic level is missing".

This behavior is consistent with the current behavior of Pulsar and is not
something introduced by this proposal.
This proposal introduces an `allowed-cluster` configuration at the
namespace level.
As the website states, you can enable replication at either the namespace
or topic level.
But If you only enable replication at the topic level, the replication
configuration would not take effect prior to this proposal.

Before this proposal: even though the topic policy can be updated
successfully, topic1 cannot be created in cluster2.
```
Namespace policy {replication clusters -> local cluster(cluster1)}, topic1
policy {replication clusters {cluster1, cluster2}}
```
After this proposal: you can set allowed clusters at the namespace level,
which specifies the clusters where topics under this namespace are allowed.
Then, the topic-level replication would also be effective, as described on
the Pulsar website.
```
Namespace policy {replication clusters -> local cluster(cluster1), allowed
clusters -> {cluster1, cluster2, cluster3}}, topic1 policy {replication
clusters {cluster1, cluster2}}
```

[1]
https://pulsar.apache.org/docs/3.1.x/administration-geo/#enable-geo-replication


On Wed, Dec 6, 2023 at 7:57 PM Xiangying Meng <xiangy...@apache.org> wrote:

> Hi Girish,
>
> Thank you for your explanation. Because Joe's email referenced the current
> implementation of Pulsar, I misunderstood him to be saying that this
> current implementation is not good.
>
> A possible use case is where there is one or a small number of topics in
> the namespace that store important messages, which need to be replicated to
> other clusters. Meanwhile, other topics only need to store data in the
> local cluster.
>
> For example, only topic1 needs replication, while topic2 to topic100 do
> not. According to the current implementation, we need to set replication
> clusters at the namespace level (e.g. cluster1 and cluster2), and then set
> the topic-level replication clusters (cluster1) for topic2 to topic100 to
> exclude them. It's hard to say that this is a good design.
>
> Best regards.
>
> On Wed, Dec 6, 2023 at 12:49 PM Joe F <joefranc...@gmail.com> wrote:
>
>> Girish,
>>
>> Thank you for making my point much better than I did ..
>>
>> -Joe
>>
>> On Tue, Dec 5, 2023 at 1:45 AM Girish Sharma <scrapmachi...@gmail.com>
>> wrote:
>>
>> > Hello Xiangying,
>> >
>> > I believe what Joe here is referring to as "application design" is not
>> the
>> > design of pulsar or namespace level replication but the design of your
>> > application and the dependency that you've put on topic level
>> replication.
>> >
>> > In general, I am aligned with Joe from an application design
>> standpoint. A
>> > namespace is supposed to represent a single application use case, topic
>> > level override of replication clusters helps in cases where there are a
>> few
>> > exceptional topics which do not need replication in all of the namespace
>> > clusters. This helps in saving network bandwidth, storage, CPU, RAM etc
>> >
>> > But the reason why you've raised this PIP is to bring down the actual
>> > replication semantics at a topic level. Yes, namespace level still
>> exists
>> > as per your PIP as well, but is basically left only to be a "default in
>> > case topic level is missing".
>> > This brings me to a very basic question - What's the use case that you
>> are
>> > trying to solve that needs these changes? Because, then what's stopping
>> us
>> > from bringing every construct that's at a namespace level (bundling,
>> > hardware affinity, etc) down to a topic level?
>> >
>> > Regards
>> >
>> > On Tue, Dec 5, 2023 at 2:52 PM Xiangying Meng <xiangy...@apache.org>
>> > wrote:
>> >
>> > > Hi Joe,
>> > >
>> > > You're correct. The initial design of the replication policy leaves
>> room
>> > > for improvement. To address this, we aim to refine the cluster
>> settings
>> > at
>> > > the namespace level in a way that won't impact the existing system.
>> The
>> > > replication clusters should solely be used to establish full mesh
>> > > replication for that specific namespace, without having any other
>> > > definitions or functionalities.
>> > >
>> > > BR,
>> > > Xiangying
>> > >
>> > >
>> > > On Mon, Dec 4, 2023 at 1:52 PM Joe F <joefranc...@gmail.com> wrote:
>> > >
>> > > > >if users want to change the replication policy for
>> > > > topic-n and do not change the replication policy of other topics,
>> they
>> > > need
>> > > > to change all the topic policy under this namespace.
>> > > >
>> > > > This PIP unfortunately  flows from  attempting to solve bad
>> application
>> > > > design
>> > > >
>> > > > A namespace is supposed to represent an application, and the
>> namespace
>> > > > policy is an umbrella for a similar set of policies  that applies to
>> > all
>> > > > topics.  The exceptions would be if a topic had a need for a
>> deficit,
>> > The
>> > > > case of one topic in the namespace sticking out of the namespace
>> policy
>> > > > umbrella is bad  application design in my opinion
>> > > >
>> > > > -Joe.
>> > > >
>> > > >
>> > > >
>> > > > On Sun, Dec 3, 2023 at 6:00 PM Xiangying Meng <xiangy...@apache.org
>> >
>> > > > wrote:
>> > > >
>> > > > > Hi Rajan and Girish,
>> > > > > Thanks for your reply. About the question you mentioned, there is
>> > some
>> > > > > information I want to share with you.
>> > > > > >If anyone wants to setup different replication clusters then
>> either
>> > > > > >those topics can be created under different namespaces or
>> defined at
>> > > > topic
>> > > > > >level policy.
>> > > > >
>> > > > > >And users can anyway go and update the namespace's cluster list
>> to
>> > add
>> > > > the
>> > > > > >missing cluster.
>> > > > > Because the replication clusters also mean the clusters where the
>> > topic
>> > > > can
>> > > > > be created or loaded, the topic-level replication clusters can
>> only
>> > be
>> > > > the
>> > > > > subset of namespace-level replication clusters.
>> > > > > Just as Girish mentioned, the users can update the namespace's
>> > cluster
>> > > > list
>> > > > > to add the missing cluster.
>> > > > > But there is a problem because the replication clusters as the
>> > > namespace
>> > > > > level will create a full mesh replication for that namespace
>> across
>> > the
>> > > > > clusters defined in
>> > > > > replication-clusters if users want to change the replication
>> policy
>> > for
>> > > > > topic-n and do not change the replication policy of other topics,
>> > they
>> > > > need
>> > > > > to change all the topic policy under this namespace.
>> > > > >
>> > > > > > Pulsar is being used by many legacy systems and changing its
>> > > > > >semantics for specific usecases without considering consequences
>> are
>> > > > > >creating a lot of pain and incompatibility problems for other
>> > existing
>> > > > > >systems and let's avoid doing it as we are struggling with such
>> > > changes
>> > > > > and
>> > > > > >breaking compatibility or changing semantics are just not
>> > acceptable.
>> > > > >
>> > > > > This proposal will not introduce an incompatibility problem,
>> because
>> > > the
>> > > > > default value of the namespace policy of allowed-clusters and
>> > > > > topic-policy-synchronized-clusters are the replication-clusters.
>> > > > >
>> > > > > >Allowed clusters defined at tenant level
>> > > > > >will restrict tenants to create namespaces in regions/clusters
>> where
>> > > > they
>> > > > > >are not allowed.
>> > > > > >As Rajan also mentioned, allowed-clusters field has a different
>> > > > > meaning/purpose.
>> > > > >
>> > > > > Allowed clusters defined at the tenant level will restrict tenants
>> > from
>> > > > > creating namespaces in regions/clusters where they are not
>> allowed.
>> > > > > Similarly, the allowed clusters defined at the namespace level
>> will
>> > > > > restrict the namespace from creating topics in regions/clusters
>> where
>> > > > they
>> > > > > are not allowed.
>> > > > > What's wrong with this?
>> > > > >
>> > > > > Regards,
>> > > > > Xiangying
>> > > > >
>> > > > > On Fri, Dec 1, 2023 at 2:35 PM Girish Sharma <
>> > scrapmachi...@gmail.com>
>> > > > > wrote:
>> > > > >
>> > > > > > Hi Xiangying,
>> > > > > >
>> > > > > > Shouldn't the solution to the issue mentioned in #21564 [0]
>> mostly
>> > be
>> > > > > > around validating that topic level replication clusters are
>> subset
>> > of
>> > > > > > namespace level replication clusters?
>> > > > > > It would be a completely compatible change as even today the
>> case
>> > > > where a
>> > > > > > topic has a cluster not defined in namespaces's
>> > replication-clusters
>> > > > > > doesn't really work.
>> > > > > > And users can anyway go and update the namespace's cluster list
>> to
>> > > add
>> > > > > the
>> > > > > > missing cluster.
>> > > > > >
>> > > > > > As Rajan also mentioned, allowed-clusters field has a different
>> > > > > > meaning/purpose.
>> > > > > > Regards
>> > > > > >
>> > > > > > On Thu, Nov 30, 2023 at 10:56 AM Xiangying Meng <
>> > > xiangy...@apache.org>
>> > > > > > wrote:
>> > > > > >
>> > > > > > > Hi, Pulsar Community
>> > > > > > >
>> > > > > > > I drafted a proposal to make the configuration of clusters at
>> the
>> > > > > > namespace
>> > > > > > > level clearer. This helps solve the problem of geo-replication
>> > not
>> > > > > > working
>> > > > > > > correctly at the topic level.
>> > > > > > >
>> > > > > > > https://github.com/apache/pulsar/pull/21648
>> > > > > > >
>> > > > > > > I'm looking forward to hearing from you.
>> > > > > > >
>> > > > > > > BR
>> > > > > > > Xiangying
>> > > > > > >
>> > > > > >
>> > > > > >
>> > > > > > --
>> > > > > > Girish Sharma
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> >
>> > --
>> > Girish Sharma
>> >
>>
>

Reply via email to