Hi dev,

I pushed a PR https://github.com/apache/pulsar/pull/20822 to fix the infinite 
unloading. Please help review this PR. Thanks!

Thanks,
Kai

On 2023/07/09 23:40:03 PengHui Li wrote:
> Hi Heesung,
> 
> For 2.10, I would like to suggest fixing the issue instead of cherry-picking
> the PR. The problem that https://github.com/apache/pulsar/pull/388 had
> resolved will happen again if `loadBalancerDistributeBundlesEvenlyEnabled`
> is disabled. We should try to remove the configuration in the future
> because users are difficult to decide whether to enable or disable it. Both
> of them have problems, just different issues.
> 
> > I think we also need to consider the namespace anit-affinity-group logic
> too.
> 
> +1, it should be fixed to avoid an infinite bundle unloading loop.
> 
> Thanks,
> Penghui
> 
> On Sat, Jul 8, 2023 at 4:07 AM Heesung Sohn
> <heesung.s...@streamnative.io.invalid> wrote:
> 
> > Hi dev,
> >
> > I think we also need to consider the namespace anit-affinity-group logic
> > too. These logics seem to do similar things.
> >
> > https://pulsar.apache.org/docs/3.0.x/administration-load-balance/#distribute-anti-affinity-namespaces-across-failure-domains
> >
> >
> > PengHui
> > We got three biding votes here. Do you think we should proceed to
> > cherry-pick the PR to 2.10, then?
> >
> > Thanks,
> > Heesung
> >
> >
> >
> >
> >
> > On Sun, Jul 2, 2023 at 5:22 PM PengHui Li <peng...@apache.org> wrote:
> >
> > > > `removeMostServicingBrokersForNamespace ` is introduced by [1] to
> > > solve the problem that when all bundles in a particular namespace
> > > belong to 1 or few machines, customers owning that namespace will be
> > > heavily impacted if that broker goes down. Of course, this PR caused
> > > the infinite unloading issue and we need to fix it.
> > >
> > > Thanks for the context.
> > > It looks like we can also try to fix the infinite unloading issue.
> > > Now, the broker is unloading the bundles without checking the
> > distribution
> > > of the bundles under a namespace, but it will check when finding
> > > a new owner. Is it possible to check the bundle distribution before
> > > unloading the bundles to avoid infinite unloading?
> > >
> > > Regards,
> > > Penghui
> > >
> > >
> > > On Sun, Jul 2, 2023 at 3:28 PM Enrico Olivelli <eolive...@gmail.com>
> > > wrote:
> > >
> > > > +1
> > > >
> > > > Enrico
> > > >
> > > > Il Dom 2 Lug 2023, 06:19 Hang Chen <chenh...@apache.org> ha scritto:
> > > >
> > > > > +1 for cherry-picking it to branch-2.10. We have a flag to control
> > > > > whether to enable or disable it.
> > > > >
> > > > > `removeMostServicingBrokersForNamespace ` is introduced by [1] to
> > > > > solve the problem that when all bundles in a particular namespace
> > > > > belong to 1 or few machines, customers owning that namespace will be
> > > > > heavily impacted if that broker goes down. Of course, this PR caused
> > > > > the infinite unloading issue and we need to fix it.
> > > > >
> > > > > > I agree with making it false for the next major version release by
> > > > > default.
> > > > > We'd better remove the config in the next version instead of change
> > > > > the default value to `false`, which will make Pulsar's configuration
> > > > > keep increasing.
> > > > >
> > > > > Thanks,
> > > > > Hang
> > > > >
> > > > > [1] https://github.com/apache/pulsar/pull/388
> > > > >
> > > > > PengHui Li <peng...@apache.org> 于2023年7月1日周六 09:38写道:
> > > > > >
> > > > > > +1 for cherry-pick to branch-2.10 since users don't have a
> > workaround
> > > > > > for this issue, and the change is well-understand, low risk.
> > > > > >
> > > > > > I agree with making it false for the next major version release by
> > > > > default.
> > > > > >
> > > > > > Thanks,
> > > > > > Penghui
> > > > > >
> > > > > > On Sat, Jul 1, 2023 at 9:26 AM Heesung Sohn
> > > > > > <heesung.s...@streamnative.io.invalid> wrote:
> > > > > >
> > > > > > > Hi dev,
> > > > > > >
> > > > > > > I realized that `removeMostServicingBrokersForNamespace` func in
> > > the
> > > > > broker
> > > > > > > selection logic can cause infinite unloading.
> > > > > > >
> > > > > > > Suppose an overloaded broker unloaded a bundle and only has the
> > > > minimum
> > > > > > > number of bundles(in that namespace) among brokers. In that case,
> > > the
> > > > > > > selection logic (`removeMostServicingBrokersForNamespace`) will
> > > > filter
> > > > > out
> > > > > > > other brokers and always reassign the bundle to the previous
> > > broker.
> > > > > This
> > > > > > > will cause infinite unloading(like a boomerang).
> > > > > > >
> > > > > > > To mitigate this issue, we need to cherry-pick this PR to disable
> > > > this
> > > > > > > logic by the config.
> > > > > > > https://github.com/apache/pulsar/pull/16059
> > > > > > >
> > > > > > > And we probably want to disable this
> > > > > > > `removeMostServicingBrokersForNamespace` logic by default.
> > > > > > >
> > > > > > > Regards,
> > > > > > > Heesung
> > > > > > >
> > > > >
> > > >
> > >
> >
> 

Reply via email to