Re: [DISCUSSION] Deprecation of obsolete rebalancing functionality

Alexei Scherbakov Thu, 13 Feb 2020 03:37:11 -0800

1. Yes

2. This is right but doesn't sound like a bug. The rebalancing will be
finished before releasing syncFut and partitions will contain all necessary
data (but are still in moving state).


3. No, local node doesn't wait the rebalancing on all grid nodes.

Actually, I think SYNC mode should be dropped as well. Instead we must
provide the convenient public API to wait for "stable" topology.


чт, 13 февр. 2020 г. в 14:09, Maxim Muzafarov <mmu...@apache.org>:

> Pavel,
>
> It's still a big question regarding SYNC rebalance mode. Here is my
> thoughts.
>
> 1. Yes, we must rebalance such caches prior to ASYNC one (if the
> rebalanceOrder configuration will be removed).
>
> 2. When persistence is enabled and when WAL is disabled (on the first
> rebalance start), I think we should finish syncFuture only on
> checkpoint like we are enabling the WAL state for cache group and
> simultaneously owning all MOVING partitions. But currently, I've seen
> that syncFuture finishes when there are no remaining partitions left
> [1].
> Is it correct? Seems like a bug.
>
> 3. In my understanding, a new local node can start only when ALL SYNC
> cache groups have been fully rebalanced on ALL nodes, right? But how
> about late affinity assignment here? It seems that SYNC caches will be
> rebalanced locally on the node, the node will start, but other nodes
> still think this node is not operational (late affinity assignment not
> occurred yet).
>
>
> [1]
> https://github.com/apache/ignite/blob/master/modules/core/src/main/java/org/apache/ignite/internal/processors/cache/distributed/dht/preloader/GridDhtPartitionDemander.java#L1561
>
> On Thu, 13 Feb 2020 at 12:57, Pavel Pereslegin <xxt...@gmail.com> wrote:
> >
> > > +1 to deprecate rebalanceOrder and remove related functionality,
> > Meant to "rework related functionality" not "remove".
> >
> > чт, 13 февр. 2020 г. в 12:47, Pavel Pereslegin <xxt...@gmail.com>:
> > >
> > > Hello,
> > >
> > > +1 to deprecate rebalanceOrder and remove related functionality,
> > > should we create a separate ticket for this?
> > >
> > > Btw, as I understand, SYNC mode is only useful for in-memory caches,
> > > because when persistence is enabled (and WAL is disabled during
> > > rebalancing), even "ignite-sys-cache" owns partitions only after all
> > > cache groups are rebalanced. Thus, even utility cache is still
> > > inoperable after node startup when persistence is enabled. Do we
> > > really need to wait for SYNC caches when a node starts with enabled
> > > persistence or should we enabled WAL for SYNC-caches?
> > >
> > > чт, 13 февр. 2020 г. в 11:13, Ivan Rakov <ivan.glu...@gmail.com>:
> > > >
> > > > Hello,
> > > >
> > > > +1 from me for rebalance delay deprecation.
> > > > I can imagine only one actual case for this option: prevent
> excessive load
> > > > on the cluster in case of temporary short-term topology changes
> (e.g. node
> > > > is stopped for a while and then returned back).
> > > > Now it's handled by baseline auto adjustment in a much more correct
> way:
> > > > partitions are not reassigned within a maintenance interval (unlike
> with
> > > > the rebalance delay).
> > > > I also don't think that ability to configure rebalance delay per
> cache is
> > > > crucial.
> > > >
> > > > > rebalanceOrder is also useless, agreed.
> > > > +1
> > > > Except for one case: we may want to rebalance caches with
> > > > CacheRebalanceMode.SYNC first. But anyway, this behavior doesn't
> require a
> > > > separate property to be enabled.
> > > >
> > > > On Wed, Feb 12, 2020 at 4:54 PM Alexei Scherbakov <
> > > > alexey.scherbak...@gmail.com> wrote:
> > > >
> > > > > Maxim,
> > > > >
> > > > > rebalanceDelay was introduced before the BLT appear in the product
> to solve
> > > > > scenarios which are now solved by BLT.
> > > > >
> > > > > It's pointless for me having it in the product since BLT was
> introduced.
> > > > >
> > > > > I do not think delaying rebalancing per cache group has any
> meaning. I
> > > > > cannot image any reason for it.
> > > > >
> > > > > rebalanceOrder is also useless, agreed.
> > > > >
> > > > >
> > > > >
> > > > >
> > > > > ср, 12 февр. 2020 г. в 16:19, Maxim Muzafarov <mmu...@apache.org>:
> > > > >
> > > > > > Alexey,
> > > > > >
> > > > > > Why do you think delaying of historical rebalance (on BLT node
> join)
> > > > > > for particular cache groups is not the real world use case?
> Probably
> > > > > > the same topic may be started on user-list to collect more use
> cases
> > > > > > from real users.
> > > > > >
> > > > > > In general, I support reducing the number of available rebalance
> > > > > > configuration parameters, but we should do it really carefully.
> > > > > > I can also propose - rebalanceOrder param for removing.
> > > > > >
> > > > > > On Wed, 12 Feb 2020 at 15:50, Alexei Scherbakov
> > > > > > <alexey.scherbak...@gmail.com> wrote:
> > > > > > >
> > > > > > > Maxim,
> > > > > > >
> > > > > > > In general rebalanceDelay is used to delay/disable rebalance
> then
> > > > > > topology
> > > > > > > is changed.
> > > > > > > Right now we have BLT to avoid unnecesary rebalancing when
> topology is
> > > > > > > changed.
> > > > > > > If a node left from cluster topology no rebalancing happens
> until the
> > > > > > node
> > > > > > > explicitly removed from baseline topology.
> > > > > > >
> > > > > > > I would like to know real world scenarios which can not be
> covered by
> > > > > BLT
> > > > > > > configuration.
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > ср, 12 февр. 2020 г. в 15:16, Maxim Muzafarov <
> mmu...@apache.org>:
> > > > > > >
> > > > > > > > Alexey,
> > > > > > > >
> > > > > > > > > All scenarios where rebalanceDelay has meaning are handled
> by
> > > > > > baseline
> > > > > > > > topology now.
> > > > > > > >
> > > > > > > > Can you, please, provide more details here e.g. the whole
> list of
> > > > > > > > scenarios where rebalanceDelay is used and how these handled
> by
> > > > > > > > baseline topology?
> > > > > > > >
> > > > > > > > Actually, I doubt that it covers exactly all the cases due to
> > > > > > > > rebalanceDelay is a "per cache group property" rather than
> "baseline"
> > > > > > > > is meaningful for the whole topology.
> > > > > > > >
> > > > > > > > On Wed, 12 Feb 2020 at 12:58, Alexei Scherbakov
> > > > > > > > <alexey.scherbak...@gmail.com> wrote:
> > > > > > > > >
> > > > > > > > > I've meant baseline topology.
> > > > > > > > >
> > > > > > > > > ср, 12 февр. 2020 г. в 12:41, Alexei Scherbakov <
> > > > > > > > > alexey.scherbak...@gmail.com>:
> > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > V.Pyatkov
> > > > > > > > > >
> > > > > > > > > > Doesn't rebalance topology solves it ?
> > > > > > > > > >
> > > > > > > > > > ср, 12 февр. 2020 г. в 12:31, V.Pyatkov <
> vldpyat...@gmail.com>:
> > > > > > > > > >
> > > > > > > > > >> Hi,
> > > > > > > > > >>
> > > > > > > > > >> I am sure we can to reduce this ability, but do not
> completely.
> > > > > > > > > >> We can use rebalance delay for disable it until manually
> > > > > > triggered.
> > > > > > > > > >>
> > > > > > > > > >> CacheConfiguration#setRebalanceDelay(-1)
> > > > > > > > > >>
> > > > > > > > > >> It may helpful for cluster where can not allow
> performance drop
> > > > > > from
> > > > > > > > > >> rebalance at any time.
> > > > > > > > > >>
> > > > > > > > > >>
> > > > > > > > > >>
> > > > > > > > > >> --
> > > > > > > > > >> Sent from:
> > > > > http://apache-ignite-developers.2346864.n4.nabble.com/
> > > > > > > > > >>
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > --
> > > > > > > > > >
> > > > > > > > > > Best regards,
> > > > > > > > > > Alexei Scherbakov
> > > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > --
> > > > > > > > >
> > > > > > > > > Best regards,
> > > > > > > > > Alexei Scherbakov
> > > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > --
> > > > > > >
> > > > > > > Best regards,
> > > > > > > Alexei Scherbakov
> > > > > >
> > > > >
> > > > >
> > > > > --
> > > > >
> > > > > Best regards,
> > > > > Alexei Scherbakov
> > > > >
>


-- 

Best regards,
Alexei Scherbakov

Re: [DISCUSSION] Deprecation of obsolete rebalancing functionality

Reply via email to