Re: [DISCUSSION] Deprecation of obsolete rebalancing functionality

Alexei Scherbakov Thu, 13 Feb 2020 03:41:11 -0800

But in combination with BLT it will work as intended - no rebalancing under
the cover.


чт, 13 февр. 2020 г. в 14:39, Alexei Scherbakov <
[email protected]>:

> Of course, stable topology will be just a hint.
>
> Any node can leave at any moment.
>
> чт, 13 февр. 2020 г. в 14:35, Alexei Scherbakov <
> [email protected]>:
>
>> 1. Yes
>>
>> 2. This is right but doesn't sound like a bug. The rebalancing will be
>> finished before releasing syncFut and partitions will contain all necessary
>> data (but are still in moving state).
>>
>> 3. No, local node doesn't wait the rebalancing on all grid nodes.
>>
>> Actually, I think SYNC mode should be dropped as well. Instead we must
>> provide the convenient public API to wait for "stable" topology.
>>
>>
>> чт, 13 февр. 2020 г. в 14:09, Maxim Muzafarov <[email protected]>:
>>
>>> Pavel,
>>>
>>> It's still a big question regarding SYNC rebalance mode. Here is my
>>> thoughts.
>>>
>>> 1. Yes, we must rebalance such caches prior to ASYNC one (if the
>>> rebalanceOrder configuration will be removed).
>>>
>>> 2. When persistence is enabled and when WAL is disabled (on the first
>>> rebalance start), I think we should finish syncFuture only on
>>> checkpoint like we are enabling the WAL state for cache group and
>>> simultaneously owning all MOVING partitions. But currently, I've seen
>>> that syncFuture finishes when there are no remaining partitions left
>>> [1].
>>> Is it correct? Seems like a bug.
>>>
>>> 3. In my understanding, a new local node can start only when ALL SYNC
>>> cache groups have been fully rebalanced on ALL nodes, right? But how
>>> about late affinity assignment here? It seems that SYNC caches will be
>>> rebalanced locally on the node, the node will start, but other nodes
>>> still think this node is not operational (late affinity assignment not
>>> occurred yet).
>>>
>>>
>>> [1]
>>> https://github.com/apache/ignite/blob/master/modules/core/src/main/java/org/apache/ignite/internal/processors/cache/distributed/dht/preloader/GridDhtPartitionDemander.java#L1561
>>>
>>> On Thu, 13 Feb 2020 at 12:57, Pavel Pereslegin <[email protected]> wrote:
>>> >
>>> > > +1 to deprecate rebalanceOrder and remove related functionality,
>>> > Meant to "rework related functionality" not "remove".
>>> >
>>> > чт, 13 февр. 2020 г. в 12:47, Pavel Pereslegin <[email protected]>:
>>> > >
>>> > > Hello,
>>> > >
>>> > > +1 to deprecate rebalanceOrder and remove related functionality,
>>> > > should we create a separate ticket for this?
>>> > >
>>> > > Btw, as I understand, SYNC mode is only useful for in-memory caches,
>>> > > because when persistence is enabled (and WAL is disabled during
>>> > > rebalancing), even "ignite-sys-cache" owns partitions only after all
>>> > > cache groups are rebalanced. Thus, even utility cache is still
>>> > > inoperable after node startup when persistence is enabled. Do we
>>> > > really need to wait for SYNC caches when a node starts with enabled
>>> > > persistence or should we enabled WAL for SYNC-caches?
>>> > >
>>> > > чт, 13 февр. 2020 г. в 11:13, Ivan Rakov <[email protected]>:
>>> > > >
>>> > > > Hello,
>>> > > >
>>> > > > +1 from me for rebalance delay deprecation.
>>> > > > I can imagine only one actual case for this option: prevent
>>> excessive load
>>> > > > on the cluster in case of temporary short-term topology changes
>>> (e.g. node
>>> > > > is stopped for a while and then returned back).
>>> > > > Now it's handled by baseline auto adjustment in a much more
>>> correct way:
>>> > > > partitions are not reassigned within a maintenance interval
>>> (unlike with
>>> > > > the rebalance delay).
>>> > > > I also don't think that ability to configure rebalance delay per
>>> cache is
>>> > > > crucial.
>>> > > >
>>> > > > > rebalanceOrder is also useless, agreed.
>>> > > > +1
>>> > > > Except for one case: we may want to rebalance caches with
>>> > > > CacheRebalanceMode.SYNC first. But anyway, this behavior doesn't
>>> require a
>>> > > > separate property to be enabled.
>>> > > >
>>> > > > On Wed, Feb 12, 2020 at 4:54 PM Alexei Scherbakov <
>>> > > > [email protected]> wrote:
>>> > > >
>>> > > > > Maxim,
>>> > > > >
>>> > > > > rebalanceDelay was introduced before the BLT appear in the
>>> product to solve
>>> > > > > scenarios which are now solved by BLT.
>>> > > > >
>>> > > > > It's pointless for me having it in the product since BLT was
>>> introduced.
>>> > > > >
>>> > > > > I do not think delaying rebalancing per cache group has any
>>> meaning. I
>>> > > > > cannot image any reason for it.
>>> > > > >
>>> > > > > rebalanceOrder is also useless, agreed.
>>> > > > >
>>> > > > >
>>> > > > >
>>> > > > >
>>> > > > > ср, 12 февр. 2020 г. в 16:19, Maxim Muzafarov <[email protected]
>>> >:
>>> > > > >
>>> > > > > > Alexey,
>>> > > > > >
>>> > > > > > Why do you think delaying of historical rebalance (on BLT node
>>> join)
>>> > > > > > for particular cache groups is not the real world use case?
>>> Probably
>>> > > > > > the same topic may be started on user-list to collect more use
>>> cases
>>> > > > > > from real users.
>>> > > > > >
>>> > > > > > In general, I support reducing the number of available
>>> rebalance
>>> > > > > > configuration parameters, but we should do it really carefully.
>>> > > > > > I can also propose - rebalanceOrder param for removing.
>>> > > > > >
>>> > > > > > On Wed, 12 Feb 2020 at 15:50, Alexei Scherbakov
>>> > > > > > <[email protected]> wrote:
>>> > > > > > >
>>> > > > > > > Maxim,
>>> > > > > > >
>>> > > > > > > In general rebalanceDelay is used to delay/disable rebalance
>>> then
>>> > > > > > topology
>>> > > > > > > is changed.
>>> > > > > > > Right now we have BLT to avoid unnecesary rebalancing when
>>> topology is
>>> > > > > > > changed.
>>> > > > > > > If a node left from cluster topology no rebalancing happens
>>> until the
>>> > > > > > node
>>> > > > > > > explicitly removed from baseline topology.
>>> > > > > > >
>>> > > > > > > I would like to know real world scenarios which can not be
>>> covered by
>>> > > > > BLT
>>> > > > > > > configuration.
>>> > > > > > >
>>> > > > > > >
>>> > > > > > >
>>> > > > > > > ср, 12 февр. 2020 г. в 15:16, Maxim Muzafarov <
>>> [email protected]>:
>>> > > > > > >
>>> > > > > > > > Alexey,
>>> > > > > > > >
>>> > > > > > > > > All scenarios where rebalanceDelay has meaning are
>>> handled by
>>> > > > > > baseline
>>> > > > > > > > topology now.
>>> > > > > > > >
>>> > > > > > > > Can you, please, provide more details here e.g. the whole
>>> list of
>>> > > > > > > > scenarios where rebalanceDelay is used and how these
>>> handled by
>>> > > > > > > > baseline topology?
>>> > > > > > > >
>>> > > > > > > > Actually, I doubt that it covers exactly all the cases due
>>> to
>>> > > > > > > > rebalanceDelay is a "per cache group property" rather than
>>> "baseline"
>>> > > > > > > > is meaningful for the whole topology.
>>> > > > > > > >
>>> > > > > > > > On Wed, 12 Feb 2020 at 12:58, Alexei Scherbakov
>>> > > > > > > > <[email protected]> wrote:
>>> > > > > > > > >
>>> > > > > > > > > I've meant baseline topology.
>>> > > > > > > > >
>>> > > > > > > > > ср, 12 февр. 2020 г. в 12:41, Alexei Scherbakov <
>>> > > > > > > > > [email protected]>:
>>> > > > > > > > >
>>> > > > > > > > > >
>>> > > > > > > > > > V.Pyatkov
>>> > > > > > > > > >
>>> > > > > > > > > > Doesn't rebalance topology solves it ?
>>> > > > > > > > > >
>>> > > > > > > > > > ср, 12 февр. 2020 г. в 12:31, V.Pyatkov <
>>> [email protected]>:
>>> > > > > > > > > >
>>> > > > > > > > > >> Hi,
>>> > > > > > > > > >>
>>> > > > > > > > > >> I am sure we can to reduce this ability, but do not
>>> completely.
>>> > > > > > > > > >> We can use rebalance delay for disable it until
>>> manually
>>> > > > > > triggered.
>>> > > > > > > > > >>
>>> > > > > > > > > >> CacheConfiguration#setRebalanceDelay(-1)
>>> > > > > > > > > >>
>>> > > > > > > > > >> It may helpful for cluster where can not allow
>>> performance drop
>>> > > > > > from
>>> > > > > > > > > >> rebalance at any time.
>>> > > > > > > > > >>
>>> > > > > > > > > >>
>>> > > > > > > > > >>
>>> > > > > > > > > >> --
>>> > > > > > > > > >> Sent from:
>>> > > > > http://apache-ignite-developers.2346864.n4.nabble.com/
>>> > > > > > > > > >>
>>> > > > > > > > > >
>>> > > > > > > > > >
>>> > > > > > > > > > --
>>> > > > > > > > > >
>>> > > > > > > > > > Best regards,
>>> > > > > > > > > > Alexei Scherbakov
>>> > > > > > > > > >
>>> > > > > > > > >
>>> > > > > > > > >
>>> > > > > > > > > --
>>> > > > > > > > >
>>> > > > > > > > > Best regards,
>>> > > > > > > > > Alexei Scherbakov
>>> > > > > > > >
>>> > > > > > >
>>> > > > > > >
>>> > > > > > > --
>>> > > > > > >
>>> > > > > > > Best regards,
>>> > > > > > > Alexei Scherbakov
>>> > > > > >
>>> > > > >
>>> > > > >
>>> > > > > --
>>> > > > >
>>> > > > > Best regards,
>>> > > > > Alexei Scherbakov
>>> > > > >
>>>
>>
>>
>> --
>>
>> Best regards,
>> Alexei Scherbakov
>>
>
>
> --
>
> Best regards,
> Alexei Scherbakov
>


-- 

Best regards,
Alexei Scherbakov

Re: [DISCUSSION] Deprecation of obsolete rebalancing functionality

Reply via email to