Re: IEP-4, Phase 2. Using BL(A)T for in-memory caches.

Dmitriy Setrakyan Fri, 04 May 2018 12:02:11 -0700

I do not like the name "current" on the methods. I think we should just
remove it, e.g. currentAffinityTopology() -> affinityTopology()


D.

On Fri, May 4, 2018 at 6:17 AM, Eduard Shangareev <
eduard.shangar...@gmail.com> wrote:

> Igniters,
>
> With Vladimir's help, we analyzed another solution's approaches.
> And decided to simplify our affinity topology auto-adjusting.
>
> It should be enough to be able to turn on/off auto-adjusting (flag) and set
> 2 timeouts if it is working:
> -soft timeout which would be used if there was no other node joins/exits;
> -hard timeout which we would track from first discovery event and if it
> reached then immediately would change affinity topology.
>
> All other strategies could be realized with API usage (setAffinityTopology)
> and metrics tracking by user's monitoring tools.
>
> So, I suggest next API changes:
>
> org.apache.ignite.IgniteCluster
>
> *Deprecate*:
> Collection<BaselineNode> currentBaselineTopology();
> void setBaselineTopology(Collection<? extends BaselineNode> baselineTop);
> void setBaselineTopology(long topVer);
>
> *Replace them with*
> Collection<BaselineNode> currentAffinityTopology();
> void setAffinityTopology(Collection<? extends BaselineNode> affinityTop);
> void setAffinityTopology(long topVer);
>
> *Add*
> isAffinityTopologyAutoAdjustEnabled()
> setAffinityTopologyAutoAdjustEnabled(boolean enabled);
>
> org.apache.ignite.configuration.IgniteConfiguration
>
> *Add*
> IgniteConfiguration setAffinityTopologyAutoAdjustEnabled(boolean enabled);
> IgniteConfiguration setAffinityTopologyAutoAdjustTimeout(long
> timeoutInMs);
> IgniteConfiguration setAffinityTopologyAutoAdjustMaxTimeout(long
> timeoutInMs);
>
>
> An open question is could we rename or duplicate BaselineNode with
> AffinityNode?
>
>
>
>
>
>
> On Fri, Apr 27, 2018 at 6:56 PM, Ivan Rakov <ivan.glu...@gmail.com> wrote:
>
> > Eduard,
> >
> > +1 to your proposed API for configuring Affinity Topology change
> policies.
> > Obviously we should use "auto" as default behavior. I believe, automatic
> > rebalancing is expected and more convenient for majority of users.
> >
> > Best Regards,
> > Ivan Rakov
> >
> >
> > On 26.04.2018 19:27, Eduard Shangareev wrote:
> >
> >> Igniters,
> >>
> >> Ok, I want to share my thoughts about "affinity topology (AT) changing
> >> policies".
> >>
> >>
> >> There would be three major option:
> >> -auto;
> >> -manual;
> >> -custom.
> >>
> >> 1. Automatic change.
> >> A user could set timeouts for:
> >> a. change AT on any topology change after some timeout
> (setATChangeTimeout
> >> in seconds);
> >> b. change AT on node left after some timeout
> (setATChangeOnNodeLeftTimeout
> >> in seconds);
> >> c. change AT on node join after some timeout
> (setATChangeOnNodeJoinTimeout
> >> in seconds).
> >>
> >> b and c are more specific, so they would override a.
> >>
> >> Also, I want to introduce a mechanism of merging AT changes, which would
> >> be
> >> turned on by default.
> >> Other words, if we reached timeout than we would change AT to current
> >> topology, not that one which was on timeout schedule.
> >>
> >> 2. Manual change.
> >>
> >> Current behavior. A user change AT himself by console tools or web
> >> console.
> >>
> >> 3. Custom.
> >>
> >> We would give the option to set own realization of changing policy
> (class
> >> name in config).
> >> We should pass as incoming parameters:
> >> - current topology (collection of cluster nodes);
> >> - current AT (affinity topology);
> >> - map of GroupId to minimal alive backup number;
> >> - list of configuration (1.a, 1.b, 1.c);
> >> - scheduler.
> >>
> >> Plus to these configurations, I propose orthogonal option.
> >> 4. Emergency affinity topology change.
> >> It would change AT even MANUAL option is set if at least one cache group
> >> backup factor goes below  or equal chosen one (by default 0).
> >> So, if we came to situation when after node left there was only primary
> >> partion (without backups) for some cache group we would change AT
> >> immediately.
> >>
> >>
> >> Thank you for your attention.
> >>
> >>
> >> On Thu, Apr 26, 2018 at 6:57 PM, Eduard Shangareev <
> >> eduard.shangar...@gmail.com> wrote:
> >>
> >> Dmitriy,
> >>>
> >>> I also think that we should think about 2.6 as the target.
> >>>
> >>>
> >>> On Thu, Apr 26, 2018 at 3:27 PM, Alexey Goncharuk <
> >>> alexey.goncha...@gmail.com> wrote:
> >>>
> >>> Dmitriy,
> >>>>
> >>>> I doubt we will be able to fit this in 2.5 given that we did not even
> >>>> agree
> >>>> on the policy interface. Forcing in-memory caches to use baseline
> >>>> topology
> >>>> will be an easy technical fix, however, we will need to update and
> >>>> probably
> >>>> fix lots of failover tests, add new ones.
> >>>>
> >>>> I think it makes sense to target this change to 2.6.
> >>>>
> >>>> 2018-04-25 22:25 GMT+03:00 Ilya Lantukh <ilant...@gridgain.com>:
> >>>>
> >>>> Eduard,
> >>>>>
> >>>>> I'm not sure I understand what you mean by "policy". Is it an
> interface
> >>>>> that will have a few default implementations and user will be able to
> >>>>> create his own one? If so, could you please write an example of such
> >>>>> interface (how you see it) and how and when it's methods will be
> >>>>>
> >>>> invoked.
> >>>>
> >>>>> On Wed, Apr 25, 2018 at 10:10 PM, Eduard Shangareev <
> >>>>> eduard.shangar...@gmail.com> wrote:
> >>>>>
> >>>>> Igniters,
> >>>>>> I have described the issue with current approach in "New definition
> >>>>>>
> >>>>> for
> >>>>
> >>>>> affinity node (issues with baseline)" topic[1].
> >>>>>>
> >>>>>> Now we have 2 different affinity topology (one for in-memory,
> another
> >>>>>>
> >>>>> for
> >>>>
> >>>>> persistent caches).
> >>>>>>
> >>>>>> It causes problems:
> >>>>>> - we lose (in general) co-location between different caches;
> >>>>>> - we can't avoid PME when non-BLAT node joins cluster;
> >>>>>> - implementation should consider 2 different approaches to affinity
> >>>>>> calculation.
> >>>>>>
> >>>>>> So, I suggest unifying behavior of in-memory and persistent caches.
> >>>>>> They should all use BLAT.
> >>>>>>
> >>>>>> Their behaviors were different because we couldn't guarantee the
> >>>>>>
> >>>>> safety
> >>>>
> >>>>> of
> >>>>>
> >>>>>> in-memory data.
> >>>>>> It should be fixed by a new mechanism of BLAT changing policy which
> >>>>>>
> >>>>> was
> >>>>
> >>>>> already discussed there - "Triggering rebalancing on timeout or
> >>>>>>
> >>>>> manually
> >>>>
> >>>>> if
> >>>>>
> >>>>>> the baseline topology is not reassembled" [2].
> >>>>>>
> >>>>>> And we should have a policy by default which similar to current one
> >>>>>> (add nodes, remove nodes automatically but after some reasonable
> delay
> >>>>>> [seconds]).
> >>>>>>
> >>>>>> After this change, we could stop using the term 'BLAT', Basline and
> so
> >>>>>>
> >>>>> on.
> >>>>>
> >>>>>> Because there would not be an alternative. So, it would be only one
> >>>>>> possible Affinity Topology.
> >>>>>>
> >>>>>>
> >>>>>> [1]
> >>>>>> http://apache-ignite-developers.2346864.n4.nabble.
> >>>>>>
> >>>>> com/New-definition-for-
> >>>>>
> >>>>>> affinity-node-issues-with-baseline-td29868.html
> >>>>>> [2]
> >>>>>> http://apache-ignite-developers.2346864.n4.nabble.
> >>>>>> com/Triggering-rebalancing-on-timeout-or-manually-if-the-
> >>>>>> baseline-topology-is-not-reassembled-td29299.html#none
> >>>>>>
> >>>>>>
> >>>>>
> >>>>> --
> >>>>> Best regards,
> >>>>> Ilya
> >>>>>
> >>>>>
> >>>
> >
>

Re: IEP-4, Phase 2. Using BL(A)T for in-memory caches.

Reply via email to