I do not like the name "current" on the methods. I think we should just remove it, e.g. currentAffinityTopology() -> affinityTopology()
D. On Fri, May 4, 2018 at 6:17 AM, Eduard Shangareev < eduard.shangar...@gmail.com> wrote: > Igniters, > > With Vladimir's help, we analyzed another solution's approaches. > And decided to simplify our affinity topology auto-adjusting. > > It should be enough to be able to turn on/off auto-adjusting (flag) and set > 2 timeouts if it is working: > -soft timeout which would be used if there was no other node joins/exits; > -hard timeout which we would track from first discovery event and if it > reached then immediately would change affinity topology. > > All other strategies could be realized with API usage (setAffinityTopology) > and metrics tracking by user's monitoring tools. > > So, I suggest next API changes: > > org.apache.ignite.IgniteCluster > > *Deprecate*: > Collection<BaselineNode> currentBaselineTopology(); > void setBaselineTopology(Collection<? extends BaselineNode> baselineTop); > void setBaselineTopology(long topVer); > > *Replace them with* > Collection<BaselineNode> currentAffinityTopology(); > void setAffinityTopology(Collection<? extends BaselineNode> affinityTop); > void setAffinityTopology(long topVer); > > *Add* > isAffinityTopologyAutoAdjustEnabled() > setAffinityTopologyAutoAdjustEnabled(boolean enabled); > > org.apache.ignite.configuration.IgniteConfiguration > > *Add* > IgniteConfiguration setAffinityTopologyAutoAdjustEnabled(boolean enabled); > IgniteConfiguration setAffinityTopologyAutoAdjustTimeout(long > timeoutInMs); > IgniteConfiguration setAffinityTopologyAutoAdjustMaxTimeout(long > timeoutInMs); > > > An open question is could we rename or duplicate BaselineNode with > AffinityNode? > > > > > > > On Fri, Apr 27, 2018 at 6:56 PM, Ivan Rakov <ivan.glu...@gmail.com> wrote: > > > Eduard, > > > > +1 to your proposed API for configuring Affinity Topology change > policies. > > Obviously we should use "auto" as default behavior. I believe, automatic > > rebalancing is expected and more convenient for majority of users. > > > > Best Regards, > > Ivan Rakov > > > > > > On 26.04.2018 19:27, Eduard Shangareev wrote: > > > >> Igniters, > >> > >> Ok, I want to share my thoughts about "affinity topology (AT) changing > >> policies". > >> > >> > >> There would be three major option: > >> -auto; > >> -manual; > >> -custom. > >> > >> 1. Automatic change. > >> A user could set timeouts for: > >> a. change AT on any topology change after some timeout > (setATChangeTimeout > >> in seconds); > >> b. change AT on node left after some timeout > (setATChangeOnNodeLeftTimeout > >> in seconds); > >> c. change AT on node join after some timeout > (setATChangeOnNodeJoinTimeout > >> in seconds). > >> > >> b and c are more specific, so they would override a. > >> > >> Also, I want to introduce a mechanism of merging AT changes, which would > >> be > >> turned on by default. > >> Other words, if we reached timeout than we would change AT to current > >> topology, not that one which was on timeout schedule. > >> > >> 2. Manual change. > >> > >> Current behavior. A user change AT himself by console tools or web > >> console. > >> > >> 3. Custom. > >> > >> We would give the option to set own realization of changing policy > (class > >> name in config). > >> We should pass as incoming parameters: > >> - current topology (collection of cluster nodes); > >> - current AT (affinity topology); > >> - map of GroupId to minimal alive backup number; > >> - list of configuration (1.a, 1.b, 1.c); > >> - scheduler. > >> > >> Plus to these configurations, I propose orthogonal option. > >> 4. Emergency affinity topology change. > >> It would change AT even MANUAL option is set if at least one cache group > >> backup factor goes below or equal chosen one (by default 0). > >> So, if we came to situation when after node left there was only primary > >> partion (without backups) for some cache group we would change AT > >> immediately. > >> > >> > >> Thank you for your attention. > >> > >> > >> On Thu, Apr 26, 2018 at 6:57 PM, Eduard Shangareev < > >> eduard.shangar...@gmail.com> wrote: > >> > >> Dmitriy, > >>> > >>> I also think that we should think about 2.6 as the target. > >>> > >>> > >>> On Thu, Apr 26, 2018 at 3:27 PM, Alexey Goncharuk < > >>> alexey.goncha...@gmail.com> wrote: > >>> > >>> Dmitriy, > >>>> > >>>> I doubt we will be able to fit this in 2.5 given that we did not even > >>>> agree > >>>> on the policy interface. Forcing in-memory caches to use baseline > >>>> topology > >>>> will be an easy technical fix, however, we will need to update and > >>>> probably > >>>> fix lots of failover tests, add new ones. > >>>> > >>>> I think it makes sense to target this change to 2.6. > >>>> > >>>> 2018-04-25 22:25 GMT+03:00 Ilya Lantukh <ilant...@gridgain.com>: > >>>> > >>>> Eduard, > >>>>> > >>>>> I'm not sure I understand what you mean by "policy". Is it an > interface > >>>>> that will have a few default implementations and user will be able to > >>>>> create his own one? If so, could you please write an example of such > >>>>> interface (how you see it) and how and when it's methods will be > >>>>> > >>>> invoked. > >>>> > >>>>> On Wed, Apr 25, 2018 at 10:10 PM, Eduard Shangareev < > >>>>> eduard.shangar...@gmail.com> wrote: > >>>>> > >>>>> Igniters, > >>>>>> I have described the issue with current approach in "New definition > >>>>>> > >>>>> for > >>>> > >>>>> affinity node (issues with baseline)" topic[1]. > >>>>>> > >>>>>> Now we have 2 different affinity topology (one for in-memory, > another > >>>>>> > >>>>> for > >>>> > >>>>> persistent caches). > >>>>>> > >>>>>> It causes problems: > >>>>>> - we lose (in general) co-location between different caches; > >>>>>> - we can't avoid PME when non-BLAT node joins cluster; > >>>>>> - implementation should consider 2 different approaches to affinity > >>>>>> calculation. > >>>>>> > >>>>>> So, I suggest unifying behavior of in-memory and persistent caches. > >>>>>> They should all use BLAT. > >>>>>> > >>>>>> Their behaviors were different because we couldn't guarantee the > >>>>>> > >>>>> safety > >>>> > >>>>> of > >>>>> > >>>>>> in-memory data. > >>>>>> It should be fixed by a new mechanism of BLAT changing policy which > >>>>>> > >>>>> was > >>>> > >>>>> already discussed there - "Triggering rebalancing on timeout or > >>>>>> > >>>>> manually > >>>> > >>>>> if > >>>>> > >>>>>> the baseline topology is not reassembled" [2]. > >>>>>> > >>>>>> And we should have a policy by default which similar to current one > >>>>>> (add nodes, remove nodes automatically but after some reasonable > delay > >>>>>> [seconds]). > >>>>>> > >>>>>> After this change, we could stop using the term 'BLAT', Basline and > so > >>>>>> > >>>>> on. > >>>>> > >>>>>> Because there would not be an alternative. So, it would be only one > >>>>>> possible Affinity Topology. > >>>>>> > >>>>>> > >>>>>> [1] > >>>>>> http://apache-ignite-developers.2346864.n4.nabble. > >>>>>> > >>>>> com/New-definition-for- > >>>>> > >>>>>> affinity-node-issues-with-baseline-td29868.html > >>>>>> [2] > >>>>>> http://apache-ignite-developers.2346864.n4.nabble. > >>>>>> com/Triggering-rebalancing-on-timeout-or-manually-if-the- > >>>>>> baseline-topology-is-not-reassembled-td29299.html#none > >>>>>> > >>>>>> > >>>>> > >>>>> -- > >>>>> Best regards, > >>>>> Ilya > >>>>> > >>>>> > >>> > > >