Ah, sorry for that. PME = Partition Map Exchange. It is described along with late affinity assignment in article you referenced earlier [1].
[1] https://cwiki.apache.org/confluence/display/IGNITE/%28Partition+Map%29+Exchange+-+under+the+hood#id-(PartitionMap)Exchange-under вт, 2 апр. 2019 г. в 20:22, Koitoer <koit...@gmail.com>: > > Sorry but what is exactly the PME ? > > On Mon, Apr 1, 2019 at 1:55 AM Павлухин Иван <vololo...@gmail.com> wrote: >> >> Hi, >> >> Sorry for the late answer. An observed result seems expected to me. I >> suppose following: >> 1. EVT_CACHE_REBALANCE_STOPPED is fired when a particular node loaded >> all partitions which it will be responsible for. >> 2. All nodes it the cluster must become aware that partition >> assignment was changed. So, PME will happen to make all nodes aware of >> new assignment. >> 3. Once PME completes all nodes will consistently treat just entered >> node as primary for a corresponding set of partitions. >> >> Do not hesitate to write back if you feel that something is going wrong. >> >> вт, 19 мар. 2019 г. в 19:30, Koitoer <koit...@gmail.com>: >> > >> > Hello Igniters >> > >> > The version of Ignite that we are using is 2.7.0. I'm adding the events >> > that I want to hear via the IgniteConfiguration using the >> > `setIncludeEventTypes` >> > Then using ignite.event().localListen(listenerPredicate, eventTypes); >> > >> > EVT_CACHE_REBALANCE_STARTED, >> > EVT_CACHE_REBALANCE_STOPPED, >> > EVT_CACHE_REBALANCE_PART_LOADED, >> > EVT_CACHE_REBALANCE_PART_UNLOADED, >> > EVT_CACHE_REBALANCE_PART_DATA_LOST >> > >> > Once I listen any of the events above, I used >> > `ignite.affinity(cacheName.name())` to retrieve the Affinity function in >> > which I'm calling the `primaryPartitions` method or `allPartitions` using >> > the ClusterNode instance that represents `this` node. >> > >> > Once I hear the rebalance process stop event I created a thread in charge >> > of checking the partition assignment as follows. >> > >> > new Thread(() -> { >> > for (int attempt = 0; attempt <= attempts; attempt++) { >> > log.info("event=partitionAssignmentRetryLogic attempt={}, >> > before={}, now={}", attempt, assignedPartitions, >> > affinity.primaryPartitions(clusterNode)); >> > >> > try { >> > if (affinity.primaryPartitions(clusterNode).length != 0) { >> > log.info("event=partitionAssignmentRetryLogicSuccess"); >> > } >> > TimeUnit.SECONDS.sleep(delay); >> > } catch (Exception e) { >> > log.error("event=ErrorOnTimerWait message={}", e.getMessage(), >> > e); >> > } >> > } >> > }).start(); >> > >> > >> > After a couple of attempts (some seconds), the `primaryPartitions` is >> > returning the correct set of partitions assigned to a node. I will check >> > the AffinityAssignment for trying to do this in a cleaner way as you >> > suggest. >> > >> > >> > On Fri, Mar 15, 2019 at 12:11 PM Павлухин Иван <vololo...@gmail.com> wrote: >> >> >> >> Hi, >> >> >> >> What Ignite version do you use? >> >> How do you register your listener? >> >> On what object do you call primaryPartitions/allPartitions? >> >> >> >> It is true that Ignite uses late affinitly assignment. And it means >> >> that for each topology change (node enter or node leave) parttion >> >> assigment changes twice. First time temporay backups are created which >> >> should be rebalanced from other nodes (EVT_CACHE_REBALANCE_STARTED >> >> takes place here). Second time redundant partition replicas should be >> >> marked as unusable (and unloaded after that) >> >> (EVT_CACHE_REBALANCE_STOPPED). And it is useful to understand that >> >> Affinity interface calculates partition distribution using affinity >> >> function and such distribution might differ from real partitoin >> >> assignment. And it differes when rebalance is in progress. See >> >> AffinityAssignment interface. >> >> >> >> ср, 13 мар. 2019 г. в 21:59, Koitoer <koit...@gmail.com>: >> >> > >> >> > Hi All. >> >> > >> >> > I'm trying to follow the rebalance events of my ignite cluster so I'm >> >> > able to track which partitions are assigned to each node at any point >> >> > in time. I am listening to the `EVT_CACHE_REBALANCE_STARTED` and >> >> > `EVT_CACHE_REBALANCE_STOPPED` >> >> > events from Ignite and that is working well, except in the case one >> >> > node crash and another take its place. >> >> > >> >> > My cluster is 5 nodes. >> >> > Ex. Node 1 has let's say 100 partitions, after I kill this node the >> >> > partitions that were assigned to it, got rebalance across the entire >> >> > cluster, I'm able to track that done with the STOPPED event and >> >> > checking the affinity function in each one of them using the >> >> > `primaryPartitions` method gives me that, if I add all those numbers I >> >> > get 1024 partitions, which is why I was expected. >> >> > >> >> > However when a new node replaces the previous one, I see a rebalance >> >> > process occurs and now I'm getting that some of the partitions >> >> > `disappear` from the already existing nodes (which is expected as well >> >> > as new node will take some partitions from them) but when the STOPPED >> >> > event is listened by this new node if I call the `primaryPartitions` >> >> > that one returns an empty list, but if I used the `allPartitions` >> >> > method that one give me a list (I think at this point is primary + >> >> > backups). >> >> > >> >> > If I let pass some time and I execute the `primaryPartitions` method >> >> > again I am able to retrieve the partitions that I was expecting to see >> >> > after the STOPPED event comes. I read here >> >> > https://cwiki.apache.org/confluence/display/IGNITE/%28Partition+Map%29+Exchange+-+under+the+hood#id-(PartitionMap)Exchange-under >> >> > the hood-LateAffinityAssignment that it could be a late assignment, >> >> > that after the cache rebalance the new node needs to bring all the >> >> > entries to fill-out the cache and after that, the `primaryPartitions` >> >> > will return something. >> >> > Will be great to know if this actually what is happening. >> >> > >> >> > My question is if there is any kind of event that I should listen so I >> >> > can be aware that this process (if this is what is happening) already >> >> > finish. I would like to said, "After you bring this node into the >> >> > cluster the partitions assigned to that node are the following: XXX, >> >> > XXX". >> >> > >> >> > Also, I'm aware of the event `EVT_CACHE_REBALANCE_PART_LOADED` but I'm >> >> > seeing a ton of them and at this point, I would be able to know when >> >> > the last one arrives and say that are now my primary partitions. >> >> > >> >> > Thanks in advance. >> >> >> >> >> >> >> >> -- >> >> Best regards, >> >> Ivan Pavlukhin >> > >> > >> > >> > -- >> > koitoer .... >> >> >> >> -- >> Best regards, >> Ivan Pavlukhin > > > > -- > koitoer .... -- Best regards, Ivan Pavlukhin