Re: Primary partitions return zero partitions before rebalance.
Ah, sorry for that. PME = Partition Map Exchange. It is described along with late affinity assignment in article you referenced earlier [1]. [1] https://cwiki.apache.org/confluence/display/IGNITE/%28Partition+Map%29+Exchange+-+under+the+hood#id-(PartitionMap)Exchange-under вт, 2 апр. 2019 г. в 20:22, Koitoer : > > Sorry but what is exactly the PME ? > > On Mon, Apr 1, 2019 at 1:55 AM Павлухин Иван wrote: >> >> Hi, >> >> Sorry for the late answer. An observed result seems expected to me. I >> suppose following: >> 1. EVT_CACHE_REBALANCE_STOPPED is fired when a particular node loaded >> all partitions which it will be responsible for. >> 2. All nodes it the cluster must become aware that partition >> assignment was changed. So, PME will happen to make all nodes aware of >> new assignment. >> 3. Once PME completes all nodes will consistently treat just entered >> node as primary for a corresponding set of partitions. >> >> Do not hesitate to write back if you feel that something is going wrong. >> >> вт, 19 мар. 2019 г. в 19:30, Koitoer : >> > >> > Hello Igniters >> > >> > The version of Ignite that we are using is 2.7.0. I'm adding the events >> > that I want to hear via the IgniteConfiguration using the >> > `setIncludeEventTypes` >> > Then using ignite.event().localListen(listenerPredicate, eventTypes); >> > >> > EVT_CACHE_REBALANCE_STARTED, >> > EVT_CACHE_REBALANCE_STOPPED, >> > EVT_CACHE_REBALANCE_PART_LOADED, >> > EVT_CACHE_REBALANCE_PART_UNLOADED, >> > EVT_CACHE_REBALANCE_PART_DATA_LOST >> > >> > Once I listen any of the events above, I used >> > `ignite.affinity(cacheName.name())` to retrieve the Affinity function in >> > which I'm calling the `primaryPartitions` method or `allPartitions` using >> > the ClusterNode instance that represents `this` node. >> > >> > Once I hear the rebalance process stop event I created a thread in charge >> > of checking the partition assignment as follows. >> > >> > new Thread(() -> { >> > for (int attempt = 0; attempt <= attempts; attempt++) { >> > log.info("event=partitionAssignmentRetryLogic attempt={}, >> > before={}, now={}", attempt, assignedPartitions, >> > affinity.primaryPartitions(clusterNode)); >> > >> > try { >> > if (affinity.primaryPartitions(clusterNode).length != 0) { >> > log.info("event=partitionAssignmentRetryLogicSuccess"); >> > } >> > TimeUnit.SECONDS.sleep(delay); >> > } catch (Exception e) { >> > log.error("event=ErrorOnTimerWait message={}", e.getMessage(), >> > e); >> > } >> > } >> > }).start(); >> > >> > >> > After a couple of attempts (some seconds), the `primaryPartitions` is >> > returning the correct set of partitions assigned to a node. I will check >> > the AffinityAssignment for trying to do this in a cleaner way as you >> > suggest. >> > >> > >> > On Fri, Mar 15, 2019 at 12:11 PM Павлухин Иван wrote: >> >> >> >> Hi, >> >> >> >> What Ignite version do you use? >> >> How do you register your listener? >> >> On what object do you call primaryPartitions/allPartitions? >> >> >> >> It is true that Ignite uses late affinitly assignment. And it means >> >> that for each topology change (node enter or node leave) parttion >> >> assigment changes twice. First time temporay backups are created which >> >> should be rebalanced from other nodes (EVT_CACHE_REBALANCE_STARTED >> >> takes place here). Second time redundant partition replicas should be >> >> marked as unusable (and unloaded after that) >> >> (EVT_CACHE_REBALANCE_STOPPED). And it is useful to understand that >> >> Affinity interface calculates partition distribution using affinity >> >> function and such distribution might differ from real partitoin >> >> assignment. And it differes when rebalance is in progress. See >> >> AffinityAssignment interface. >> >> >> >> ср, 13 мар. 2019 г. в 21:59, Koitoer : >> >> > >> >> > Hi All. >> >> > >> >> > I'm trying to follow the rebalance events of my ignite cluster so I'm >> >> > able to track which partitions are assigned to each node at any point >> >> > in time. I am listening to the `EVT_CACHE_REBALANCE_STARTED` and >> >> > `EVT_CACHE_REBALANCE_STOPPED` >> >> > events from Ignite and that is working well, except in the case one >> >> > node crash and another take its place. >> >> > >> >> > My cluster is 5 nodes. >> >> > Ex. Node 1 has let's say 100 partitions, after I kill this node the >> >> > partitions that were assigned to it, got rebalance across the entire >> >> > cluster, I'm able to track that done with the STOPPED event and >> >> > checking the affinity function in each one of them using the >> >> > `primaryPartitions` method gives me that, if I add all those numbers I >> >> > get 1024 partitions, which is why I was expected. >> >> > >> >> > However when a new node replaces the previous one, I see a rebalance >> >> > process occurs and now I'm getting that some of the partitions >> >> > `disappe
Re: Primary partitions return zero partitions before rebalance.
Sorry but what is exactly the PME ? On Mon, Apr 1, 2019 at 1:55 AM Павлухин Иван wrote: > Hi, > > Sorry for the late answer. An observed result seems expected to me. I > suppose following: > 1. EVT_CACHE_REBALANCE_STOPPED is fired when a particular node loaded > all partitions which it will be responsible for. > 2. All nodes it the cluster must become aware that partition > assignment was changed. So, PME will happen to make all nodes aware of > new assignment. > 3. Once PME completes all nodes will consistently treat just entered > node as primary for a corresponding set of partitions. > > Do not hesitate to write back if you feel that something is going wrong. > > вт, 19 мар. 2019 г. в 19:30, Koitoer : > > > > Hello Igniters > > > > The version of Ignite that we are using is 2.7.0. I'm adding the events > that I want to hear via the IgniteConfiguration using the > `setIncludeEventTypes` > > Then using ignite.event().localListen(listenerPredicate, eventTypes); > > > > EVT_CACHE_REBALANCE_STARTED, > > EVT_CACHE_REBALANCE_STOPPED, > > EVT_CACHE_REBALANCE_PART_LOADED, > > EVT_CACHE_REBALANCE_PART_UNLOADED, > > EVT_CACHE_REBALANCE_PART_DATA_LOST > > > > Once I listen any of the events above, I used > `ignite.affinity(cacheName.name())` to retrieve the Affinity function in > which I'm calling the `primaryPartitions` method or `allPartitions` using > the ClusterNode instance that represents `this` node. > > > > Once I hear the rebalance process stop event I created a thread in > charge of checking the partition assignment as follows. > > > > new Thread(() -> { > > for (int attempt = 0; attempt <= attempts; attempt++) { > > log.info("event=partitionAssignmentRetryLogic attempt={}, > before={}, now={}", attempt, assignedPartitions, > > affinity.primaryPartitions(clusterNode)); > > > > try { > > if (affinity.primaryPartitions(clusterNode).length != 0) { > > log.info("event=partitionAssignmentRetryLogicSuccess"); > > } > > TimeUnit.SECONDS.sleep(delay); > > } catch (Exception e) { > > log.error("event=ErrorOnTimerWait message={}", > e.getMessage(), e); > > } > > } > > }).start(); > > > > > > After a couple of attempts (some seconds), the `primaryPartitions` is > returning the correct set of partitions assigned to a node. I will check > the AffinityAssignment for trying to do this in a cleaner way as you > suggest. > > > > > > On Fri, Mar 15, 2019 at 12:11 PM Павлухин Иван > wrote: > >> > >> Hi, > >> > >> What Ignite version do you use? > >> How do you register your listener? > >> On what object do you call primaryPartitions/allPartitions? > >> > >> It is true that Ignite uses late affinitly assignment. And it means > >> that for each topology change (node enter or node leave) parttion > >> assigment changes twice. First time temporay backups are created which > >> should be rebalanced from other nodes (EVT_CACHE_REBALANCE_STARTED > >> takes place here). Second time redundant partition replicas should be > >> marked as unusable (and unloaded after that) > >> (EVT_CACHE_REBALANCE_STOPPED). And it is useful to understand that > >> Affinity interface calculates partition distribution using affinity > >> function and such distribution might differ from real partitoin > >> assignment. And it differes when rebalance is in progress. See > >> AffinityAssignment interface. > >> > >> ср, 13 мар. 2019 г. в 21:59, Koitoer : > >> > > >> > Hi All. > >> > > >> > I'm trying to follow the rebalance events of my ignite cluster so I'm > able to track which partitions are assigned to each node at any point in > time. I am listening to the `EVT_CACHE_REBALANCE_STARTED` and > `EVT_CACHE_REBALANCE_STOPPED` > >> > events from Ignite and that is working well, except in the case one > node crash and another take its place. > >> > > >> > My cluster is 5 nodes. > >> > Ex. Node 1 has let's say 100 partitions, after I kill this node the > partitions that were assigned to it, got rebalance across the entire > cluster, I'm able to track that done with the STOPPED event and checking > the affinity function in each one of them using the `primaryPartitions` > method gives me that, if I add all those numbers I get 1024 partitions, > which is why I was expected. > >> > > >> > However when a new node replaces the previous one, I see a rebalance > process occurs and now I'm getting that some of the partitions `disappear` > from the already existing nodes (which is expected as well as new node will > take some partitions from them) but when the STOPPED event is listened by > this new node if I call the `primaryPartitions` that one returns an empty > list, but if I used the `allPartitions` method that one give me a list (I > think at this point is primary + backups). > >> > > >> > If I let pass some time and I execute the `primaryPartitions` method > again I am able to retrieve the partitions that I was expecting to see > after the STO
Re: Primary partitions return zero partitions before rebalance.
Hi, Sorry for the late answer. An observed result seems expected to me. I suppose following: 1. EVT_CACHE_REBALANCE_STOPPED is fired when a particular node loaded all partitions which it will be responsible for. 2. All nodes it the cluster must become aware that partition assignment was changed. So, PME will happen to make all nodes aware of new assignment. 3. Once PME completes all nodes will consistently treat just entered node as primary for a corresponding set of partitions. Do not hesitate to write back if you feel that something is going wrong. вт, 19 мар. 2019 г. в 19:30, Koitoer : > > Hello Igniters > > The version of Ignite that we are using is 2.7.0. I'm adding the events that > I want to hear via the IgniteConfiguration using the `setIncludeEventTypes` > Then using ignite.event().localListen(listenerPredicate, eventTypes); > > EVT_CACHE_REBALANCE_STARTED, > EVT_CACHE_REBALANCE_STOPPED, > EVT_CACHE_REBALANCE_PART_LOADED, > EVT_CACHE_REBALANCE_PART_UNLOADED, > EVT_CACHE_REBALANCE_PART_DATA_LOST > > Once I listen any of the events above, I used > `ignite.affinity(cacheName.name())` to retrieve the Affinity function in > which I'm calling the `primaryPartitions` method or `allPartitions` using the > ClusterNode instance that represents `this` node. > > Once I hear the rebalance process stop event I created a thread in charge of > checking the partition assignment as follows. > > new Thread(() -> { > for (int attempt = 0; attempt <= attempts; attempt++) { > log.info("event=partitionAssignmentRetryLogic attempt={}, before={}, > now={}", attempt, assignedPartitions, > affinity.primaryPartitions(clusterNode)); > > try { > if (affinity.primaryPartitions(clusterNode).length != 0) { > log.info("event=partitionAssignmentRetryLogicSuccess"); > } > TimeUnit.SECONDS.sleep(delay); > } catch (Exception e) { > log.error("event=ErrorOnTimerWait message={}", e.getMessage(), e); > } > } > }).start(); > > > After a couple of attempts (some seconds), the `primaryPartitions` is > returning the correct set of partitions assigned to a node. I will check the > AffinityAssignment for trying to do this in a cleaner way as you suggest. > > > On Fri, Mar 15, 2019 at 12:11 PM Павлухин Иван wrote: >> >> Hi, >> >> What Ignite version do you use? >> How do you register your listener? >> On what object do you call primaryPartitions/allPartitions? >> >> It is true that Ignite uses late affinitly assignment. And it means >> that for each topology change (node enter or node leave) parttion >> assigment changes twice. First time temporay backups are created which >> should be rebalanced from other nodes (EVT_CACHE_REBALANCE_STARTED >> takes place here). Second time redundant partition replicas should be >> marked as unusable (and unloaded after that) >> (EVT_CACHE_REBALANCE_STOPPED). And it is useful to understand that >> Affinity interface calculates partition distribution using affinity >> function and such distribution might differ from real partitoin >> assignment. And it differes when rebalance is in progress. See >> AffinityAssignment interface. >> >> ср, 13 мар. 2019 г. в 21:59, Koitoer : >> > >> > Hi All. >> > >> > I'm trying to follow the rebalance events of my ignite cluster so I'm able >> > to track which partitions are assigned to each node at any point in time. >> > I am listening to the `EVT_CACHE_REBALANCE_STARTED` and >> > `EVT_CACHE_REBALANCE_STOPPED` >> > events from Ignite and that is working well, except in the case one node >> > crash and another take its place. >> > >> > My cluster is 5 nodes. >> > Ex. Node 1 has let's say 100 partitions, after I kill this node the >> > partitions that were assigned to it, got rebalance across the entire >> > cluster, I'm able to track that done with the STOPPED event and checking >> > the affinity function in each one of them using the `primaryPartitions` >> > method gives me that, if I add all those numbers I get 1024 partitions, >> > which is why I was expected. >> > >> > However when a new node replaces the previous one, I see a rebalance >> > process occurs and now I'm getting that some of the partitions `disappear` >> > from the already existing nodes (which is expected as well as new node >> > will take some partitions from them) but when the STOPPED event is >> > listened by this new node if I call the `primaryPartitions` that one >> > returns an empty list, but if I used the `allPartitions` method that one >> > give me a list (I think at this point is primary + backups). >> > >> > If I let pass some time and I execute the `primaryPartitions` method again >> > I am able to retrieve the partitions that I was expecting to see after the >> > STOPPED event comes. I read here >> > https://cwiki.apache.org/confluence/display/IGNITE/%28Partition+Map%29+Exchange+-+under+the+hood#id-(PartitionMap)Exchange-under >> > the hood-LateAffi
Re: Primary partitions return zero partitions before rebalance.
Hello Igniters The version of Ignite that we are using is 2.7.0. I'm adding the events that I want to hear via the IgniteConfiguration using the `setIncludeEventTypes` Then using ignite.event().localListen(listenerPredicate, eventTypes); EVT_CACHE_REBALANCE_STARTED, EVT_CACHE_REBALANCE_STOPPED, EVT_CACHE_REBALANCE_PART_LOADED, EVT_CACHE_REBALANCE_PART_UNLOADED, EVT_CACHE_REBALANCE_PART_DATA_LOST Once I listen any of the events above, I used `ignite.affinity(cacheName.name())` to retrieve the Affinity function in which I'm calling the `primaryPartitions` method or `allPartitions` using the ClusterNode instance that represents `this` node. Once I hear the rebalance process stop event I created a thread in charge of checking the partition assignment as follows. new Thread(() -> { for (int attempt = 0; attempt <= attempts; attempt++) { log.info("event=partitionAssignmentRetryLogic attempt={}, before={}, now={}", attempt, assignedPartitions, affinity.primaryPartitions(clusterNode)); try { if (affinity.primaryPartitions(clusterNode).length != 0) { log.info("event=partitionAssignmentRetryLogicSuccess"); } TimeUnit.SECONDS.sleep(delay); } catch (Exception e) { log.error("event=ErrorOnTimerWait message={}", e.getMessage(), e); } } }).start(); After a couple of attempts (some seconds), the `primaryPartitions` is returning the correct set of partitions assigned to a node. I will check the AffinityAssignment for trying to do this in a cleaner way as you suggest. On Fri, Mar 15, 2019 at 12:11 PM Павлухин Иван wrote: > Hi, > > What Ignite version do you use? > How do you register your listener? > On what object do you call primaryPartitions/allPartitions? > > It is true that Ignite uses late affinitly assignment. And it means > that for each topology change (node enter or node leave) parttion > assigment changes twice. First time temporay backups are created which > should be rebalanced from other nodes (EVT_CACHE_REBALANCE_STARTED > takes place here). Second time redundant partition replicas should be > marked as unusable (and unloaded after that) > (EVT_CACHE_REBALANCE_STOPPED). And it is useful to understand that > Affinity interface calculates partition distribution using affinity > function and such distribution might differ from real partitoin > assignment. And it differes when rebalance is in progress. See > AffinityAssignment interface. > > ср, 13 мар. 2019 г. в 21:59, Koitoer : > > > > Hi All. > > > > I'm trying to follow the rebalance events of my ignite cluster so I'm > able to track which partitions are assigned to each node at any point in > time. I am listening to the `EVT_CACHE_REBALANCE_STARTED` and > `EVT_CACHE_REBALANCE_STOPPED` > > events from Ignite and that is working well, except in the case one node > crash and another take its place. > > > > My cluster is 5 nodes. > > Ex. Node 1 has let's say 100 partitions, after I kill this node the > partitions that were assigned to it, got rebalance across the entire > cluster, I'm able to track that done with the STOPPED event and checking > the affinity function in each one of them using the `primaryPartitions` > method gives me that, if I add all those numbers I get 1024 partitions, > which is why I was expected. > > > > However when a new node replaces the previous one, I see a rebalance > process occurs and now I'm getting that some of the partitions `disappear` > from the already existing nodes (which is expected as well as new node will > take some partitions from them) but when the STOPPED event is listened by > this new node if I call the `primaryPartitions` that one returns an empty > list, but if I used the `allPartitions` method that one give me a list (I > think at this point is primary + backups). > > > > If I let pass some time and I execute the `primaryPartitions` method > again I am able to retrieve the partitions that I was expecting to see > after the STOPPED event comes. I read here > https://cwiki.apache.org/confluence/display/IGNITE/%28Partition+Map%29+Exchange+-+under+the+hood#id-(PartitionMap)Exchange-under > the hood-LateAffinityAssignment that it could be a late assignment, that > after the cache rebalance the new node needs to bring all the entries to > fill-out the cache and after that, the `primaryPartitions` will return > something. > > Will be great to know if this actually what is happening. > > > > My question is if there is any kind of event that I should listen so I > can be aware that this process (if this is what is happening) already > finish. I would like to said, "After you bring this node into the cluster > the partitions assigned to that node are the following: XXX, XXX". > > > > Also, I'm aware of the event `EVT_CACHE_REBALANCE_PART_LOADED` but I'm > seeing a ton of them and at this point, I would be able to know when the > last one arrives and say that are now my primary partitions. > >
Re: Primary partitions return zero partitions before rebalance.
Hi, What Ignite version do you use? How do you register your listener? On what object do you call primaryPartitions/allPartitions? It is true that Ignite uses late affinitly assignment. And it means that for each topology change (node enter or node leave) parttion assigment changes twice. First time temporay backups are created which should be rebalanced from other nodes (EVT_CACHE_REBALANCE_STARTED takes place here). Second time redundant partition replicas should be marked as unusable (and unloaded after that) (EVT_CACHE_REBALANCE_STOPPED). And it is useful to understand that Affinity interface calculates partition distribution using affinity function and such distribution might differ from real partitoin assignment. And it differes when rebalance is in progress. See AffinityAssignment interface. ср, 13 мар. 2019 г. в 21:59, Koitoer : > > Hi All. > > I'm trying to follow the rebalance events of my ignite cluster so I'm able to > track which partitions are assigned to each node at any point in time. I am > listening to the `EVT_CACHE_REBALANCE_STARTED` and > `EVT_CACHE_REBALANCE_STOPPED` > events from Ignite and that is working well, except in the case one node > crash and another take its place. > > My cluster is 5 nodes. > Ex. Node 1 has let's say 100 partitions, after I kill this node the > partitions that were assigned to it, got rebalance across the entire cluster, > I'm able to track that done with the STOPPED event and checking the affinity > function in each one of them using the `primaryPartitions` method gives me > that, if I add all those numbers I get 1024 partitions, which is why I was > expected. > > However when a new node replaces the previous one, I see a rebalance process > occurs and now I'm getting that some of the partitions `disappear` from the > already existing nodes (which is expected as well as new node will take some > partitions from them) but when the STOPPED event is listened by this new node > if I call the `primaryPartitions` that one returns an empty list, but if I > used the `allPartitions` method that one give me a list (I think at this > point is primary + backups). > > If I let pass some time and I execute the `primaryPartitions` method again I > am able to retrieve the partitions that I was expecting to see after the > STOPPED event comes. I read here > https://cwiki.apache.org/confluence/display/IGNITE/%28Partition+Map%29+Exchange+-+under+the+hood#id-(PartitionMap)Exchange-under > the hood-LateAffinityAssignment that it could be a late assignment, that > after the cache rebalance the new node needs to bring all the entries to > fill-out the cache and after that, the `primaryPartitions` will return > something. > Will be great to know if this actually what is happening. > > My question is if there is any kind of event that I should listen so I can be > aware that this process (if this is what is happening) already finish. I > would like to said, "After you bring this node into the cluster the > partitions assigned to that node are the following: XXX, XXX". > > Also, I'm aware of the event `EVT_CACHE_REBALANCE_PART_LOADED` but I'm seeing > a ton of them and at this point, I would be able to know when the last one > arrives and say that are now my primary partitions. > > Thanks in advance. -- Best regards, Ivan Pavlukhin
Primary partitions return zero partitions before rebalance.
Hi All. I'm trying to follow the rebalance events of my ignite cluster so I'm able to track which partitions are assigned to each node at any point in time. I am listening to the `EVT_CACHE_REBALANCE_STARTED` and `EVT_CACHE_REBALANCE_STOPPED` events from Ignite and that is working well, except in the case one node crash and another take its place. My cluster is 5 nodes. Ex. Node 1 has let's say 100 partitions, after I kill this node the partitions that were assigned to it, got rebalance across the entire cluster, I'm able to track that done with the STOPPED event and checking the affinity function in each one of them using the `primaryPartitions` method gives me that, if I add all those numbers I get 1024 partitions, which is why I was expected. However when a new node replaces the previous one, I see a rebalance process occurs and now I'm getting that some of the partitions `disappear` from the already existing nodes (which is expected as well as new node will take some partitions from them) but when the STOPPED event is listened by this new node if I call the `primaryPartitions` that one returns an empty list, but if I used the `allPartitions` method that one give me a list (I think at this point is primary + backups). If I let pass some time and I execute the `primaryPartitions` method again I am able to retrieve the partitions that I was expecting to see after the STOPPED event comes. I read here https://cwiki.apache.org/confluence/display/IGNITE/%28Partition+Map%29+Exchange+-+under+the+hood#id-(PartitionMap)Exchange-under the hood-LateAffinityAssignment that it could be a late assignment, that after the cache rebalance the new node needs to bring all the entries to fill-out the cache and after that, the `primaryPartitions` will return something. Will be great to know if this actually what is happening. My question is if there is any kind of event that I should listen so I can be aware that this process (if this is what is happening) already finish. I would like to said, "After you bring this node into the cluster the partitions assigned to that node are the following: XXX, XXX". Also, I'm aware of the event `EVT_CACHE_REBALANCE_PART_LOADED` but I'm seeing a ton of them and at this point, I would be able to know when the last one arrives and say that are now my primary partitions. Thanks in advance.