Hello! I tried to summarize some ideas on how to make ATOMIC caches stay consistent in case of topology changes.
Imagine partitioned ATOMIC cache with 2 backups configured (primary + 2 backup copies of the partition). One of possible scenarios to get to primary and backups divergence is the following - update initiating node sends update operation to primary node, primary node propagates update to 1 of 2 backups and then dies. If initiating node crashes as well (and therefore cannot retry the operation) then in current implementation system comes to a situation when 2 copies of the partition present in cluster may be different. Note that both situations possible - new primary contains the latest update and backup does not and vice versa. New backup will be elected according to configured affinity and will rebalance the partition from random owner, but copies may not be consistent due to described above. This problem does not affect TRANSACTIONAL caches as 2PC protocol deals with scenarios of the kind very well. Here is the link to IEP - https://cwiki.apache.org/confluence/display/IGNITE/IEP-12+Make+ATOMIC+Caches+Consistent+Again Sam, Alex G, Vladimir, please share your thoughts. --Yakov