Okay, I drove back to work to check this out. It turns out all my troubles were being caused by inconsistent equals/hashCode. The key in the cache was an abstract base class with multiple overrides (which in itself I believe is one of those grayish areas wrt equals/hashCode). I just changed this to string and it all worked perfectly. I'll find the root reason on monday :)
Kristian 2016-06-17 18:40 GMT+02:00 Kristian Rosenvold <krosenv...@apache.org>: > Denis, you linked back to my own post :) > > I've left work for the weekend, but there is one piece of information > that couldnt leave my head: The database backing of the cache always > contains fewer nodes than either of the cluster members, even though > there is no reported error. > > This would actually be consistent with an inconsistent equals/hashCode > implementation on one of the cache keys where the upsert in the > database normalizes 2 objects that appear to be different down to the > same value. equals/hashCode is one of the scariest things around, and > I'm supposed to be good at that stuff :) > > Is Ignite known to be particularly picky about this ? > > Kristian > > > 2016-06-17 14:58 GMT+02:00 Denis Magda <dma...@gridgain.com>: >> Kristian, >> >> This topic looks similar to the following one [1]. Probably the issue is the >> same so I would prefer to discuss this in one place if you don’t mind. >> >> [1] >> http://apache-ignite-users.70518.x6.nabble.com/Replicated-cache-leaks-entries-on-1-6-and-1-7-SNAPSHOT-td5704.html >> >> — >> Denis >> >> On Jun 17, 2016, at 3:41 PM, Dmitriy Setrakyan <dsetrak...@apache.org> >> wrote: >> >> Kristian, it is likely an environment problem, rather than Ignite problem. >> Can you create a simple reproducer that starts 2 nodes in the same JVM and >> proves that data is not replicated? If the problem is in Ignite, we will fix >> it asap. >> >> On Thu, Jun 16, 2016 at 10:58 PM, Kristian Rosenvold <krosenv...@apache.org> >> wrote: >>> >>> We're using a cache with CacheMode.REPLICATED. >>> >>> Using 2 nodes, I start each node sequentially and they both get the >>> same number of elements in their caches (as expected so far). >>> >>> Almost immedately, the caches start to drift out sync, all of the >>> elements are simply not getting replicated. There is nothing in the >>> log to indicate anything peculiar happening. >>> >>> Downgrading to 1.5 makes this problem go away. >>> >>> Any suggestions ? >>> >>> >>> Kristian >> >> >>