Igniters,

Sometimes, at real deployment, we're faced with inconsistent state across
the topology.
This means that somehow we have different values for the same key at
different nodes.
This is an extremely rare situation, but, when you have thousands of
terabytes of data, this can be a real problem.

Apache Ignite provides a consistency guarantee, each affinity node should
contain the same value for the same key, at least eventually.
But this guarantee can be violated because of bugs, see IEP-31 [1] for
details.

So, I created the issue [2] to handle such situations.
The main idea is to have a special cache.withConsistency() proxy allows
checking a fix inconsistency on get operation.

I've created PR [3] with following improvements (when
cache.withConsistency() proxy used):

- PESSIMISTIC && !READ_COMMITTED transaction
-- checks values across the topology (under locks),
-- finds correct values according to LWW strategy,
-- records special event in case consistency violation found (contains
inconsistent map <Node, <K,V>> and last values <K,V>),
-- enlists writes with latest value for each inconsistent key, so it will
be written on tx.commit().

- OPTIMISTIC || READ_COMMITTED transactions
-- checks values across the topology (not under locks, so false-positive
case is possible),
-- starts PESSIMISTIC && SERIALIZABLE (at separate thread) transaction for
each possibly broken key and fixes it on a commit if necessary.
-- original transaction performs get-after-fix and can be continued if the
fix does not conflict with isolation level.

Future plans
- Consistency guard (special process periodically checks we have no
inconsistency).
- MVCC support.
- Atomic caches support.
- Thin client support.
- SQL support.
- Read-with-consistency before the write operation.
- Single key read-with-consistency optimization, now the collection
approach used each time.
- Do not perform read-with-consistency for the key in case it was
consistently read some time ago.

[1]
https://cwiki.apache.org/confluence/display/IGNITE/IEP-31+Consistency+check+and+fix
[2] https://issues.apache.org/jira/browse/IGNITE-10663
[3] https://github.com/apache/ignite/pull/5656

Reply via email to