Vladimir,

Thanks for a throughout overview and proposal.
 
> Also we could try employing tiered approach
> 1) Try to keep everything in-memory to minimize writes to blocks
> 2) Fallback to persistent lock data if certain threshold is reached.

What are the benefits of the backed-by-persistence approach in compare to the 
one based on tuples? Specifically:
- will the persistence approach work for both 3rd party and Ignite persistence?
- any performance impacts depending on a chosen method?
- what’s faster to implement?

—
Denis

> On Dec 13, 2017, at 2:10 AM, Vladimir Ozerov <voze...@gridgain.com> wrote:
> 
> Igniters,
> 
> As you probably we know we work actively on MVCC [1] and transactional SQL
> [2] features which could be treated as a single huge improvement. We face a
> number of challenges and one of them is locking.
> 
> At the moment information about all locks is kept in memory on per-entry
> basis (see GridCacheMvccManager). For every locked key we maintain current
> lock owner (XID) and the list of would-be-owner transactions. When
> transaction is about to lock an entry two scenarios are possible:
> 1) If entry is not locked we obtain the lock immediately
> 2) if entry is locked we add current transaction to the wait list and jumps
> to the next entry to be locked. Once the first entry is released by
> conflicting transaction, current transaction becomes an owner of the first
> entry and tries to promote itself for subsequent entries.
> 
> Once all required locks are obtained, response is sent to the caller.
> 
> This approach doesn't work well for transactional SQL - if we update
> millions of rows in a single transaction we will simply run out of memory.
> To mitigate the problem other database vendors keep information about locks
> inside the tuples. I propose to apply the similar design as follows:
> 
> 1) No per-entry lock information is stored in memory anymore.
> 2) The list of active transactions are maintained in memory still
> 3) When TX locks an entry, it sets special marker to the tuple [3]
> 4) When TX meets already locked entry, it enlists itself to wait queue of
> conflicting transaction and suspends
> 5) When first transaction releases conflicting lock, it notifies and wakes
> up suspended transactions, so they resume locking
> 6) Entry lock data is cleared on transaction commit
> 7) Entry lock data is not cleared on rollback or node restart; Instead, we
> will could use active transactions list to identify invalid locks and
> overwrite them as needed.
> 
> Also we could try employing tiered approach
> 1) Try to keep everything in-memory to minimize writes to blocks
> 2) Fallback to persistent lock data if certain threshold is reached.
> 
> Thoughts?
> 
> [1] https://issues.apache.org/jira/browse/IGNITE-3478
> [2] https://issues.apache.org/jira/browse/IGNITE-4191
> [3] Depends on final MVCC design - it could be per-tuple XID, undo vectors,
> per-block transaction lists, etc..
> 
> Vladimir.

Reply via email to