[ 
https://issues.apache.org/jira/browse/CASSANDRA-15367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17023361#comment-17023361
 ] 

Benedict Elliott Smith edited comment on CASSANDRA-15367 at 1/25/20 1:00 AM:
-----------------------------------------------------------------------------

bq. I don’t understand how that could be done without reintroducing the 
contention gc problem.

Simply by making it fast enough.  CASSANDRA-15511 manages to get the costs 
significantly lower than today, even with 16-threads actively spinning on a 
dual-socket 24-core machine.  It manages to compete fairly well even against 
itself with locking enabled (although the lock variant does manage lower 
garbage, it isn't dramatic overall).  

The spreadsheet I posted compares a number of possible approaches, settings and 
workloads.  

Of course, there would still be the potential for some wasted work that could 
be usefully spent elsewhere, but I think the gains are minimal.  There are also 
other options available to us for minimising contention besides a lock, that 
I've broached before (e.g. on failure to update, tag the write onto a 
linked-list and merge lazily on read, potentially attempting to apply the merge 
to the tree each time to reduce duplicated work).

bq. either merge these two classes, or make one control the other

It might well be possible to merge the management of these things, it's an 
interesting idea and something to consider.

bq. Rough example with lazy naming here

Thanks.  I'll have to dig into that next week, failing to reckon with it right 
now.


was (Author: benedict):
bq. I don’t understand how that could be done without reintroducing the 
contention gc problem.

Simply by making it fast enough.  CASSANDRA-15511 manages to get the costs 
significantly lower than today, even with 16-threads actively spinning on a 
dual-socket 24-core machine.  It manages to compete fairly well even against 
itself with locking enabled (although the lock variant does manage lower 
garbage, it isn't dramatic overall).  

The spreadsheet I posted compares a number of possible approaches, settings and 
workloads.  

Of course, there would still be the potential for some wasted work that could 
be usefully spent elsewhere, but I think the gains are minimal.  There are also 
other options available to us for minimising contention besides a lock, that 
I've broached before (on failure to update, tag the write onto a linked-list 
and merge lazily on read, potentially attempting to apply the merge to the tree 
each time to reduce duplicated work).

bq. either merge these two classes, or make one control the other

It might well be possible to merge the management of these things, it's an 
interesting idea and something to consider.

bq. Rough example with lazy naming here

Thanks.  I'll have to dig into that next week, failing to reckon with it right 
now.

> Memtable memory allocations may deadlock
> ----------------------------------------
>
>                 Key: CASSANDRA-15367
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-15367
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Local/Commit Log, Local/Memtable
>            Reporter: Benedict Elliott Smith
>            Assignee: Benedict Elliott Smith
>            Priority: Normal
>             Fix For: 4.0, 2.2.x, 3.0.x, 3.11.x
>
>
> * Under heavy contention, we guard modifications to a partition with a mutex, 
> for the lifetime of the memtable.
> * Memtables block for the completion of all {{OpOrder.Group}} started before 
> their flush began
> * Memtables permit operations from this cohort to fall-through to the 
> following Memtable, in order to guarantee a precise commitLogUpperBound
> * Memtable memory limits may be lifted for operations in the first cohort, 
> since they block flush (and hence block future memory allocation)
> With very unfortunate scheduling
> * A contended partition may rapidly escalate to a mutex
> * The system may reach memory limits that prevent allocations for the new 
> Memtable’s cohort (C2) 
> * An operation from C2 may hold the mutex when this occurs
> * Operations from a prior Memtable’s cohort (C1), for a contended partition, 
> may fall-through to the next Memtable
> * The operations from C1 may execute after the above is encountered by those 
> from C2



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Reply via email to