[ 
https://issues.apache.org/jira/browse/CASSANDRA-7546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14069396#comment-14069396
 ] 

Benedict commented on CASSANDRA-7546:
-------------------------------------

My concern with the approach you've outlined is that we're barely a hair's 
breadth from a lock: as soon as we hit _any_ contention, we inflate to locking 
behaviour. This is good for large partitions, and most likely bad for small 
ones, and more to the point seems barely worth the complexity of not just 
making it a lock in the first place. On further consideration, I think I would 
perhaps prefer to run this lock-inflation behaviour based on the size of the 
aborted changes, so if the amount of work we've wasted exceeds some threshold 
we decide it's high time all threads were stopped to let us finish. We could in 
this scenario flip a switch that requires all modifications to acquire the 
monitor once we hit this threshold once; I would be fine with this behaviour, 
and it would be simple. 

I do wonder how much of a problem this is in 2.1, though. I wonder if the 
largest problem with these racy modifications isn't actually the massive 
amounts of memtable arena allocations they incur in 2.0 with all their 
transformation.apply() calls (which reallocate the mutation on the arena), 
which is most likely what causes the promotion failures, as they cannot be 
collected. I wonder if we shouldn't simply backport the logic to only allocate 
these once, or at most twice (the first time we race). It seems much more 
likely to me that this is where the pain is being felt.

> AtomicSortedColumns.addAllWithSizeDelta has a spin loop that allocates memory
> -----------------------------------------------------------------------------
>
>                 Key: CASSANDRA-7546
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-7546
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>            Reporter: graham sanderson
>            Assignee: graham sanderson
>         Attachments: 7546.20.txt, 7546.20_2.txt, 7546.20_alt.txt, 
> suggestion1.txt, suggestion1_21.txt
>
>
> In order to preserve atomicity, this code attempts to read, clone/update, 
> then CAS the state of the partition.
> Under heavy contention for updating a single partition this can cause some 
> fairly staggering memory growth (the more cores on your machine the worst it 
> gets).
> Whilst many usage patterns don't do highly concurrent updates to the same 
> partition, hinting today, does, and in this case wild (order(s) of magnitude 
> more than expected) memory allocation rates can be seen (especially when the 
> updates being hinted are small updates to different partitions which can 
> happen very fast on their own) - see CASSANDRA-7545
> It would be best to eliminate/reduce/limit the spinning memory allocation 
> whilst not slowing down the very common un-contended case.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to