[ 
https://issues.apache.org/jira/browse/CASSANDRA-7546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14072469#comment-14072469
 ] 

Benedict commented on CASSANDRA-7546:
-------------------------------------

bq. it is the low 64 bits of a monotonic number

That's pretty pedantic, since with nanos that stretches to 600 years before 
overflow!

Either way, I'm not sure if I clarified or not but we should be offsetting this 
number from the memtable creation time so we can safely stick within 32 bits. I 
suggest we use the top bit being set as the indicator we've hit contention, so 
we naturally avoid problematic overflow (although really this would just result 
in our optimisation not running properly, so would also be fine)

bq.  how long you expect AtomicSorted/BTreeColumns to last

AtomicBTreeColumns is unlikely to live past 3.1. I would like to get rid of it 
in 3.0, but that is probably ambitious. So another year or so at bleeding edge; 
a few more years at various stages downstream no doubt. AtomicSortedColumns 
will be around as long as 2.0.x is, which is decided by the community really.

Either way, tuning this value is probably not super helpful, since the goal is 
simply to avoid lots of wasted memory allocations. We can simply define a 
sensible slightly cautious criteria for this, and that should be sufficient, 
since if we are slightly overly cautious the end result is only a small number 
of partitions seeing slightly reduced throughput for writes. It is not a huge 
deal either way. It's only really likely to have a measurable impact at all on 
very highly contended partitions, on which any sane value will likely yield a 
very similar improvement.

> AtomicSortedColumns.addAllWithSizeDelta has a spin loop that allocates memory
> -----------------------------------------------------------------------------
>
>                 Key: CASSANDRA-7546
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-7546
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>            Reporter: graham sanderson
>            Assignee: graham sanderson
>         Attachments: 7546.20.txt, 7546.20_2.txt, 7546.20_3.txt, 
> 7546.20_alt.txt, suggestion1.txt, suggestion1_21.txt
>
>
> In order to preserve atomicity, this code attempts to read, clone/update, 
> then CAS the state of the partition.
> Under heavy contention for updating a single partition this can cause some 
> fairly staggering memory growth (the more cores on your machine the worst it 
> gets).
> Whilst many usage patterns don't do highly concurrent updates to the same 
> partition, hinting today, does, and in this case wild (order(s) of magnitude 
> more than expected) memory allocation rates can be seen (especially when the 
> updates being hinted are small updates to different partitions which can 
> happen very fast on their own) - see CASSANDRA-7545
> It would be best to eliminate/reduce/limit the spinning memory allocation 
> whilst not slowing down the very common un-contended case.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to