[ 
https://issues.apache.org/jira/browse/CASSANDRA-7546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14141764#comment-14141764
 ] 

graham sanderson commented on CASSANDRA-7546:
---------------------------------------------

Thanks - I updated, and have run 1/16/256/1024 partitions against both my 
baseline 2.1.1, and patched (with 7546.21_v1.txt) 2.1.1 using heap_buffers and 
all 5 nodes up.

Things look promising so far, I need to run with a node down (I assume I take 
it out of the seeds list), and also with native_objects/native_buffers... this 
is something I can do in parallel with other work, but will still take some 
time.

Random cassandra-stress question: Generally it seems that the threadCount where 
it stops seems to be the one after it has started overloading the system. Maybe 
this is what is wanted for the final results, but generally it seems that the 
latency of this final run is not representative of the previous one or two 
thread counts which were doing about the same number of ops/second (hence why 
it stopped). Not sure what the thinking is on that, I'm sure it has come up 
before.

> AtomicSortedColumns.addAllWithSizeDelta has a spin loop that allocates memory
> -----------------------------------------------------------------------------
>
>                 Key: CASSANDRA-7546
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-7546
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>            Reporter: graham sanderson
>            Assignee: graham sanderson
>             Fix For: 2.1.1
>
>         Attachments: 7546.20.txt, 7546.20_2.txt, 7546.20_3.txt, 
> 7546.20_4.txt, 7546.20_5.txt, 7546.20_6.txt, 7546.20_7.txt, 7546.20_7b.txt, 
> 7546.20_alt.txt, 7546.20_async.txt, 7546.21_v1.txt, hint_spikes.png, 
> suggestion1.txt, suggestion1_21.txt, young_gen_gc.png
>
>
> In order to preserve atomicity, this code attempts to read, clone/update, 
> then CAS the state of the partition.
> Under heavy contention for updating a single partition this can cause some 
> fairly staggering memory growth (the more cores on your machine the worst it 
> gets).
> Whilst many usage patterns don't do highly concurrent updates to the same 
> partition, hinting today, does, and in this case wild (order(s) of magnitude 
> more than expected) memory allocation rates can be seen (especially when the 
> updates being hinted are small updates to different partitions which can 
> happen very fast on their own) - see CASSANDRA-7545
> It would be best to eliminate/reduce/limit the spinning memory allocation 
> whilst not slowing down the very common un-contended case.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to