[ 
https://issues.apache.org/jira/browse/CASSANDRA-7546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14140087#comment-14140087
 ] 

Benedict commented on CASSANDRA-7546:
-------------------------------------

I meant to mention, but forgot, in case you worried about this: for simplicity 
and performance, we don't guarantee that we only generate as many partitions as 
the sample defines, we only guarantee that when sampling we follow that 
distribution (and so will ignore any overshoot that we generated). Essentially 
any thread sampling the working set that hits _past the end of the set_ (i.e. 
either into an area not yet populated, or one that has been finished and not 
replaced) will asynchronously generate a new seed, write to it, and _then_ 
update the sample. This is because updating the sample is itself costly, and 
for workloads where the work is likely to be completed in one shot we don't 
want to incur that cost.

That said it should be quite possible to decide upfront if the workload meets 
these characteristics and, if it doesn't (like this one), update the sample in 
advance.

There's also sort-of an off-by-1 error for the 1025, though. We're not taking 
the minimum index off from the generated sample index, so with a distribution 
1..1024, we're never sampling index 0, and our sample size will be 1025. I've 
pushed a fix for this.

> AtomicSortedColumns.addAllWithSizeDelta has a spin loop that allocates memory
> -----------------------------------------------------------------------------
>
>                 Key: CASSANDRA-7546
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-7546
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>            Reporter: graham sanderson
>            Assignee: graham sanderson
>             Fix For: 2.1.1
>
>         Attachments: 7546.20.txt, 7546.20_2.txt, 7546.20_3.txt, 
> 7546.20_4.txt, 7546.20_5.txt, 7546.20_6.txt, 7546.20_7.txt, 7546.20_7b.txt, 
> 7546.20_alt.txt, 7546.20_async.txt, 7546.21_v1.txt, hint_spikes.png, 
> suggestion1.txt, suggestion1_21.txt, young_gen_gc.png
>
>
> In order to preserve atomicity, this code attempts to read, clone/update, 
> then CAS the state of the partition.
> Under heavy contention for updating a single partition this can cause some 
> fairly staggering memory growth (the more cores on your machine the worst it 
> gets).
> Whilst many usage patterns don't do highly concurrent updates to the same 
> partition, hinting today, does, and in this case wild (order(s) of magnitude 
> more than expected) memory allocation rates can be seen (especially when the 
> updates being hinted are small updates to different partitions which can 
> happen very fast on their own) - see CASSANDRA-7545
> It would be best to eliminate/reduce/limit the spinning memory allocation 
> whilst not slowing down the very common un-contended case.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to