[jira] [Commented] (CASSANDRA-7546) AtomicSortedColumns.addAllWithSizeDelta has a spin loop that allocates memory

graham sanderson (JIRA) Thu, 18 Sep 2014 21:41:34 -0700

    [ 
https://issues.apache.org/jira/browse/CASSANDRA-7546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14139982#comment-14139982
 ]


graham sanderson commented on CASSANDRA-7546:
---------------------------------------------

Ok, cool thanks - I've upgraded my 2.1.0 to 2.1.1... {{7cfd3ed}} for what it's 
worth.

I merged {{7964+7926}} into that and updated my load machine with that.

I switched to 40x40x40x40 clustering keys as suggested and changed the 10M 
entries in the command line args to 2560000 accordingly (it now runs 
successfully)

The output is below

Note I ended up with 1275 partitions (note during the warmup I ended up with 
1025 so there may be a 1-off bug there also either in stress or my config!)... 
still not sure this is what we expect - each node has only seen about 3M 
mutations total (and I've run the stress test twice - once without the GC stuff 
working)

Anyway, let me know what you think - I won't be running more tests until 
tomorrow US time. 

Another question - what do you usually do to get comparable results; right now 
I have been blowing away the stresscql keyspace every time to at least keep 
compaction out of the equation. Given the length of the cassandra-stress run, 
I'm not sure there is much to be gained by bouncing the cluster in between 
runs, but you probably know better having used it before.

{code}
Results:
op rate                   : 10595
partition rate            : 10595
row rate                  : 10595
latency mean              : 85.8
latency median            : 49.9
latency 95th percentile   : 360.0
latency 99th percentile   : 417.9
latency 99.9th percentile : 491.9
latency max               : 552.2
total gc count            : 3
total gc mb               : 19471
total gc time (s)         : 0
avg gc time(ms)           : 67
stdev gc time(ms)         : 5
Total operation time      : 00:00:40
Improvement over 609 threadCount: -1%
             id, total ops , adj row/s,    op/s,    pk/s,   row/s,    mean,     
med,     .95,     .99,    .999,     max,   time,   stderr,  gc: #,  max ms,  
sum ms,  sdv ms,      mb
  4 threadCount, 6939      ,        -0,     226,     226,     226,    17.6,    
16.3,    40.3,    49.4,    51.1,   131.8,   30.6,  0.01464,      0,       0,    
   0,       0,       0
  8 threadCount, 11827     ,       385,     385,     385,     385,    20.7,    
15.1,    47.5,    51.3,    82.1,   111.7,   30.7,  0.02511,      0,       0,    
   0,       0,       0
 16 threadCount, 19068     ,        -0,     612,     612,     612,    26.1,    
28.8,    49.9,    60.6,    89.7,   172.1,   31.2,  0.01924,      0,       0,    
   0,       0,       0
 24 threadCount, 24441     ,        -0,     775,     775,     775,    30.9,    
32.6,    52.1,    80.3,    88.3,   150.4,   31.5,  0.01508,      0,       0,    
   0,       0,       0
 36 threadCount, 36641     ,        -0,    1155,    1155,    1155,    31.1,    
30.2,    59.0,    78.1,    89.7,   172.1,   31.7,  0.01127,      0,       0,    
   0,       0,       0
 54 threadCount, 55220     ,        -0,    1730,    1730,    1730,    31.1,    
29.1,    54.5,    74.3,    84.3,   164.4,   31.9,  0.00883,      0,       0,    
   0,       0,       0
 81 threadCount, 83460     ,        -0,    2609,    2609,    2609,    31.0,    
28.9,    51.2,    71.0,    79.2,   175.4,   32.0,  0.01678,      0,       0,    
   0,       0,       0
121 threadCount, 140705    ,        -0,    4402,    4402,    4402,    27.4,    
25.8,    49.7,    53.2,    70.3,   302.8,   32.0,  0.01438,      2,     462,    
 462,      11,   12889
181 threadCount, 226213    ,        -0,    7116,    7116,    7116,    25.4,    
24.2,    48.8,    51.8,    60.1,   279.0,   31.8,  0.01335,      1,     230,    
 230,       0,    6401
271 threadCount, 320658    ,        -0,   10089,   10089,   10089,    26.8,    
25.0,    48.3,    50.1,    57.4,   297.0,   31.8,  0.01256,      2,     425,    
 425,      14,   12786
406 threadCount, 342451    ,        -0,   10609,   10609,   10609,    38.2,    
40.3,    59.0,    77.5,    81.7,   142.4,   32.3,  0.00920,      0,       0,    
   0,       0,       0
609 threadCount, 381058    ,        -0,   10651,   10651,   10651,    57.0,    
48.6,   171.5,   224.4,   248.4,   342.0,   35.8,  0.01234,      1,      66,    
  66,       0,    6520
913 threadCount, 432518    ,        -0,   10595,   10595,   10595,    85.8,    
49.9,   360.0,   417.9,   491.9,   552.2,   40.8,  0.01471,      3,     200,    
 200,       5,   19471
END
{code}

> AtomicSortedColumns.addAllWithSizeDelta has a spin loop that allocates memory
> -----------------------------------------------------------------------------
>
>                 Key: CASSANDRA-7546
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-7546
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>            Reporter: graham sanderson
>            Assignee: graham sanderson
>             Fix For: 2.1.1
>
>         Attachments: 7546.20.txt, 7546.20_2.txt, 7546.20_3.txt, 
> 7546.20_4.txt, 7546.20_5.txt, 7546.20_6.txt, 7546.20_7.txt, 7546.20_7b.txt, 
> 7546.20_alt.txt, 7546.20_async.txt, 7546.21_v1.txt, hint_spikes.png, 
> suggestion1.txt, suggestion1_21.txt, young_gen_gc.png
>
>
> In order to preserve atomicity, this code attempts to read, clone/update, 
> then CAS the state of the partition.
> Under heavy contention for updating a single partition this can cause some 
> fairly staggering memory growth (the more cores on your machine the worst it 
> gets).
> Whilst many usage patterns don't do highly concurrent updates to the same 
> partition, hinting today, does, and in this case wild (order(s) of magnitude 
> more than expected) memory allocation rates can be seen (especially when the 
> updates being hinted are small updates to different partitions which can 
> happen very fast on their own) - see CASSANDRA-7545
> It would be best to eliminate/reduce/limit the spinning memory allocation 
> whilst not slowing down the very common un-contended case.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-7546) AtomicSortedColumns.addAllWithSizeDelta has a spin loop that allocates memory

Reply via email to