[jira] [Comment Edited] (CASSANDRA-7546) AtomicSortedColumns.addAllWithSizeDelta has a spin loop that allocates memory

graham sanderson (JIRA) Mon, 15 Sep 2014 15:51:17 -0700

    [ 
https://issues.apache.org/jira/browse/CASSANDRA-7546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14134635#comment-14134635
 ]


graham sanderson edited comment on CASSANDRA-7546 at 9/15/14 10:50 PM:
-----------------------------------------------------------------------

Finally getting back to this, been doing other things (this slightly lower 
priority as we have it in production already) as well as keeping breaking 
myself physically, requiring orthopedic visits! I just realized that the 
version c6a2c65a75ade being voted on for 2.1.0 that I deployed is not the same 
as 2.1.0 released. I am now upgrading, since cassandra-stress changes snuck in.

Note, than I plan to stress using 1024, 256, 16, 1 partitions, with all 5 nodes 
up, and then with 4 nodes up and one down to test effect of hinting, (note repl 
factor of 3 and cl=LOCAL_QUORUM), as well as with at least 
memtable_allocation_type = heap_buffers & off_heap_buffers

I want to do one cell insert per batch... I'm upgrading in part because of the 
new visit/revisit stuff - I'm not 100% sure how to use them correctly, I'll 
keep playing but you may answer before I have finished upgrading and tried with 
this. My first attempt on the original 2.1.0 revision, ended up with only one 
clustering key value per partition which is not what I wanted (because it'll 
make trees small)

Sample YAML for 1024 partitions
{code}
#
# This is an example YAML profile for cassandra-stress
#
# insert data
# cassandra-stress user profile=/home/jake/stress1.yaml ops(insert=1)
#
# read, using query simple1:
# cassandra-stress profile=/home/jake/stress1.yaml ops(simple1=1)
#
# mixed workload (90/10)
# cassandra-stress user profile=/home/jake/stress1.yaml ops(insert=1,simple1=9)


#
# Keyspace info
#
keyspace: stresscql

#
# The CQL for creating a keyspace (optional if it already exists)
#
keyspace_definition: |
  CREATE KEYSPACE stresscql WITH replication = {'class': 'SimpleStrategy', 
'replication_factor': 3};

#
# Table info
#
table: testtable

#
# The CQL for creating a table you wish to stress (optional if it already 
exists)
#
table_definition: |
  CREATE TABLE testtable (
        p text,
        c text,
        v blob,
        PRIMARY KEY(p, c)
  ) WITH COMPACT STORAGE 
    AND compaction = { 'class':'LeveledCompactionStrategy' }
    AND comment='TestTable'

#
# Optional meta information on the generated columns in the above table
# The min and max only apply to text and blob types
# The distribution field represents the total unique population
# distribution of that column across rows.  Supported types are
# 
#      EXP(min..max)                        An exponential distribution over 
the range [min..max]
#      EXTREME(min..max,shape)              An extreme value (Weibull) 
distribution over the range [min..max]
#      GAUSSIAN(min..max,stdvrng)           A gaussian/normal distribution, 
where mean=(min+max)/2, and stdev is (mean-min)/stdvrng
#      GAUSSIAN(min..max,mean,stdev)        A gaussian/normal distribution, 
with explicitly defined mean and stdev
#      UNIFORM(min..max)                    A uniform distribution over the 
range [min, max]
#      FIXED(val)                           A fixed distribution, always 
returning the same value
#      Aliases: extr, gauss, normal, norm, weibull
#
#      If preceded by ~, the distribution is inverted
#
# Defaults for all columns are size: uniform(4..8), population: 
uniform(1..100B), cluster: fixed(1)
#
columnspec:
  - name: p
    size: fixed(16)
    population: uniform(1..1024)     # the range of unique values to select for 
the field (default is 100Billion)
  - name: c
    size: fixed(26)
#    cluster: uniform(1..100B)
  - name: v
    size: gaussian(50..250)

insert:
  partitions: fixed(1)            # number of unique partitions to update in a 
single operation
                                  # if batchcount > 1, multiple batches will be 
used but all partitions will
                                  # occur in all batches (unless they finish 
early); only the row counts will vary
  batchtype: LOGGED               # type of batch to use
  visits: fixed(10M)    # not sure about this

queries:
   simple1: select * from testtable where k = ? and v = ? LIMIT 10
{code}

Command-line

{code}
./cassandra-stress user profile=~/cqlstress-1024.yaml ops\(insert=1\) 
cl=LOCAL_QUORUM -node $NODES -mode native prepared cql3 | tee 
results/results-2.1.0-p1024-a.txt
{code}


was (Author: graham sanderson):
Finally getting back to this, been doing other things (this slightly lower 
priority as we have it in production already)... I just realized that the 
version c6a2c65a75ade being voted on for 2.1.0 that I deployed is not the same 
as 2.1.0 released. I am now upgrading, since cassandra-stress changes snuck in.

Note, than I plan to stress using 1024, 256, 16, 1 partitions, with all 5 nodes 
up, and then with 4 nodes up and one down to test effect of hinting, (note repl 
factor of 3 and cl=LOCAL_QUORUM)

I want to do one cell insert per batch... I'm upgrading in part because of the 
new visit/revisit stuff - I'm not 100% sure how to use them correctly, I'll 
keep playing but you may answer before I have finished upgrading and tried with 
this. My first attempt on the original 2.1.0 revision, ended up with only one 
clustering key value per partition which is not what I wanted (because it'll 
make trees small)

Sample YAML for 1024 partitions
{code}
#
# This is an example YAML profile for cassandra-stress
#
# insert data
# cassandra-stress user profile=/home/jake/stress1.yaml ops(insert=1)
#
# read, using query simple1:
# cassandra-stress profile=/home/jake/stress1.yaml ops(simple1=1)
#
# mixed workload (90/10)
# cassandra-stress user profile=/home/jake/stress1.yaml ops(insert=1,simple1=9)


#
# Keyspace info
#
keyspace: stresscql

#
# The CQL for creating a keyspace (optional if it already exists)
#
keyspace_definition: |
  CREATE KEYSPACE stresscql WITH replication = {'class': 'SimpleStrategy', 
'replication_factor': 3};

#
# Table info
#
table: testtable

#
# The CQL for creating a table you wish to stress (optional if it already 
exists)
#
table_definition: |
  CREATE TABLE testtable (
        p text,
        c text,
        v blob,
        PRIMARY KEY(p, c)
  ) WITH COMPACT STORAGE 
    AND compaction = { 'class':'LeveledCompactionStrategy' }
    AND comment='TestTable'

#
# Optional meta information on the generated columns in the above table
# The min and max only apply to text and blob types
# The distribution field represents the total unique population
# distribution of that column across rows.  Supported types are
# 
#      EXP(min..max)                        An exponential distribution over 
the range [min..max]
#      EXTREME(min..max,shape)              An extreme value (Weibull) 
distribution over the range [min..max]
#      GAUSSIAN(min..max,stdvrng)           A gaussian/normal distribution, 
where mean=(min+max)/2, and stdev is (mean-min)/stdvrng
#      GAUSSIAN(min..max,mean,stdev)        A gaussian/normal distribution, 
with explicitly defined mean and stdev
#      UNIFORM(min..max)                    A uniform distribution over the 
range [min, max]
#      FIXED(val)                           A fixed distribution, always 
returning the same value
#      Aliases: extr, gauss, normal, norm, weibull
#
#      If preceded by ~, the distribution is inverted
#
# Defaults for all columns are size: uniform(4..8), population: 
uniform(1..100B), cluster: fixed(1)
#
columnspec:
  - name: p
    size: fixed(16)
    population: uniform(1..1024)     # the range of unique values to select for 
the field (default is 100Billion)
  - name: c
    size: fixed(26)
#    cluster: uniform(1..100B)
  - name: v
    size: gaussian(50..250)

insert:
  partitions: fixed(1)            # number of unique partitions to update in a 
single operation
                                  # if batchcount > 1, multiple batches will be 
used but all partitions will
                                  # occur in all batches (unless they finish 
early); only the row counts will vary
  batchtype: LOGGED               # type of batch to use
  visits: fixed(10M)    # not sure about this

queries:
   simple1: select * from testtable where k = ? and v = ? LIMIT 10
{code}

Command-line

{code}
./cassandra-stress user profile=~/cqlstress-1024.yaml ops\(insert=1\) 
cl=LOCAL_QUORUM -node $NODES -mode native prepared cql3 | tee 
results/results-2.1.0-p1024-a.txt
{code}

> AtomicSortedColumns.addAllWithSizeDelta has a spin loop that allocates memory
> -----------------------------------------------------------------------------
>
>                 Key: CASSANDRA-7546
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-7546
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>            Reporter: graham sanderson
>            Assignee: graham sanderson
>             Fix For: 2.1.1
>
>         Attachments: 7546.20.txt, 7546.20_2.txt, 7546.20_3.txt, 
> 7546.20_4.txt, 7546.20_5.txt, 7546.20_6.txt, 7546.20_7.txt, 7546.20_7b.txt, 
> 7546.20_alt.txt, 7546.20_async.txt, 7546.21_v1.txt, hint_spikes.png, 
> suggestion1.txt, suggestion1_21.txt, young_gen_gc.png
>
>
> In order to preserve atomicity, this code attempts to read, clone/update, 
> then CAS the state of the partition.
> Under heavy contention for updating a single partition this can cause some 
> fairly staggering memory growth (the more cores on your machine the worst it 
> gets).
> Whilst many usage patterns don't do highly concurrent updates to the same 
> partition, hinting today, does, and in this case wild (order(s) of magnitude 
> more than expected) memory allocation rates can be seen (especially when the 
> updates being hinted are small updates to different partitions which can 
> happen very fast on their own) - see CASSANDRA-7545
> It would be best to eliminate/reduce/limit the spinning memory allocation 
> whilst not slowing down the very common un-contended case.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (CASSANDRA-7546) AtomicSortedColumns.addAllWithSizeDelta has a spin loop that allocates memory

Reply via email to