[
https://issues.apache.org/jira/browse/CASSANDRA-7546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14137777#comment-14137777
]
graham sanderson commented on CASSANDRA-7546:
---------------------------------------------
OK, so I'm running latest stress.jar on my load machine - given the number of
changes to stress in 2.1.1 (and the addition by the looks of things of remote
GC logging via cassandra-stress which would be useful in this case), I guess
I'll upgrade the cluster as well.
Here is my current config (minus the comments) and the launch command... note
there were some typos in our conversation above
{code}
keyspace: stresscql
keyspace_definition: |
CREATE KEYSPACE stresscql WITH replication = {'class': 'SimpleStrategy',
'replication_factor': 3};
table: testtable
table_definition: |
CREATE TABLE testtable (
p text,
c1 int, c2 int, c3 int,
v blob,
PRIMARY KEY(p, c1, c2, c3)
) WITH COMPACT STORAGE
AND compaction = { 'class':'LeveledCompactionStrategy' }
AND comment='TestTable'
columnspec:
- name: p
size: fixed(16)
- name: c1
cluster: fixed(100)
- name: c2
cluster: fixed(100)
- name: c3
cluster: fixed(1000) # note I made it slightly bigger since 10M is better
than 1M for a max - 1M happens pretty quickly
- name: v
size: gaussian(50..250)
queries:
simple1:
cql: select * from testtable where k = ? and v = ? LIMIT 10
fields: samerow
{code}
{code}
./cassandra-stress user profile=~/cqlstress-7546.yaml ops\(insert=1\)
cl=LOCAL_QUORUM -node $NODES -mode native prepared cql3 -pop seq=1..10M -insert
visits=fixed\(10M\) revisit=uniform\(1..1024\) | tee
results/results-2.1.0-p1024-a.txt
{code}
As of right now, we're still (8 minutes later) at:
{code}
INFO 19:11:51 Using data-center name 'Austin' for DCAwareRoundRobinPolicy (if
this is incorrect, please provide the correct datacenter name with
DCAwareRoundRobinPolicy constructor)
Connected to cluster: Austin Multi-Tenant Cassandra 1
INFO 19:11:51 New Cassandra host cassandra4.aus.vast.com/172.17.26.14:9042
added
Datatacenter: Austin; Host: cassandra4.aus.vast.com/172.17.26.14; Rack: 98.9
Datatacenter: Austin; Host: /172.17.26.15; Rack: 98.9
Datatacenter: Austin; Host: /172.17.26.13; Rack: 98.9
Datatacenter: Austin; Host: /172.17.26.12; Rack: 98.9
Datatacenter: Austin; Host: /172.17.26.11; Rack: 98.9
INFO 19:11:51 New Cassandra host /172.17.26.12:9042 added
INFO 19:11:51 New Cassandra host /172.17.26.11:9042 added
INFO 19:11:51 New Cassandra host /172.17.26.13:9042 added
INFO 19:11:51 New Cassandra host /172.17.26.15:9042 added
Created schema. Sleeping 5s for propagation.
Warming up insert with 250000 iterations...
Failed to connect over JMX; not collecting these stats
Generating batches with [1..1] partitions and [1..1] rows (of
[10000000..10000000] total rows in the partitions)
{code}
Number of distinct partitions is currently 2365 and growing.
Is this what we expect? doesn't seem like 250,000 should have exhausted any
partitions?
> AtomicSortedColumns.addAllWithSizeDelta has a spin loop that allocates memory
> -----------------------------------------------------------------------------
>
> Key: CASSANDRA-7546
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7546
> Project: Cassandra
> Issue Type: Bug
> Components: Core
> Reporter: graham sanderson
> Assignee: graham sanderson
> Fix For: 2.1.1
>
> Attachments: 7546.20.txt, 7546.20_2.txt, 7546.20_3.txt,
> 7546.20_4.txt, 7546.20_5.txt, 7546.20_6.txt, 7546.20_7.txt, 7546.20_7b.txt,
> 7546.20_alt.txt, 7546.20_async.txt, 7546.21_v1.txt, hint_spikes.png,
> suggestion1.txt, suggestion1_21.txt, young_gen_gc.png
>
>
> In order to preserve atomicity, this code attempts to read, clone/update,
> then CAS the state of the partition.
> Under heavy contention for updating a single partition this can cause some
> fairly staggering memory growth (the more cores on your machine the worst it
> gets).
> Whilst many usage patterns don't do highly concurrent updates to the same
> partition, hinting today, does, and in this case wild (order(s) of magnitude
> more than expected) memory allocation rates can be seen (especially when the
> updates being hinted are small updates to different partitions which can
> happen very fast on their own) - see CASSANDRA-7545
> It would be best to eliminate/reduce/limit the spinning memory allocation
> whilst not slowing down the very common un-contended case.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)