[ 
https://issues.apache.org/jira/browse/CASSANDRA-7103?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Martin Bligh reopened CASSANDRA-7103:
-------------------------------------


We're not understanding each other. I'm not updating the same row, I'm just 
inserting new ones, so 5417 doesn't seem to be relevant.

There are two fundamental and very serious performance problems in here:

1. There's a 70x performance degradation over time as I insert a few thousand 
rows, that's fixed by running "nodetool flush".

2. Scaling of parallel row inserts (just new inserts, not updates) on a single 
table per node is terrible. 10s per insert with 64 writers ?

If you want me to break this out into two separate bugs, thats fine, I probably 
should have done that to start with. 

> Very poor performance with simple setup
> ---------------------------------------
>
>                 Key: CASSANDRA-7103
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-7103
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>         Environment: Fedora 19 (also happens on Ubuntu), Cassandra 2.0.7. dsc 
> standard install
>            Reporter: Martin Bligh
>
> Single node (this is just development, 32GB 20 core server), single disk 
> array.
> Create the following table:
> {code}
> CREATE TABLE reut (
>   time_order bigint,
>   time_start bigint,
>   ack_us map<int, int>,
>   gc_strategy map<text, int>,
>   gc_strategy_symbol map<text, int>,
>   gc_symbol map<text, int>,
>   ge_strategy map<text, int>,
>   ge_strategy_symbol map<text, int>,
>   ge_symbol map<text, int>,
>   go_strategy map<text, int>,
>   go_strategy_symbol map<text, int>,
>   go_symbol map<text, int>,
>   message_type map<text, int>,
>   PRIMARY KEY (time_order, time_start)
> ) WITH
>   bloom_filter_fp_chance=0.010000 AND
>   caching='KEYS_ONLY' AND
>   comment='' AND
>   dclocal_read_repair_chance=0.000000 AND
>   gc_grace_seconds=864000 AND
>   index_interval=128 AND
>   read_repair_chance=0.100000 AND
>   replicate_on_write='true' AND
>   populate_io_cache_on_flush='false' AND
>   default_time_to_live=0 AND
>   speculative_retry='99.0PERCENTILE' AND
>   memtable_flush_period_in_ms=0 AND
>   compaction={'class': 'SizeTieredCompactionStrategy'} AND
>   compression={};
> {code}
> Now I just insert data into it (using python driver, async insert, prepared 
> insert statement). Each row only fills out one of the gc_*, go_*, or ge_* 
> columns, and there's something like 20-100 entries per map column, 
> occasionally 1000, but it's nothing huge. 
> First run 685 inserts in 1.004860 seconds (681.687053 Operations/s).
> OK, not great, but that's fine.
> Now throw 50,000 rows at it.
> Now run the first run again, and it takes 53s to do the same insert of 685 
> rows - I'm getting about 10 rows per second. 
> It's not IO bound - "iostat 1" shows quiescent for 9 seconds, then ~640KB 
> write, then sleeps again - seems like the fflush sync.
> Run "nodetool flush" and performance goes back to as before!!!!
> Not sure why this gets so slow - I think it just builds huge commit logs and 
> memtables, but never writes out to the data/ directory with sstables because 
> I only have one table? That doesn't seem like a good situation. 
> Worse ... if you let the python driver just throw stuff at it async (I think 
> this allows up to 128 request if I understand the underlying protocol, then 
> it gets so slow that a single write takes over 10s and times out). Seems to 
> be some sort of synchronization problem in Java ... if I limit the concurrent 
> async requests to the left column below, I get the number of seconds elapsed 
> on the right:
> 1: 103 seconds
> 2: 63 seconds
> 8: 53 seconds
> 16: 53 seconds
> 32: 66 seconds
> 64: so slow it explodes in timeouts on write (over 10s each).
> I guess there's some thundering herd type locking issue in whatever Java 
> primitive you are using to lock concurrent access to a single table. I know 
> some of the Java concurrent.* stuff has this issue. So for the other tests 
> above, I was limiting async writes to 16 pending.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to