[ 
https://issues.apache.org/jira/browse/CASSANDRA-7386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14219318#comment-14219318
 ] 

Alan Boudreault edited comment on CASSANDRA-7386 at 11/20/14 12:23 PM:
-----------------------------------------------------------------------

Devs, this is the result of my regression test without and with the patch. 

Note: the compaction concurrency is set to 4 and the throughput unlimited.

h4. Test

* 12 disks total of 2G of size.
* Goal: run the following command to fill the disk:
cassandra-stress WRITE n=2000000 -col size=FIXED\(1000\) -mode native prepared 
cql3 -schema keyspace=r1

h5. Result -  No Patch

[^test_regression_no_patch.jpg]

All disk are filled in ~420 seconds. Casandra-stress crashed with write 
timeouts at around n=650000

h5. Result -  With Patch

[^test_regression_with_patch.jpg]

Cassandra-stress finished all its work (~13 minutes, n=2000000) and all disks 
are under 60% of disk usage.

Any idea what's going on? Am I doing something wrong in my test case?



was (Author: aboudreault):
Devs, this is the result of my regression test without and with the patch. 

Note: the compaction concurrency is set to 4 and the throughput unlimited.

h4. Test

* 12 disks total of 2G of size.
* Goal: run the following command to fill the disk:
cassandra-stress WRITE n=2000000 -col size=FIXED\(1000\) -mode native prepared 
cql3 -schema keyspace=r1

h5. Result -  No Patch

!test_regression_no_patch.jpg! 

All disk are filled in ~420 seconds. Casandra-stress crashed with write 
timeouts at around n=650000

h5. Result -  With Patch

!test_regression_with_patch.jpg!

Cassandra-stress finished all its work (~13 minutes, n=2000000) and all disks 
are under 60% of disk usage.

Any idea what's going on? Am I doing something wrong in my test case?


> JBOD threshold to prevent unbalanced disk utilization
> -----------------------------------------------------
>
>                 Key: CASSANDRA-7386
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-7386
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Chris Lohfink
>            Assignee: Robert Stupp
>            Priority: Minor
>             Fix For: 2.1.3
>
>         Attachments: 7386-2.0-v3.txt, 7386-2.0-v4.txt, 7386-2.0-v5.txt, 
> 7386-2.1-v3.txt, 7386-2.1-v4.txt, 7386-2.1-v5.txt, 7386-v1.patch, 
> 7386v2.diff, Mappe1.ods, mean-writevalue-7disks.png, 
> patch_2_1_branch_proto.diff, sstable-count-second-run.png, 
> test1_no_patch.jpg, test1_with_patch.jpg, test2_no_patch.jpg, 
> test2_with_patch.jpg, test3_no_patch.jpg, test3_with_patch.jpg, 
> test_regression_no_patch.jpg, test_regression_with_patch.jpg
>
>
> Currently the pick the disks are picked first by number of current tasks, 
> then by free space.  This helps with performance but can lead to large 
> differences in utilization in some (unlikely but possible) scenarios.  Ive 
> seen 55% to 10% and heard reports of 90% to 10% on IRC.  With both LCS and 
> STCS (although my suspicion is that STCS makes it worse since harder to be 
> balanced).
> I purpose the algorithm change a little to have some maximum range of 
> utilization where it will pick by free space over load (acknowledging it can 
> be slower).  So if a disk A is 30% full and disk B is 5% full it will never 
> pick A over B until it balances out.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to