[ 
https://issues.apache.org/jira/browse/CASSANDRA-5727?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Meyer updated CASSANDRA-5727:
------------------------------------

    Attachment: UpdateLatency_vs_LCS.png
                Throughtput_vs_LCS.png
                ReadLatency_vs_LCS.png
                BytesRead_vs_LCS.png

I have conducted an investigation into the default LCS file size.  YCSB was 
used to perform all tests.  The system under test consisted of a single 
rackspace node with 2GB of ram.  YCSB workloada was used for all tests, which 
consists of 50/50 read/update workload with the total number of operations set 
to 900K.  The amount of data was varied from 4GB to 40GB.
LCS file size was varied for the 4GB tests as 5MB, 10MB, 20MB, 160MB, 320MB, 
475MB, 640MB, 1280MB.  LCS file size for the 40GB tests was varied as 5MB, 
40MB, 80MB, 160MB, 320MB, 640MB.

It is important to note that the 40GB test was not runnable with the current 
default LCS file size of 5MB due to consistent OOM errors.  Those OOM issues go 
away with increased LCS file size.

Based upon the data from this experiment an LCS file size of 160MB would be an 
optimal default value. Please see attached graphs.
                
> Evaluate default LCS sstable size
> ---------------------------------
>
>                 Key: CASSANDRA-5727
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-5727
>             Project: Cassandra
>          Issue Type: Task
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Daniel Meyer
>         Attachments: BytesRead_vs_LCS.png, ReadLatency_vs_LCS.png, 
> Throughtput_vs_LCS.png, UpdateLatency_vs_LCS.png
>
>
> What we're not sure about is the effect on compaction efficiency --
> larger files mean that each level contains more data, so reads will
> have to touch less sstables, but we're also compacting less unchanged
> data when we merge forward.
> So the question is, how big can we make the sstables to get the benefits of 
> the
> first effect, before the second effect starts to dominate?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to