[ 
https://issues.apache.org/jira/browse/CASSANDRA-13241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15886529#comment-15886529
 ] 

Jeff Jirsa commented on CASSANDRA-13241:
----------------------------------------

4k chunks will will give much better IO for sstables not in page cache, but 
come at the cost of significant offheap memory requirements, and compression 
ratios will suffer. There might be a better default, but I'm not sure going all 
the way to 4k is the right answer. 

{quote}
Thanks for your vote, but ... maybe this is a stupid question: Who will finally 
decide if that change is accepted?
{quote}

Generally, a committer can push it as long as they have a +1 vote. However, for 
something like this, most committers will (should) look for consensus among the 
other committers. Ultimately, the final say will come from consensus among the 
PMC when it goes to voting for a release - if a member of the PMC ultimately 
decides it doesn't like the change, that member can/will/should vote -1 on the 
release until the commit is removed. 


> Lower default chunk_length_in_kb from 64kb to 4kb
> -------------------------------------------------
>
>                 Key: CASSANDRA-13241
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-13241
>             Project: Cassandra
>          Issue Type: Wish
>          Components: Core
>            Reporter: Benjamin Roth
>
> Having a too low chunk size may result in some wasted disk space. A too high 
> chunk size may lead to massive overreads and may have a critical impact on 
> overall system performance.
> In my case, the default chunk size lead to peak read IOs of up to 1GB/s and 
> avg reads of 200MB/s. After lowering chunksize (of course aligned with read 
> ahead), the avg read IO went below 20 MB/s, rather 10-15MB/s.
> The risk of (physical) overreads is increasing with lower (page cache size) / 
> (total data size) ratio.
> High chunk sizes are mostly appropriate for bigger payloads pre request but 
> if the model consists rather of small rows or small resultsets, the read 
> overhead with 64kb chunk size is insanely high. This applies for example for 
> (small) skinny rows.
> Please also see here:
> https://groups.google.com/forum/#!topic/scylladb-dev/j_qXSP-6-gY
> To give you some insights what a difference it can make (460GB data, 128GB 
> RAM):
> - Latency of a quite large CF: https://cl.ly/1r3e0W0S393L
> - Disk throughput: https://cl.ly/2a0Z250S1M3c
> - This shows, that the request distribution remained the same, so no "dynamic 
> snitch magic": https://cl.ly/3E0t1T1z2c0J



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to