[ 
https://issues.apache.org/jira/browse/CASSANDRA-19334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17812840#comment-17812840
 ] 

Yifan Cai commented on CASSANDRA-19334:
---------------------------------------

Fixed a few checkstyle and test error. The CI is green now. 

> [Analytics] Upgrade to Cassandra 4.0.12 and remove RowBufferMode and 
> BatchSize options
> --------------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-19334
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-19334
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Analytics Library
>            Reporter: Yifan Cai
>            Assignee: Yifan Cai
>            Priority: Normal
>          Time Spent: 1h
>  Remaining Estimate: 0h
>
> In cassandra-all:4.0.12, improvements were made for the CQLSSTableWriter. The 
> sorted writer now can produce size-capped SSTables. It replaces the need for 
> the unsorted sstable writer, which has to buffer and sort data on flushing. 
> The dataset to write in the spark application is already sorted. By avoiding 
> using the unsorted writer, it prevents wasting CPU time on sorting the sorted 
> data. Since the sorted sstable writer does not need to buffer data, its size 
> estimation is more accurate than the unsorted one, meaning the produced 
> sstables files are closer to the expectation.
> By removing the unsorted sstable writer, it no longer requires the 
> RowBufferMode option.
> By supporting size-capping in sorted writer, it no longer requires the 
> BatchSize option.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Reply via email to