[ 
https://issues.apache.org/jira/browse/CASSANDRA-18714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17798371#comment-17798371
 ] 

Doug Rohrer commented on CASSANDRA-18714:
-----------------------------------------

18941 was implemented more to try to control actual data file size so that we 
didn't end up in a situation where someone was writing 1000 small rows per data 
file and then closing the sstable and opening a new one (essentially, the 
alternative to this was to control the number of rows and tune for sstable size 
by trial-and-error). I think it's acceptable for the SAIes to not be counted as 
part of the max SSTable size, but we should maybe rename the thing to "max 
primary data file size" or something like that?

> Expand CQLSSTableWriter to write SSTable-attached secondary indexes
> -------------------------------------------------------------------
>
>                 Key: CASSANDRA-18714
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-18714
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Feature/SAI, Tool/bulk load
>            Reporter: Caleb Rackliffe
>            Assignee: Caleb Rackliffe
>            Priority: Normal
>             Fix For: 5.x
>
>
> {{CQLSSTableWriter}} currently has no way of writing any secondary indexes 
> inline as it writes the core SSTable components. With SAI, this has become 
> tractable problem, and we should be able to enhance both it and 
> {{SSTableImporter}} to handle cases where we might want to write SSTables 
> somewhere in bulk (and in parallel) and then import them without waiting for 
> index building on import. It would require the following changes:
> 1.) {{CQLSSTableWriter}} must accept 2i definitions on top of its current 
> table schema definition. Once added to the schema, any {{ColumnFamilyStore}} 
> instances opened will have those 2i defined in their index managers.
> 2.) All {{AbstractSSTableSimpleWriter}} instances must register index groups, 
> allowing the proper {{SSTableFlushObservers}} to be attached to 
> {{SSTableWriter}}. Once this is done, SAI (and any other SSTable-attached 
> indexes) components will be built incrementally along w/ the SSTable data 
> file, and will be finalized when the newly written SSTable is finalized.
> 3.) Provide an example (in a unit test?) of how a third-party tool might, 
> assuming access to the right C* JAR, validate/checksum SAI components outside 
> C* proper.
> 4.) {{SSTableImporter}} should have two new options:
>     a.) an option that fails import if any SSTable-attached 2i must be built 
> (i.e. has not already been built and brought along w/ the other new SSTable 
> components)
>     b.) an option that allows us to bypass full checksum validation on 
> imported/already-built SSTable-attached indexes (assuming they have just been 
> written by {{CQLSSTableWriter}})



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Reply via email to