[ 
https://issues.apache.org/jira/browse/CASSANDRA-18714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17804403#comment-17804403
 ] 

Caleb Rackliffe commented on CASSANDRA-18714:
---------------------------------------------

{quote}b.) an option that allows us to bypass full checksum validation on 
imported/already-built SSTable-attached indexes (assuming they have just been 
written by {{{}CQLSSTableWriter{}}})
{quote}

Take a look at {{SSTableImporter#importNewSSTables()}}:
 
{noformat}
// Validate existing SSTable-attached indexes, and then build any that are 
missing:
if (!cfs.indexManager.validateSSTableAttachedIndexes(newSSTables, false))
    cfs.indexManager.buildSSTableAttachedIndexesBlocking(newSSTables);
{noformat}

When we validate here, we're checksumming as well. There may be a case where we 
want to just bypass that and, in SAI's case, do simple header/footer validation 
instead. (The idea being that {{CQLSSTableWriter}} just wrote them.) It's not 
exposed super cleanly as an option via 
{{Group#validateSSTableAttachedIndexes()}}, but I think we could just add a 
boolean for whether to checksum, and then have {{StorageAttachedIndexGroup}} 's 
implementation of that call a modified version of 
{{checksumPerSSTableComponents()}} that just forwards the boolean to the 
{{IndexDescriptor}} method that actually does the validation.

> Expand CQLSSTableWriter to write SSTable-attached secondary indexes
> -------------------------------------------------------------------
>
>                 Key: CASSANDRA-18714
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-18714
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Feature/SAI, Tool/bulk load
>            Reporter: Caleb Rackliffe
>            Assignee: Stefan Miklosovic
>            Priority: Normal
>             Fix For: 5.x
>
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> {{CQLSSTableWriter}} currently has no way of writing any secondary indexes 
> inline as it writes the core SSTable components. With SAI, this has become 
> tractable problem, and we should be able to enhance both it and 
> {{SSTableImporter}} to handle cases where we might want to write SSTables 
> somewhere in bulk (and in parallel) and then import them without waiting for 
> index building on import. It would require the following changes:
> 1.) {{CQLSSTableWriter}} must accept 2i definitions on top of its current 
> table schema definition. Once added to the schema, any {{ColumnFamilyStore}} 
> instances opened will have those 2i defined in their index managers.
> 2.) All {{AbstractSSTableSimpleWriter}} instances must register index groups, 
> allowing the proper {{SSTableFlushObservers}} to be attached to 
> {{SSTableWriter}}. Once this is done, SAI (and any other SSTable-attached 
> indexes) components will be built incrementally along w/ the SSTable data 
> file, and will be finalized when the newly written SSTable is finalized.
> 3.) Provide an example (in a unit test?) of how a third-party tool might, 
> assuming access to the right C* JAR, validate/checksum SAI components outside 
> C* proper.
> 4.) {{SSTableImporter}} should have two new options:
>     a.) an option that fails import if any SSTable-attached 2i must be built 
> (i.e. has not already been built and brought along w/ the other new SSTable 
> components)
>     b.) an option that allows us to bypass full checksum validation on 
> imported/already-built SSTable-attached indexes (assuming they have just been 
> written by {{CQLSSTableWriter}})



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Reply via email to