[ https://issues.apache.org/jira/browse/CASSANDRA-18714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17804403#comment-17804403 ]
Caleb Rackliffe commented on CASSANDRA-18714: --------------------------------------------- {quote}b.) an option that allows us to bypass full checksum validation on imported/already-built SSTable-attached indexes (assuming they have just been written by {{{}CQLSSTableWriter{}}}) {quote} Take a look at {{SSTableImporter#importNewSSTables()}}: {noformat} // Validate existing SSTable-attached indexes, and then build any that are missing: if (!cfs.indexManager.validateSSTableAttachedIndexes(newSSTables, false)) cfs.indexManager.buildSSTableAttachedIndexesBlocking(newSSTables); {noformat} When we validate here, we're checksumming as well. There may be a case where we want to just bypass that and, in SAI's case, do simple header/footer validation instead. (The idea being that {{CQLSSTableWriter}} just wrote them.) It's not exposed super cleanly as an option via {{Group#validateSSTableAttachedIndexes()}}, but I think we could just add a boolean for whether to checksum, and then have {{StorageAttachedIndexGroup}} 's implementation of that call a modified version of {{checksumPerSSTableComponents()}} that just forwards the boolean to the {{IndexDescriptor}} method that actually does the validation. > Expand CQLSSTableWriter to write SSTable-attached secondary indexes > ------------------------------------------------------------------- > > Key: CASSANDRA-18714 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18714 > Project: Cassandra > Issue Type: Improvement > Components: Feature/SAI, Tool/bulk load > Reporter: Caleb Rackliffe > Assignee: Stefan Miklosovic > Priority: Normal > Fix For: 5.x > > Time Spent: 10m > Remaining Estimate: 0h > > {{CQLSSTableWriter}} currently has no way of writing any secondary indexes > inline as it writes the core SSTable components. With SAI, this has become > tractable problem, and we should be able to enhance both it and > {{SSTableImporter}} to handle cases where we might want to write SSTables > somewhere in bulk (and in parallel) and then import them without waiting for > index building on import. It would require the following changes: > 1.) {{CQLSSTableWriter}} must accept 2i definitions on top of its current > table schema definition. Once added to the schema, any {{ColumnFamilyStore}} > instances opened will have those 2i defined in their index managers. > 2.) All {{AbstractSSTableSimpleWriter}} instances must register index groups, > allowing the proper {{SSTableFlushObservers}} to be attached to > {{SSTableWriter}}. Once this is done, SAI (and any other SSTable-attached > indexes) components will be built incrementally along w/ the SSTable data > file, and will be finalized when the newly written SSTable is finalized. > 3.) Provide an example (in a unit test?) of how a third-party tool might, > assuming access to the right C* JAR, validate/checksum SAI components outside > C* proper. > 4.) {{SSTableImporter}} should have two new options: > a.) an option that fails import if any SSTable-attached 2i must be built > (i.e. has not already been built and brought along w/ the other new SSTable > components) > b.) an option that allows us to bypass full checksum validation on > imported/already-built SSTable-attached indexes (assuming they have just been > written by {{CQLSSTableWriter}}) -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org