[ 
https://issues.apache.org/jira/browse/CASSANDRA-18714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17806026#comment-17806026
 ] 

Stefan Miklosovic commented on CASSANDRA-18714:
-----------------------------------------------

So, upon testing, I realized that CQLSSTableWriterClientTest fails (1). Why 
this is happening, long story short, is that when we are trying to fetch SAI 
indexes from ColumnFamilyStore, this does not work in client mode because there 
were never any cfs created, no keyspaces opened etc etc ... it just goes by 
completely different code path in comparison to CQLSSTableWriterTest which has 
"DatabaseDescriptor.daemonInitialization();" but client test has 
"DatabaseDescriptor.clientInitialization();"

The "solution" is to programmatically construct it in CQLSSTableWriter (2) 
where I construct ColumnFamilyStore with a mocked keyspace into which I 
register all indexes. Then, upon asking what SAI groups there are, they will be 
read from this CFS I just constructed.

The problem is that this gets complicated pretty fast and because we are in a 
client mode where descriptor was not initialized, all paths in SAI (and some in 
column family) which are reading values from DatatabaseDescriptor would fail 
because they would be null values ... One can see what I mean by fetching the 
latest branch and running the client test.

(1) 
https://app.circleci.com/pipelines/github/instaclustr/cassandra/3755/workflows/690aa727-3957-4194-aa40-d58634440547/jobs/177832/tests#failed-test-0
(2) https://github.com/apache/cassandra/pull/3029/commits

> Expand CQLSSTableWriter to write SSTable-attached secondary indexes
> -------------------------------------------------------------------
>
>                 Key: CASSANDRA-18714
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-18714
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Feature/SAI, Tool/bulk load
>            Reporter: Caleb Rackliffe
>            Assignee: Stefan Miklosovic
>            Priority: Normal
>             Fix For: 5.0-rc, 5.x
>
>          Time Spent: 10h 10m
>  Remaining Estimate: 0h
>
> {{CQLSSTableWriter}} currently has no way of writing any secondary indexes 
> inline as it writes the core SSTable components. With SAI, this has become 
> tractable problem, and we should be able to enhance both it and 
> {{SSTableImporter}} to handle cases where we might want to write SSTables 
> somewhere in bulk (and in parallel) and then import them without waiting for 
> index building on import. It would require the following changes:
> 1.) {{CQLSSTableWriter}} must accept 2i definitions on top of its current 
> table schema definition. Once added to the schema, any {{ColumnFamilyStore}} 
> instances opened will have those 2i defined in their index managers.
> 2.) All {{AbstractSSTableSimpleWriter}} instances must register index groups, 
> allowing the proper {{SSTableFlushObservers}} to be attached to 
> {{SSTableWriter}}. Once this is done, SAI (and any other SSTable-attached 
> indexes) components will be built incrementally along w/ the SSTable data 
> file, and will be finalized when the newly written SSTable is finalized.
> 3.) Provide an example (in a unit test?) of how a third-party tool might, 
> assuming access to the right C* JAR, validate/checksum SAI components outside 
> C* proper.
> 4.) {{SSTableImporter}} should have two new options:
>     a.) an option that fails import if any SSTable-attached 2i must be built 
> (i.e. has not already been built and brought along w/ the other new SSTable 
> components)
>     b.) an option that allows us to bypass full checksum validation on 
> imported/already-built SSTable-attached indexes (assuming they have just been 
> written by {{CQLSSTableWriter}})



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Reply via email to