[jira] [Commented] (CASSANDRA-13475) First version of pluggable storage engine API.

Dikang Gu (JIRA) Mon, 30 Oct 2017 23:47:39 -0700

    [ 
https://issues.apache.org/jira/browse/CASSANDRA-13475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16226344#comment-16226344
 ]


Dikang Gu commented on CASSANDRA-13475:
---------------------------------------

[~bdeggleston], I see what you mean, but my point is that, ColumnFamilyStore is 
a very complicated class, it is leaking the storage details like the sstable 
concept in almost every function it provides. In the future, all the call 
stacks deal with the APIs which leaks the storage details should be moved to a 
CQLStorageEngine (or CQLColumnFamliyStore in your word). And I'm not sure it's 
the top priority to try to clean up the ColumnFamilyStore at this moment.

The process in my mind is that:
1. We define the new API for common work load, which does not require a big 
refactor of Cassandra's code yet, but can hide a new storage engine 
implementation. This is demonstrated in our RocksDBEngine implementation.
2. Start to refactor/cleanup ColumnFamilyStore and Keyspace, which means we 
implement a CQLStorageEngine and move the current storage related business into 
the CQLStorageEnigne. As you said 99.99% of the work will be involved here. And 
according to our experience of implementing the RocksDBEngine, we should be 
able to do it step by step, move things piece by piece.
3. I can image we will take a lot of iterations of step 1 & 2, keep refining 
the API and cleaning up the CFS/Keyspace classes. At the end, I think 
CFS/Keyspace will become a thin wrapper around the storage engine API. 

I don't think there are big differences between our proposals, even for the 
IColumnFamilyStore interface, I can image it will be pretty similar to the 
StorageEngine interface I propose. But I don't want to change everywhere to use 
IColumnFamilyStore interface at step 1, since it requires so many refactoring 
work at once, and I tend to have many small patches instead of one big patch 
for the refactoring. Also for testing purpose, I think small patches are better 
and easier to have better test coverage.

What do you think?

> First version of pluggable storage engine API.
> ----------------------------------------------
>
>                 Key: CASSANDRA-13475
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-13475
>             Project: Cassandra
>          Issue Type: Sub-task
>            Reporter: Dikang Gu
>            Assignee: Dikang Gu
>
> In order to support pluggable storage engine, we need to define a unified 
> interface/API, which can allow us to plug in different storage engines for 
> different requirements. 
> In very high level, the storage engine interface should include APIs to:
> 1. Apply update into the engine.
> 2. Query data from the engine.
> 3. Stream data in/out to/from the engine.
> 4. Table operations, like create/drop/truncate a table, etc.
> 5. Various stats about the engine.
> I create this ticket to start the discussions about the interface.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-13475) First version of pluggable storage engine API.

Reply via email to