[ https://issues.apache.org/jira/browse/CASSANDRA-13475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16239329#comment-16239329 ]
Blake Eggleston commented on CASSANDRA-13475: --------------------------------------------- Dikang and I spoke offline, and my proposed plan seems reasonable to him. So I think the next step would be to talk about the non technical side of this. The pluggable storage project’s place in Cassandra, and some general guidelines for how to approach the sub projects. Once we’ve converged on something in this jira, we should put it up on the dev list for a wider audience / additional feedback. My thoughts are below: First, pluggable storage’s place in the Cassandra project: For the time being, I think we should approach this as an effort to properly modularize storage related parts of Cassandra. The motivation being to enable experimentation with alternate storage ideas without having to resort to awful hacks, not ‘add pluggable storage to Cassandra’ I think this work could definitely lead to pluggable storage being a part of Cassandra at some point, and that it could be beneficial to users. However, I don’t think it’s a good idea to start with the intention of supporting, directly or indirectly, secondary storage layers. Both because of how it would impact development on core Cassandra, and also because of how it would affect user expectations about the storage options available to them. Let’s start with making it possible, and then see where things go from there. The short term implications for rocksdb would be that there may be api changes in minor releases they’d have to worry about, and they’ll still need a fork. The long term implications would be that pluggable storage may never really become an official part of Cassandra, so there’s risk in investing a lot of time in it. Next, guidelines on approaching each incremental component. Whenever we commit some code modularizing something, the overarching storage modularization project itself should remain abandon-able. In other words, if work stops on this project for some reason, there shouldn’t be any need to go back and revert any of the previous work. Each component refactor, should, as much as possible, make sense on it’s own. Especially larger ones. Each project’s affect on internal decoupling and testability should be positive. We also can’t make core development work more difficult. Finally, I’ve discussed this with Dikang offline, but just so no one’s surprised if I say this in the future: I don’t think a rocksdb backend makes sense for Cassandra. Cassandra is a sorted lsm, rocksdb is a sorted lsm, and we don’t need 2 of them. If we want rocksdb performance in Cassandra, it would probably take less time to close the gap by optimizing the existing engine than it will to do all this work to make storage pluggable. That said, I think this project is a good thing, and I’m happy to help. I think that the modularization work will be good for Cassandra. It enables a member of our dev community to try something new without committing the entire project to it, it will clean up some of our messy internals, and I think it will help us more quickly adapt to some of the changes in storage technology that are on the horizon. > First version of pluggable storage engine API. > ---------------------------------------------- > > Key: CASSANDRA-13475 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13475 > Project: Cassandra > Issue Type: Sub-task > Reporter: Dikang Gu > Assignee: Dikang Gu > > In order to support pluggable storage engine, we need to define a unified > interface/API, which can allow us to plug in different storage engines for > different requirements. > Here is a design quip we are currently working on: > https://quip.com/bhw5ABUCi3co > In very high level, the storage engine interface should include APIs to: > 1. Apply update into the engine. > 2. Query data from the engine. > 3. Stream data in/out to/from the engine. > 4. Table operations, like create/drop/truncate a table, etc. > 5. Various stats about the engine. > I create this ticket to start the discussions about the interface. -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org