We (DataStax) have a FileSystemProvider for Astra we can provide. Works with S3/GCS/Azure.
I'll ask someone on our end to make it accessible. This would work by having a bucket prefix per node. But there are lots of details needed to support things like out of bound compaction (mentioned in CEP). Jake On Tue, Sep 26, 2023 at 12:56 PM Benedict <bened...@apache.org> wrote: > > I agree with Ariel, the more suitable insertion point is probably the JDK > level FileSystemProvider and FileSystem abstraction. > > It might also be that we can reuse existing work here in some cases? > > On 26 Sep 2023, at 17:49, Ariel Weisberg <ar...@weisberg.ws> wrote: > > > Hi, > > Support for multiple storage backends including remote storage backends is a > pretty high value piece of functionality. I am happy to see there is interest > in that. > > I think that `ChannelProxyFactory` as an integration point is going to > quickly turn into a dead end as we get into really using multiple storage > backends. We need to be able to list files and really the full range of > filesystem interactions that Java supports should work with any backend to > make development, testing, and using existing code straightforward. > > It's a little more work to get C* to creates paths for alternate backends > where appropriate, but that works is probably necessary even with > `ChanelProxyFactory` and munging UNIX paths (vs supporting multiple > Fileystems). There will probably also be backend specific behaviors that show > up above the `ChannelProxy` layer that will depend on the backend. > > Ideally there would be some config to specify several backend filesystems and > their individual configuration that can be used, as well as configuration and > support for a "backend file router" for file creation (and opening) that can > be used to route files to the backend most appropriate. > > Regards, > Ariel > > On Mon, Sep 25, 2023, at 2:48 AM, Claude Warren, Jr via dev wrote: > > I have just filed CEP-36 [1] to allow for keyspace/table storage outside of > the standard storage space. > > There are two desires driving this change: > > The ability to temporarily move some keyspaces/tables to storage outside the > normal directory tree to other disk so that compaction can occur in > situations where there is not enough disk space for compaction and the > processing to the moved data can not be suspended. > The ability to store infrequently used data on slower cheaper storage layers. > > I have a working POC implementation [2] though there are some issues still to > be solved and much logging to be reduced. > > I look forward to productive discussions, > Claude > > [1] > https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-36%3A+A+Configurable+ChannelProxy+to+alias+external+storage+locations > [2] https://github.com/Claudenw/cassandra/tree/channel_proxy_factory > > > -- http://twitter.com/tjake