Yeah, there is so much things to do as cassandra (share-nothing) is
different from some other system like hbase , So I think we can break the
final goal into multiple steps. first is what Claude proposed. But I
suggest that this design can make the interface more scalable and we can
consider the implementation of cloud storage. so that someone can extend
the interface in the future.

Josh McKenzie <jmcken...@apache.org> 于2023年9月26日周二 18:40写道:

> it may be better to support most cloud storage
> It simply only supports S3, which feels a bit customized for a certain
> user and is not universal enough.Am I right ?
>
> I agree w/the eventual goal (and constraint on design now) of supporting
> most popular cloud storage vendors, but if we have someone with an itch to
> scratch and at the end of that we end up with first steps in a compatible
> direction to ultimately supporting decoupled / abstracted storage systems,
> that's fantastic.
>
> To Jeff's point - so long as we can think about and chart a general path
> of where we want to go, if Claude has the time and inclination to handle
> abstracting out the API in that direction and one implementation, that's
> fantastic IMO.
>
> I know there's some other folks out there who've done some interception /
> refactoring of the FileChannel stuff to support disaggregated storage;
> curious what their experiences were like.
>
>
> On Tue, Sep 26, 2023, at 4:20 AM, Claude Warren, Jr via dev wrote:
>
> The intention of the CEP is to lay the groundwork to allow development of
> ChannelProxyFactories that are pluggable in Cassandra.  In this way any
> storage system can be a candidate for Cassandra storage provided
> FileChannels can be created for the system.
>
> As I stated before I think that there may be a need for a
> java.nio.FileSystem implementation for  the proxies but I have not had the
> time to dig into it yet.
>
> Claude
>
>
> On Tue, Sep 26, 2023 at 9:01 AM guo Maxwell <cclive1...@gmail.com> wrote:
>
> In my mind , it may be better to support most cloud storage : aws,
> azure,gcp,aliyun and so on . We may make it a plugable. But in that way, it
> seems there may need a filesystem interface layer for object storage. And
> should we support ,distributed system like hdfs ,or something else. We
> should first discuss what should be done and what should not be done. It
> simply only supports S3, which feels a bit customized for a certain user
> and is not universal enough.Am I right ?
>
> Claude Warren, Jr <claude.war...@aiven.io> 于2023年9月26日周二 14:36写道:
>
> My intention is to develop an S3 storage system using
> https://github.com/carlspring/s3fs-nio
>
> There are several issues yet to be solved:
>
>    1. There are some internal calls that create files in the table
>    directory that do not use the channel proxy.  I believe that these are
>    making calls on File objects.  I think those File objects are Cassandra
>    File objects not Java I/O File objects, but am unsure.
>    2. Determine if the carlspring s3fs-nio library will be performant
>    enough to work in the long run.  There may be issues with it:
>    1. Downloading entire files before using them rather than using views
>       into larger remotely stored files.
>       2. Requiring a complete file to upload rather than using the
>       partial upload capability of the S3 interface.
>
>
>
> On Tue, Sep 26, 2023 at 4:11 AM guo Maxwell <cclive1...@gmail.com> wrote:
>
> "Rather than building this piece by piece, I think it'd be awesome if
> someone drew up an end-to-end plan to implement tiered storage, so we can
> make sure we're discussing the whole final state, and not an implementation
> detail of one part of the final state?"
>
> Do agree with jeff for this ~~~ If these feature can be supported in oss
> cassandra , I think it will be very popular, whether in  a private
> deployment environment or a public cloud service (our experience can prove
> it). In addition, it is also a cost-cutting option for users too
>
> Jeff Jirsa <jji...@gmail.com> 于2023年9月26日周二 00:11写道:
>
>
> - I think this is a great step forward.
> - Being able to move sstables around between tiers of storage is a feature
> Cassandra desperately needs, especially if one of those tiers is some sort
> of object storage
> - This looks like it's a foundational piece that enables that. Perhaps by
> a team that's already implemented this end to end?
> - Rather than building this piece by piece, I think it'd be awesome if
> someone drew up an end-to-end plan to implement tiered storage, so we can
> make sure we're discussing the whole final state, and not an implementation
> detail of one part of the final state?
>
>
>
>
>
>
> On Sun, Sep 24, 2023 at 11:49 PM Claude Warren, Jr via dev <
> dev@cassandra.apache.org> wrote:
>
> I have just filed CEP-36 [1] to allow for keyspace/table storage outside
> of the standard storage space.
>
> There are two desires  driving this change:
>
>    1. The ability to temporarily move some keyspaces/tables to storage
>    outside the normal directory tree to other disk so that compaction can
>    occur in situations where there is not enough disk space for compaction and
>    the processing to the moved data can not be suspended.
>    2. The ability to store infrequently used data on slower cheaper
>    storage layers.
>
> I have a working POC implementation [2] though there are some issues still
> to be solved and much logging to be reduced.
>
> I look forward to productive discussions,
> Claude
>
> [1]
> https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-36%3A+A+Configurable+ChannelProxy+to+alias+external+storage+locations
> [2] https://github.com/Claudenw/cassandra/tree/channel_proxy_factory
>
>
>
>
> --
> you are the apple of my eye !
>
>
>
> --
> you are the apple of my eye !
>
>
>

-- 
you are the apple of my eye !

Reply via email to