external storage can be any storage that you can produce a FileChannel
for.  There is an S3 library that does this so S3 is a definite
possibility for storage in this solution.  My example code only writes to a
different directory on the same system.  And there are a couple of places
where I did not catch the file creation, those have to be found and
redirected to the proxy location.  I think that it may be necessary to have
a java FileSystem object to make the whole thing work.  The S3 library that
I found also has an S3 FileSystem class.

This solution uses the internal file name for for example an sstable name.
The proxyfactory can examine the entire path and make a determination of
where to read/write the file.  So any determination that can be made based
on the information in the file path can be implemented with this approach.
There is no direct inspection of the data being written to determine
routing.  The only routing data are in the file name.

I ran an inhouse demo where I showed that we could reroute a single table
to a different storage while leaving the rest of the tables in the same
keyspace alone.

In discussing this with a colleague we hit upon the term "tiered nodes".
If you can spread your data across the nodes so that some nodes get the
infrequently used data (cold data) and other nodes receive the frequently
used data (hot data) then the cold data nodes can use this process to store
the data on S3 or similar systems.

On Mon, Sep 25, 2023 at 10:45 AM guo Maxwell <cclive1...@gmail.com> wrote:

> Great suggestion,  Can external storage only be local storage media? Or
> can it be stored in any storage medium, such as object storage s3 ?
> We have previously implemented a tiered storage capability, that is, there
> are multiple storage media on one node, SSD, HDD, and data placement based
> on requests. After briefly browsing the proposals, it seems that there are
> some differences. Can you help to do some explain ? Thanks 。
>
>
> Claude Warren, Jr via dev <dev@cassandra.apache.org> 于2023年9月25日周一
> 14:49写道:
>
>> I have just filed CEP-36 [1] to allow for keyspace/table storage outside
>> of the standard storage space.
>>
>> There are two desires  driving this change:
>>
>>    1. The ability to temporarily move some keyspaces/tables to storage
>>    outside the normal directory tree to other disk so that compaction can
>>    occur in situations where there is not enough disk space for compaction 
>> and
>>    the processing to the moved data can not be suspended.
>>    2. The ability to store infrequently used data on slower cheaper
>>    storage layers.
>>
>> I have a working POC implementation [2] though there are some issues
>> still to be solved and much logging to be reduced.
>>
>> I look forward to productive discussions,
>> Claude
>>
>> [1]
>> https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-36%3A+A+Configurable+ChannelProxy+to+alias+external+storage+locations
>> [2] https://github.com/Claudenw/cassandra/tree/channel_proxy_factory
>>
>>
>>
>
> --
> you are the apple of my eye !
>

Reply via email to