[jira] [Commented] (CASSANDRA-8460) Make it possible to move non-compacting sstables to slow/big storage in DTCS

Jon Haddad (JIRA) Fri, 23 Feb 2018 11:14:39 -0800

    [ 
https://issues.apache.org/jira/browse/CASSANDRA-8460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16374864#comment-16374864
 ]


Jon Haddad commented on CASSANDRA-8460:
---------------------------------------

Hey [~Lerh Low]!  

First off, let me thank you for being open to alternative ideas, especially 
after writing a large chunk of code.  Not everyone is willing to take a step 
back and consider other options, I really appreciate it.

{quote}
Maybe you have stumbled upon the case where data has been resurrected in JBOD 
configuration in your experiences...? In theory since splitting by token range 
there should be no more such cases. It is safe.
{quote}

I had actually misremembered how CASSANDRA-6696 was implemented.  Looking back 
at the code and testing it manually I see the memtables are flushed to their 
respective disks initially.  It's nice to be wrong about this.

There's quite a bit going on here, I did a quick search but didn't see anything 
related to disk failure policy.  One thing that's going to be a bit tricky is 
unless you have a 1:1 fast disk to archive disk relationship, you end up with 
some weird situations that can show up when using {{disk_failure_policy: 
best_effort}}, which is what CASSANDRA-6696 was all about in the first place.  
If you lose your fast disk, will you still be able to query data that's on the 
archive disk for a given token range?  

It seems to me that using this feature would have to imply 
{{disk_failure_policy: stop}}, since either the failure of the archive or one 
of the disks in {{data_file_directories}} would result in incorrect results 
being returned.

lvmcache uses 
[dm-cache|https://www.kernel.org/doc/Documentation/device-mapper/cache.txt] 
under the hood which keeps hot pages in memory.  It shipped in Linux kernel 
3.9, which was released in April 2013.  

Using lvmcache, if you were to create a logical volume per disk, with the SSD 
as your fast disk configured as a writethrough, you'd still honor the disk 
failure policy in the case of an archival or SSD failure, as well as have the 
flexibility of keeping any hot data readily available and not explicitly 
needing to move it off to another device when it's still active.  It adapts to 
your read and write patterns rather than requiring configuration.  Take a look 
at the [man page|http://man7.org/linux/man-pages/man7/lvmcache.7.html], it's 
pretty awesome.

> Make it possible to move non-compacting sstables to slow/big storage in DTCS
> ----------------------------------------------------------------------------
>
>                 Key: CASSANDRA-8460
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-8460
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Marcus Eriksson
>            Assignee: Lerh Chuan Low
>            Priority: Major
>              Labels: doc-impacting, dtcs
>             Fix For: 4.x
>
>
> It would be nice if we could configure DTCS to have a set of extra data 
> directories where we move the sstables once they are older than 
> max_sstable_age_days. 
> This would enable users to have a quick, small SSD for hot, new data, and big 
> spinning disks for data that is rarely read and never compacted.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-8460) Make it possible to move non-compacting sstables to slow/big storage in DTCS

Reply via email to