[ 
https://issues.apache.org/jira/browse/CASSANDRA-1902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13030508#comment-13030508
 ] 

Peter Schuller commented on CASSANDRA-1902:
-------------------------------------------

In general I've been thinking some more lately about this whole issue with 
caching again. Even with perfectly working direct I/O or advising the kernel in 
ways that actually work, we'll never get around the problem of needing to 
maintain a much larger amount of data in memory than is the *actual* live set 
unless we can effect some kind of incremental transition in the live set being 
served by live reads. No matter how well the migration works, it'll just tend 
to be seek bound anyway and the longer it takes to re-populate the more the 
positive effects are negated anyway since you're still keeping the live 
sstables in-core while also trying to re-populate the new data.

I once filed CASSANDRA-1658 for similar reasons, but as has been pointed out 
that's a lot of complexity. As jbellis pointed out at the time, CASSANDRA-1608 
could help too with limiting live set sizes.

In particular, I'm leaning more and more towards limiting of sstable sizes in 
combination with cache advising. If sstable sizes are limited one has a more 
natural and vaguely reasonable unit of data that is transitioned between 
sstables at a time, and you can must more easily avoid extremities associated 
with large compactions (e.g., a smaller repair waiting 1 day for huge 
compaction work that's already submitted).

That said, maybe a "reasonable" sstable size is still too large to be an 
appropriate unit from the perspective of cache warmness and compaction/repair, 
which would lead to the need for something like CASSANDRA-1658 anyway. I can 
see the incremental transition happening in such a way that the live traffic is 
re-directed to the new sstable in a moving wave-front matched by a 
DONTNEED/WILLNEED process such that no significant amount of data would have to 
be uselessly kept in memory at all.

I dunno, there are no really easy answers. I'm currently battling with a 
situation where we have exactly this as an issue; the active set is too big to 
realistically rely on row caching, and as soon as you take the leap to rely on 
page caching you have repair/compaction issues. Particularly when repair 
effectively transfers most or all of the data set on other nodes, very 
significantly increasing the live size of the CF during repair work (something 
else that I hope would be possible to mitigate with smaller sstables, as 
repairs could be done more incrementally as an iteration over the key range, 
letting local compaction happen as data is streamed).

Incidentally the DONTNEED:ing in CASSANDRA-1470, to the extent that it works 
given lack of fsync, is probably working against us and we'll probably patch to 
disable the behavior. Taking a sudden cache coldness when traffic is switched 
to a newly written DONTNEED:ed set of sstables is not pretty; and if DONTNEED 
is successful you have an issue here even if your entire CF + temporary 
expansion during repair/compaction fits in memory.



> Migrate cached pages during compaction 
> ---------------------------------------
>
>                 Key: CASSANDRA-1902
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1902
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>    Affects Versions: 0.7.1
>            Reporter: T Jake Luciani
>            Assignee: Pavel Yaskevich
>             Fix For: 1.0
>
>         Attachments: 
> 0001-CASSANDRA-1902-cache-migration-impl-with-config-option.txt, 
> 1902-BufferedSegmentedFile-logandsleep.txt, 1902-formatted.txt, 
> 1902-per-column-migration-rebase2.txt, 1902-per-column-migration.txt, 
> CASSANDRA-1902-v10-trunk-rebased.patch, CASSANDRA-1902-v3.patch, 
> CASSANDRA-1902-v4.patch, CASSANDRA-1902-v5.patch, CASSANDRA-1902-v6.patch, 
> CASSANDRA-1902-v7.patch, CASSANDRA-1902-v8.patch, 
> CASSANDRA-1902-v9-trunk-rebased.patch, 
> CASSANDRA-1902-v9-trunk-with-jmx.patch, CASSANDRA-1902-v9-trunk.patch, 
> CASSANDRA-1902-v9.patch
>
>   Original Estimate: 32h
>          Time Spent: 56h
>  Remaining Estimate: 0h
>
> Post CASSANDRA-1470 there is an opportunity to migrate cached pages from a 
> pre-compacted CF during the compaction process.  This is now important since 
> CASSANDRA-1470 caches effectively nothing.  
> For example an active CF being compacted hurts reads since nothing is cached 
> in the new SSTable. 
> The purpose of this ticket then is to make sure SOME data is cached from 
> active CFs. This can be done my monitoring which Old SSTables are in the page 
> cache and caching active rows in the New SStable.
> A simpler yet similar approach is described here: 
> http://insights.oetiker.ch/linux/fadvise/

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to