[ https://issues.apache.org/jira/browse/CASSANDRA-5357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13871885#comment-13871885 ]
Sylvain Lebresne commented on CASSANDRA-5357: --------------------------------------------- bq. The case you describe of "wanting to cache a full table" is not dependent on rows per partition but on cache size = number of partitions cached But if you don't want to cache a full table, you still at least need to make sure that for each partition, all rows are cached. You still need "rows per partition = <n> where <n> > max number of rows per partition in that table" and all I'm saying is that "rows per partition = all" is a bit more user friendly. It's true you also need to make sure you cache is big enough if you want to cache the table in full but that doesn't invalidate the first part (unless I'm missing something). bq. We're talking about static CFs aka partition key == primary key, right? Then there is one row per partition, so there is no need for a special "rows per partition = all" setting. I guess I'm saying 2 things: # I think that what user sometimes really want is "cache full partitions". That's the basic intention. So what's the harm of adding a "all" alias that express that intention better for user friendliness sake, provided adding that don't require noticeable complexity? And given "all" can just be an alias for Integer.MAX_VALUE, it doesn't add complexity so ... # It's somewhat a detail, but I don't think that technically "rows per partition = 1" will work equivalently to the current row cache behavior for static table in practice, not always at least. More precisely, suppose you get a query "select * from foo where pk=3", that "pk=3" is a cache hit and that "rows_per_partition=1" on that table. Then, you can only serve the read from the cache hit if you know *for sure* that this is a static table, i.e. that there cannot be more rows in that partition that haven't been cache due to the per-partition limitation. And, at least for thrift, we never really know for sure if a table is a static one. I do note that "rows_per_partition=2" would work, because if your cache hit has 1 row and you know you cache the 2 first rows of the partition, then you can infer all rows of the partition are cached without any more info, but at that point, I think it's a lot simpler to have a "all" alias than to have to explain those implementation details. Not saying it's a big deal, just that I think it's user friendly and has not real downside that I can see. > Query cache / partition head cache > ---------------------------------- > > Key: CASSANDRA-5357 > URL: https://issues.apache.org/jira/browse/CASSANDRA-5357 > Project: Cassandra > Issue Type: New Feature > Reporter: Jonathan Ellis > Assignee: Marcus Eriksson > Fix For: 2.1 > > Attachments: 0001-Cache-a-configurable-amount-of-columns.patch > > > I think that most people expect the row cache to act like a query cache, > because that's a reasonable model. Caching the entire partition is, in > retrospect, not really reasonable, so it's not surprising that it catches > people off guard, especially given the confusion we've inflicted on ourselves > as to what a "row" constitutes. > I propose replacing it with a true query cache. -- This message was sent by Atlassian JIRA (v6.1.5#6160)