Re: Tune cache MB settings per table.
On Sun, Jun 1, 2014 at 12:49 PM, Kevin Burton bur...@spinn3r.com wrote: It's possible to set caching to: all, keys_only, rows_only, or none .. for a given table. But we have one table which is MASSIVE and we only need the most recent 4-8 hours in memory. Anything older than that can go to disk as the queries there are very rare. … but I don't think cassandra can do this (which is a shame). Cassandra used to offer tunable sized key and row caches per column family. It was decided that it was too much management complexity for too little benefit. My personal view is that there are clearly cases where one wants to prevent one very hot CF from dominating the key or row cache, but where you do want it to be cached. Whether this is worth the overall complexity is not something I have a strong opinion on, but data caching (as opposed to meta-data caching) within the Java heap seems like an exercise in futility to me generally.. =Rob
Re: Tune cache MB settings per table.
The OS should handle this really well as long as your on v3 linux kernel -- Colin Clark +1-320-221-9531 On Jun 1, 2014, at 2:49 PM, Kevin Burton bur...@spinn3r.com wrote: It's possible to set caching to: all, keys_only, rows_only, or none .. for a given table. But we have one table which is MASSIVE and we only need the most recent 4-8 hours in memory. Anything older than that can go to disk as the queries there are very rare. … but I don't think cassandra can do this (which is a shame). Another option is to partition our tables per hour… then tell the older tables to cache 'none'… I hate this option though. A smarter mechanism would be to have a compaction strategy that created an SSTable for every hour and then had custom caching settings for that table. The additional upside for this is that TTLs would just drop the older data in the compactor.. -- Founder/CEO Spinn3r.com Location: San Francisco, CA Skype: burtonator blog: http://burtonator.wordpress.com … or check out my Google+ profile War is peace. Freedom is slavery. Ignorance is strength. Corporations are people.
Re: Tune cache MB settings per table.
Hello Kevin You'll be probably interested by this : http://www.datastax.com/dev/blog/row-caching-in-cassandra-2-1 On Sun, Jun 1, 2014 at 9:49 PM, Kevin Burton bur...@spinn3r.com wrote: It's possible to set caching to: all, keys_only, rows_only, or none .. for a given table. But we have one table which is MASSIVE and we only need the most recent 4-8 hours in memory. Anything older than that can go to disk as the queries there are very rare. … but I don't think cassandra can do this (which is a shame). Another option is to partition our tables per hour… then tell the older tables to cache 'none'… I hate this option though. A smarter mechanism would be to have a compaction strategy that created an SSTable for every hour and then had custom caching settings for that table. The additional upside for this is that TTLs would just drop the older data in the compactor.. -- Founder/CEO Spinn3r.com Location: *San Francisco, CA* Skype: *burtonator* blog: http://burtonator.wordpress.com … or check out my Google+ profile https://plus.google.com/102718274791889610666/posts http://spinn3r.com War is peace. Freedom is slavery. Ignorance is strength. Corporations are people.
Re: Tune cache MB settings per table.
Not in our experience… We've been using fadvise don't need to purge pages that aren't necessary any longer. Of course YMMV based on your usage. I tend to like to control everything explicitly instead of having magic. That's worked out very well for us in the past so it would be nice to still have this on cassandra. On Sun, Jun 1, 2014 at 12:53 PM, Colin co...@clark.ws wrote: The OS should handle this really well as long as your on v3 linux kernel -- *Colin Clark* +1-320-221-9531 On Jun 1, 2014, at 2:49 PM, Kevin Burton bur...@spinn3r.com wrote: It's possible to set caching to: all, keys_only, rows_only, or none .. for a given table. But we have one table which is MASSIVE and we only need the most recent 4-8 hours in memory. Anything older than that can go to disk as the queries there are very rare. … but I don't think cassandra can do this (which is a shame). Another option is to partition our tables per hour… then tell the older tables to cache 'none'… I hate this option though. A smarter mechanism would be to have a compaction strategy that created an SSTable for every hour and then had custom caching settings for that table. The additional upside for this is that TTLs would just drop the older data in the compactor.. -- Founder/CEO Spinn3r.com Location: *San Francisco, CA* Skype: *burtonator* blog: http://burtonator.wordpress.com … or check out my Google+ profile https://plus.google.com/102718274791889610666/posts http://spinn3r.com War is peace. Freedom is slavery. Ignorance is strength. Corporations are people. -- Founder/CEO Spinn3r.com Location: *San Francisco, CA* Skype: *burtonator* blog: http://burtonator.wordpress.com … or check out my Google+ profile https://plus.google.com/102718274791889610666/posts http://spinn3r.com War is peace. Freedom is slavery. Ignorance is strength. Corporations are people.
Re: Tune cache MB settings per table.
Have you been unable to achieve your SLA's using Cassandra out of the box so far? Based upon my experience, trying to tune Cassandra before the app is done and without simulating real world load patterns, you might actually be doing yourself a disservice. -- Colin 320-221-9531 On Jun 1, 2014, at 6:08 PM, Kevin Burton bur...@spinn3r.com wrote: Not in our experience… We've been using fadvise don't need to purge pages that aren't necessary any longer. Of course YMMV based on your usage. I tend to like to control everything explicitly instead of having magic. That's worked out very well for us in the past so it would be nice to still have this on cassandra. On Sun, Jun 1, 2014 at 12:53 PM, Colin co...@clark.ws wrote: The OS should handle this really well as long as your on v3 linux kernel -- Colin Clark +1-320-221-9531 On Jun 1, 2014, at 2:49 PM, Kevin Burton bur...@spinn3r.com wrote: It's possible to set caching to: all, keys_only, rows_only, or none .. for a given table. But we have one table which is MASSIVE and we only need the most recent 4-8 hours in memory. Anything older than that can go to disk as the queries there are very rare. … but I don't think cassandra can do this (which is a shame). Another option is to partition our tables per hour… then tell the older tables to cache 'none'… I hate this option though. A smarter mechanism would be to have a compaction strategy that created an SSTable for every hour and then had custom caching settings for that table. The additional upside for this is that TTLs would just drop the older data in the compactor.. -- Founder/CEO Spinn3r.com Location: San Francisco, CA Skype: burtonator blog: http://burtonator.wordpress.com … or check out my Google+ profile War is peace. Freedom is slavery. Ignorance is strength. Corporations are people. -- Founder/CEO Spinn3r.com Location: San Francisco, CA Skype: burtonator blog: http://burtonator.wordpress.com … or check out my Google+ profile War is peace. Freedom is slavery. Ignorance is strength. Corporations are people.
Re: Tune cache MB settings per table.
I think of all the areas you could spend your time, this will have the least returns. The OS will keep the most frequently used data in memory. There's no reason to require cassandra to do it. If you're curious as to what's been loaded into ram, try Al Tobey's pcstat utility. https://github.com/tobert/pcstat On Sun, Jun 1, 2014 at 4:30 PM, Colin colpcl...@gmail.com wrote: Have you been unable to achieve your SLA's using Cassandra out of the box so far? Based upon my experience, trying to tune Cassandra before the app is done and without simulating real world load patterns, you might actually be doing yourself a disservice. -- Colin 320-221-9531 On Jun 1, 2014, at 6:08 PM, Kevin Burton bur...@spinn3r.com wrote: Not in our experience… We've been using fadvise don't need to purge pages that aren't necessary any longer. Of course YMMV based on your usage. I tend to like to control everything explicitly instead of having magic. That's worked out very well for us in the past so it would be nice to still have this on cassandra. On Sun, Jun 1, 2014 at 12:53 PM, Colin co...@clark.ws wrote: The OS should handle this really well as long as your on v3 linux kernel -- *Colin Clark* +1-320-221-9531 On Jun 1, 2014, at 2:49 PM, Kevin Burton bur...@spinn3r.com wrote: It's possible to set caching to: all, keys_only, rows_only, or none .. for a given table. But we have one table which is MASSIVE and we only need the most recent 4-8 hours in memory. Anything older than that can go to disk as the queries there are very rare. … but I don't think cassandra can do this (which is a shame). Another option is to partition our tables per hour… then tell the older tables to cache 'none'… I hate this option though. A smarter mechanism would be to have a compaction strategy that created an SSTable for every hour and then had custom caching settings for that table. The additional upside for this is that TTLs would just drop the older data in the compactor.. -- Founder/CEO Spinn3r.com Location: *San Francisco, CA* Skype: *burtonator* blog: http://burtonator.wordpress.com … or check out my Google+ profile https://plus.google.com/102718274791889610666/posts http://spinn3r.com War is peace. Freedom is slavery. Ignorance is strength. Corporations are people. -- Founder/CEO Spinn3r.com Location: *San Francisco, CA* Skype: *burtonator* blog: http://burtonator.wordpress.com … or check out my Google+ profile https://plus.google.com/102718274791889610666/posts http://spinn3r.com War is peace. Freedom is slavery. Ignorance is strength. Corporations are people. -- Jon Haddad http://www.rustyrazorblade.com skype: rustyrazorblade
Re: Tune cache MB settings per table.
Good question. still migrating.. but we don't want to paint ourselves into a corner. There's an interesting line between premature optimization and painting yourself into a corner ;) Best to get it right in between both extremes. On Sun, Jun 1, 2014 at 4:30 PM, Colin colpcl...@gmail.com wrote: Have you been unable to achieve your SLA's using Cassandra out of the box so far? Based upon my experience, trying to tune Cassandra before the app is done and without simulating real world load patterns, you might actually be doing yourself a disservice. -- Colin 320-221-9531 On Jun 1, 2014, at 6:08 PM, Kevin Burton bur...@spinn3r.com wrote: Not in our experience… We've been using fadvise don't need to purge pages that aren't necessary any longer. Of course YMMV based on your usage. I tend to like to control everything explicitly instead of having magic. That's worked out very well for us in the past so it would be nice to still have this on cassandra. On Sun, Jun 1, 2014 at 12:53 PM, Colin co...@clark.ws wrote: The OS should handle this really well as long as your on v3 linux kernel -- *Colin Clark* +1-320-221-9531 On Jun 1, 2014, at 2:49 PM, Kevin Burton bur...@spinn3r.com wrote: It's possible to set caching to: all, keys_only, rows_only, or none .. for a given table. But we have one table which is MASSIVE and we only need the most recent 4-8 hours in memory. Anything older than that can go to disk as the queries there are very rare. … but I don't think cassandra can do this (which is a shame). Another option is to partition our tables per hour… then tell the older tables to cache 'none'… I hate this option though. A smarter mechanism would be to have a compaction strategy that created an SSTable for every hour and then had custom caching settings for that table. The additional upside for this is that TTLs would just drop the older data in the compactor.. -- Founder/CEO Spinn3r.com Location: *San Francisco, CA* Skype: *burtonator* blog: http://burtonator.wordpress.com … or check out my Google+ profile https://plus.google.com/102718274791889610666/posts http://spinn3r.com War is peace. Freedom is slavery. Ignorance is strength. Corporations are people. -- Founder/CEO Spinn3r.com Location: *San Francisco, CA* Skype: *burtonator* blog: http://burtonator.wordpress.com … or check out my Google+ profile https://plus.google.com/102718274791889610666/posts http://spinn3r.com War is peace. Freedom is slavery. Ignorance is strength. Corporations are people. -- Founder/CEO Spinn3r.com Location: *San Francisco, CA* Skype: *burtonator* blog: http://burtonator.wordpress.com … or check out my Google+ profile https://plus.google.com/102718274791889610666/posts http://spinn3r.com War is peace. Freedom is slavery. Ignorance is strength. Corporations are people.
Re: Tune cache MB settings per table.
Your data model will most likely be the far most important component of your migration. Get that right, and the rest is easy. -- Colin Clark +1-320-221-9531 On Jun 1, 2014, at 7:01 PM, Kevin Burton bur...@spinn3r.com wrote: Good question. still migrating.. but we don't want to paint ourselves into a corner. There's an interesting line between premature optimization and painting yourself into a corner ;) Best to get it right in between both extremes. On Sun, Jun 1, 2014 at 4:30 PM, Colin colpcl...@gmail.com wrote: Have you been unable to achieve your SLA's using Cassandra out of the box so far? Based upon my experience, trying to tune Cassandra before the app is done and without simulating real world load patterns, you might actually be doing yourself a disservice. -- Colin 320-221-9531 On Jun 1, 2014, at 6:08 PM, Kevin Burton bur...@spinn3r.com wrote: Not in our experience… We've been using fadvise don't need to purge pages that aren't necessary any longer. Of course YMMV based on your usage. I tend to like to control everything explicitly instead of having magic. That's worked out very well for us in the past so it would be nice to still have this on cassandra. On Sun, Jun 1, 2014 at 12:53 PM, Colin co...@clark.ws wrote: The OS should handle this really well as long as your on v3 linux kernel -- Colin Clark +1-320-221-9531 On Jun 1, 2014, at 2:49 PM, Kevin Burton bur...@spinn3r.com wrote: It's possible to set caching to: all, keys_only, rows_only, or none .. for a given table. But we have one table which is MASSIVE and we only need the most recent 4-8 hours in memory. Anything older than that can go to disk as the queries there are very rare. … but I don't think cassandra can do this (which is a shame). Another option is to partition our tables per hour… then tell the older tables to cache 'none'… I hate this option though. A smarter mechanism would be to have a compaction strategy that created an SSTable for every hour and then had custom caching settings for that table. The additional upside for this is that TTLs would just drop the older data in the compactor.. -- Founder/CEO Spinn3r.com Location: San Francisco, CA Skype: burtonator blog: http://burtonator.wordpress.com … or check out my Google+ profile War is peace. Freedom is slavery. Ignorance is strength. Corporations are people. -- Founder/CEO Spinn3r.com Location: San Francisco, CA Skype: burtonator blog: http://burtonator.wordpress.com … or check out my Google+ profile War is peace. Freedom is slavery. Ignorance is strength. Corporations are people. -- Founder/CEO Spinn3r.com Location: San Francisco, CA Skype: burtonator blog: http://burtonator.wordpress.com … or check out my Google+ profile War is peace. Freedom is slavery. Ignorance is strength. Corporations are people.