Re: Tune cache MB settings per table.

2014-06-02 Thread Robert Coli
On Sun, Jun 1, 2014 at 12:49 PM, Kevin Burton bur...@spinn3r.com wrote:

 It's possible to set caching to:

 all, keys_only, rows_only, or none

 .. for a given table.

 But we have one table which is MASSIVE and we only need the most recent
 4-8 hours in memory.

 Anything older than that can go to disk as the queries there are very rare.

 … but I don't think cassandra can do this (which is a shame).


Cassandra used to offer tunable sized key and row caches per column family.
It was decided that it was too much management complexity for too little
benefit. My personal view is that there are clearly cases where one wants
to prevent one very hot CF from dominating the key or row cache, but where
you do want it to be cached. Whether this is worth the overall complexity
is not something I have a strong opinion on, but data caching (as opposed
to meta-data caching) within the Java heap seems like an exercise in
futility to me generally..

=Rob


Re: Tune cache MB settings per table.

2014-06-01 Thread Colin
The OS should handle this really well as long as your on v3 linux kernel  

--
Colin Clark 
+1-320-221-9531
 

 On Jun 1, 2014, at 2:49 PM, Kevin Burton bur...@spinn3r.com wrote:
 
 It's possible to set caching to:
 
 all, keys_only, rows_only, or none
 
 .. for a given table.
 
 But we have one table which is MASSIVE and we only need the most recent 4-8 
 hours in memory.  
 
 Anything older than that can go to disk as the queries there are very rare.
 
 … but I don't think cassandra can do this (which is a shame).
 
 Another option is to partition our tables per hour… then tell the older 
 tables to cache 'none'… 
 
 I hate this option though.  A smarter mechanism would be to have a compaction 
 strategy that created an SSTable for every hour and then had custom caching 
 settings for that table.
 
 The additional upside for this is that TTLs would just drop the older data in 
 the compactor.. 
 
 -- 
 Founder/CEO Spinn3r.com
 Location: San Francisco, CA
 Skype: burtonator
 blog: http://burtonator.wordpress.com
 … or check out my Google+ profile
 
 War is peace. Freedom is slavery. Ignorance is strength. Corporations are 
 people.


Re: Tune cache MB settings per table.

2014-06-01 Thread DuyHai Doan
Hello Kevin

 You'll be probably interested by this :
http://www.datastax.com/dev/blog/row-caching-in-cassandra-2-1


On Sun, Jun 1, 2014 at 9:49 PM, Kevin Burton bur...@spinn3r.com wrote:

 It's possible to set caching to:

 all, keys_only, rows_only, or none

 .. for a given table.

 But we have one table which is MASSIVE and we only need the most recent
 4-8 hours in memory.

 Anything older than that can go to disk as the queries there are very rare.

 … but I don't think cassandra can do this (which is a shame).

 Another option is to partition our tables per hour… then tell the older
 tables to cache 'none'…

 I hate this option though.  A smarter mechanism would be to have a
 compaction strategy that created an SSTable for every hour and then had
 custom caching settings for that table.

 The additional upside for this is that TTLs would just drop the older data
 in the compactor..

 --

 Founder/CEO Spinn3r.com
 Location: *San Francisco, CA*
 Skype: *burtonator*
 blog: http://burtonator.wordpress.com
 … or check out my Google+ profile
 https://plus.google.com/102718274791889610666/posts
 http://spinn3r.com
 War is peace. Freedom is slavery. Ignorance is strength. Corporations are
 people.




Re: Tune cache MB settings per table.

2014-06-01 Thread Kevin Burton
Not in our experience… We've been using fadvise don't need to purge pages
that aren't necessary any longer.

Of course YMMV based on your usage.  I tend to like to control everything
explicitly instead of having magic.

That's worked out very well for us in the past so it would be nice to still
have this on cassandra.


On Sun, Jun 1, 2014 at 12:53 PM, Colin co...@clark.ws wrote:

 The OS should handle this really well as long as your on v3 linux
 kernel

 --
 *Colin Clark*
 +1-320-221-9531


 On Jun 1, 2014, at 2:49 PM, Kevin Burton bur...@spinn3r.com wrote:

 It's possible to set caching to:

 all, keys_only, rows_only, or none

 .. for a given table.

 But we have one table which is MASSIVE and we only need the most recent
 4-8 hours in memory.

 Anything older than that can go to disk as the queries there are very rare.

 … but I don't think cassandra can do this (which is a shame).

 Another option is to partition our tables per hour… then tell the older
 tables to cache 'none'…

 I hate this option though.  A smarter mechanism would be to have a
 compaction strategy that created an SSTable for every hour and then had
 custom caching settings for that table.

 The additional upside for this is that TTLs would just drop the older data
 in the compactor..

 --

 Founder/CEO Spinn3r.com
 Location: *San Francisco, CA*
 Skype: *burtonator*
 blog: http://burtonator.wordpress.com
 … or check out my Google+ profile
 https://plus.google.com/102718274791889610666/posts
 http://spinn3r.com
 War is peace. Freedom is slavery. Ignorance is strength. Corporations are
 people.




-- 

Founder/CEO Spinn3r.com
Location: *San Francisco, CA*
Skype: *burtonator*
blog: http://burtonator.wordpress.com
… or check out my Google+ profile
https://plus.google.com/102718274791889610666/posts
http://spinn3r.com
War is peace. Freedom is slavery. Ignorance is strength. Corporations are
people.


Re: Tune cache MB settings per table.

2014-06-01 Thread Colin
Have you been unable to achieve your SLA's using Cassandra out of the box so 
far?

Based upon my experience, trying to tune Cassandra before the app is done and 
without simulating real world load patterns, you might actually be doing 
yourself a disservice.

--
Colin
320-221-9531


 On Jun 1, 2014, at 6:08 PM, Kevin Burton bur...@spinn3r.com wrote:
 
 Not in our experience… We've been using fadvise don't need to purge pages 
 that aren't necessary any longer.
 
 Of course YMMV based on your usage.  I tend to like to control everything 
 explicitly instead of having magic.
 
 That's worked out very well for us in the past so it would be nice to still 
 have this on cassandra.
 
 
 On Sun, Jun 1, 2014 at 12:53 PM, Colin co...@clark.ws wrote:
 The OS should handle this really well as long as your on v3 linux kernel 
  
 
 --
 Colin Clark 
 +1-320-221-9531
  
 
 On Jun 1, 2014, at 2:49 PM, Kevin Burton bur...@spinn3r.com wrote:
 
 It's possible to set caching to:
 
 all, keys_only, rows_only, or none
 
 .. for a given table.
 
 But we have one table which is MASSIVE and we only need the most recent 4-8 
 hours in memory.  
 
 Anything older than that can go to disk as the queries there are very rare.
 
 … but I don't think cassandra can do this (which is a shame).
 
 Another option is to partition our tables per hour… then tell the older 
 tables to cache 'none'… 
 
 I hate this option though.  A smarter mechanism would be to have a 
 compaction strategy that created an SSTable for every hour and then had 
 custom caching settings for that table.
 
 The additional upside for this is that TTLs would just drop the older data 
 in the compactor.. 
 
 -- 
 Founder/CEO Spinn3r.com
 Location: San Francisco, CA
 Skype: burtonator
 blog: http://burtonator.wordpress.com
 … or check out my Google+ profile
 
 War is peace. Freedom is slavery. Ignorance is strength. Corporations are 
 people.
 
 
 
 -- 
 Founder/CEO Spinn3r.com
 Location: San Francisco, CA
 Skype: burtonator
 blog: http://burtonator.wordpress.com
 … or check out my Google+ profile
 
 War is peace. Freedom is slavery. Ignorance is strength. Corporations are 
 people.


Re: Tune cache MB settings per table.

2014-06-01 Thread Jonathan Haddad
I think of all the areas you could spend your time, this will have the
least returns.  The OS will keep the most frequently used data in memory.
 There's no reason to require cassandra to do it.

If you're curious as to what's been loaded into ram, try Al Tobey's pcstat
utility.  https://github.com/tobert/pcstat


On Sun, Jun 1, 2014 at 4:30 PM, Colin colpcl...@gmail.com wrote:

 Have you been unable to achieve your SLA's using Cassandra out of the box
 so far?

 Based upon my experience, trying to tune Cassandra before the app is done
 and without simulating real world load patterns, you might actually be
 doing yourself a disservice.

 --
 Colin
 320-221-9531


 On Jun 1, 2014, at 6:08 PM, Kevin Burton bur...@spinn3r.com wrote:

 Not in our experience… We've been using fadvise don't need to purge pages
 that aren't necessary any longer.

 Of course YMMV based on your usage.  I tend to like to control everything
 explicitly instead of having magic.

 That's worked out very well for us in the past so it would be nice to
 still have this on cassandra.


 On Sun, Jun 1, 2014 at 12:53 PM, Colin co...@clark.ws wrote:

 The OS should handle this really well as long as your on v3 linux
 kernel

 --
 *Colin Clark*
 +1-320-221-9531


 On Jun 1, 2014, at 2:49 PM, Kevin Burton bur...@spinn3r.com wrote:

 It's possible to set caching to:

 all, keys_only, rows_only, or none

 .. for a given table.

 But we have one table which is MASSIVE and we only need the most recent
 4-8 hours in memory.

 Anything older than that can go to disk as the queries there are very
 rare.

 … but I don't think cassandra can do this (which is a shame).

 Another option is to partition our tables per hour… then tell the older
 tables to cache 'none'…

 I hate this option though.  A smarter mechanism would be to have a
 compaction strategy that created an SSTable for every hour and then had
 custom caching settings for that table.

 The additional upside for this is that TTLs would just drop the older
 data in the compactor..

 --

 Founder/CEO Spinn3r.com
 Location: *San Francisco, CA*
 Skype: *burtonator*
 blog: http://burtonator.wordpress.com
 … or check out my Google+ profile
 https://plus.google.com/102718274791889610666/posts
 http://spinn3r.com
 War is peace. Freedom is slavery. Ignorance is strength. Corporations are
 people.




 --

 Founder/CEO Spinn3r.com
 Location: *San Francisco, CA*
 Skype: *burtonator*
 blog: http://burtonator.wordpress.com
 … or check out my Google+ profile
 https://plus.google.com/102718274791889610666/posts
 http://spinn3r.com
 War is peace. Freedom is slavery. Ignorance is strength. Corporations are
 people.




-- 
Jon Haddad
http://www.rustyrazorblade.com
skype: rustyrazorblade


Re: Tune cache MB settings per table.

2014-06-01 Thread Kevin Burton
Good question. still migrating.. but we don't want to paint ourselves into
a corner.

There's an interesting line between premature optimization and painting
yourself into a corner ;)

Best to get it right in between both extremes.


On Sun, Jun 1, 2014 at 4:30 PM, Colin colpcl...@gmail.com wrote:

 Have you been unable to achieve your SLA's using Cassandra out of the box
 so far?

 Based upon my experience, trying to tune Cassandra before the app is done
 and without simulating real world load patterns, you might actually be
 doing yourself a disservice.

 --
 Colin
 320-221-9531


 On Jun 1, 2014, at 6:08 PM, Kevin Burton bur...@spinn3r.com wrote:

 Not in our experience… We've been using fadvise don't need to purge pages
 that aren't necessary any longer.

 Of course YMMV based on your usage.  I tend to like to control everything
 explicitly instead of having magic.

 That's worked out very well for us in the past so it would be nice to
 still have this on cassandra.


 On Sun, Jun 1, 2014 at 12:53 PM, Colin co...@clark.ws wrote:

 The OS should handle this really well as long as your on v3 linux
 kernel

 --
 *Colin Clark*
 +1-320-221-9531


 On Jun 1, 2014, at 2:49 PM, Kevin Burton bur...@spinn3r.com wrote:

 It's possible to set caching to:

 all, keys_only, rows_only, or none

 .. for a given table.

 But we have one table which is MASSIVE and we only need the most recent
 4-8 hours in memory.

 Anything older than that can go to disk as the queries there are very
 rare.

 … but I don't think cassandra can do this (which is a shame).

 Another option is to partition our tables per hour… then tell the older
 tables to cache 'none'…

 I hate this option though.  A smarter mechanism would be to have a
 compaction strategy that created an SSTable for every hour and then had
 custom caching settings for that table.

 The additional upside for this is that TTLs would just drop the older
 data in the compactor..

 --

 Founder/CEO Spinn3r.com
 Location: *San Francisco, CA*
 Skype: *burtonator*
 blog: http://burtonator.wordpress.com
 … or check out my Google+ profile
 https://plus.google.com/102718274791889610666/posts
 http://spinn3r.com
 War is peace. Freedom is slavery. Ignorance is strength. Corporations are
 people.




 --

 Founder/CEO Spinn3r.com
 Location: *San Francisco, CA*
 Skype: *burtonator*
 blog: http://burtonator.wordpress.com
 … or check out my Google+ profile
 https://plus.google.com/102718274791889610666/posts
 http://spinn3r.com
 War is peace. Freedom is slavery. Ignorance is strength. Corporations are
 people.




-- 

Founder/CEO Spinn3r.com
Location: *San Francisco, CA*
Skype: *burtonator*
blog: http://burtonator.wordpress.com
… or check out my Google+ profile
https://plus.google.com/102718274791889610666/posts
http://spinn3r.com
War is peace. Freedom is slavery. Ignorance is strength. Corporations are
people.


Re: Tune cache MB settings per table.

2014-06-01 Thread Colin
Your data model will most likely be the far most important component of your 
migration.  Get that right, and the rest is easy.

--
Colin Clark 
+1-320-221-9531
 

 On Jun 1, 2014, at 7:01 PM, Kevin Burton bur...@spinn3r.com wrote:
 
 Good question. still migrating.. but we don't want to paint ourselves into a 
 corner.
 
 There's an interesting line between premature optimization and painting 
 yourself into a corner ;)
 
 Best to get it right in between both extremes.
 
 
 On Sun, Jun 1, 2014 at 4:30 PM, Colin colpcl...@gmail.com wrote:
 Have you been unable to achieve your SLA's using Cassandra out of the box so 
 far?
 
 Based upon my experience, trying to tune Cassandra before the app is done 
 and without simulating real world load patterns, you might actually be doing 
 yourself a disservice.
 
 --
 Colin
 320-221-9531
 
 
 On Jun 1, 2014, at 6:08 PM, Kevin Burton bur...@spinn3r.com wrote:
 
 Not in our experience… We've been using fadvise don't need to purge pages 
 that aren't necessary any longer.
 
 Of course YMMV based on your usage.  I tend to like to control everything 
 explicitly instead of having magic.
 
 That's worked out very well for us in the past so it would be nice to still 
 have this on cassandra.
 
 
 On Sun, Jun 1, 2014 at 12:53 PM, Colin co...@clark.ws wrote:
 The OS should handle this really well as long as your on v3 linux 
 kernel  
 
 --
 Colin Clark 
 +1-320-221-9531
  
 
 On Jun 1, 2014, at 2:49 PM, Kevin Burton bur...@spinn3r.com wrote:
 
 It's possible to set caching to:
 
 all, keys_only, rows_only, or none
 
 .. for a given table.
 
 But we have one table which is MASSIVE and we only need the most recent 
 4-8 hours in memory.  
 
 Anything older than that can go to disk as the queries there are very 
 rare.
 
 … but I don't think cassandra can do this (which is a shame).
 
 Another option is to partition our tables per hour… then tell the older 
 tables to cache 'none'… 
 
 I hate this option though.  A smarter mechanism would be to have a 
 compaction strategy that created an SSTable for every hour and then had 
 custom caching settings for that table.
 
 The additional upside for this is that TTLs would just drop the older 
 data in the compactor.. 
 
 -- 
 Founder/CEO Spinn3r.com
 Location: San Francisco, CA
 Skype: burtonator
 blog: http://burtonator.wordpress.com
 … or check out my Google+ profile
 
 War is peace. Freedom is slavery. Ignorance is strength. Corporations are 
 people.
 
 
 
 -- 
 Founder/CEO Spinn3r.com
 Location: San Francisco, CA
 Skype: burtonator
 blog: http://burtonator.wordpress.com
 … or check out my Google+ profile
 
 War is peace. Freedom is slavery. Ignorance is strength. Corporations are 
 people.
 
 
 
 -- 
 Founder/CEO Spinn3r.com
 Location: San Francisco, CA
 Skype: burtonator
 blog: http://burtonator.wordpress.com
 … or check out my Google+ profile
 
 War is peace. Freedom is slavery. Ignorance is strength. Corporations are 
 people.