Re: Less frequent flushing with LCS

2015-03-02 Thread Dan Kinder
Nope, they flush every 5 to 10 minutes.

On Mon, Mar 2, 2015 at 1:13 PM, Daniel Chia danc...@coursera.org wrote:

 Do the tables look like they're being flushed every hour? It seems like
 the setting memtable_flush_after_mins which I believe defaults to 60
 could also affect how often your tables are flushed.

 Thanks,
 Daniel

 On Mon, Mar 2, 2015 at 11:49 AM, Dan Kinder dkin...@turnitin.com wrote:

 I see, thanks for the input. Compression is not enabled at the moment,
 but I may try increasing that number regardless.

 Also I don't think in-memory tables would work since the dataset is
 actually quite large. The pattern is more like a given set of rows will
 receive many overwriting updates and then not be touched for a while.

 On Fri, Feb 27, 2015 at 2:27 PM, Robert Coli rc...@eventbrite.com
 wrote:

 On Fri, Feb 27, 2015 at 2:01 PM, Dan Kinder dkin...@turnitin.com
 wrote:

 Theoretically sstable_size_in_mb could be causing it to flush (it's at
 the default 160MB)... though we are flushing well before we hit 160MB. I
 have not tried changing this but we don't necessarily want all the sstables
 to be large anyway,


 I've always wished that the log message told you *why* the SSTable was
 being flushed, which of the various bounds prompted the flush.

 In your case, the size on disk may be under 160MB because compression is
 enabled. I would start by increasing that size.

 Datastax DSE has in-memory tables for this use case.

 =Rob




 --
 Dan Kinder
 Senior Software Engineer
 Turnitin – www.turnitin.com
 dkin...@turnitin.com





-- 
Dan Kinder
Senior Software Engineer
Turnitin – www.turnitin.com
dkin...@turnitin.com


Re: Less frequent flushing with LCS

2015-03-02 Thread Daniel Chia
Do the tables look like they're being flushed every hour? It seems like the
setting memtable_flush_after_mins which I believe defaults to 60 could also
affect how often your tables are flushed.

Thanks,
Daniel

On Mon, Mar 2, 2015 at 11:49 AM, Dan Kinder dkin...@turnitin.com wrote:

 I see, thanks for the input. Compression is not enabled at the moment, but
 I may try increasing that number regardless.

 Also I don't think in-memory tables would work since the dataset is
 actually quite large. The pattern is more like a given set of rows will
 receive many overwriting updates and then not be touched for a while.

 On Fri, Feb 27, 2015 at 2:27 PM, Robert Coli rc...@eventbrite.com wrote:

 On Fri, Feb 27, 2015 at 2:01 PM, Dan Kinder dkin...@turnitin.com wrote:

 Theoretically sstable_size_in_mb could be causing it to flush (it's at
 the default 160MB)... though we are flushing well before we hit 160MB. I
 have not tried changing this but we don't necessarily want all the sstables
 to be large anyway,


 I've always wished that the log message told you *why* the SSTable was
 being flushed, which of the various bounds prompted the flush.

 In your case, the size on disk may be under 160MB because compression is
 enabled. I would start by increasing that size.

 Datastax DSE has in-memory tables for this use case.

 =Rob




 --
 Dan Kinder
 Senior Software Engineer
 Turnitin – www.turnitin.com
 dkin...@turnitin.com



Re: Less frequent flushing with LCS

2015-03-02 Thread Dan Kinder
I see, thanks for the input. Compression is not enabled at the moment, but
I may try increasing that number regardless.

Also I don't think in-memory tables would work since the dataset is
actually quite large. The pattern is more like a given set of rows will
receive many overwriting updates and then not be touched for a while.

On Fri, Feb 27, 2015 at 2:27 PM, Robert Coli rc...@eventbrite.com wrote:

 On Fri, Feb 27, 2015 at 2:01 PM, Dan Kinder dkin...@turnitin.com wrote:

 Theoretically sstable_size_in_mb could be causing it to flush (it's at
 the default 160MB)... though we are flushing well before we hit 160MB. I
 have not tried changing this but we don't necessarily want all the sstables
 to be large anyway,


 I've always wished that the log message told you *why* the SSTable was
 being flushed, which of the various bounds prompted the flush.

 In your case, the size on disk may be under 160MB because compression is
 enabled. I would start by increasing that size.

 Datastax DSE has in-memory tables for this use case.

 =Rob




-- 
Dan Kinder
Senior Software Engineer
Turnitin – www.turnitin.com
dkin...@turnitin.com


Less frequent flushing with LCS

2015-02-27 Thread Dan Kinder
Hi all,

We have a table in Cassandra where we frequently overwrite recent inserts.
Compaction does a fine job with this but ultimately larger memtables would
reduce compactions.

The question is: can we make Cassandra use larger memtables and flush less
frequently? What currently triggers the flushes? Opscenter shows them
flushing consistently at about 110MB in size, we have plenty of memory to
go larger.

According to
http://www.datastax.com/documentation/cassandra/2.0/cassandra/operations/ops_memtable_thruput_c.html
we can up the commit log space threshold, but this does not help, there is
plenty of runway there.

Theoretically sstable_size_in_mb could be causing it to flush (it's at the
default 160MB)... though we are flushing well before we hit 160MB. I have
not tried changing this but we don't necessarily want all the sstables to
be large anyway,

Thanks,
-dan


Re: Less frequent flushing with LCS

2015-02-27 Thread Robert Coli
On Fri, Feb 27, 2015 at 2:01 PM, Dan Kinder dkin...@turnitin.com wrote:

 Theoretically sstable_size_in_mb could be causing it to flush (it's at the
 default 160MB)... though we are flushing well before we hit 160MB. I have
 not tried changing this but we don't necessarily want all the sstables to
 be large anyway,


I've always wished that the log message told you *why* the SSTable was
being flushed, which of the various bounds prompted the flush.

In your case, the size on disk may be under 160MB because compression is
enabled. I would start by increasing that size.

Datastax DSE has in-memory tables for this use case.

=Rob