Re: Commitlog questions
The incoming mutations are written per column in a Memtable (an in memory cache) . The default size for this table is 64MB if I can recall correctly. For more information take a look here: https://wiki.apache.org/cassandra/MemtableSSTable http://wiki.apache.org/cassandra/MemtableThresholds Regards, Panagiotis On Wed, Apr 9, 2014 at 8:44 PM, Robert Coli rc...@eventbrite.com wrote: On Wed, Apr 9, 2014 at 3:06 AM, Parag Patel ppa...@clearpoolgroup.comwrote: some questions about the commitlog and related assumptions https://issues.apache.org/jira/browse/CASSANDRA-6764 You might wish to get in contact with the reporter here, who has similar questions! =Rob
RE: Commitlog questions
Oleg, Thanks for the response. If the commitlog is in periodic mode and the fsync happens every 10 seconds, Cassandra is storing the stuff that needs to be sync'd somewhere for a period of 10 seconds. I'm talking about before it even hits any disk. This has to be in memory, correct? Parag -Original Message- From: Oleg Dulin [mailto:oleg.du...@gmail.com] Sent: Wednesday, April 09, 2014 10:42 AM To: user@cassandra.apache.org Subject: Re: Commitlog questions Parag: To answer your questions: 1) Default is just that, a default. I wouldn't advise raising it though. The bigger it is the longer it takes to restart the node. 2) I think they juse use fsync. There is no queue. All files in cassandra use java.nio buffers, but they need to be fsynced periodically. Look at commitlog_sync parameters in cassandra.yaml file, the comments there explain how it works. I believe the difference between periodic and batch is just that -- if it is periodic, it will fsync every 10 seconds, if it is batch it will fsync if there were any changes within a time window. On 2014-04-09 10:06:52 +, Parag Patel said: 1) Why is the default 4GB? Has anyone changed this? What are some aspects to consider when determining the commitlog size? 2) If the commitlog is in periodic mode, there is a property to set a time interval to flush the incoming mutations to disk. This implies that there is a queue inside Cassandra to hold this data in memory until it is flushed. a. Is there a name for this queue? b. Is there a limit for this queue? c. Are there any tuning parameters for this queue? Thanks, Parag -- Regards, Oleg Dulin http://www.olegdulin.com
Re: Commitlog questions
If the commitlog is in periodic mode and the fsync happens every 10 seconds, Cassandra is storing the stuff that needs to be sync'd somewhere for a period of 10 seconds. I'm talking about before it even hits any disk. This has to be in memory, correct? The information you are referring to is stored in the OS page cache[1] so it's not part of Cassandra's memory, though I imagine Cassandra will keep a small handle of some kind on the mutation for making the system fsync[2] call when appropriate. [1] http://en.wikipedia.org/wiki/Page_cache [2] http://linux.die.net/man/2/fsync Thanks, Russ On Thu, Apr 10, 2014 at 1:11 PM, Parag Patel ppa...@clearpoolgroup.comwrote: Oleg, Thanks for the response. If the commitlog is in periodic mode and the fsync happens every 10 seconds, Cassandra is storing the stuff that needs to be sync'd somewhere for a period of 10 seconds. I'm talking about before it even hits any disk. This has to be in memory, correct? Parag -Original Message- From: Oleg Dulin [mailto:oleg.du...@gmail.com] Sent: Wednesday, April 09, 2014 10:42 AM To: user@cassandra.apache.org Subject: Re: Commitlog questions Parag: To answer your questions: 1) Default is just that, a default. I wouldn't advise raising it though. The bigger it is the longer it takes to restart the node. 2) I think they juse use fsync. There is no queue. All files in cassandra use java.nio buffers, but they need to be fsynced periodically. Look at commitlog_sync parameters in cassandra.yaml file, the comments there explain how it works. I believe the difference between periodic and batch is just that -- if it is periodic, it will fsync every 10 seconds, if it is batch it will fsync if there were any changes within a time window. On 2014-04-09 10:06:52 +, Parag Patel said: 1) Why is the default 4GB? Has anyone changed this? What are some aspects to consider when determining the commitlog size? 2) If the commitlog is in periodic mode, there is a property to set a time interval to flush the incoming mutations to disk. This implies that there is a queue inside Cassandra to hold this data in memory until it is flushed. a. Is there a name for this queue? b. Is there a limit for this queue? c. Are there any tuning parameters for this queue? Thanks, Parag -- Regards, Oleg Dulin http://www.olegdulin.com
Commitlog questions
1) Why is the default 4GB? Has anyone changed this? What are some aspects to consider when determining the commitlog size? 2) If the commitlog is in periodic mode, there is a property to set a time interval to flush the incoming mutations to disk. This implies that there is a queue inside Cassandra to hold this data in memory until it is flushed. a. Is there a name for this queue? b. Is there a limit for this queue? c. Are there any tuning parameters for this queue? Thanks, Parag
Re: Commitlog questions
Parag: To answer your questions: 1) Default is just that, a default. I wouldn't advise raising it though. The bigger it is the longer it takes to restart the node. 2) I think they juse use fsync. There is no queue. All files in cassandra use java.nio buffers, but they need to be fsynced periodically. Look at commitlog_sync parameters in cassandra.yaml file, the comments there explain how it works. I believe the difference between periodic and batch is just that -- if it is periodic, it will fsync every 10 seconds, if it is batch it will fsync if there were any changes within a time window. On 2014-04-09 10:06:52 +, Parag Patel said: 1) Why is the default 4GB? Has anyone changed this? What are some aspects to consider when determining the commitlog size? 2) If the commitlog is in periodic mode, there is a property to set a time interval to flush the incoming mutations to disk. This implies that there is a queue inside Cassandra to hold this data in memory until it is flushed. a. Is there a name for this queue? b. Is there a limit for this queue? c. Are there any tuning parameters for this queue? Thanks, Parag -- Regards, Oleg Dulin http://www.olegdulin.com
Re: Commitlog questions
On Wed, Apr 9, 2014 at 3:06 AM, Parag Patel ppa...@clearpoolgroup.comwrote: some questions about the commitlog and related assumptions https://issues.apache.org/jira/browse/CASSANDRA-6764 You might wish to get in contact with the reporter here, who has similar questions! =Rob