Re: do I need to add more nodes? minor compaction eat all IO
On Mon, Jul 25, 2011 at 6:41 PM, aaron morton aa...@thelastpickle.com wrote: There are no hard and fast rules to add new nodes, but here are two guidelines: 1) Single node load is getting too high, rule of thumb is 300GB is probably too high. What is that rule of thumb based on? I would guess that working set size would matter more than absolute size. Why isn't that the case? Jim
Re: do I need to add more nodes? minor compaction eat all IO
I am using normal SATA disk, actually I was worrying about whether it is okay if every time cassandra using all the io resources? further more when is the good time to add more nodes when I was just using normal SATA disk and with 100r/s it could reach 100 %util how large the data size it should be on each node? below is my iostat -x 2 when doing node repair, I have to repair column family separately otherwise the load will be more crazy: Device: rrqm/s wrqm/s r/s w/srMB/swMB/s avgrq-sz avgqu-sz await r_await w_await svctm %util sda 1.50 1.50 121.50 14.00 3.68 0.30 60.19 116.98 1569.46 59.49 14673.86 7.38 100.00 On Sun, Jul 24, 2011 at 8:04 AM, Jonathan Ellis jbel...@gmail.com wrote: On Sat, Jul 23, 2011 at 4:16 PM, Francois Richard frich...@xobni.com wrote: My understanding is that during compaction cassandra does a lot of non sequential readsa then dumps the results with a big sequential write. Compaction reads and writes are both sequential, and 0.8 allows setting a MB/s to cap compaction at. As to the original question do I need to add more machines I'd say that depends more on whether your application's SLA is met, than what % io util spikes to. -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of DataStax, the source for professional Cassandra support http://www.datastax.com
Re: do I need to add more nodes? minor compaction eat all IO
as the wiki suggested: http://wiki.apache.org/cassandra/LargeDataSetConsiderations Adding nodes is a slow process if each node is responsible for a large amount of data. Plan for this; do not try to throw additional hardware at a cluster at the last minute. I really would like to know what's the status of my cluster, if it is normal On Mon, Jul 25, 2011 at 8:59 PM, Yan Chunlu springri...@gmail.com wrote: I am using normal SATA disk, actually I was worrying about whether it is okay if every time cassandra using all the io resources? further more when is the good time to add more nodes when I was just using normal SATA disk and with 100r/s it could reach 100 %util how large the data size it should be on each node? below is my iostat -x 2 when doing node repair, I have to repair column family separately otherwise the load will be more crazy: Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await r_await w_await svctm %util sda 1.50 1.50 121.50 14.00 3.68 0.30 60.19 116.98 1569.46 59.49 14673.86 7.38 100.00 On Sun, Jul 24, 2011 at 8:04 AM, Jonathan Ellis jbel...@gmail.com wrote: On Sat, Jul 23, 2011 at 4:16 PM, Francois Richard frich...@xobni.com wrote: My understanding is that during compaction cassandra does a lot of non sequential readsa then dumps the results with a big sequential write. Compaction reads and writes are both sequential, and 0.8 allows setting a MB/s to cap compaction at. As to the original question do I need to add more machines I'd say that depends more on whether your application's SLA is met, than what % io util spikes to. -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of DataStax, the source for professional Cassandra support http://www.datastax.com
Re: do I need to add more nodes? minor compaction eat all IO
There are no hard and fast rules to add new nodes, but here are two guidelines: 1) Single node load is getting too high, rule of thumb is 300GB is probably too high. 2) There are times when the cluster cannot keep up with throughout, for example the client is getting TimedOutExceptions or TPStats is showing consistently high (a multiple of the available threads) read or write pending queues. What works for you will be what keeps your site running and keeps the ops/dev team sleeping at night. In your case, high IO during repair maybe OK if the cluster can keep up with demands. Or it may mean you need to upgrade the IO capacity or add nodes. Cheers - Aaron Morton Freelance Cassandra Developer @aaronmorton http://www.thelastpickle.com On 26 Jul 2011, at 01:17, Yan Chunlu wrote: as the wiki suggested: http://wiki.apache.org/cassandra/LargeDataSetConsiderations Adding nodes is a slow process if each node is responsible for a large amount of data. Plan for this; do not try to throw additional hardware at a cluster at the last minute. I really would like to know what's the status of my cluster, if it is normal On Mon, Jul 25, 2011 at 8:59 PM, Yan Chunlu springri...@gmail.com wrote: I am using normal SATA disk, actually I was worrying about whether it is okay if every time cassandra using all the io resources? further more when is the good time to add more nodes when I was just using normal SATA disk and with 100r/s it could reach 100 %util how large the data size it should be on each node? below is my iostat -x 2 when doing node repair, I have to repair column family separately otherwise the load will be more crazy: Device: rrqm/s wrqm/s r/s w/srMB/swMB/s avgrq-sz avgqu-sz await r_await w_await svctm %util sda 1.50 1.50 121.50 14.00 3.68 0.30 60.19 116.98 1569.46 59.49 14673.86 7.38 100.00 On Sun, Jul 24, 2011 at 8:04 AM, Jonathan Ellis jbel...@gmail.com wrote: On Sat, Jul 23, 2011 at 4:16 PM, Francois Richard frich...@xobni.com wrote: My understanding is that during compaction cassandra does a lot of non sequential readsa then dumps the results with a big sequential write. Compaction reads and writes are both sequential, and 0.8 allows setting a MB/s to cap compaction at. As to the original question do I need to add more machines I'd say that depends more on whether your application's SLA is met, than what % io util spikes to. -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of DataStax, the source for professional Cassandra support http://www.datastax.com
RE: do I need to add more nodes? minor compaction eat all IO
Jonathan, Are you sure that the reads done for compaction are sequential with Cassandra 0.6.13? This is not what I am observing right now. During a minor compaction I usually observe ~ 1500 to 1900 r/s while rMB/s is barely around 30 to 35MB/s. Just asking out of curiosity. FR -Original Message- From: Jonathan Ellis [mailto:jbel...@gmail.com] Sent: Saturday, July 23, 2011 5:05 PM To: user@cassandra.apache.org Subject: Re: do I need to add more nodes? minor compaction eat all IO On Sat, Jul 23, 2011 at 4:16 PM, Francois Richard frich...@xobni.com wrote: My understanding is that during compaction cassandra does a lot of non sequential readsa then dumps the results with a big sequential write. Compaction reads and writes are both sequential, and 0.8 allows setting a MB/s to cap compaction at. As to the original question do I need to add more machines I'd say that depends more on whether your application's SLA is met, than what % io util spikes to. -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of DataStax, the source for professional Cassandra support http://www.datastax.com
Re: do I need to add more nodes? minor compaction eat all IO
It's sequential per-sstable. If you are compacting a lot of sstables how closely this approximates completely sequential will deteriorate. On Sun, Jul 24, 2011 at 1:18 PM, Francois Richard frich...@xobni.com wrote: Jonathan, Are you sure that the reads done for compaction are sequential with Cassandra 0.6.13? This is not what I am observing right now. During a minor compaction I usually observe ~ 1500 to 1900 r/s while rMB/s is barely around 30 to 35MB/s. Just asking out of curiosity. FR -Original Message- From: Jonathan Ellis [mailto:jbel...@gmail.com] Sent: Saturday, July 23, 2011 5:05 PM To: user@cassandra.apache.org Subject: Re: do I need to add more nodes? minor compaction eat all IO On Sat, Jul 23, 2011 at 4:16 PM, Francois Richard frich...@xobni.com wrote: My understanding is that during compaction cassandra does a lot of non sequential readsa then dumps the results with a big sequential write. Compaction reads and writes are both sequential, and 0.8 allows setting a MB/s to cap compaction at. As to the original question do I need to add more machines I'd say that depends more on whether your application's SLA is met, than what % io util spikes to. -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of DataStax, the source for professional Cassandra support http://www.datastax.com -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of DataStax, the source for professional Cassandra support http://www.datastax.com
RE: do I need to add more nodes? minor compaction eat all IO
This really depends on your disks setup. When you run iostat under high load, do you see a high number of r/s but the rMB/s is not so great? I usually use: iostat -x -m sdb sdc 1 to monitor situation like this. In my case my disk setup is the following: OS -- /sda Cassandra CommitLogs -- /sdb Cassandra Data -- /sdc My understanding is that during compaction cassandra does a lot of non sequential readsa then dumps the results with a big sequential write. Is your application mostly doing writes and little reads or the other way around. FR -Original Message- From: Yan Chunlu [mailto:springri...@gmail.com] Sent: Saturday, July 23, 2011 9:16 AM To: cassandra-u...@incubator.apache.org Subject: do I need to add more nodes? minor compaction eat all IO I have three nodes and RF=3, every time it is do minor compaction, the cpu load(8 core) get to 30, and iostat -x 2 shows utils is 100%, is that means I need more nodes? the total data size is 60G thanks! --
Re: do I need to add more nodes? minor compaction eat all IO
On Sat, Jul 23, 2011 at 4:16 PM, Francois Richard frich...@xobni.com wrote: My understanding is that during compaction cassandra does a lot of non sequential readsa then dumps the results with a big sequential write. Compaction reads and writes are both sequential, and 0.8 allows setting a MB/s to cap compaction at. As to the original question do I need to add more machines I'd say that depends more on whether your application's SLA is met, than what % io util spikes to. -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of DataStax, the source for professional Cassandra support http://www.datastax.com