Re: Many pending compactions
Try repair -pr on all nodes. If after that you still have issues, you can try to rebuild the SSTables using nodetool upgradesstables or scrub. Regards, Roni Balthazar Em 18/02/2015, às 14:13, Ja Sam ptrstp...@gmail.com escreveu: ad 3) I did this already yesterday (setcompactionthrouput also). But still SSTables are increasing. ad 1) What do you think I should use -pr or try to use incremental? On Wed, Feb 18, 2015 at 4:54 PM, Roni Balthazar ronibaltha...@gmail.com wrote: You are right... Repair makes the data consistent between nodes. I understand that you have 2 issues going on. You need to run repair periodically without errors and need to decrease the numbers of compactions pending. So I suggest: 1) Run repair -pr on all nodes. If you upgrade to the new 2.1.3, you can use incremental repairs. There were some bugs on 2.1.2. 2) Run cleanup on all nodes 3) Since you have too many cold SSTables, set cold_reads_to_omit to 0.0, and increase setcompactionthroughput for some time and see if the number of SSTables is going down. Let us know what errors are you getting when running repairs. Regards, Roni Balthazar On Wed, Feb 18, 2015 at 1:31 PM, Ja Sam ptrstp...@gmail.com wrote: Can you explain me what is the correlation between growing SSTables and repair? I was sure, until your mail, that repair is only to make data consistent between nodes. Regards On Wed, Feb 18, 2015 at 4:20 PM, Roni Balthazar ronibaltha...@gmail.com wrote: Which error are you getting when running repairs? You need to run repair on your nodes within gc_grace_seconds (eg: weekly). They have data that are not read frequently. You can run repair -pr on all nodes. Since you do not have deletes, you will not have trouble with that. If you have deletes, it's better to increase gc_grace_seconds before the repair. http://www.datastax.com/documentation/cassandra/2.0/cassandra/operations/ops_repair_nodes_c.html After repair, try to run a nodetool cleanup. Check if the number of SSTables goes down after that... Pending compactions must decrease as well... Cheers, Roni Balthazar On Wed, Feb 18, 2015 at 12:39 PM, Ja Sam ptrstp...@gmail.com wrote: 1) we tried to run repairs but they usually does not succeed. But we had Leveled compaction before. Last week we ALTER tables to STCS, because guys from DataStax suggest us that we should not use Leveled and alter tables in STCS, because we don't have SSD. After this change we did not run any repair. Anyway I don't think it will change anything in SSTable count - if I am wrong please give me an information 2) I did this. My tables are 99% write only. It is audit system 3) Yes I am using default values 4) In both operations I am using LOCAL_QUORUM. I am almost sure that READ timeout happens because of too much SSTables. Anyway firstly I would like to fix to many pending compactions. I still don't know how to speed up them. On Wed, Feb 18, 2015 at 2:49 PM, Roni Balthazar ronibaltha...@gmail.com wrote: Are you running repairs within gc_grace_seconds? (default is 10 days) http://www.datastax.com/documentation/cassandra/2.0/cassandra/operations/ops_repair_nodes_c.html Double check if you set cold_reads_to_omit to 0.0 on tables with STCS that you do not read often. Are you using default values for the properties min_compaction_threshold(4) and max_compaction_threshold(32)? Which Consistency Level are you using for reading operations? Check if you are not reading from DC_B due to your Replication Factor and CL. http://www.datastax.com/documentation/cassandra/2.0/cassandra/dml/dml_config_consistency_c.html Cheers, Roni Balthazar On Wed, Feb 18, 2015 at 11:07 AM, Ja Sam ptrstp...@gmail.com wrote: I don't have problems with DC_B (replica) only in DC_A(my system write only to it) I have read timeouts. I checked in OpsCenter SSTable count and I have: 1) in DC_A same +-10% for last week, a small increase for last 24h (it is more than 15000-2 SSTables depends on node) 2) in DC_B last 24h shows up to 50% decrease, which give nice prognostics. Now I have less then 1000 SSTables What did you measure during system optimizations? Or do you have an idea what more should I check? 1) I look at CPU Idle (one node is 50% idle, rest 70% idle) 2) Disk queue - mostly is it near zero: avg 0.09. Sometimes there are spikes 3) system RAM usage is almost full 4) In Total Bytes Compacted most most lines are below 3MB/s. For total DC_A it is less than 10MB/s, in DC_B it looks much better (avg is like 17MB/s) something else? On Wed, Feb 18, 2015 at 1:32 PM, Roni Balthazar ronibaltha...@gmail.com wrote: Hi, You can check if the number of SSTables is decreasing. Look for the SSTable count information of your tables using nodetool
Re: Many pending compactions
As Al Tobey suggest me I upgraded my 2.1.0 to snaphot version of 2.1.3. I have now installed exactly this build: https://cassci.datastax.com/job/cassandra-2.1/912/ I see many compaction which completes, but some of them are really slow. Maybe I should send some stats form OpsCenter or servers? But it is difficult to me to choose what is important Regards On Wed, Feb 18, 2015 at 6:11 PM, Jake Luciani jak...@gmail.com wrote: Ja, Please upgrade to official 2.1.3 we've fixed many things related to compaction. Are you seeing the compactions % complete progress at all? On Wed, Feb 18, 2015 at 11:58 AM, Roni Balthazar ronibaltha...@gmail.com wrote: Try repair -pr on all nodes. If after that you still have issues, you can try to rebuild the SSTables using nodetool upgradesstables or scrub. Regards, Roni Balthazar Em 18/02/2015, às 14:13, Ja Sam ptrstp...@gmail.com escreveu: ad 3) I did this already yesterday (setcompactionthrouput also). But still SSTables are increasing. ad 1) What do you think I should use -pr or try to use incremental? On Wed, Feb 18, 2015 at 4:54 PM, Roni Balthazar ronibaltha...@gmail.com wrote: You are right... Repair makes the data consistent between nodes. I understand that you have 2 issues going on. You need to run repair periodically without errors and need to decrease the numbers of compactions pending. So I suggest: 1) Run repair -pr on all nodes. If you upgrade to the new 2.1.3, you can use incremental repairs. There were some bugs on 2.1.2. 2) Run cleanup on all nodes 3) Since you have too many cold SSTables, set cold_reads_to_omit to 0.0, and increase setcompactionthroughput for some time and see if the number of SSTables is going down. Let us know what errors are you getting when running repairs. Regards, Roni Balthazar On Wed, Feb 18, 2015 at 1:31 PM, Ja Sam ptrstp...@gmail.com wrote: Can you explain me what is the correlation between growing SSTables and repair? I was sure, until your mail, that repair is only to make data consistent between nodes. Regards On Wed, Feb 18, 2015 at 4:20 PM, Roni Balthazar ronibaltha...@gmail.com wrote: Which error are you getting when running repairs? You need to run repair on your nodes within gc_grace_seconds (eg: weekly). They have data that are not read frequently. You can run repair -pr on all nodes. Since you do not have deletes, you will not have trouble with that. If you have deletes, it's better to increase gc_grace_seconds before the repair. http://www.datastax.com/documentation/cassandra/2.0/cassandra/operations/ops_repair_nodes_c.html After repair, try to run a nodetool cleanup. Check if the number of SSTables goes down after that... Pending compactions must decrease as well... Cheers, Roni Balthazar On Wed, Feb 18, 2015 at 12:39 PM, Ja Sam ptrstp...@gmail.com wrote: 1) we tried to run repairs but they usually does not succeed. But we had Leveled compaction before. Last week we ALTER tables to STCS, because guys from DataStax suggest us that we should not use Leveled and alter tables in STCS, because we don't have SSD. After this change we did not run any repair. Anyway I don't think it will change anything in SSTable count - if I am wrong please give me an information 2) I did this. My tables are 99% write only. It is audit system 3) Yes I am using default values 4) In both operations I am using LOCAL_QUORUM. I am almost sure that READ timeout happens because of too much SSTables. Anyway firstly I would like to fix to many pending compactions. I still don't know how to speed up them. On Wed, Feb 18, 2015 at 2:49 PM, Roni Balthazar ronibaltha...@gmail.com wrote: Are you running repairs within gc_grace_seconds? (default is 10 days) http://www.datastax.com/documentation/cassandra/2.0/cassandra/operations/ops_repair_nodes_c.html Double check if you set cold_reads_to_omit to 0.0 on tables with STCS that you do not read often. Are you using default values for the properties min_compaction_threshold(4) and max_compaction_threshold(32)? Which Consistency Level are you using for reading operations? Check if you are not reading from DC_B due to your Replication Factor and CL. http://www.datastax.com/documentation/cassandra/2.0/cassandra/dml/dml_config_consistency_c.html Cheers, Roni Balthazar On Wed, Feb 18, 2015 at 11:07 AM, Ja Sam ptrstp...@gmail.com wrote: I don't have problems with DC_B (replica) only in DC_A(my system write only to it) I have read timeouts. I checked in OpsCenter SSTable count and I have: 1) in DC_A same +-10% for last week, a small increase for last 24h (it is more than 15000-2 SSTables depends on node) 2) in DC_B last 24h shows up to 50% decrease, which give nice prognostics. Now I have less then 1000 SSTables What did you measure during system optimizations? Or do you have an idea
Re: Many pending compactions
Ja, Please upgrade to official 2.1.3 we've fixed many things related to compaction. Are you seeing the compactions % complete progress at all? On Wed, Feb 18, 2015 at 11:58 AM, Roni Balthazar ronibaltha...@gmail.com wrote: Try repair -pr on all nodes. If after that you still have issues, you can try to rebuild the SSTables using nodetool upgradesstables or scrub. Regards, Roni Balthazar Em 18/02/2015, às 14:13, Ja Sam ptrstp...@gmail.com escreveu: ad 3) I did this already yesterday (setcompactionthrouput also). But still SSTables are increasing. ad 1) What do you think I should use -pr or try to use incremental? On Wed, Feb 18, 2015 at 4:54 PM, Roni Balthazar ronibaltha...@gmail.com wrote: You are right... Repair makes the data consistent between nodes. I understand that you have 2 issues going on. You need to run repair periodically without errors and need to decrease the numbers of compactions pending. So I suggest: 1) Run repair -pr on all nodes. If you upgrade to the new 2.1.3, you can use incremental repairs. There were some bugs on 2.1.2. 2) Run cleanup on all nodes 3) Since you have too many cold SSTables, set cold_reads_to_omit to 0.0, and increase setcompactionthroughput for some time and see if the number of SSTables is going down. Let us know what errors are you getting when running repairs. Regards, Roni Balthazar On Wed, Feb 18, 2015 at 1:31 PM, Ja Sam ptrstp...@gmail.com wrote: Can you explain me what is the correlation between growing SSTables and repair? I was sure, until your mail, that repair is only to make data consistent between nodes. Regards On Wed, Feb 18, 2015 at 4:20 PM, Roni Balthazar ronibaltha...@gmail.com wrote: Which error are you getting when running repairs? You need to run repair on your nodes within gc_grace_seconds (eg: weekly). They have data that are not read frequently. You can run repair -pr on all nodes. Since you do not have deletes, you will not have trouble with that. If you have deletes, it's better to increase gc_grace_seconds before the repair. http://www.datastax.com/documentation/cassandra/2.0/cassandra/operations/ops_repair_nodes_c.html After repair, try to run a nodetool cleanup. Check if the number of SSTables goes down after that... Pending compactions must decrease as well... Cheers, Roni Balthazar On Wed, Feb 18, 2015 at 12:39 PM, Ja Sam ptrstp...@gmail.com wrote: 1) we tried to run repairs but they usually does not succeed. But we had Leveled compaction before. Last week we ALTER tables to STCS, because guys from DataStax suggest us that we should not use Leveled and alter tables in STCS, because we don't have SSD. After this change we did not run any repair. Anyway I don't think it will change anything in SSTable count - if I am wrong please give me an information 2) I did this. My tables are 99% write only. It is audit system 3) Yes I am using default values 4) In both operations I am using LOCAL_QUORUM. I am almost sure that READ timeout happens because of too much SSTables. Anyway firstly I would like to fix to many pending compactions. I still don't know how to speed up them. On Wed, Feb 18, 2015 at 2:49 PM, Roni Balthazar ronibaltha...@gmail.com wrote: Are you running repairs within gc_grace_seconds? (default is 10 days) http://www.datastax.com/documentation/cassandra/2.0/cassandra/operations/ops_repair_nodes_c.html Double check if you set cold_reads_to_omit to 0.0 on tables with STCS that you do not read often. Are you using default values for the properties min_compaction_threshold(4) and max_compaction_threshold(32)? Which Consistency Level are you using for reading operations? Check if you are not reading from DC_B due to your Replication Factor and CL. http://www.datastax.com/documentation/cassandra/2.0/cassandra/dml/dml_config_consistency_c.html Cheers, Roni Balthazar On Wed, Feb 18, 2015 at 11:07 AM, Ja Sam ptrstp...@gmail.com wrote: I don't have problems with DC_B (replica) only in DC_A(my system write only to it) I have read timeouts. I checked in OpsCenter SSTable count and I have: 1) in DC_A same +-10% for last week, a small increase for last 24h (it is more than 15000-2 SSTables depends on node) 2) in DC_B last 24h shows up to 50% decrease, which give nice prognostics. Now I have less then 1000 SSTables What did you measure during system optimizations? Or do you have an idea what more should I check? 1) I look at CPU Idle (one node is 50% idle, rest 70% idle) 2) Disk queue - mostly is it near zero: avg 0.09. Sometimes there are spikes 3) system RAM usage is almost full 4) In Total Bytes Compacted most most lines are below 3MB/s. For total DC_A it is less than 10MB/s, in DC_B it looks much better (avg is like 17MB/s) something else? On Wed, Feb 18, 2015 at 1:32
Re: Cassandra install on JRE vs JDK
The ”natural” dependency of Cassandra is the JRE (not the JDK) - e.g. in the Debian package. You should be safe using JRE instead of JDK. If you’re asking whether to use a non-Oracle JVM - the answer would be: use the Oracle JVM. OpenJDK might work, but I’d not recommend it. Am 18.02.2015 um 20:49 schrieb cass savy casss...@gmail.com: Can we install Oracle JDK instead of JRE in Cassandra servers? We have few clusters running JDK when we upgraded to C*2.0. Is there any known issue or impact with using JDK vs JRE? What is the reason to not use Oracle JDK in C* servers? Is there any performance impact ? Please advice. — Robert Stupp @snazy
Re: Deleting Statistics.db at startup
On Wed, Feb 18, 2015 at 4:02 AM, Tomer Pearl tomer.pe...@contextream.com wrote: My question is what is the consequences of deleting this file every time the node is starting up? Performance wise or other. You waste the time Cassandra spends to regenerate it. I personally would not institute an operational practice whereby I regularly purged these files to avoid OOM. =Rob
C* 2.1.2 invokes oom-killer
Hi, Couple of times a day 2 out of 4 members cluster nodes are killed root@db4:~# dmesg | grep -i oom [4811135.792657] [ pid ] uid tgid total_vm rss cpu oom_adj oom_score_adj name [6559049.307293] java invoked oom-killer: gfp_mask=0x201da, order=0, oom_adj=0, oom_score_adj=0 Nodes are using 8GB heap (confirmed with *nodetool info*) and aren't using row cache. Noticed that couple of times a day used RSS is growing really fast within couple of minutes and I see CPU spikes at the same time - https://www.dropbox.com/s/khco2kdp4qdzjit/Screenshot%202015-02-18%2015.10.54.png?dl=0 . Could be related to compaction but after compaction is finished used RSS doesn't shrink. Output from pmap when C* process uses 50GB RAM (out of 64GB) is available on http://paste.ofcode.org/ZjLUA2dYVuKvJHAk9T3Hjb. At the time dump was made heap usage is far below 8GB (~3GB) but total RSS is ~50GB. Any help will be appreciated. -- BR, Michał Łowicki
Re: C* 2.1.2 invokes oom-killer
On Wed, Feb 18, 2015 at 10:28 AM, Michał Łowicki mlowi...@gmail.com wrote: Couple of times a day 2 out of 4 members cluster nodes are killed This sort of issue is usually best handled/debugged interactively on IRC. But briefly : - 2.1.2 is IMO broken for production. Downgrade (officially unsupported but fine between these versions) to 2.1.1 or upgrade to 2.1.3. - Beyond that, look at the steady state heap consumption. With 2.1.2, it would likely take at least 1TB of data to fill heap in steady state to near-failure. =Rob
Re: Cassandra install on JRE vs JDK
Yes you can use Oracle JDK if your prefer, I've been using the JDK with Cassandra in production for years without issue. Regards, Mark On 18 February 2015 at 19:49, cass savy casss...@gmail.com wrote: Can we install Oracle JDK instead of JRE in Cassandra servers? We have few clusters running JDK when we upgraded to C*2.0. Is there any known issue or impact with using JDK vs JRE? What is the reason to not use Oracle JDK in C* servers? Is there any performance impact ? Please advice.
Re: Cassandra install on JRE vs JDK
Thanks Mark for quick response. What version of Cassandra and JDK are you using in Prod. On Wed, Feb 18, 2015 at 11:58 AM, Mark Reddy mark.l.re...@gmail.com wrote: Yes you can use Oracle JDK if your prefer, I've been using the JDK with Cassandra in production for years without issue. Regards, Mark On 18 February 2015 at 19:49, cass savy casss...@gmail.com wrote: Can we install Oracle JDK instead of JRE in Cassandra servers? We have few clusters running JDK when we upgraded to C*2.0. Is there any known issue or impact with using JDK vs JRE? What is the reason to not use Oracle JDK in C* servers? Is there any performance impact ? Please advice.
Re: Cassandra install on JRE vs JDK
Thanks Robert for quick response. I use Oracle JDK and not OpenJDK. On Wed, Feb 18, 2015 at 11:54 AM, Robert Stupp sn...@snazy.de wrote: The ”natural” dependency of Cassandra is the JRE (not the JDK) - e.g. in the Debian package. You should be safe using JRE instead of JDK. If you’re asking whether to use a non-Oracle JVM - the answer would be: use the Oracle JVM. OpenJDK might work, but I’d not recommend it. Am 18.02.2015 um 20:49 schrieb cass savy casss...@gmail.com: Can we install Oracle JDK instead of JRE in Cassandra servers? We have few clusters running JDK when we upgraded to C*2.0. Is there any known issue or impact with using JDK vs JRE? What is the reason to not use Oracle JDK in C* servers? Is there any performance impact ? Please advice. — Robert Stupp @snazy
Deleting Statistics.db at startup
Hello, I have received the following error ERROR [SSTableBatchOpen:2] 2015-01-19 13:55:28,478 CassandraDaemon.java (line 196) Exception in thread Thread[SSTableBatchOpen:2,5,main] java.lang.OutOfMemoryError: Java heap space at org.apache.cassandra.utils.EstimatedHistogram$EstimatedHistogramSerializer.deserialize(EstimatedHistogram.java:335) at org.apache.cassandra.io.sstable.SSTableMetadata$SSTableMetadataSerializer.deserialize(SSTableMetadata.java:462) at org.apache.cassandra.io.sstable.SSTableMetadata$SSTableMetadataSerializer.deserialize(SSTableMetadata.java:448) at org.apache.cassandra.io.sstable.SSTableMetadata$SSTableMetadataSerializer.deserialize(SSTableMetadata.java:432) at org.apache.cassandra.io.sstable.SSTableReader.openMetadata(SSTableReader.java:225) at org.apache.cassandra.io.sstable.SSTableReader.open(SSTableReader.java:194) at org.apache.cassandra.io.sstable.SSTableReader.open(SSTableReader.java:184) at org.apache.cassandra.io.sstable.SSTableReader$1.run(SSTableReader.java:264) at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source) at java.util.concurrent.FutureTask$Sync.innerRun(Unknown Source) at java.util.concurrent.FutureTask.run(Unknown Source) at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.lang.Thread.run(Unknown Source) I have found a solution here: http://www.mail-archive.com/user%40cassandra.apache.org/msg23682.html Which advice to delete the statistics.db file. My question is what is the consequences of deleting this file every time the node is starting up? Performance wise or other. Thanks, Tomer.
Re: Many pending compactions
I don't have problems with DC_B (replica) only in DC_A(my system write only to it) I have read timeouts. I checked in OpsCenter SSTable count and I have: 1) in DC_A same +-10% for last week, a small increase for last 24h (it is more than 15000-2 SSTables depends on node) 2) in DC_B last 24h shows up to 50% decrease, which give nice prognostics. Now I have less then 1000 SSTables What did you measure during system optimizations? Or do you have an idea what more should I check? 1) I look at CPU Idle (one node is 50% idle, rest 70% idle) 2) Disk queue - mostly is it near zero: avg 0.09. Sometimes there are spikes 3) system RAM usage is almost full 4) In Total Bytes Compacted most most lines are below 3MB/s. For total DC_A it is less than 10MB/s, in DC_B it looks much better (avg is like 17MB/s) something else? On Wed, Feb 18, 2015 at 1:32 PM, Roni Balthazar ronibaltha...@gmail.com wrote: Hi, You can check if the number of SSTables is decreasing. Look for the SSTable count information of your tables using nodetool cfstats. The compaction history can be viewed using nodetool compactionhistory. About the timeouts, check this out: http://www.datastax.com/dev/blog/how-cassandra-deals-with-replica-failure Also try to run nodetool tpstats to see the threads statistics. It can lead you to know if you are having performance problems. If you are having too many pending tasks or dropped messages, maybe will you need to tune your system (eg: driver's timeout, concurrent reads and so on) Regards, Roni Balthazar On Wed, Feb 18, 2015 at 9:51 AM, Ja Sam ptrstp...@gmail.com wrote: Hi, Thanks for your tip it looks that something changed - I still don't know if it is ok. My nodes started to do more compaction, but it looks that some compactions are really slow. In IO we have idle, CPU is quite ok (30%-40%). We set compactionthrouput to 999, but I do not see difference. Can we check something more? Or do you have any method to monitor progress with small files? Regards On Tue, Feb 17, 2015 at 2:43 PM, Roni Balthazar ronibaltha...@gmail.com wrote: HI, Yes... I had the same issue and setting cold_reads_to_omit to 0.0 was the solution... The number of SSTables decreased from many thousands to a number below a hundred and the SSTables are now much bigger with several gigabytes (most of them). Cheers, Roni Balthazar On Tue, Feb 17, 2015 at 11:32 AM, Ja Sam ptrstp...@gmail.com wrote: After some diagnostic ( we didn't set yet cold_reads_to_omit ). Compaction are running but VERY slow with idle IO. We had a lot of Data files in Cassandra. In DC_A it is about ~12 (only xxx-Data.db) in DC_B has only ~4000. I don't know if this change anything but: 1) in DC_A avg size of Data.db file is ~13 mb. I have few a really big ones, but most is really small (almost 1 files are less then 100mb). 2) in DC_B avg size of Data.db is much bigger ~260mb. Do you think that above flag will help us? On Tue, Feb 17, 2015 at 9:04 AM, Ja Sam ptrstp...@gmail.com wrote: I set setcompactionthroughput 999 permanently and it doesn't change anything. IO is still same. CPU is idle. On Tue, Feb 17, 2015 at 1:15 AM, Roni Balthazar ronibaltha...@gmail.com wrote: Hi, You can run nodetool compactionstats to view statistics on compactions. Setting cold_reads_to_omit to 0.0 can help to reduce the number of SSTables when you use Size-Tiered compaction. You can also create a cron job to increase the value of setcompactionthroughput during the night or when your IO is not busy. From http://wiki.apache.org/cassandra/NodeTool: 0 0 * * * root nodetool -h `hostname` setcompactionthroughput 999 0 6 * * * root nodetool -h `hostname` setcompactionthroughput 16 Cheers, Roni Balthazar On Mon, Feb 16, 2015 at 7:47 PM, Ja Sam ptrstp...@gmail.com wrote: One think I do not understand. In my case compaction is running permanently. Is there a way to check which compaction is pending? The only information is about total count. On Monday, February 16, 2015, Ja Sam ptrstp...@gmail.com wrote: Of couse I made a mistake. I am using 2.1.2. Anyway night build is available from http://cassci.datastax.com/job/cassandra-2.1/ I read about cold_reads_to_omit It looks promising. Should I set also compaction throughput? p.s. I am really sad that I didn't read this before: https://engineering.eventbrite.com/what-version-of-cassandra-should-i-run/ On Monday, February 16, 2015, Carlos Rolo r...@pythian.com wrote: Hi 100% in agreement with Roland, 2.1.x series is a pain! I would never recommend the current 2.1.x series for production. Clocks is a pain, and check your connectivity! Also check tpstats to see if your
Re: Adding new node to cluster
Hello, Please note that DataStax has updated the documentation for replacing a seed node. The new docs outline a simplified process to help avoid the confusion on this topic. http://www.datastax.com/documentation/cassandra/2.0/cassandra/operations/ops_replace_seed_node.html Jonathan [image: datastax_logo.png] Jonathan Lacefield Solution Architect | (404) 822 3487 | jlacefi...@datastax.com [image: linkedin.png] http://www.linkedin.com/in/jlacefield/ [image: facebook.png] https://www.facebook.com/datastax [image: twitter.png] https://twitter.com/datastax [image: g+.png] https://plus.google.com/+Datastax/about http://feeds.feedburner.com/datastax https://github.com/datastax/ On Tue, Feb 17, 2015 at 8:04 PM, Robert Coli rc...@eventbrite.com wrote: On Tue, Feb 17, 2015 at 2:25 PM, sean_r_dur...@homedepot.com wrote: SimpleSnitch is not rack aware. You would want to choose seed nodes and then not change them. Seed nodes apparently don’t bootstrap. No one seems to know what a seed node actually *is*, but seed nodes can in fact bootstrap. They just have to temporarily forget to tell themselves that they are a seed node while bootstrapping, and then other nodes will still gossip to it as a seed once it comes up, even though it doesn't consider itself a seed. https://issues.apache.org/jira/browse/CASSANDRA-5836?focusedCommentId=13727032page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13727032 Replacing a seed node is a very common operation, and this best practice is confusing/poorly documented. There are regular contacts to #cassandra/cassandra-user@ where people ask how to replace a seed node, and are confused by the answer. The workaround also means that, if you do not restart your node after bootstrapping it (and changing the conf file back to indicate to itself that it is a seed) the node runs until next restart without any understanding that it is a seed node. Being a seed node appears to mean two things : 1) I have myself as an entry in my own seed list, so I know that I am a seed. 2) Other nodes have me in their seed list, so they consider me a seed. The current code checks for 1) and refuses to bootstrap. The workaround is to remove the 1) state temporarily. But if it is unsafe to bootstrap a seed node because of either 1) or 2), the workaround is unsafe. Can you explicate the special cases here? I sincerely would like to understand why the code tries to prevent a seed from bootstrapping when one can clearly, and apparently safely, bootstrap a seed. Unfortunately, there has been no answer. =Rob
Re: Many pending compactions
Hi, Thanks for your tip it looks that something changed - I still don't know if it is ok. My nodes started to do more compaction, but it looks that some compactions are really slow. In IO we have idle, CPU is quite ok (30%-40%). We set compactionthrouput to 999, but I do not see difference. Can we check something more? Or do you have any method to monitor progress with small files? Regards On Tue, Feb 17, 2015 at 2:43 PM, Roni Balthazar ronibaltha...@gmail.com wrote: HI, Yes... I had the same issue and setting cold_reads_to_omit to 0.0 was the solution... The number of SSTables decreased from many thousands to a number below a hundred and the SSTables are now much bigger with several gigabytes (most of them). Cheers, Roni Balthazar On Tue, Feb 17, 2015 at 11:32 AM, Ja Sam ptrstp...@gmail.com wrote: After some diagnostic ( we didn't set yet cold_reads_to_omit ). Compaction are running but VERY slow with idle IO. We had a lot of Data files in Cassandra. In DC_A it is about ~12 (only xxx-Data.db) in DC_B has only ~4000. I don't know if this change anything but: 1) in DC_A avg size of Data.db file is ~13 mb. I have few a really big ones, but most is really small (almost 1 files are less then 100mb). 2) in DC_B avg size of Data.db is much bigger ~260mb. Do you think that above flag will help us? On Tue, Feb 17, 2015 at 9:04 AM, Ja Sam ptrstp...@gmail.com wrote: I set setcompactionthroughput 999 permanently and it doesn't change anything. IO is still same. CPU is idle. On Tue, Feb 17, 2015 at 1:15 AM, Roni Balthazar ronibaltha...@gmail.com wrote: Hi, You can run nodetool compactionstats to view statistics on compactions. Setting cold_reads_to_omit to 0.0 can help to reduce the number of SSTables when you use Size-Tiered compaction. You can also create a cron job to increase the value of setcompactionthroughput during the night or when your IO is not busy. From http://wiki.apache.org/cassandra/NodeTool: 0 0 * * * root nodetool -h `hostname` setcompactionthroughput 999 0 6 * * * root nodetool -h `hostname` setcompactionthroughput 16 Cheers, Roni Balthazar On Mon, Feb 16, 2015 at 7:47 PM, Ja Sam ptrstp...@gmail.com wrote: One think I do not understand. In my case compaction is running permanently. Is there a way to check which compaction is pending? The only information is about total count. On Monday, February 16, 2015, Ja Sam ptrstp...@gmail.com wrote: Of couse I made a mistake. I am using 2.1.2. Anyway night build is available from http://cassci.datastax.com/job/cassandra-2.1/ I read about cold_reads_to_omit It looks promising. Should I set also compaction throughput? p.s. I am really sad that I didn't read this before: https://engineering.eventbrite.com/what-version-of-cassandra-should-i-run/ On Monday, February 16, 2015, Carlos Rolo r...@pythian.com wrote: Hi 100% in agreement with Roland, 2.1.x series is a pain! I would never recommend the current 2.1.x series for production. Clocks is a pain, and check your connectivity! Also check tpstats to see if your threadpools are being overrun. Regards, Carlos Juzarte Rolo Cassandra Consultant Pythian - Love your data rolo@pythian | Twitter: cjrolo | Linkedin: linkedin.com/in/carlosjuzarterolo Tel: 1649 www.pythian.com On Mon, Feb 16, 2015 at 8:12 PM, Roland Etzenhammer r.etzenham...@t-online.de wrote: Hi, 1) Actual Cassandra 2.1.3, it was upgraded from 2.1.0 (suggested by Al Tobey from DataStax) 7) minimal reads (usually none, sometimes few) those two points keep me repeating an anwser I got. First where did you get 2.1.3 from? Maybe I missed it, I will have a look. But if it is 2.1.2 whis is the latest released version, that version has many bugs - most of them I got kicked by while testing 2.1.2. I got many problems with compactions not beeing triggred on column families not beeing read, compactions and repairs not beeing completed. See https://www.mail-archive.com/search?l=user@cassandra.apache.orgq=subject:%22Re%3A+Compaction+failing+to+trigger%22o=newestf=1 https://www.mail-archive.com/user%40cassandra.apache.org/msg40768.html Apart from that, how are those both datacenters connected? Maybe there is a bottleneck. Also do you have ntp up and running on all nodes to keep all clocks in thight sync? Note: I'm no expert (yet) - just sharing my 2 cents. Cheers, Roland --
Re: Many pending compactions
1) we tried to run repairs but they usually does not succeed. But we had Leveled compaction before. Last week we ALTER tables to STCS, because guys from DataStax suggest us that we should not use Leveled and alter tables in STCS, because we don't have SSD. After this change we did not run any repair. Anyway I don't think it will change anything in SSTable count - if I am wrong please give me an information 2) I did this. My tables are 99% write only. It is audit system 3) Yes I am using default values 4) In both operations I am using LOCAL_QUORUM. I am almost sure that READ timeout happens because of too much SSTables. Anyway firstly I would like to fix to many pending compactions. I still don't know how to speed up them. On Wed, Feb 18, 2015 at 2:49 PM, Roni Balthazar ronibaltha...@gmail.com wrote: Are you running repairs within gc_grace_seconds? (default is 10 days) http://www.datastax.com/documentation/cassandra/2.0/cassandra/operations/ops_repair_nodes_c.html Double check if you set cold_reads_to_omit to 0.0 on tables with STCS that you do not read often. Are you using default values for the properties min_compaction_threshold(4) and max_compaction_threshold(32)? Which Consistency Level are you using for reading operations? Check if you are not reading from DC_B due to your Replication Factor and CL. http://www.datastax.com/documentation/cassandra/2.0/cassandra/dml/dml_config_consistency_c.html Cheers, Roni Balthazar On Wed, Feb 18, 2015 at 11:07 AM, Ja Sam ptrstp...@gmail.com wrote: I don't have problems with DC_B (replica) only in DC_A(my system write only to it) I have read timeouts. I checked in OpsCenter SSTable count and I have: 1) in DC_A same +-10% for last week, a small increase for last 24h (it is more than 15000-2 SSTables depends on node) 2) in DC_B last 24h shows up to 50% decrease, which give nice prognostics. Now I have less then 1000 SSTables What did you measure during system optimizations? Or do you have an idea what more should I check? 1) I look at CPU Idle (one node is 50% idle, rest 70% idle) 2) Disk queue - mostly is it near zero: avg 0.09. Sometimes there are spikes 3) system RAM usage is almost full 4) In Total Bytes Compacted most most lines are below 3MB/s. For total DC_A it is less than 10MB/s, in DC_B it looks much better (avg is like 17MB/s) something else? On Wed, Feb 18, 2015 at 1:32 PM, Roni Balthazar ronibaltha...@gmail.com wrote: Hi, You can check if the number of SSTables is decreasing. Look for the SSTable count information of your tables using nodetool cfstats. The compaction history can be viewed using nodetool compactionhistory. About the timeouts, check this out: http://www.datastax.com/dev/blog/how-cassandra-deals-with-replica-failure Also try to run nodetool tpstats to see the threads statistics. It can lead you to know if you are having performance problems. If you are having too many pending tasks or dropped messages, maybe will you need to tune your system (eg: driver's timeout, concurrent reads and so on) Regards, Roni Balthazar On Wed, Feb 18, 2015 at 9:51 AM, Ja Sam ptrstp...@gmail.com wrote: Hi, Thanks for your tip it looks that something changed - I still don't know if it is ok. My nodes started to do more compaction, but it looks that some compactions are really slow. In IO we have idle, CPU is quite ok (30%-40%). We set compactionthrouput to 999, but I do not see difference. Can we check something more? Or do you have any method to monitor progress with small files? Regards On Tue, Feb 17, 2015 at 2:43 PM, Roni Balthazar ronibaltha...@gmail.com wrote: HI, Yes... I had the same issue and setting cold_reads_to_omit to 0.0 was the solution... The number of SSTables decreased from many thousands to a number below a hundred and the SSTables are now much bigger with several gigabytes (most of them). Cheers, Roni Balthazar On Tue, Feb 17, 2015 at 11:32 AM, Ja Sam ptrstp...@gmail.com wrote: After some diagnostic ( we didn't set yet cold_reads_to_omit ). Compaction are running but VERY slow with idle IO. We had a lot of Data files in Cassandra. In DC_A it is about ~12 (only xxx-Data.db) in DC_B has only ~4000. I don't know if this change anything but: 1) in DC_A avg size of Data.db file is ~13 mb. I have few a really big ones, but most is really small (almost 1 files are less then 100mb). 2) in DC_B avg size of Data.db is much bigger ~260mb. Do you think that above flag will help us? On Tue, Feb 17, 2015 at 9:04 AM, Ja Sam ptrstp...@gmail.com wrote: I set setcompactionthroughput 999 permanently and it doesn't change anything. IO is still same. CPU is idle. On Tue, Feb 17, 2015 at 1:15 AM, Roni Balthazar
Re: Adding new node to cluster
Hello, I have decommissioned a node, deleted data,commitlog and saved caches, changed yaml file to not include self ip and started it. For some reason I do not fully understand, Opscenter says that the node is in an unknown datacenter. Nodetool says UJ but shows ? in the Owns column. I have started the node yesterday. I still see streams towards this node so I can I assume that once it finishes joining, Opscenter will see it properly? I remember that I might have setup the entire initial cluster with own ips in the yaml file.To bring the cluster to a valid state, I assume that I have to decommission the nodes one by one, delete the data, and restart with correct yaml settings.Is this correct? Also a change of the cluster name would be nice.How can this be done with minimal impact? On Wednesday, February 18, 2015 12:56 AM, Eric Stevens migh...@gmail.com wrote: Seed nodes apparently don’t bootstrap That's right, if a node has itself in its own seeds list, it assumes it's a foundational member of the cluster, and it will join immediately with no bootstrap. If you've done this by accident, you should do nodetool decommission on that node, and when it's fully left the cluster, wipe its data directory, edit the yaml and remove it from the seeds list. On Tue, Feb 17, 2015 at 3:25 PM, sean_r_dur...@homedepot.com wrote: SimpleSnitch is not rack aware. You would want to choose seed nodes and then not change them. Seed nodes apparently don’t bootstrap. All nodes need the same seeds in the yaml file. Here is more info: http://www.datastax.com/documentation/cassandra/2.0/cassandra/initialize/initializeSingleDS.html Sean Durity – Cassandra Admin, Big Data TeamTo engage the team,create a request From: Batranut Bogdan [mailto:batra...@yahoo.com] Sent: Tuesday, February 17, 2015 3:28 PM To: user@cassandra.apache.org; reynald.bourtembo...@esrf.fr Subject: Re: Adding new node to cluster Hello, I use SimpleSnitch. All the nodes are in the sane datacenter. Not sure if all are in the same rack. On Tuesday, February 17, 2015 8:53 PM, sean_r_dur...@homedepot.com sean_r_dur...@homedepot.com wrote: What snitch are you using? You may need to do some work on your topology file (or rackdc) to make sure you have the topology you want. Also, it is possible you may need to restart OpsCenter agents and/or your browser to see the nodes represented properly in OpsCenter. Sean Durity – Cassandra Admin, Home Depot From: Batranut Bogdan [mailto:batra...@yahoo.com] Sent: Tuesday, February 17, 2015 10:20 AM To: user@cassandra.apache.org;reynald.bourtembo...@esrf.fr Subject: Re: Adding new node to cluster Hello, I know that UN is good, but what troubles me is the addition of the own node's ip in it's yaml seeds section. On Tuesday, February 17, 2015 3:40 PM, Reynald Bourtembourg reynald.bourtembo...@esrf.fr wrote: Hi Bogdan In nodetool status: - UJ: means your node is Up and Joining - UN: means your node is Up and in Normal state UN in nodetool is good ;-) On 17/02/2015 13:56, Batranut Bogdan wrote: Hello all, I have an existing cluster. When adding a new node, I saw that Opscenter saw the node in an unknown cluster. In the yaml, the cluster name is the same. So i have stopped the node and added it's ip address in the list of seeds. Now Opscenter sees my node. But nodetool status now sees it as UN, instead of UJ when it first started. One other mension is that even if I stop the node, remove it's ip from the list of seeds, Opscenter sees the node in the known clustre but nodetool sees it as UN. I am not sure what the implications of adding a node's ip in it's seed list are and I think that for the existing nodes I have might done the same. Eg. started with it's ip in the seed list but after removing it and having to restart the nodes for whatever reason, I did not see any changes. Is my cluster ok, or what do I need to do to bring the cluster to a good state? Thank you. The information in this Internet Email is confidential and may be legally privileged. It is intended solely for the addressee. Access to this Email by anyone else is unauthorized. If you are not the intended recipient, any disclosure, copying, distribution or any action taken or omitted to be taken in reliance on it, is prohibited and may be unlawful. When addressed to our clients any opinions or advice contained in this Email are subject to the terms and conditions expressed in any applicable governing The Home Depot terms of business or client engagement letter. The Home Depot disclaims all responsibility and liability for the accuracy and content of this attachment and for any damages or losses arising from any inaccuracies, errors, viruses, e.g., worms, trojan horses, etc., or other items of a destructive nature, which may be contained in this attachment and shall not be liable for direct, indirect, consequential or special damages in connection with this e-mail message or its attachment.
Re: Many pending compactions
Are you running repairs within gc_grace_seconds? (default is 10 days) http://www.datastax.com/documentation/cassandra/2.0/cassandra/operations/ops_repair_nodes_c.html Double check if you set cold_reads_to_omit to 0.0 on tables with STCS that you do not read often. Are you using default values for the properties min_compaction_threshold(4) and max_compaction_threshold(32)? Which Consistency Level are you using for reading operations? Check if you are not reading from DC_B due to your Replication Factor and CL. http://www.datastax.com/documentation/cassandra/2.0/cassandra/dml/dml_config_consistency_c.html Cheers, Roni Balthazar On Wed, Feb 18, 2015 at 11:07 AM, Ja Sam ptrstp...@gmail.com wrote: I don't have problems with DC_B (replica) only in DC_A(my system write only to it) I have read timeouts. I checked in OpsCenter SSTable count and I have: 1) in DC_A same +-10% for last week, a small increase for last 24h (it is more than 15000-2 SSTables depends on node) 2) in DC_B last 24h shows up to 50% decrease, which give nice prognostics. Now I have less then 1000 SSTables What did you measure during system optimizations? Or do you have an idea what more should I check? 1) I look at CPU Idle (one node is 50% idle, rest 70% idle) 2) Disk queue - mostly is it near zero: avg 0.09. Sometimes there are spikes 3) system RAM usage is almost full 4) In Total Bytes Compacted most most lines are below 3MB/s. For total DC_A it is less than 10MB/s, in DC_B it looks much better (avg is like 17MB/s) something else? On Wed, Feb 18, 2015 at 1:32 PM, Roni Balthazar ronibaltha...@gmail.com wrote: Hi, You can check if the number of SSTables is decreasing. Look for the SSTable count information of your tables using nodetool cfstats. The compaction history can be viewed using nodetool compactionhistory. About the timeouts, check this out: http://www.datastax.com/dev/blog/how-cassandra-deals-with-replica-failure Also try to run nodetool tpstats to see the threads statistics. It can lead you to know if you are having performance problems. If you are having too many pending tasks or dropped messages, maybe will you need to tune your system (eg: driver's timeout, concurrent reads and so on) Regards, Roni Balthazar On Wed, Feb 18, 2015 at 9:51 AM, Ja Sam ptrstp...@gmail.com wrote: Hi, Thanks for your tip it looks that something changed - I still don't know if it is ok. My nodes started to do more compaction, but it looks that some compactions are really slow. In IO we have idle, CPU is quite ok (30%-40%). We set compactionthrouput to 999, but I do not see difference. Can we check something more? Or do you have any method to monitor progress with small files? Regards On Tue, Feb 17, 2015 at 2:43 PM, Roni Balthazar ronibaltha...@gmail.com wrote: HI, Yes... I had the same issue and setting cold_reads_to_omit to 0.0 was the solution... The number of SSTables decreased from many thousands to a number below a hundred and the SSTables are now much bigger with several gigabytes (most of them). Cheers, Roni Balthazar On Tue, Feb 17, 2015 at 11:32 AM, Ja Sam ptrstp...@gmail.com wrote: After some diagnostic ( we didn't set yet cold_reads_to_omit ). Compaction are running but VERY slow with idle IO. We had a lot of Data files in Cassandra. In DC_A it is about ~12 (only xxx-Data.db) in DC_B has only ~4000. I don't know if this change anything but: 1) in DC_A avg size of Data.db file is ~13 mb. I have few a really big ones, but most is really small (almost 1 files are less then 100mb). 2) in DC_B avg size of Data.db is much bigger ~260mb. Do you think that above flag will help us? On Tue, Feb 17, 2015 at 9:04 AM, Ja Sam ptrstp...@gmail.com wrote: I set setcompactionthroughput 999 permanently and it doesn't change anything. IO is still same. CPU is idle. On Tue, Feb 17, 2015 at 1:15 AM, Roni Balthazar ronibaltha...@gmail.com wrote: Hi, You can run nodetool compactionstats to view statistics on compactions. Setting cold_reads_to_omit to 0.0 can help to reduce the number of SSTables when you use Size-Tiered compaction. You can also create a cron job to increase the value of setcompactionthroughput during the night or when your IO is not busy. From http://wiki.apache.org/cassandra/NodeTool: 0 0 * * * root nodetool -h `hostname` setcompactionthroughput 999 0 6 * * * root nodetool -h `hostname` setcompactionthroughput 16 Cheers, Roni Balthazar On Mon, Feb 16, 2015 at 7:47 PM, Ja Sam ptrstp...@gmail.com wrote: One think I do not understand. In my case compaction is running permanently. Is there a way to check which compaction is pending? The only information is about total count. On Monday, February 16, 2015, Ja Sam ptrstp...@gmail.com wrote:
Re: Many pending compactions
You are right... Repair makes the data consistent between nodes. I understand that you have 2 issues going on. You need to run repair periodically without errors and need to decrease the numbers of compactions pending. So I suggest: 1) Run repair -pr on all nodes. If you upgrade to the new 2.1.3, you can use incremental repairs. There were some bugs on 2.1.2. 2) Run cleanup on all nodes 3) Since you have too many cold SSTables, set cold_reads_to_omit to 0.0, and increase setcompactionthroughput for some time and see if the number of SSTables is going down. Let us know what errors are you getting when running repairs. Regards, Roni Balthazar On Wed, Feb 18, 2015 at 1:31 PM, Ja Sam ptrstp...@gmail.com wrote: Can you explain me what is the correlation between growing SSTables and repair? I was sure, until your mail, that repair is only to make data consistent between nodes. Regards On Wed, Feb 18, 2015 at 4:20 PM, Roni Balthazar ronibaltha...@gmail.com wrote: Which error are you getting when running repairs? You need to run repair on your nodes within gc_grace_seconds (eg: weekly). They have data that are not read frequently. You can run repair -pr on all nodes. Since you do not have deletes, you will not have trouble with that. If you have deletes, it's better to increase gc_grace_seconds before the repair. http://www.datastax.com/documentation/cassandra/2.0/cassandra/operations/ops_repair_nodes_c.html After repair, try to run a nodetool cleanup. Check if the number of SSTables goes down after that... Pending compactions must decrease as well... Cheers, Roni Balthazar On Wed, Feb 18, 2015 at 12:39 PM, Ja Sam ptrstp...@gmail.com wrote: 1) we tried to run repairs but they usually does not succeed. But we had Leveled compaction before. Last week we ALTER tables to STCS, because guys from DataStax suggest us that we should not use Leveled and alter tables in STCS, because we don't have SSD. After this change we did not run any repair. Anyway I don't think it will change anything in SSTable count - if I am wrong please give me an information 2) I did this. My tables are 99% write only. It is audit system 3) Yes I am using default values 4) In both operations I am using LOCAL_QUORUM. I am almost sure that READ timeout happens because of too much SSTables. Anyway firstly I would like to fix to many pending compactions. I still don't know how to speed up them. On Wed, Feb 18, 2015 at 2:49 PM, Roni Balthazar ronibaltha...@gmail.com wrote: Are you running repairs within gc_grace_seconds? (default is 10 days) http://www.datastax.com/documentation/cassandra/2.0/cassandra/operations/ops_repair_nodes_c.html Double check if you set cold_reads_to_omit to 0.0 on tables with STCS that you do not read often. Are you using default values for the properties min_compaction_threshold(4) and max_compaction_threshold(32)? Which Consistency Level are you using for reading operations? Check if you are not reading from DC_B due to your Replication Factor and CL. http://www.datastax.com/documentation/cassandra/2.0/cassandra/dml/dml_config_consistency_c.html Cheers, Roni Balthazar On Wed, Feb 18, 2015 at 11:07 AM, Ja Sam ptrstp...@gmail.com wrote: I don't have problems with DC_B (replica) only in DC_A(my system write only to it) I have read timeouts. I checked in OpsCenter SSTable count and I have: 1) in DC_A same +-10% for last week, a small increase for last 24h (it is more than 15000-2 SSTables depends on node) 2) in DC_B last 24h shows up to 50% decrease, which give nice prognostics. Now I have less then 1000 SSTables What did you measure during system optimizations? Or do you have an idea what more should I check? 1) I look at CPU Idle (one node is 50% idle, rest 70% idle) 2) Disk queue - mostly is it near zero: avg 0.09. Sometimes there are spikes 3) system RAM usage is almost full 4) In Total Bytes Compacted most most lines are below 3MB/s. For total DC_A it is less than 10MB/s, in DC_B it looks much better (avg is like 17MB/s) something else? On Wed, Feb 18, 2015 at 1:32 PM, Roni Balthazar ronibaltha...@gmail.com wrote: Hi, You can check if the number of SSTables is decreasing. Look for the SSTable count information of your tables using nodetool cfstats. The compaction history can be viewed using nodetool compactionhistory. About the timeouts, check this out: http://www.datastax.com/dev/blog/how-cassandra-deals-with-replica-failure Also try to run nodetool tpstats to see the threads statistics. It can lead you to know if you are having performance problems. If you are having too many pending tasks or dropped messages, maybe will you need to tune your system (eg: driver's timeout, concurrent reads and so on) Regards, Roni Balthazar
Re: Many pending compactions
Hi, You can check if the number of SSTables is decreasing. Look for the SSTable count information of your tables using nodetool cfstats. The compaction history can be viewed using nodetool compactionhistory. About the timeouts, check this out: http://www.datastax.com/dev/blog/how-cassandra-deals-with-replica-failure Also try to run nodetool tpstats to see the threads statistics. It can lead you to know if you are having performance problems. If you are having too many pending tasks or dropped messages, maybe will you need to tune your system (eg: driver's timeout, concurrent reads and so on) Regards, Roni Balthazar On Wed, Feb 18, 2015 at 9:51 AM, Ja Sam ptrstp...@gmail.com wrote: Hi, Thanks for your tip it looks that something changed - I still don't know if it is ok. My nodes started to do more compaction, but it looks that some compactions are really slow. In IO we have idle, CPU is quite ok (30%-40%). We set compactionthrouput to 999, but I do not see difference. Can we check something more? Or do you have any method to monitor progress with small files? Regards On Tue, Feb 17, 2015 at 2:43 PM, Roni Balthazar ronibaltha...@gmail.com wrote: HI, Yes... I had the same issue and setting cold_reads_to_omit to 0.0 was the solution... The number of SSTables decreased from many thousands to a number below a hundred and the SSTables are now much bigger with several gigabytes (most of them). Cheers, Roni Balthazar On Tue, Feb 17, 2015 at 11:32 AM, Ja Sam ptrstp...@gmail.com wrote: After some diagnostic ( we didn't set yet cold_reads_to_omit ). Compaction are running but VERY slow with idle IO. We had a lot of Data files in Cassandra. In DC_A it is about ~12 (only xxx-Data.db) in DC_B has only ~4000. I don't know if this change anything but: 1) in DC_A avg size of Data.db file is ~13 mb. I have few a really big ones, but most is really small (almost 1 files are less then 100mb). 2) in DC_B avg size of Data.db is much bigger ~260mb. Do you think that above flag will help us? On Tue, Feb 17, 2015 at 9:04 AM, Ja Sam ptrstp...@gmail.com wrote: I set setcompactionthroughput 999 permanently and it doesn't change anything. IO is still same. CPU is idle. On Tue, Feb 17, 2015 at 1:15 AM, Roni Balthazar ronibaltha...@gmail.com wrote: Hi, You can run nodetool compactionstats to view statistics on compactions. Setting cold_reads_to_omit to 0.0 can help to reduce the number of SSTables when you use Size-Tiered compaction. You can also create a cron job to increase the value of setcompactionthroughput during the night or when your IO is not busy. From http://wiki.apache.org/cassandra/NodeTool: 0 0 * * * root nodetool -h `hostname` setcompactionthroughput 999 0 6 * * * root nodetool -h `hostname` setcompactionthroughput 16 Cheers, Roni Balthazar On Mon, Feb 16, 2015 at 7:47 PM, Ja Sam ptrstp...@gmail.com wrote: One think I do not understand. In my case compaction is running permanently. Is there a way to check which compaction is pending? The only information is about total count. On Monday, February 16, 2015, Ja Sam ptrstp...@gmail.com wrote: Of couse I made a mistake. I am using 2.1.2. Anyway night build is available from http://cassci.datastax.com/job/cassandra-2.1/ I read about cold_reads_to_omit It looks promising. Should I set also compaction throughput? p.s. I am really sad that I didn't read this before: https://engineering.eventbrite.com/what-version-of-cassandra-should-i-run/ On Monday, February 16, 2015, Carlos Rolo r...@pythian.com wrote: Hi 100% in agreement with Roland, 2.1.x series is a pain! I would never recommend the current 2.1.x series for production. Clocks is a pain, and check your connectivity! Also check tpstats to see if your threadpools are being overrun. Regards, Carlos Juzarte Rolo Cassandra Consultant Pythian - Love your data rolo@pythian | Twitter: cjrolo | Linkedin: linkedin.com/in/carlosjuzarterolo Tel: 1649 www.pythian.com On Mon, Feb 16, 2015 at 8:12 PM, Roland Etzenhammer r.etzenham...@t-online.de wrote: Hi, 1) Actual Cassandra 2.1.3, it was upgraded from 2.1.0 (suggested by Al Tobey from DataStax) 7) minimal reads (usually none, sometimes few) those two points keep me repeating an anwser I got. First where did you get 2.1.3 from? Maybe I missed it, I will have a look. But if it is 2.1.2 whis is the latest released version, that version has many bugs - most of them I got kicked by while testing 2.1.2. I got many problems with compactions not beeing triggred on column families not beeing read, compactions and repairs not beeing completed. See
Re: Many pending compactions
Which error are you getting when running repairs? You need to run repair on your nodes within gc_grace_seconds (eg: weekly). They have data that are not read frequently. You can run repair -pr on all nodes. Since you do not have deletes, you will not have trouble with that. If you have deletes, it's better to increase gc_grace_seconds before the repair. http://www.datastax.com/documentation/cassandra/2.0/cassandra/operations/ops_repair_nodes_c.html After repair, try to run a nodetool cleanup. Check if the number of SSTables goes down after that... Pending compactions must decrease as well... Cheers, Roni Balthazar On Wed, Feb 18, 2015 at 12:39 PM, Ja Sam ptrstp...@gmail.com wrote: 1) we tried to run repairs but they usually does not succeed. But we had Leveled compaction before. Last week we ALTER tables to STCS, because guys from DataStax suggest us that we should not use Leveled and alter tables in STCS, because we don't have SSD. After this change we did not run any repair. Anyway I don't think it will change anything in SSTable count - if I am wrong please give me an information 2) I did this. My tables are 99% write only. It is audit system 3) Yes I am using default values 4) In both operations I am using LOCAL_QUORUM. I am almost sure that READ timeout happens because of too much SSTables. Anyway firstly I would like to fix to many pending compactions. I still don't know how to speed up them. On Wed, Feb 18, 2015 at 2:49 PM, Roni Balthazar ronibaltha...@gmail.com wrote: Are you running repairs within gc_grace_seconds? (default is 10 days) http://www.datastax.com/documentation/cassandra/2.0/cassandra/operations/ops_repair_nodes_c.html Double check if you set cold_reads_to_omit to 0.0 on tables with STCS that you do not read often. Are you using default values for the properties min_compaction_threshold(4) and max_compaction_threshold(32)? Which Consistency Level are you using for reading operations? Check if you are not reading from DC_B due to your Replication Factor and CL. http://www.datastax.com/documentation/cassandra/2.0/cassandra/dml/dml_config_consistency_c.html Cheers, Roni Balthazar On Wed, Feb 18, 2015 at 11:07 AM, Ja Sam ptrstp...@gmail.com wrote: I don't have problems with DC_B (replica) only in DC_A(my system write only to it) I have read timeouts. I checked in OpsCenter SSTable count and I have: 1) in DC_A same +-10% for last week, a small increase for last 24h (it is more than 15000-2 SSTables depends on node) 2) in DC_B last 24h shows up to 50% decrease, which give nice prognostics. Now I have less then 1000 SSTables What did you measure during system optimizations? Or do you have an idea what more should I check? 1) I look at CPU Idle (one node is 50% idle, rest 70% idle) 2) Disk queue - mostly is it near zero: avg 0.09. Sometimes there are spikes 3) system RAM usage is almost full 4) In Total Bytes Compacted most most lines are below 3MB/s. For total DC_A it is less than 10MB/s, in DC_B it looks much better (avg is like 17MB/s) something else? On Wed, Feb 18, 2015 at 1:32 PM, Roni Balthazar ronibaltha...@gmail.com wrote: Hi, You can check if the number of SSTables is decreasing. Look for the SSTable count information of your tables using nodetool cfstats. The compaction history can be viewed using nodetool compactionhistory. About the timeouts, check this out: http://www.datastax.com/dev/blog/how-cassandra-deals-with-replica-failure Also try to run nodetool tpstats to see the threads statistics. It can lead you to know if you are having performance problems. If you are having too many pending tasks or dropped messages, maybe will you need to tune your system (eg: driver's timeout, concurrent reads and so on) Regards, Roni Balthazar On Wed, Feb 18, 2015 at 9:51 AM, Ja Sam ptrstp...@gmail.com wrote: Hi, Thanks for your tip it looks that something changed - I still don't know if it is ok. My nodes started to do more compaction, but it looks that some compactions are really slow. In IO we have idle, CPU is quite ok (30%-40%). We set compactionthrouput to 999, but I do not see difference. Can we check something more? Or do you have any method to monitor progress with small files? Regards On Tue, Feb 17, 2015 at 2:43 PM, Roni Balthazar ronibaltha...@gmail.com wrote: HI, Yes... I had the same issue and setting cold_reads_to_omit to 0.0 was the solution... The number of SSTables decreased from many thousands to a number below a hundred and the SSTables are now much bigger with several gigabytes (most of them). Cheers, Roni Balthazar On Tue, Feb 17, 2015 at 11:32 AM, Ja Sam ptrstp...@gmail.com wrote: After some diagnostic ( we didn't set yet cold_reads_to_omit ). Compaction are running but VERY slow
RE: Data tiered compaction and data model question
What is the maximum number of events that you expect in a day? What is the worst-case scenario? Mohammed From: cass savy [mailto:casss...@gmail.com] Sent: Wednesday, February 18, 2015 4:21 PM To: user@cassandra.apache.org Subject: Data tiered compaction and data model question We want to track events in log Cf/table and should be able to query for events that occurred in range of mins or hours for given day. Multiple events can occur in a given minute. Listed 2 table designs and leaning towards table 1 to avoid large wide row. Please advice on Table 1: not very widerow, still be able to query for range of minutes for given day and/or given day and range of hours Create table log_Event ( event_day text, event_hr int, event_time timeuuid, data text, PRIMARY KEY ( (event_day,event_hr),event_time) ) Table 2: This will be very wide row Create table log_Event ( event_day text, event_time timeuuid, data text, PRIMARY KEY ( event_day,event_time) ) Datatiered compaction: recommended for time series data as per below doc. Our data will be kept only for 30 days. Hence thought of using this compaction strategy. http://www.datastax.com/dev/blog/datetieredcompactionstrategy Create table 1 listed above with this compaction strategy. Added some rows and did manual flush. I do not see any sstables created yet. Is that expected? compaction={'max_sstable_age_days': '1', 'class': 'DateTieredCompactionStrategy'}
Logging client ID for YCSB workloads on Cassandra?
Hi, I'd like to log the client ID for every operation performed by the YCSB on my Cassandra cluster. The purpose is to identify analyze various other consistency measures other than eventual consistency. I wanted to know if people have done something similar in the past. Or am I missing something really basic here? Please let me know if you need more information. Thanks — Jatin Ganhotra
Data tiered compaction and data model question
We want to track events in log Cf/table and should be able to query for events that occurred in range of mins or hours for given day. Multiple events can occur in a given minute. Listed 2 table designs and leaning towards table 1 to avoid large wide row. Please advice on *Table 1*: not very widerow, still be able to query for range of minutes for given day and/or given day and range of hours Create table *log_Event* ( event_day text, event_hr int, event_time timeuuid, data text, PRIMARY KEY ( *(event_day,event_hr),*event_time) ) *Table 2: This will be very wide row* Create table *log_Event* ( event_day text, event_time timeuuid, data text, PRIMARY KEY ( *event_day,*event_time) ) *Datatiered compaction: recommended for time series data as per below doc. Our data will be kept only for 30 days. Hence thought of using this compaction strategy.* http://www.datastax.com/dev/blog/datetieredcompactionstrategy Create table 1 listed above with this compaction strategy. Added some rows and did manual flush. I do not see any sstables created yet. Is that expected? compaction={'max_sstable_age_days': '1', 'class': 'DateTieredCompactionStrategy'}
Re: run cassandra on a small instance
2.1.2 is IMO broken and should not be used for any purpose. Use 2.1.1 or 2.1.3. https://engineering.eventbrite.com/what-version-of-cassandra-should-i-run/ =Rob Cool man. Thanks for the info. I just upgraded to 2.1.3. We'll see how that goes. I can let you know more once it's been running for a while. Thanks Tim On Wed, Feb 18, 2015 at 8:16 PM, Robert Coli rc...@eventbrite.com wrote: On Wed, Feb 18, 2015 at 5:09 PM, Tim Dunphy bluethu...@gmail.com wrote: I'm attempting to run Cassandra 2.1.2 on a smallish 2.GB ram instance over at Digital Ocean. It's a CentOS 7 host. 2.1.2 is IMO broken and should not be used for any purpose. Use 2.1.1 or 2.1.3. https://engineering.eventbrite.com/what-version-of-cassandra-should-i-run/ =Rob -- GPG me!! gpg --keyserver pool.sks-keyservers.net --recv-keys F186197B
Re: run cassandra on a small instance
Robert, Let me know if I’m off base about this—but I feel like I see a lot of posts that are like this (i.e., use this arbitrary version, not this other arbitrary version). Why are releases going out if they’re “broken”? This seems like a very confusing way for new (and existing) users to approach versions... Andrew On February 18, 2015 at 5:16:27 PM, Robert Coli (rc...@eventbrite.com) wrote: On Wed, Feb 18, 2015 at 5:09 PM, Tim Dunphy bluethu...@gmail.com wrote: I'm attempting to run Cassandra 2.1.2 on a smallish 2.GB ram instance over at Digital Ocean. It's a CentOS 7 host. 2.1.2 is IMO broken and should not be used for any purpose. Use 2.1.1 or 2.1.3. https://engineering.eventbrite.com/what-version-of-cassandra-should-i-run/ =Rob