Re: Many pending compactions

2015-02-18 Thread Roni Balthazar
Try repair -pr on all nodes.

If after that you still have issues, you can try to rebuild the SSTables using 
nodetool upgradesstables or scrub.

Regards,

Roni Balthazar

 Em 18/02/2015, às 14:13, Ja Sam ptrstp...@gmail.com escreveu:
 
 ad 3)  I did this already yesterday (setcompactionthrouput also). But still 
 SSTables are increasing.
 
 ad 1) What do you think I should use -pr or try to use incremental?
 
 
 
 On Wed, Feb 18, 2015 at 4:54 PM, Roni Balthazar ronibaltha...@gmail.com 
 wrote:
 You are right... Repair makes the data consistent between nodes.
 
 I understand that you have 2 issues going on.
 
 You need to run repair periodically without errors and need to decrease the 
 numbers of compactions pending.
 
 So I suggest:
 
 1) Run repair -pr on all nodes. If you upgrade to the new 2.1.3, you can use 
 incremental repairs. There were some bugs on 2.1.2.
 2) Run cleanup on all nodes
 3) Since you have too many cold SSTables, set cold_reads_to_omit to 0.0, and 
 increase setcompactionthroughput for some time and see if the number of 
 SSTables is going down.
 
 Let us know what errors are you getting when running repairs.
 
 Regards,
 
 Roni Balthazar
 
 
 On Wed, Feb 18, 2015 at 1:31 PM, Ja Sam ptrstp...@gmail.com wrote:
 Can you explain me what is the correlation between growing SSTables and 
 repair? 
 I was sure, until your  mail, that repair is only to make data consistent 
 between nodes.
 
 Regards
 
 
 
 On Wed, Feb 18, 2015 at 4:20 PM, Roni Balthazar ronibaltha...@gmail.com 
 wrote:
 Which error are you getting when running repairs?
 You need to run repair on your nodes within gc_grace_seconds (eg:
 weekly). They have data that are not read frequently. You can run
 repair -pr on all nodes. Since you do not have deletes, you will not
 have trouble with that. If you have deletes, it's better to increase
 gc_grace_seconds before the repair.
 http://www.datastax.com/documentation/cassandra/2.0/cassandra/operations/ops_repair_nodes_c.html
 After repair, try to run a nodetool cleanup.
 
 Check if the number of SSTables goes down after that... Pending
 compactions must decrease as well...
 
 Cheers,
 
 Roni Balthazar
 
 
 
 
 On Wed, Feb 18, 2015 at 12:39 PM, Ja Sam ptrstp...@gmail.com wrote:
  1) we tried to run repairs but they usually does not succeed. But we had
  Leveled compaction before. Last week we ALTER tables to STCS, because 
  guys
  from DataStax suggest us that we should not use Leveled and alter tables 
  in
  STCS, because we don't have SSD. After this change we did not run any
  repair. Anyway I don't think it will change anything in SSTable count - 
  if I
  am wrong please give me an information
 
  2) I did this. My tables are 99% write only. It is audit system
 
  3) Yes I am using default values
 
  4) In both operations I am using LOCAL_QUORUM.
 
  I am almost sure that READ timeout happens because of too much SSTables.
  Anyway firstly I would like to fix to many pending compactions. I still
  don't know how to speed up them.
 
 
  On Wed, Feb 18, 2015 at 2:49 PM, Roni Balthazar ronibaltha...@gmail.com
  wrote:
 
  Are you running repairs within gc_grace_seconds? (default is 10 days)
 
  http://www.datastax.com/documentation/cassandra/2.0/cassandra/operations/ops_repair_nodes_c.html
 
  Double check if you set cold_reads_to_omit to 0.0 on tables with STCS
  that you do not read often.
 
  Are you using default values for the properties
  min_compaction_threshold(4) and max_compaction_threshold(32)?
 
  Which Consistency Level are you using for reading operations? Check if
  you are not reading from DC_B due to your Replication Factor and CL.
 
  http://www.datastax.com/documentation/cassandra/2.0/cassandra/dml/dml_config_consistency_c.html
 
 
  Cheers,
 
  Roni Balthazar
 
  On Wed, Feb 18, 2015 at 11:07 AM, Ja Sam ptrstp...@gmail.com wrote:
   I don't have problems with DC_B (replica) only in DC_A(my system write
   only
   to it) I have read timeouts.
  
   I checked in OpsCenter SSTable count  and I have:
   1) in DC_A  same +-10% for last week, a small increase for last 24h 
   (it
   is
   more than 15000-2 SSTables depends on node)
   2) in DC_B last 24h shows up to 50% decrease, which give nice
   prognostics.
   Now I have less then 1000 SSTables
  
   What did you measure during system optimizations? Or do you have an 
   idea
   what more should I check?
   1) I look at CPU Idle (one node is 50% idle, rest 70% idle)
   2) Disk queue - mostly is it near zero: avg 0.09. Sometimes there are
   spikes
   3) system RAM usage is almost full
   4) In Total Bytes Compacted most most lines are below 3MB/s. For total
   DC_A
   it is less than 10MB/s, in DC_B it looks much better (avg is like
   17MB/s)
  
   something else?
  
  
  
   On Wed, Feb 18, 2015 at 1:32 PM, Roni Balthazar
   ronibaltha...@gmail.com
   wrote:
  
   Hi,
  
   You can check if the number of SSTables is decreasing. Look for the
   SSTable count information of your tables using nodetool 

Re: Many pending compactions

2015-02-18 Thread Ja Sam
As Al Tobey suggest me I upgraded my 2.1.0 to snaphot version of 2.1.3. I
have now installed exactly this build:
https://cassci.datastax.com/job/cassandra-2.1/912/
I see many compaction which completes, but some of them are really slow.
Maybe I should send some stats form OpsCenter or servers? But it is
difficult to me to choose what is important

Regards



On Wed, Feb 18, 2015 at 6:11 PM, Jake Luciani jak...@gmail.com wrote:

 Ja, Please upgrade to official 2.1.3 we've fixed many things related to
 compaction.  Are you seeing the compactions % complete progress at all?

 On Wed, Feb 18, 2015 at 11:58 AM, Roni Balthazar ronibaltha...@gmail.com
 wrote:

 Try repair -pr on all nodes.

 If after that you still have issues, you can try to rebuild the SSTables
 using nodetool upgradesstables or scrub.

 Regards,

 Roni Balthazar

 Em 18/02/2015, às 14:13, Ja Sam ptrstp...@gmail.com escreveu:

 ad 3)  I did this already yesterday (setcompactionthrouput also). But
 still SSTables are increasing.

 ad 1) What do you think I should use -pr or try to use incremental?



 On Wed, Feb 18, 2015 at 4:54 PM, Roni Balthazar ronibaltha...@gmail.com
 wrote:

 You are right... Repair makes the data consistent between nodes.

 I understand that you have 2 issues going on.

 You need to run repair periodically without errors and need to decrease
 the numbers of compactions pending.

 So I suggest:

 1) Run repair -pr on all nodes. If you upgrade to the new 2.1.3, you can
 use incremental repairs. There were some bugs on 2.1.2.
 2) Run cleanup on all nodes
 3) Since you have too many cold SSTables, set cold_reads_to_omit to
 0.0, and increase setcompactionthroughput for some time and see if the
 number of SSTables is going down.

 Let us know what errors are you getting when running repairs.

 Regards,

 Roni Balthazar


 On Wed, Feb 18, 2015 at 1:31 PM, Ja Sam ptrstp...@gmail.com wrote:

 Can you explain me what is the correlation between growing SSTables and
 repair?
 I was sure, until your  mail, that repair is only to make data
 consistent between nodes.

 Regards


 On Wed, Feb 18, 2015 at 4:20 PM, Roni Balthazar 
 ronibaltha...@gmail.com wrote:

 Which error are you getting when running repairs?
 You need to run repair on your nodes within gc_grace_seconds (eg:
 weekly). They have data that are not read frequently. You can run
 repair -pr on all nodes. Since you do not have deletes, you will not
 have trouble with that. If you have deletes, it's better to increase
 gc_grace_seconds before the repair.

 http://www.datastax.com/documentation/cassandra/2.0/cassandra/operations/ops_repair_nodes_c.html
 After repair, try to run a nodetool cleanup.

 Check if the number of SSTables goes down after that... Pending
 compactions must decrease as well...

 Cheers,

 Roni Balthazar




 On Wed, Feb 18, 2015 at 12:39 PM, Ja Sam ptrstp...@gmail.com wrote:
  1) we tried to run repairs but they usually does not succeed. But we
 had
  Leveled compaction before. Last week we ALTER tables to STCS,
 because guys
  from DataStax suggest us that we should not use Leveled and alter
 tables in
  STCS, because we don't have SSD. After this change we did not run any
  repair. Anyway I don't think it will change anything in SSTable
 count - if I
  am wrong please give me an information
 
  2) I did this. My tables are 99% write only. It is audit system
 
  3) Yes I am using default values
 
  4) In both operations I am using LOCAL_QUORUM.
 
  I am almost sure that READ timeout happens because of too much
 SSTables.
  Anyway firstly I would like to fix to many pending compactions. I
 still
  don't know how to speed up them.
 
 
  On Wed, Feb 18, 2015 at 2:49 PM, Roni Balthazar 
 ronibaltha...@gmail.com
  wrote:
 
  Are you running repairs within gc_grace_seconds? (default is 10
 days)
 
 
 http://www.datastax.com/documentation/cassandra/2.0/cassandra/operations/ops_repair_nodes_c.html
 
  Double check if you set cold_reads_to_omit to 0.0 on tables with
 STCS
  that you do not read often.
 
  Are you using default values for the properties
  min_compaction_threshold(4) and max_compaction_threshold(32)?
 
  Which Consistency Level are you using for reading operations? Check
 if
  you are not reading from DC_B due to your Replication Factor and CL.
 
 
 http://www.datastax.com/documentation/cassandra/2.0/cassandra/dml/dml_config_consistency_c.html
 
 
  Cheers,
 
  Roni Balthazar
 
  On Wed, Feb 18, 2015 at 11:07 AM, Ja Sam ptrstp...@gmail.com
 wrote:
   I don't have problems with DC_B (replica) only in DC_A(my system
 write
   only
   to it) I have read timeouts.
  
   I checked in OpsCenter SSTable count  and I have:
   1) in DC_A  same +-10% for last week, a small increase for last
 24h (it
   is
   more than 15000-2 SSTables depends on node)
   2) in DC_B last 24h shows up to 50% decrease, which give nice
   prognostics.
   Now I have less then 1000 SSTables
  
   What did you measure during system optimizations? Or do you have
 an idea
   

Re: Many pending compactions

2015-02-18 Thread Jake Luciani
Ja, Please upgrade to official 2.1.3 we've fixed many things related to
compaction.  Are you seeing the compactions % complete progress at all?

On Wed, Feb 18, 2015 at 11:58 AM, Roni Balthazar ronibaltha...@gmail.com
wrote:

 Try repair -pr on all nodes.

 If after that you still have issues, you can try to rebuild the SSTables
 using nodetool upgradesstables or scrub.

 Regards,

 Roni Balthazar

 Em 18/02/2015, às 14:13, Ja Sam ptrstp...@gmail.com escreveu:

 ad 3)  I did this already yesterday (setcompactionthrouput also). But
 still SSTables are increasing.

 ad 1) What do you think I should use -pr or try to use incremental?



 On Wed, Feb 18, 2015 at 4:54 PM, Roni Balthazar ronibaltha...@gmail.com
 wrote:

 You are right... Repair makes the data consistent between nodes.

 I understand that you have 2 issues going on.

 You need to run repair periodically without errors and need to decrease
 the numbers of compactions pending.

 So I suggest:

 1) Run repair -pr on all nodes. If you upgrade to the new 2.1.3, you can
 use incremental repairs. There were some bugs on 2.1.2.
 2) Run cleanup on all nodes
 3) Since you have too many cold SSTables, set cold_reads_to_omit to 0.0,
 and increase setcompactionthroughput for some time and see if the number
 of SSTables is going down.

 Let us know what errors are you getting when running repairs.

 Regards,

 Roni Balthazar


 On Wed, Feb 18, 2015 at 1:31 PM, Ja Sam ptrstp...@gmail.com wrote:

 Can you explain me what is the correlation between growing SSTables and
 repair?
 I was sure, until your  mail, that repair is only to make data
 consistent between nodes.

 Regards


 On Wed, Feb 18, 2015 at 4:20 PM, Roni Balthazar ronibaltha...@gmail.com
  wrote:

 Which error are you getting when running repairs?
 You need to run repair on your nodes within gc_grace_seconds (eg:
 weekly). They have data that are not read frequently. You can run
 repair -pr on all nodes. Since you do not have deletes, you will not
 have trouble with that. If you have deletes, it's better to increase
 gc_grace_seconds before the repair.

 http://www.datastax.com/documentation/cassandra/2.0/cassandra/operations/ops_repair_nodes_c.html
 After repair, try to run a nodetool cleanup.

 Check if the number of SSTables goes down after that... Pending
 compactions must decrease as well...

 Cheers,

 Roni Balthazar




 On Wed, Feb 18, 2015 at 12:39 PM, Ja Sam ptrstp...@gmail.com wrote:
  1) we tried to run repairs but they usually does not succeed. But we
 had
  Leveled compaction before. Last week we ALTER tables to STCS, because
 guys
  from DataStax suggest us that we should not use Leveled and alter
 tables in
  STCS, because we don't have SSD. After this change we did not run any
  repair. Anyway I don't think it will change anything in SSTable count
 - if I
  am wrong please give me an information
 
  2) I did this. My tables are 99% write only. It is audit system
 
  3) Yes I am using default values
 
  4) In both operations I am using LOCAL_QUORUM.
 
  I am almost sure that READ timeout happens because of too much
 SSTables.
  Anyway firstly I would like to fix to many pending compactions. I
 still
  don't know how to speed up them.
 
 
  On Wed, Feb 18, 2015 at 2:49 PM, Roni Balthazar 
 ronibaltha...@gmail.com
  wrote:
 
  Are you running repairs within gc_grace_seconds? (default is 10 days)
 
 
 http://www.datastax.com/documentation/cassandra/2.0/cassandra/operations/ops_repair_nodes_c.html
 
  Double check if you set cold_reads_to_omit to 0.0 on tables with STCS
  that you do not read often.
 
  Are you using default values for the properties
  min_compaction_threshold(4) and max_compaction_threshold(32)?
 
  Which Consistency Level are you using for reading operations? Check
 if
  you are not reading from DC_B due to your Replication Factor and CL.
 
 
 http://www.datastax.com/documentation/cassandra/2.0/cassandra/dml/dml_config_consistency_c.html
 
 
  Cheers,
 
  Roni Balthazar
 
  On Wed, Feb 18, 2015 at 11:07 AM, Ja Sam ptrstp...@gmail.com
 wrote:
   I don't have problems with DC_B (replica) only in DC_A(my system
 write
   only
   to it) I have read timeouts.
  
   I checked in OpsCenter SSTable count  and I have:
   1) in DC_A  same +-10% for last week, a small increase for last
 24h (it
   is
   more than 15000-2 SSTables depends on node)
   2) in DC_B last 24h shows up to 50% decrease, which give nice
   prognostics.
   Now I have less then 1000 SSTables
  
   What did you measure during system optimizations? Or do you have
 an idea
   what more should I check?
   1) I look at CPU Idle (one node is 50% idle, rest 70% idle)
   2) Disk queue - mostly is it near zero: avg 0.09. Sometimes there
 are
   spikes
   3) system RAM usage is almost full
   4) In Total Bytes Compacted most most lines are below 3MB/s. For
 total
   DC_A
   it is less than 10MB/s, in DC_B it looks much better (avg is like
   17MB/s)
  
   something else?
  
  
  
   On Wed, Feb 18, 2015 at 1:32 

Re: Cassandra install on JRE vs JDK

2015-02-18 Thread Robert Stupp
The ”natural” dependency of Cassandra is the JRE (not the JDK) - e.g. in the 
Debian package.
You should be safe using JRE instead of JDK.

If you’re asking whether to use a non-Oracle JVM - the answer would be: use the 
Oracle JVM.
OpenJDK might work, but I’d not recommend it.


 Am 18.02.2015 um 20:49 schrieb cass savy casss...@gmail.com:
 
 Can we install Oracle JDK instead of JRE  in Cassandra servers? We have few 
 clusters running JDK when we upgraded to C*2.0. 
 
 Is there any known issue or impact with using  JDK vs JRE?
 What is the reason to not use Oracle JDK in C* servers?
 Is there any performance impact ?
 
 Please advice.
  

—
Robert Stupp
@snazy



Re: Deleting Statistics.db at startup

2015-02-18 Thread Robert Coli
On Wed, Feb 18, 2015 at 4:02 AM, Tomer Pearl tomer.pe...@contextream.com
wrote:

  My question is what is the consequences of deleting this file every time
 the node is starting up? Performance wise or other.


You waste the time Cassandra spends to regenerate it.

I personally would not institute an operational practice whereby I
regularly purged these files to avoid OOM.

=Rob


C* 2.1.2 invokes oom-killer

2015-02-18 Thread Michał Łowicki
Hi,

Couple of times a day 2 out of 4 members cluster nodes are killed

root@db4:~# dmesg | grep -i oom
[4811135.792657] [ pid ]   uid  tgid total_vm  rss cpu oom_adj
oom_score_adj name
[6559049.307293] java invoked oom-killer: gfp_mask=0x201da, order=0,
oom_adj=0, oom_score_adj=0

Nodes are using 8GB heap (confirmed with *nodetool info*) and aren't using
row cache.

Noticed that couple of times a day used RSS is growing really fast within
couple of minutes and I see CPU spikes at the same time -
https://www.dropbox.com/s/khco2kdp4qdzjit/Screenshot%202015-02-18%2015.10.54.png?dl=0
.

Could be related to compaction but after compaction is finished used RSS
doesn't shrink. Output from pmap when C* process uses 50GB RAM (out of
64GB) is available on http://paste.ofcode.org/ZjLUA2dYVuKvJHAk9T3Hjb. At
the time dump was made heap usage is far below 8GB (~3GB) but total RSS is
~50GB.

Any help will be appreciated.

-- 
BR,
Michał Łowicki


Re: C* 2.1.2 invokes oom-killer

2015-02-18 Thread Robert Coli
On Wed, Feb 18, 2015 at 10:28 AM, Michał Łowicki mlowi...@gmail.com wrote:

 Couple of times a day 2 out of 4 members cluster nodes are killed


This sort of issue is usually best handled/debugged interactively on IRC.

But briefly :

- 2.1.2 is IMO broken for production. Downgrade (officially unsupported but
fine between these versions) to 2.1.1 or upgrade to 2.1.3.
- Beyond that, look at the steady state heap consumption. With 2.1.2, it
would likely take at least 1TB of data to fill heap in steady state to
near-failure.

=Rob


Re: Cassandra install on JRE vs JDK

2015-02-18 Thread Mark Reddy
Yes you can use Oracle JDK if your prefer, I've been using the JDK with
Cassandra in production for years without issue.

Regards,
Mark

On 18 February 2015 at 19:49, cass savy casss...@gmail.com wrote:

 Can we install Oracle JDK instead of JRE  in Cassandra servers? We have
 few clusters running JDK when we upgraded to C*2.0.

 Is there any known issue or impact with using  JDK vs JRE?
 What is the reason to not use Oracle JDK in C* servers?
 Is there any performance impact ?

 Please advice.




Re: Cassandra install on JRE vs JDK

2015-02-18 Thread cass savy
Thanks Mark  for quick response. What version of Cassandra and JDK are you
using in Prod.


On Wed, Feb 18, 2015 at 11:58 AM, Mark Reddy mark.l.re...@gmail.com wrote:

 Yes you can use Oracle JDK if your prefer, I've been using the JDK with
 Cassandra in production for years without issue.

 Regards,
 Mark

 On 18 February 2015 at 19:49, cass savy casss...@gmail.com wrote:

 Can we install Oracle JDK instead of JRE  in Cassandra servers? We have
 few clusters running JDK when we upgraded to C*2.0.

 Is there any known issue or impact with using  JDK vs JRE?
 What is the reason to not use Oracle JDK in C* servers?
 Is there any performance impact ?

 Please advice.






Re: Cassandra install on JRE vs JDK

2015-02-18 Thread cass savy
Thanks Robert  for quick response. I use Oracle JDK and not OpenJDK.


On Wed, Feb 18, 2015 at 11:54 AM, Robert Stupp sn...@snazy.de wrote:

 The ”natural” dependency of Cassandra is the JRE (not the JDK) - e.g. in
 the Debian package.
 You should be safe using JRE instead of JDK.

 If you’re asking whether to use a non-Oracle JVM - the answer would be:
 use the Oracle JVM.
 OpenJDK might work, but I’d not recommend it.


 Am 18.02.2015 um 20:49 schrieb cass savy casss...@gmail.com:

 Can we install Oracle JDK instead of JRE  in Cassandra servers? We have
 few clusters running JDK when we upgraded to C*2.0.

 Is there any known issue or impact with using  JDK vs JRE?
 What is the reason to not use Oracle JDK in C* servers?
 Is there any performance impact ?

 Please advice.



 —
 Robert Stupp
 @snazy




Deleting Statistics.db at startup

2015-02-18 Thread Tomer Pearl
Hello,

I have received the following error
ERROR [SSTableBatchOpen:2] 2015-01-19 13:55:28,478 CassandraDaemon.java (line 
196) Exception in thread Thread[SSTableBatchOpen:2,5,main]
java.lang.OutOfMemoryError: Java heap space
at 
org.apache.cassandra.utils.EstimatedHistogram$EstimatedHistogramSerializer.deserialize(EstimatedHistogram.java:335)
at 
org.apache.cassandra.io.sstable.SSTableMetadata$SSTableMetadataSerializer.deserialize(SSTableMetadata.java:462)
at 
org.apache.cassandra.io.sstable.SSTableMetadata$SSTableMetadataSerializer.deserialize(SSTableMetadata.java:448)
at 
org.apache.cassandra.io.sstable.SSTableMetadata$SSTableMetadataSerializer.deserialize(SSTableMetadata.java:432)
at 
org.apache.cassandra.io.sstable.SSTableReader.openMetadata(SSTableReader.java:225)
at 
org.apache.cassandra.io.sstable.SSTableReader.open(SSTableReader.java:194)
at 
org.apache.cassandra.io.sstable.SSTableReader.open(SSTableReader.java:184)
at 
org.apache.cassandra.io.sstable.SSTableReader$1.run(SSTableReader.java:264)
at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
at java.util.concurrent.FutureTask$Sync.innerRun(Unknown Source)
at java.util.concurrent.FutureTask.run(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at java.lang.Thread.run(Unknown Source)

I have found a solution here:
http://www.mail-archive.com/user%40cassandra.apache.org/msg23682.html
Which advice to delete the statistics.db file.

My question is what is the consequences of deleting this file every time the 
node is starting up? Performance wise or other.

Thanks,
Tomer.



Re: Many pending compactions

2015-02-18 Thread Ja Sam
I don't have problems with DC_B (replica) only in DC_A(my system write only
to it) I have read timeouts.

I checked in OpsCenter SSTable count  and I have:
1) in DC_A  same +-10% for last week, a small increase for last 24h (it is
more than 15000-2 SSTables depends on node)
2) in DC_B last 24h shows up to 50% decrease, which give nice prognostics.
Now I have less then 1000 SSTables

What did you measure during system optimizations? Or do you have an idea
what more should I check?
1) I look at CPU Idle (one node is 50% idle, rest 70% idle)
2) Disk queue - mostly is it near zero: avg 0.09. Sometimes there are
spikes
3) system RAM usage is almost full
4) In Total Bytes Compacted most most lines are below 3MB/s. For total DC_A
it is less than 10MB/s, in DC_B it looks much better (avg is like 17MB/s)

something else?



On Wed, Feb 18, 2015 at 1:32 PM, Roni Balthazar ronibaltha...@gmail.com
wrote:

 Hi,

 You can check if the number of SSTables is decreasing. Look for the
 SSTable count information of your tables using nodetool cfstats.
 The compaction history can be viewed using nodetool
 compactionhistory.

 About the timeouts, check this out:
 http://www.datastax.com/dev/blog/how-cassandra-deals-with-replica-failure
 Also try to run nodetool tpstats to see the threads statistics. It
 can lead you to know if you are having performance problems. If you
 are having too many pending tasks or dropped messages, maybe will you
 need to tune your system (eg: driver's timeout, concurrent reads and
 so on)

 Regards,

 Roni Balthazar

 On Wed, Feb 18, 2015 at 9:51 AM, Ja Sam ptrstp...@gmail.com wrote:
  Hi,
  Thanks for your tip it looks that something changed - I still don't
 know
  if it is ok.
 
  My nodes started to do more compaction, but it looks that some
 compactions
  are really slow.
  In IO we have idle, CPU is quite ok (30%-40%). We set compactionthrouput
 to
  999, but I do not see difference.
 
  Can we check something more? Or do you have any method to monitor
 progress
  with small files?
 
  Regards
 
  On Tue, Feb 17, 2015 at 2:43 PM, Roni Balthazar ronibaltha...@gmail.com
 
  wrote:
 
  HI,
 
  Yes... I had the same issue and setting cold_reads_to_omit to 0.0 was
  the solution...
  The number of SSTables decreased from many thousands to a number below
  a hundred and the SSTables are now much bigger with several gigabytes
  (most of them).
 
  Cheers,
 
  Roni Balthazar
 
 
 
  On Tue, Feb 17, 2015 at 11:32 AM, Ja Sam ptrstp...@gmail.com wrote:
   After some diagnostic ( we didn't set yet cold_reads_to_omit ).
   Compaction
   are running but VERY slow with idle IO.
  
   We had a lot of Data files in Cassandra. In DC_A it is about ~12
   (only
   xxx-Data.db) in DC_B has only ~4000.
  
   I don't know if this change anything but:
   1) in DC_A avg size of Data.db file is ~13 mb. I have few a really big
   ones,
   but most is really small (almost 1 files are less then 100mb).
   2) in DC_B avg size of Data.db is much bigger ~260mb.
  
   Do you think that above flag will help us?
  
  
   On Tue, Feb 17, 2015 at 9:04 AM, Ja Sam ptrstp...@gmail.com wrote:
  
   I set setcompactionthroughput 999 permanently and it doesn't change
   anything. IO is still same. CPU is idle.
  
   On Tue, Feb 17, 2015 at 1:15 AM, Roni Balthazar
   ronibaltha...@gmail.com
   wrote:
  
   Hi,
  
   You can run nodetool compactionstats to view statistics on
   compactions.
   Setting cold_reads_to_omit to 0.0 can help to reduce the number of
   SSTables when you use Size-Tiered compaction.
   You can also create a cron job to increase the value of
   setcompactionthroughput during the night or when your IO is not
 busy.
  
   From http://wiki.apache.org/cassandra/NodeTool:
   0 0 * * * root nodetool -h `hostname` setcompactionthroughput 999
   0 6 * * * root nodetool -h `hostname` setcompactionthroughput 16
  
   Cheers,
  
   Roni Balthazar
  
   On Mon, Feb 16, 2015 at 7:47 PM, Ja Sam ptrstp...@gmail.com
 wrote:
One think I do not understand. In my case compaction is running
permanently.
Is there a way to check which compaction is pending? The only
information is
about total count.
   
   
On Monday, February 16, 2015, Ja Sam ptrstp...@gmail.com wrote:
   
Of couse I made a mistake. I am using 2.1.2. Anyway night build
 is
available from
http://cassci.datastax.com/job/cassandra-2.1/
   
I read about cold_reads_to_omit It looks promising. Should I set
also
compaction throughput?
   
p.s. I am really sad that I didn't read this before:
   
   
   
 https://engineering.eventbrite.com/what-version-of-cassandra-should-i-run/
   
   
   
On Monday, February 16, 2015, Carlos Rolo r...@pythian.com
 wrote:
   
Hi 100% in agreement with Roland,
   
2.1.x series is a pain! I would never recommend the current
 2.1.x
series
for production.
   
Clocks is a pain, and check your connectivity! Also check
 tpstats
to
see
if your 

Re: Adding new node to cluster

2015-02-18 Thread Jonathan Lacefield
Hello,

  Please note that DataStax has updated the documentation for replacing a
seed node.  The new docs outline a simplified process to help avoid the
confusion on this topic.


http://www.datastax.com/documentation/cassandra/2.0/cassandra/operations/ops_replace_seed_node.html

Jonathan

[image: datastax_logo.png]

Jonathan Lacefield

Solution Architect | (404) 822 3487 | jlacefi...@datastax.com

[image: linkedin.png] http://www.linkedin.com/in/jlacefield/ [image:
facebook.png] https://www.facebook.com/datastax [image: twitter.png]
https://twitter.com/datastax [image: g+.png]
https://plus.google.com/+Datastax/about
http://feeds.feedburner.com/datastax https://github.com/datastax/

On Tue, Feb 17, 2015 at 8:04 PM, Robert Coli rc...@eventbrite.com wrote:

 On Tue, Feb 17, 2015 at 2:25 PM, sean_r_dur...@homedepot.com wrote:

  SimpleSnitch is not rack aware. You would want to choose seed nodes and
 then not change them. Seed nodes apparently don’t bootstrap.


 No one seems to know what a seed node actually *is*, but seed nodes
 can in fact bootstrap. They just have to temporarily forget to tell
 themselves that they are a seed node while bootstrapping, and then other
 nodes will still gossip to it as a seed once it comes up, even though it
 doesn't consider itself a seed.


 https://issues.apache.org/jira/browse/CASSANDRA-5836?focusedCommentId=13727032page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13727032
 

 Replacing a seed node is a very common operation, and this best practice
 is confusing/poorly documented. There are regular contacts to
 #cassandra/cassandra-user@ where people ask how to replace a seed node,
 and are confused by the answer. The workaround also means that, if you do
 not restart your node after bootstrapping it (and changing the conf file
 back to indicate to itself that it is a seed) the node runs until next
 restart without any understanding that it is a seed node.

 Being a seed node appears to mean two things :

 1) I have myself as an entry in my own seed list, so I know that I am a
 seed.
 2) Other nodes have me in their seed list, so they consider me a seed.

 The current code checks for 1) and refuses to bootstrap. The workaround is
 to remove the 1) state temporarily. But if it is unsafe to bootstrap a seed
 node because of either 1) or 2), the workaround is unsafe.

 Can you explicate the special cases here? I sincerely would like to
 understand why the code tries to prevent a seed from bootstrapping when
 one can clearly, and apparently safely, bootstrap a seed.

 


 Unfortunately, there has been no answer.


 =Rob






Re: Many pending compactions

2015-02-18 Thread Ja Sam
Hi,
Thanks for your tip it looks that something changed - I still don't know
if it is ok.

My nodes started to do more compaction, but it looks that some compactions
are really slow.
In IO we have idle, CPU is quite ok (30%-40%). We set compactionthrouput to
999, but I do not see difference.

Can we check something more? Or do you have any method to monitor progress
with small files?

Regards

On Tue, Feb 17, 2015 at 2:43 PM, Roni Balthazar ronibaltha...@gmail.com
wrote:

 HI,

 Yes... I had the same issue and setting cold_reads_to_omit to 0.0 was
 the solution...
 The number of SSTables decreased from many thousands to a number below
 a hundred and the SSTables are now much bigger with several gigabytes
 (most of them).

 Cheers,

 Roni Balthazar



 On Tue, Feb 17, 2015 at 11:32 AM, Ja Sam ptrstp...@gmail.com wrote:
  After some diagnostic ( we didn't set yet cold_reads_to_omit ).
 Compaction
  are running but VERY slow with idle IO.
 
  We had a lot of Data files in Cassandra. In DC_A it is about ~12
 (only
  xxx-Data.db) in DC_B has only ~4000.
 
  I don't know if this change anything but:
  1) in DC_A avg size of Data.db file is ~13 mb. I have few a really big
 ones,
  but most is really small (almost 1 files are less then 100mb).
  2) in DC_B avg size of Data.db is much bigger ~260mb.
 
  Do you think that above flag will help us?
 
 
  On Tue, Feb 17, 2015 at 9:04 AM, Ja Sam ptrstp...@gmail.com wrote:
 
  I set setcompactionthroughput 999 permanently and it doesn't change
  anything. IO is still same. CPU is idle.
 
  On Tue, Feb 17, 2015 at 1:15 AM, Roni Balthazar 
 ronibaltha...@gmail.com
  wrote:
 
  Hi,
 
  You can run nodetool compactionstats to view statistics on
 compactions.
  Setting cold_reads_to_omit to 0.0 can help to reduce the number of
  SSTables when you use Size-Tiered compaction.
  You can also create a cron job to increase the value of
  setcompactionthroughput during the night or when your IO is not busy.
 
  From http://wiki.apache.org/cassandra/NodeTool:
  0 0 * * * root nodetool -h `hostname` setcompactionthroughput 999
  0 6 * * * root nodetool -h `hostname` setcompactionthroughput 16
 
  Cheers,
 
  Roni Balthazar
 
  On Mon, Feb 16, 2015 at 7:47 PM, Ja Sam ptrstp...@gmail.com wrote:
   One think I do not understand. In my case compaction is running
   permanently.
   Is there a way to check which compaction is pending? The only
   information is
   about total count.
  
  
   On Monday, February 16, 2015, Ja Sam ptrstp...@gmail.com wrote:
  
   Of couse I made a mistake. I am using 2.1.2. Anyway night build is
   available from
   http://cassci.datastax.com/job/cassandra-2.1/
  
   I read about cold_reads_to_omit It looks promising. Should I set
 also
   compaction throughput?
  
   p.s. I am really sad that I didn't read this before:
  
  
 https://engineering.eventbrite.com/what-version-of-cassandra-should-i-run/
  
  
  
   On Monday, February 16, 2015, Carlos Rolo r...@pythian.com wrote:
  
   Hi 100% in agreement with Roland,
  
   2.1.x series is a pain! I would never recommend the current 2.1.x
   series
   for production.
  
   Clocks is a pain, and check your connectivity! Also check tpstats
 to
   see
   if your threadpools are being overrun.
  
   Regards,
  
   Carlos Juzarte Rolo
   Cassandra Consultant
  
   Pythian - Love your data
  
   rolo@pythian | Twitter: cjrolo | Linkedin:
   linkedin.com/in/carlosjuzarterolo
   Tel: 1649
   www.pythian.com
  
   On Mon, Feb 16, 2015 at 8:12 PM, Roland Etzenhammer
   r.etzenham...@t-online.de wrote:
  
   Hi,
  
   1) Actual Cassandra 2.1.3, it was upgraded from 2.1.0 (suggested
 by
   Al
   Tobey from DataStax)
   7) minimal reads (usually none, sometimes few)
  
   those two points keep me repeating an anwser I got. First where
 did
   you
   get 2.1.3 from? Maybe I missed it, I will have a look. But if it
 is
   2.1.2
   whis is the latest released version, that version has many bugs -
   most of
   them I got kicked by while testing 2.1.2. I got many problems with
   compactions not beeing triggred on column families not beeing
 read,
   compactions and repairs not beeing completed.  See
  
  
  
  
 https://www.mail-archive.com/search?l=user@cassandra.apache.orgq=subject:%22Re%3A+Compaction+failing+to+trigger%22o=newestf=1
  
  
 https://www.mail-archive.com/user%40cassandra.apache.org/msg40768.html
  
   Apart from that, how are those both datacenters connected? Maybe
   there
   is a bottleneck.
  
   Also do you have ntp up and running on all nodes to keep all
 clocks
   in
   thight sync?
  
   Note: I'm no expert (yet) - just sharing my 2 cents.
  
   Cheers,
   Roland
  
  
  
   --
  
  
  
  
 
 
 



Re: Many pending compactions

2015-02-18 Thread Ja Sam
1) we tried to run repairs but they usually does not succeed. But we had
Leveled compaction before. Last week we ALTER tables to STCS, because guys
from DataStax suggest us that we should not use Leveled and alter tables in
STCS, because we don't have SSD. After this change we did not run any
repair. Anyway I don't think it will change anything in SSTable count - if
I am wrong please give me an information

2) I did this. My tables are 99% write only. It is audit system

3) Yes I am using default values

4) In both operations I am using LOCAL_QUORUM.

I am almost sure that READ timeout happens because of too much SSTables.
Anyway firstly I would like to fix to many pending compactions. I still
don't know how to speed up them.


On Wed, Feb 18, 2015 at 2:49 PM, Roni Balthazar ronibaltha...@gmail.com
wrote:

 Are you running repairs within gc_grace_seconds? (default is 10 days)

 http://www.datastax.com/documentation/cassandra/2.0/cassandra/operations/ops_repair_nodes_c.html

 Double check if you set cold_reads_to_omit to 0.0 on tables with STCS
 that you do not read often.

 Are you using default values for the properties
 min_compaction_threshold(4) and max_compaction_threshold(32)?

 Which Consistency Level are you using for reading operations? Check if
 you are not reading from DC_B due to your Replication Factor and CL.

 http://www.datastax.com/documentation/cassandra/2.0/cassandra/dml/dml_config_consistency_c.html


 Cheers,

 Roni Balthazar

 On Wed, Feb 18, 2015 at 11:07 AM, Ja Sam ptrstp...@gmail.com wrote:
  I don't have problems with DC_B (replica) only in DC_A(my system write
 only
  to it) I have read timeouts.
 
  I checked in OpsCenter SSTable count  and I have:
  1) in DC_A  same +-10% for last week, a small increase for last 24h (it
 is
  more than 15000-2 SSTables depends on node)
  2) in DC_B last 24h shows up to 50% decrease, which give nice
 prognostics.
  Now I have less then 1000 SSTables
 
  What did you measure during system optimizations? Or do you have an idea
  what more should I check?
  1) I look at CPU Idle (one node is 50% idle, rest 70% idle)
  2) Disk queue - mostly is it near zero: avg 0.09. Sometimes there are
  spikes
  3) system RAM usage is almost full
  4) In Total Bytes Compacted most most lines are below 3MB/s. For total
 DC_A
  it is less than 10MB/s, in DC_B it looks much better (avg is like 17MB/s)
 
  something else?
 
 
 
  On Wed, Feb 18, 2015 at 1:32 PM, Roni Balthazar ronibaltha...@gmail.com
 
  wrote:
 
  Hi,
 
  You can check if the number of SSTables is decreasing. Look for the
  SSTable count information of your tables using nodetool cfstats.
  The compaction history can be viewed using nodetool
  compactionhistory.
 
  About the timeouts, check this out:
 
 http://www.datastax.com/dev/blog/how-cassandra-deals-with-replica-failure
  Also try to run nodetool tpstats to see the threads statistics. It
  can lead you to know if you are having performance problems. If you
  are having too many pending tasks or dropped messages, maybe will you
  need to tune your system (eg: driver's timeout, concurrent reads and
  so on)
 
  Regards,
 
  Roni Balthazar
 
  On Wed, Feb 18, 2015 at 9:51 AM, Ja Sam ptrstp...@gmail.com wrote:
   Hi,
   Thanks for your tip it looks that something changed - I still don't
   know
   if it is ok.
  
   My nodes started to do more compaction, but it looks that some
   compactions
   are really slow.
   In IO we have idle, CPU is quite ok (30%-40%). We set
 compactionthrouput
   to
   999, but I do not see difference.
  
   Can we check something more? Or do you have any method to monitor
   progress
   with small files?
  
   Regards
  
   On Tue, Feb 17, 2015 at 2:43 PM, Roni Balthazar
   ronibaltha...@gmail.com
   wrote:
  
   HI,
  
   Yes... I had the same issue and setting cold_reads_to_omit to 0.0 was
   the solution...
   The number of SSTables decreased from many thousands to a number
 below
   a hundred and the SSTables are now much bigger with several gigabytes
   (most of them).
  
   Cheers,
  
   Roni Balthazar
  
  
  
   On Tue, Feb 17, 2015 at 11:32 AM, Ja Sam ptrstp...@gmail.com
 wrote:
After some diagnostic ( we didn't set yet cold_reads_to_omit ).
Compaction
are running but VERY slow with idle IO.
   
We had a lot of Data files in Cassandra. In DC_A it is about
~12
(only
xxx-Data.db) in DC_B has only ~4000.
   
I don't know if this change anything but:
1) in DC_A avg size of Data.db file is ~13 mb. I have few a really
big
ones,
but most is really small (almost 1 files are less then 100mb).
2) in DC_B avg size of Data.db is much bigger ~260mb.
   
Do you think that above flag will help us?
   
   
On Tue, Feb 17, 2015 at 9:04 AM, Ja Sam ptrstp...@gmail.com
 wrote:
   
I set setcompactionthroughput 999 permanently and it doesn't
 change
anything. IO is still same. CPU is idle.
   
On Tue, Feb 17, 2015 at 1:15 AM, Roni Balthazar

Re: Adding new node to cluster

2015-02-18 Thread Batranut Bogdan
Hello,
I have decommissioned a node, deleted data,commitlog and saved caches, changed 
yaml file to not include self ip and started it. For some reason I do not fully 
understand, Opscenter says that the node is in an unknown datacenter. Nodetool 
says UJ but shows ? in the Owns column. I have started the node yesterday. I 
still see streams towards this node so I can I assume that once it finishes 
joining, Opscenter will see it properly?
I remember that I might have setup the entire initial cluster with own ips in 
the yaml file.To bring the cluster to a valid state, I assume that I have to 
decommission the nodes one by one, delete the data, and restart with correct 
yaml settings.Is this correct?
Also a change of the cluster name would be nice.How can this be done with 
minimal impact?
 On Wednesday, February 18, 2015 12:56 AM, Eric Stevens migh...@gmail.com 
wrote:
   

  Seed nodes apparently don’t bootstrap

That's right, if a node has itself in its own seeds list, it assumes it's a 
foundational member of the cluster, and it will join immediately with no 
bootstrap.
If you've done this by accident, you should do nodetool decommission on that 
node, and when it's fully left the cluster, wipe its data directory, edit the 
yaml and remove it from the seeds list.
On Tue, Feb 17, 2015 at 3:25 PM, sean_r_dur...@homedepot.com wrote:

SimpleSnitch is not rack aware. You would want to choose seed nodes and then 
not change them. Seed nodes apparently don’t bootstrap. All nodes need the same 
seeds in the yaml file. Here is more info: 
http://www.datastax.com/documentation/cassandra/2.0/cassandra/initialize/initializeSingleDS.html
   Sean Durity – Cassandra Admin, Big Data TeamTo engage the team,create a 
request From: Batranut Bogdan [mailto:batra...@yahoo.com]
Sent: Tuesday, February 17, 2015 3:28 PM
To: user@cassandra.apache.org; reynald.bourtembo...@esrf.fr
Subject: Re: Adding new node to cluster Hello, I use SimpleSnitch. All the 
nodes are in the sane datacenter. Not sure if all are in the same rack. On 
Tuesday, February 17, 2015 8:53 PM, sean_r_dur...@homedepot.com 
sean_r_dur...@homedepot.com wrote: What snitch are you using? You may need to 
do some work on your topology file (or rackdc) to make sure you have the 
topology you want. Also, it is possible you may need to restart OpsCenter 
agents and/or your browser to see the nodes represented properly in OpsCenter.  
Sean Durity – Cassandra Admin, Home Depot From: Batranut Bogdan 
[mailto:batra...@yahoo.com]
Sent: Tuesday, February 17, 2015 10:20 AM
To: user@cassandra.apache.org;reynald.bourtembo...@esrf.fr
Subject: Re: Adding new node to cluster Hello, I know that UN is good, but what 
troubles me is the addition of the own node's ip in it's yaml seeds section. On 
Tuesday, February 17, 2015 3:40 PM, Reynald Bourtembourg 
reynald.bourtembo...@esrf.fr wrote: Hi Bogdan

In nodetool status:   
   - UJ: means your node is Up and Joining
   - UN: means your node is Up and in Normal state
UN in nodetool is good ;-) On 17/02/2015 13:56, Batranut Bogdan wrote:
Hello all, I have an existing cluster. When adding a new node, I saw that 
Opscenter saw the node in an unknown cluster. In the yaml, the cluster name is 
the same. So i have stopped the node and added it's ip address in the list of 
seeds. Now Opscenter sees my node. But nodetool status now sees it as UN, 
instead of UJ when it first started. One other mension is that even if I stop 
the node, remove it's ip from the list of seeds, Opscenter sees the node in the 
known clustre but nodetool sees it as UN. I am not sure what the implications 
of adding a node's ip in it's seed list are and I think that for the existing 
nodes I have might done the same. Eg. started with it's ip in the seed list but 
after removing it and having to restart the nodes for whatever reason, I did 
not see any changes. Is my cluster ok, or what do I need to do to bring the 
cluster to a good state? Thank you.
   
The information in this Internet Email is confidential and may be legally 
privileged. It is intended solely for the addressee. Access to this Email by 
anyone else is unauthorized. If you are not the intended recipient, any 
disclosure, copying, distribution or any action taken or omitted to be taken in 
reliance on it, is prohibited and may be unlawful. When addressed to our 
clients any opinions or advice contained in this Email are subject to the terms 
and conditions expressed in any applicable governing The Home Depot terms of 
business or client engagement letter. The Home Depot disclaims all 
responsibility and liability for the accuracy and content of this attachment 
and for any damages or losses arising from any inaccuracies, errors, viruses, 
e.g., worms, trojan horses, etc., or other items of a destructive nature, which 
may be contained in this attachment and shall not be liable for direct, 
indirect, consequential or special damages in connection with this e-mail 
message or its attachment.  


Re: Many pending compactions

2015-02-18 Thread Roni Balthazar
Are you running repairs within gc_grace_seconds? (default is 10 days)
http://www.datastax.com/documentation/cassandra/2.0/cassandra/operations/ops_repair_nodes_c.html

Double check if you set cold_reads_to_omit to 0.0 on tables with STCS
that you do not read often.

Are you using default values for the properties
min_compaction_threshold(4) and max_compaction_threshold(32)?

Which Consistency Level are you using for reading operations? Check if
you are not reading from DC_B due to your Replication Factor and CL.
http://www.datastax.com/documentation/cassandra/2.0/cassandra/dml/dml_config_consistency_c.html


Cheers,

Roni Balthazar

On Wed, Feb 18, 2015 at 11:07 AM, Ja Sam ptrstp...@gmail.com wrote:
 I don't have problems with DC_B (replica) only in DC_A(my system write only
 to it) I have read timeouts.

 I checked in OpsCenter SSTable count  and I have:
 1) in DC_A  same +-10% for last week, a small increase for last 24h (it is
 more than 15000-2 SSTables depends on node)
 2) in DC_B last 24h shows up to 50% decrease, which give nice prognostics.
 Now I have less then 1000 SSTables

 What did you measure during system optimizations? Or do you have an idea
 what more should I check?
 1) I look at CPU Idle (one node is 50% idle, rest 70% idle)
 2) Disk queue - mostly is it near zero: avg 0.09. Sometimes there are
 spikes
 3) system RAM usage is almost full
 4) In Total Bytes Compacted most most lines are below 3MB/s. For total DC_A
 it is less than 10MB/s, in DC_B it looks much better (avg is like 17MB/s)

 something else?



 On Wed, Feb 18, 2015 at 1:32 PM, Roni Balthazar ronibaltha...@gmail.com
 wrote:

 Hi,

 You can check if the number of SSTables is decreasing. Look for the
 SSTable count information of your tables using nodetool cfstats.
 The compaction history can be viewed using nodetool
 compactionhistory.

 About the timeouts, check this out:
 http://www.datastax.com/dev/blog/how-cassandra-deals-with-replica-failure
 Also try to run nodetool tpstats to see the threads statistics. It
 can lead you to know if you are having performance problems. If you
 are having too many pending tasks or dropped messages, maybe will you
 need to tune your system (eg: driver's timeout, concurrent reads and
 so on)

 Regards,

 Roni Balthazar

 On Wed, Feb 18, 2015 at 9:51 AM, Ja Sam ptrstp...@gmail.com wrote:
  Hi,
  Thanks for your tip it looks that something changed - I still don't
  know
  if it is ok.
 
  My nodes started to do more compaction, but it looks that some
  compactions
  are really slow.
  In IO we have idle, CPU is quite ok (30%-40%). We set compactionthrouput
  to
  999, but I do not see difference.
 
  Can we check something more? Or do you have any method to monitor
  progress
  with small files?
 
  Regards
 
  On Tue, Feb 17, 2015 at 2:43 PM, Roni Balthazar
  ronibaltha...@gmail.com
  wrote:
 
  HI,
 
  Yes... I had the same issue and setting cold_reads_to_omit to 0.0 was
  the solution...
  The number of SSTables decreased from many thousands to a number below
  a hundred and the SSTables are now much bigger with several gigabytes
  (most of them).
 
  Cheers,
 
  Roni Balthazar
 
 
 
  On Tue, Feb 17, 2015 at 11:32 AM, Ja Sam ptrstp...@gmail.com wrote:
   After some diagnostic ( we didn't set yet cold_reads_to_omit ).
   Compaction
   are running but VERY slow with idle IO.
  
   We had a lot of Data files in Cassandra. In DC_A it is about
   ~12
   (only
   xxx-Data.db) in DC_B has only ~4000.
  
   I don't know if this change anything but:
   1) in DC_A avg size of Data.db file is ~13 mb. I have few a really
   big
   ones,
   but most is really small (almost 1 files are less then 100mb).
   2) in DC_B avg size of Data.db is much bigger ~260mb.
  
   Do you think that above flag will help us?
  
  
   On Tue, Feb 17, 2015 at 9:04 AM, Ja Sam ptrstp...@gmail.com wrote:
  
   I set setcompactionthroughput 999 permanently and it doesn't change
   anything. IO is still same. CPU is idle.
  
   On Tue, Feb 17, 2015 at 1:15 AM, Roni Balthazar
   ronibaltha...@gmail.com
   wrote:
  
   Hi,
  
   You can run nodetool compactionstats to view statistics on
   compactions.
   Setting cold_reads_to_omit to 0.0 can help to reduce the number of
   SSTables when you use Size-Tiered compaction.
   You can also create a cron job to increase the value of
   setcompactionthroughput during the night or when your IO is not
   busy.
  
   From http://wiki.apache.org/cassandra/NodeTool:
   0 0 * * * root nodetool -h `hostname` setcompactionthroughput 999
   0 6 * * * root nodetool -h `hostname` setcompactionthroughput 16
  
   Cheers,
  
   Roni Balthazar
  
   On Mon, Feb 16, 2015 at 7:47 PM, Ja Sam ptrstp...@gmail.com
   wrote:
One think I do not understand. In my case compaction is running
permanently.
Is there a way to check which compaction is pending? The only
information is
about total count.
   
   
On Monday, February 16, 2015, Ja Sam ptrstp...@gmail.com wrote:
   
 

Re: Many pending compactions

2015-02-18 Thread Roni Balthazar
You are right... Repair makes the data consistent between nodes.

I understand that you have 2 issues going on.

You need to run repair periodically without errors and need to decrease the
numbers of compactions pending.

So I suggest:

1) Run repair -pr on all nodes. If you upgrade to the new 2.1.3, you can
use incremental repairs. There were some bugs on 2.1.2.
2) Run cleanup on all nodes
3) Since you have too many cold SSTables, set cold_reads_to_omit to 0.0,
and increase setcompactionthroughput for some time and see if the number of
SSTables is going down.

Let us know what errors are you getting when running repairs.

Regards,

Roni Balthazar


On Wed, Feb 18, 2015 at 1:31 PM, Ja Sam ptrstp...@gmail.com wrote:

 Can you explain me what is the correlation between growing SSTables and
 repair?
 I was sure, until your  mail, that repair is only to make data consistent
 between nodes.

 Regards


 On Wed, Feb 18, 2015 at 4:20 PM, Roni Balthazar ronibaltha...@gmail.com
 wrote:

 Which error are you getting when running repairs?
 You need to run repair on your nodes within gc_grace_seconds (eg:
 weekly). They have data that are not read frequently. You can run
 repair -pr on all nodes. Since you do not have deletes, you will not
 have trouble with that. If you have deletes, it's better to increase
 gc_grace_seconds before the repair.

 http://www.datastax.com/documentation/cassandra/2.0/cassandra/operations/ops_repair_nodes_c.html
 After repair, try to run a nodetool cleanup.

 Check if the number of SSTables goes down after that... Pending
 compactions must decrease as well...

 Cheers,

 Roni Balthazar




 On Wed, Feb 18, 2015 at 12:39 PM, Ja Sam ptrstp...@gmail.com wrote:
  1) we tried to run repairs but they usually does not succeed. But we had
  Leveled compaction before. Last week we ALTER tables to STCS, because
 guys
  from DataStax suggest us that we should not use Leveled and alter
 tables in
  STCS, because we don't have SSD. After this change we did not run any
  repair. Anyway I don't think it will change anything in SSTable count -
 if I
  am wrong please give me an information
 
  2) I did this. My tables are 99% write only. It is audit system
 
  3) Yes I am using default values
 
  4) In both operations I am using LOCAL_QUORUM.
 
  I am almost sure that READ timeout happens because of too much SSTables.
  Anyway firstly I would like to fix to many pending compactions. I still
  don't know how to speed up them.
 
 
  On Wed, Feb 18, 2015 at 2:49 PM, Roni Balthazar 
 ronibaltha...@gmail.com
  wrote:
 
  Are you running repairs within gc_grace_seconds? (default is 10 days)
 
 
 http://www.datastax.com/documentation/cassandra/2.0/cassandra/operations/ops_repair_nodes_c.html
 
  Double check if you set cold_reads_to_omit to 0.0 on tables with STCS
  that you do not read often.
 
  Are you using default values for the properties
  min_compaction_threshold(4) and max_compaction_threshold(32)?
 
  Which Consistency Level are you using for reading operations? Check if
  you are not reading from DC_B due to your Replication Factor and CL.
 
 
 http://www.datastax.com/documentation/cassandra/2.0/cassandra/dml/dml_config_consistency_c.html
 
 
  Cheers,
 
  Roni Balthazar
 
  On Wed, Feb 18, 2015 at 11:07 AM, Ja Sam ptrstp...@gmail.com wrote:
   I don't have problems with DC_B (replica) only in DC_A(my system
 write
   only
   to it) I have read timeouts.
  
   I checked in OpsCenter SSTable count  and I have:
   1) in DC_A  same +-10% for last week, a small increase for last 24h
 (it
   is
   more than 15000-2 SSTables depends on node)
   2) in DC_B last 24h shows up to 50% decrease, which give nice
   prognostics.
   Now I have less then 1000 SSTables
  
   What did you measure during system optimizations? Or do you have an
 idea
   what more should I check?
   1) I look at CPU Idle (one node is 50% idle, rest 70% idle)
   2) Disk queue - mostly is it near zero: avg 0.09. Sometimes there
 are
   spikes
   3) system RAM usage is almost full
   4) In Total Bytes Compacted most most lines are below 3MB/s. For
 total
   DC_A
   it is less than 10MB/s, in DC_B it looks much better (avg is like
   17MB/s)
  
   something else?
  
  
  
   On Wed, Feb 18, 2015 at 1:32 PM, Roni Balthazar
   ronibaltha...@gmail.com
   wrote:
  
   Hi,
  
   You can check if the number of SSTables is decreasing. Look for the
   SSTable count information of your tables using nodetool cfstats.
   The compaction history can be viewed using nodetool
   compactionhistory.
  
   About the timeouts, check this out:
  
  
 http://www.datastax.com/dev/blog/how-cassandra-deals-with-replica-failure
   Also try to run nodetool tpstats to see the threads statistics. It
   can lead you to know if you are having performance problems. If you
   are having too many pending tasks or dropped messages, maybe will
 you
   need to tune your system (eg: driver's timeout, concurrent reads and
   so on)
  
   Regards,
  
   Roni Balthazar
  
   

Re: Many pending compactions

2015-02-18 Thread Roni Balthazar
Hi,

You can check if the number of SSTables is decreasing. Look for the
SSTable count information of your tables using nodetool cfstats.
The compaction history can be viewed using nodetool
compactionhistory.

About the timeouts, check this out:
http://www.datastax.com/dev/blog/how-cassandra-deals-with-replica-failure
Also try to run nodetool tpstats to see the threads statistics. It
can lead you to know if you are having performance problems. If you
are having too many pending tasks or dropped messages, maybe will you
need to tune your system (eg: driver's timeout, concurrent reads and
so on)

Regards,

Roni Balthazar

On Wed, Feb 18, 2015 at 9:51 AM, Ja Sam ptrstp...@gmail.com wrote:
 Hi,
 Thanks for your tip it looks that something changed - I still don't know
 if it is ok.

 My nodes started to do more compaction, but it looks that some compactions
 are really slow.
 In IO we have idle, CPU is quite ok (30%-40%). We set compactionthrouput to
 999, but I do not see difference.

 Can we check something more? Or do you have any method to monitor progress
 with small files?

 Regards

 On Tue, Feb 17, 2015 at 2:43 PM, Roni Balthazar ronibaltha...@gmail.com
 wrote:

 HI,

 Yes... I had the same issue and setting cold_reads_to_omit to 0.0 was
 the solution...
 The number of SSTables decreased from many thousands to a number below
 a hundred and the SSTables are now much bigger with several gigabytes
 (most of them).

 Cheers,

 Roni Balthazar



 On Tue, Feb 17, 2015 at 11:32 AM, Ja Sam ptrstp...@gmail.com wrote:
  After some diagnostic ( we didn't set yet cold_reads_to_omit ).
  Compaction
  are running but VERY slow with idle IO.
 
  We had a lot of Data files in Cassandra. In DC_A it is about ~12
  (only
  xxx-Data.db) in DC_B has only ~4000.
 
  I don't know if this change anything but:
  1) in DC_A avg size of Data.db file is ~13 mb. I have few a really big
  ones,
  but most is really small (almost 1 files are less then 100mb).
  2) in DC_B avg size of Data.db is much bigger ~260mb.
 
  Do you think that above flag will help us?
 
 
  On Tue, Feb 17, 2015 at 9:04 AM, Ja Sam ptrstp...@gmail.com wrote:
 
  I set setcompactionthroughput 999 permanently and it doesn't change
  anything. IO is still same. CPU is idle.
 
  On Tue, Feb 17, 2015 at 1:15 AM, Roni Balthazar
  ronibaltha...@gmail.com
  wrote:
 
  Hi,
 
  You can run nodetool compactionstats to view statistics on
  compactions.
  Setting cold_reads_to_omit to 0.0 can help to reduce the number of
  SSTables when you use Size-Tiered compaction.
  You can also create a cron job to increase the value of
  setcompactionthroughput during the night or when your IO is not busy.
 
  From http://wiki.apache.org/cassandra/NodeTool:
  0 0 * * * root nodetool -h `hostname` setcompactionthroughput 999
  0 6 * * * root nodetool -h `hostname` setcompactionthroughput 16
 
  Cheers,
 
  Roni Balthazar
 
  On Mon, Feb 16, 2015 at 7:47 PM, Ja Sam ptrstp...@gmail.com wrote:
   One think I do not understand. In my case compaction is running
   permanently.
   Is there a way to check which compaction is pending? The only
   information is
   about total count.
  
  
   On Monday, February 16, 2015, Ja Sam ptrstp...@gmail.com wrote:
  
   Of couse I made a mistake. I am using 2.1.2. Anyway night build is
   available from
   http://cassci.datastax.com/job/cassandra-2.1/
  
   I read about cold_reads_to_omit It looks promising. Should I set
   also
   compaction throughput?
  
   p.s. I am really sad that I didn't read this before:
  
  
   https://engineering.eventbrite.com/what-version-of-cassandra-should-i-run/
  
  
  
   On Monday, February 16, 2015, Carlos Rolo r...@pythian.com wrote:
  
   Hi 100% in agreement with Roland,
  
   2.1.x series is a pain! I would never recommend the current 2.1.x
   series
   for production.
  
   Clocks is a pain, and check your connectivity! Also check tpstats
   to
   see
   if your threadpools are being overrun.
  
   Regards,
  
   Carlos Juzarte Rolo
   Cassandra Consultant
  
   Pythian - Love your data
  
   rolo@pythian | Twitter: cjrolo | Linkedin:
   linkedin.com/in/carlosjuzarterolo
   Tel: 1649
   www.pythian.com
  
   On Mon, Feb 16, 2015 at 8:12 PM, Roland Etzenhammer
   r.etzenham...@t-online.de wrote:
  
   Hi,
  
   1) Actual Cassandra 2.1.3, it was upgraded from 2.1.0 (suggested
   by
   Al
   Tobey from DataStax)
   7) minimal reads (usually none, sometimes few)
  
   those two points keep me repeating an anwser I got. First where
   did
   you
   get 2.1.3 from? Maybe I missed it, I will have a look. But if it
   is
   2.1.2
   whis is the latest released version, that version has many bugs -
   most of
   them I got kicked by while testing 2.1.2. I got many problems
   with
   compactions not beeing triggred on column families not beeing
   read,
   compactions and repairs not beeing completed.  See
  
  
  
  
   

Re: Many pending compactions

2015-02-18 Thread Roni Balthazar
Which error are you getting when running repairs?
You need to run repair on your nodes within gc_grace_seconds (eg:
weekly). They have data that are not read frequently. You can run
repair -pr on all nodes. Since you do not have deletes, you will not
have trouble with that. If you have deletes, it's better to increase
gc_grace_seconds before the repair.
http://www.datastax.com/documentation/cassandra/2.0/cassandra/operations/ops_repair_nodes_c.html
After repair, try to run a nodetool cleanup.

Check if the number of SSTables goes down after that... Pending
compactions must decrease as well...

Cheers,

Roni Balthazar




On Wed, Feb 18, 2015 at 12:39 PM, Ja Sam ptrstp...@gmail.com wrote:
 1) we tried to run repairs but they usually does not succeed. But we had
 Leveled compaction before. Last week we ALTER tables to STCS, because guys
 from DataStax suggest us that we should not use Leveled and alter tables in
 STCS, because we don't have SSD. After this change we did not run any
 repair. Anyway I don't think it will change anything in SSTable count - if I
 am wrong please give me an information

 2) I did this. My tables are 99% write only. It is audit system

 3) Yes I am using default values

 4) In both operations I am using LOCAL_QUORUM.

 I am almost sure that READ timeout happens because of too much SSTables.
 Anyway firstly I would like to fix to many pending compactions. I still
 don't know how to speed up them.


 On Wed, Feb 18, 2015 at 2:49 PM, Roni Balthazar ronibaltha...@gmail.com
 wrote:

 Are you running repairs within gc_grace_seconds? (default is 10 days)

 http://www.datastax.com/documentation/cassandra/2.0/cassandra/operations/ops_repair_nodes_c.html

 Double check if you set cold_reads_to_omit to 0.0 on tables with STCS
 that you do not read often.

 Are you using default values for the properties
 min_compaction_threshold(4) and max_compaction_threshold(32)?

 Which Consistency Level are you using for reading operations? Check if
 you are not reading from DC_B due to your Replication Factor and CL.

 http://www.datastax.com/documentation/cassandra/2.0/cassandra/dml/dml_config_consistency_c.html


 Cheers,

 Roni Balthazar

 On Wed, Feb 18, 2015 at 11:07 AM, Ja Sam ptrstp...@gmail.com wrote:
  I don't have problems with DC_B (replica) only in DC_A(my system write
  only
  to it) I have read timeouts.
 
  I checked in OpsCenter SSTable count  and I have:
  1) in DC_A  same +-10% for last week, a small increase for last 24h (it
  is
  more than 15000-2 SSTables depends on node)
  2) in DC_B last 24h shows up to 50% decrease, which give nice
  prognostics.
  Now I have less then 1000 SSTables
 
  What did you measure during system optimizations? Or do you have an idea
  what more should I check?
  1) I look at CPU Idle (one node is 50% idle, rest 70% idle)
  2) Disk queue - mostly is it near zero: avg 0.09. Sometimes there are
  spikes
  3) system RAM usage is almost full
  4) In Total Bytes Compacted most most lines are below 3MB/s. For total
  DC_A
  it is less than 10MB/s, in DC_B it looks much better (avg is like
  17MB/s)
 
  something else?
 
 
 
  On Wed, Feb 18, 2015 at 1:32 PM, Roni Balthazar
  ronibaltha...@gmail.com
  wrote:
 
  Hi,
 
  You can check if the number of SSTables is decreasing. Look for the
  SSTable count information of your tables using nodetool cfstats.
  The compaction history can be viewed using nodetool
  compactionhistory.
 
  About the timeouts, check this out:
 
  http://www.datastax.com/dev/blog/how-cassandra-deals-with-replica-failure
  Also try to run nodetool tpstats to see the threads statistics. It
  can lead you to know if you are having performance problems. If you
  are having too many pending tasks or dropped messages, maybe will you
  need to tune your system (eg: driver's timeout, concurrent reads and
  so on)
 
  Regards,
 
  Roni Balthazar
 
  On Wed, Feb 18, 2015 at 9:51 AM, Ja Sam ptrstp...@gmail.com wrote:
   Hi,
   Thanks for your tip it looks that something changed - I still don't
   know
   if it is ok.
  
   My nodes started to do more compaction, but it looks that some
   compactions
   are really slow.
   In IO we have idle, CPU is quite ok (30%-40%). We set
   compactionthrouput
   to
   999, but I do not see difference.
  
   Can we check something more? Or do you have any method to monitor
   progress
   with small files?
  
   Regards
  
   On Tue, Feb 17, 2015 at 2:43 PM, Roni Balthazar
   ronibaltha...@gmail.com
   wrote:
  
   HI,
  
   Yes... I had the same issue and setting cold_reads_to_omit to 0.0
   was
   the solution...
   The number of SSTables decreased from many thousands to a number
   below
   a hundred and the SSTables are now much bigger with several
   gigabytes
   (most of them).
  
   Cheers,
  
   Roni Balthazar
  
  
  
   On Tue, Feb 17, 2015 at 11:32 AM, Ja Sam ptrstp...@gmail.com
   wrote:
After some diagnostic ( we didn't set yet cold_reads_to_omit ).
Compaction
are running but VERY slow 

RE: Data tiered compaction and data model question

2015-02-18 Thread Mohammed Guller
What is the maximum number of events that you expect in a day? What is the 
worst-case scenario?

Mohammed

From: cass savy [mailto:casss...@gmail.com]
Sent: Wednesday, February 18, 2015 4:21 PM
To: user@cassandra.apache.org
Subject: Data tiered compaction and data model question

We want to track events in log  Cf/table and should be able to query for events 
that occurred in range of mins or hours for given day. Multiple events can 
occur in a given minute.  Listed 2 table designs and leaning towards table 1 to 
avoid large wide row.  Please advice on

Table 1: not very widerow, still be able to query for range of minutes for 
given day
and/or given day and range of hours
Create table log_Event
(
 event_day text,
 event_hr int,
 event_time timeuuid,
 data text,
PRIMARY KEY ( (event_day,event_hr),event_time)
)
Table 2: This will be very wide row

Create table log_Event
( event_day text,
 event_time timeuuid,
 data text,
PRIMARY KEY ( event_day,event_time)
)

Datatiered compaction: recommended for time series data as per below doc. Our 
data will be kept only for 30 days. Hence thought of using this compaction 
strategy.
http://www.datastax.com/dev/blog/datetieredcompactionstrategy
Create table 1 listed above with this compaction strategy. Added some rows and 
did manual flush.  I do not see any sstables created yet. Is that expected?
 compaction={'max_sstable_age_days': '1', 'class': 
'DateTieredCompactionStrategy'}



Logging client ID for YCSB workloads on Cassandra?

2015-02-18 Thread Jatin Ganhotra
Hi,

I'd like to log the client ID for every operation performed by the YCSB on
my Cassandra cluster.

The purpose is to identify  analyze various other consistency measures
other than eventual consistency.

I wanted to know if people have done something similar in the past. Or am I
missing something really basic here?

Please let me know if you need more information. Thanks
—
Jatin Ganhotra


Data tiered compaction and data model question

2015-02-18 Thread cass savy
We want to track events in log  Cf/table and should be able to query for
events that occurred in range of mins or hours for given day. Multiple
events can occur in a given minute.  Listed 2 table designs and leaning
towards table 1 to avoid large wide row.  Please advice on

*Table 1*: not very widerow, still be able to query for range of minutes
for given day
and/or given day and range of hours

Create table *log_Event*

(

 event_day text,

 event_hr int,

 event_time timeuuid,

 data text,

PRIMARY KEY ( *(event_day,event_hr),*event_time)

)
*Table 2: This will be very wide row*

Create table *log_Event*

( event_day text,

 event_time timeuuid,

 data text,

PRIMARY KEY ( *event_day,*event_time)

)


*Datatiered compaction: recommended for time series data as per below doc.
Our data will be kept only for 30 days. Hence thought of using this
compaction strategy.*

http://www.datastax.com/dev/blog/datetieredcompactionstrategy

Create table 1 listed above with this compaction strategy. Added some rows
and did manual flush.  I do not see any sstables created yet. Is that
expected?

 compaction={'max_sstable_age_days': '1', 'class':
'DateTieredCompactionStrategy'}


Re: run cassandra on a small instance

2015-02-18 Thread Tim Dunphy

 2.1.2 is IMO broken and should not be used for any purpose.
 Use 2.1.1 or 2.1.3.
 https://engineering.eventbrite.com/what-version-of-cassandra-should-i-run/
 =Rob


Cool man. Thanks for the info. I just upgraded to 2.1.3. We'll see how that
goes. I can let you know more once it's been running for a while.

Thanks
Tim

On Wed, Feb 18, 2015 at 8:16 PM, Robert Coli rc...@eventbrite.com wrote:

 On Wed, Feb 18, 2015 at 5:09 PM, Tim Dunphy bluethu...@gmail.com wrote:

 I'm attempting to run Cassandra 2.1.2 on a smallish 2.GB ram instance
 over at Digital Ocean. It's a CentOS 7 host.


 2.1.2 is IMO broken and should not be used for any purpose.

 Use 2.1.1 or 2.1.3.

 https://engineering.eventbrite.com/what-version-of-cassandra-should-i-run/

 =Rob





-- 
GPG me!!

gpg --keyserver pool.sks-keyservers.net --recv-keys F186197B


Re: run cassandra on a small instance

2015-02-18 Thread Andrew
Robert,

Let me know if I’m off base about this—but I feel like I see a lot of posts 
that are like this (i.e., use this arbitrary version, not this other arbitrary 
version).  Why are releases going out if they’re “broken”?  This seems like a 
very confusing way for new (and existing) users to approach versions...

Andrew

On February 18, 2015 at 5:16:27 PM, Robert Coli (rc...@eventbrite.com) wrote:

On Wed, Feb 18, 2015 at 5:09 PM, Tim Dunphy bluethu...@gmail.com wrote:
I'm attempting to run Cassandra 2.1.2 on a smallish 2.GB ram instance over at 
Digital Ocean. It's a CentOS 7 host.

2.1.2 is IMO broken and should not be used for any purpose.

Use 2.1.1 or 2.1.3.

https://engineering.eventbrite.com/what-version-of-cassandra-should-i-run/

=Rob