Re: Repair fails with java.io.IOError: java.io.EOFException

2011-07-26 Thread Sylvain Lebresne
 If they are and repair has completed use node tool cleanup to remove the
 data the node is no longer responsible. See bootstrap section above.

I've seen that said a few times so allow me to correct. Cleanup is useless after
a repair. 'nodetool cleanup' removes rows the node is not responsible anymore
and is thus useful only after operations that change the range a node is
responsible for (bootstrap, move, decommission). After a repair, you will need
compaction to kick in to see you disk usage come back to normal.

--
Sylvain

 Hope that helps.

 -
 Aaron Morton
 Freelance Cassandra Developer
 @aaronmorton
 http://www.thelastpickle.com
 On 26 Jul 2011, at 12:44, Sameer Farooqui wrote:

 Looks like the repair finished successfully the second time. However, the
 cluster is still severely unbalanced. I was hoping the repair would balance
 the nodes. We're using random partitioner. One node has 900GB and others
 have 128GB, 191GB, 129GB, 257 GB, etc. The 900GB and the 646GB are just
 insanely high. Not sure why or how to troubleshoot.



 On Fri, Jul 22, 2011 at 1:28 PM, Sameer Farooqui cassandral...@gmail.com
 wrote:

 I don't see a JVM crashlog ( hs_err_pid[pid].log) in
 ~/brisk/resources/cassandra/bin or /tmp. So maybe JVM didn't crash?

 We're running a pretty up to date with Sun Java:

 ubuntu@ip-10-2-x-x:/tmp$ java -version
 java version 1.6.0_24
 Java(TM) SE Runtime Environment (build 1.6.0_24-b07)
 Java HotSpot(TM) 64-Bit Server VM (build 19.1-b02, mixed mode)

 I'm gonna restart the Repair process in a few more hours. If there are any
 additional debug or troubleshooting logs you'd like me to enable first,
 please let me know.

 - Sameer



 On Thu, Jul 21, 2011 at 5:31 PM, Jonathan Ellis jbel...@gmail.com wrote:

 Did you check for a JVM crash log?

 You should make sure you're running the latest Sun JVM, older versions
 and OpenJDK in particular are prone to segfaulting.

 On Thu, Jul 21, 2011 at 6:53 PM, Sameer Farooqui
 cassandral...@gmail.com wrote:
  We are starting Cassandra with brisk cassandra, so as a stand-alone
  process, not a service.
 
  The syslog on the node doesn't show anything regarding the Cassandra
  Java
  process around the time the last entries were made in the Cassandra
  system.log (2011-07-21 13:01:51):
 
  Jul 21 12:35:01 ip-10-2-206-127 CRON[12826]: (root) CMD (command -v
  debian-sa1  /dev/null  debian-sa1 1 1)
  Jul 21 12:45:01 ip-10-2-206-127 CRON[13420]: (root) CMD (command -v
  debian-sa1  /dev/null  debian-sa1 1 1)
  Jul 21 12:55:01 ip-10-2-206-127 CRON[14021]: (root) CMD (command -v
  debian-sa1  /dev/null  debian-sa1 1 1)
  Jul 21 14:26:07 ip-10-2-206-127 kernel: imklog 4.2.0, log source =
  /proc/kmsg started.
  Jul 21 14:26:07 ip-10-2-206-127 rsyslogd: [origin software=rsyslogd
  swVersion=4.2.0 x-pid=663 x-info=http://www.rsyslog.com;]
  (re)start
 
 
  The last thing in the Cassandra log before INFO Logging initialized is:
 
   INFO [ScheduledTasks:1] 2011-07-21 13:01:51,187 GCInspector.java (line
  128)
  GC for ParNew: 202 ms, 153219160 reclaimed leaving 2040879600 used; max
  is
  4030726144
 
 
  I can start Repair again, but am worried that it will crash Cassandra
  again,
  so I want to turn on any debugging or helpful logs to diagnose the
  crash if
  it happens again.
 
 
  - Sameer
 
 
  On Thu, Jul 21, 2011 at 4:30 PM, aaron morton aa...@thelastpickle.com
  wrote:
 
  The default init.d script will direct std out/err to that file, how
  are
  you starting brisk / cassandra ?
  Check the syslog and other logs in /var/log to see if the OS killed
  cassandra.
  Also, what was the last thing in the casandra log before INFO [main]
  2011-07-21 15:48:07,233 AbstractCassandraDaemon.java (line 78) Logging
  initialised ?
 
  Cheers
 
  -
  Aaron Morton
  Freelance Cassandra Developer
  @aaronmorton
  http://www.thelastpickle.com
  On 22 Jul 2011, at 10:50, Sameer Farooqui wrote:
 
  Hey Aaron,
 
  I don't have any output.log files in that folder:
 
  ubuntu@ip-10-2-x-x:~$ cd /var/log/cassandra
  ubuntu@ip-10-2-x-x:/var/log/cassandra$ ls
  system.log system.log.11  system.log.4  system.log.7
  system.log.1   system.log.2   system.log.5  system.log.8
  system.log.10  system.log.3   system.log.6  system.log.9
 
 
 
  On Thu, Jul 21, 2011 at 3:40 PM, aaron morton
  aa...@thelastpickle.com
  wrote:
 
  Check /var/log/cassandra/output.log (assuming the default init
  scripts)
  A
  -
  Aaron Morton
  Freelance Cassandra Developer
  @aaronmorton
  http://www.thelastpickle.com
  On 22 Jul 2011, at 10:13, Sameer Farooqui wrote:
 
  Hmm. Just looked at the log more closely.
 
  So, what actually happened is while Repair was running on this
  specific
  node, the Cassandra java process terminated itself automatically. The
  last
  entries in the log are:
 
   INFO [ScheduledTasks:1] 2011-07-21 13:00:20,285 GCInspector.java
  (line
  128) GC for ParNew: 214 ms, 162748656 reclaimed leaving 1845274888
  used; max
  is 

Re: Cassandra 0.7.8 and 0.8.1 fail when major compaction on 37GB database

2011-07-26 Thread aaron morton
Have you tried some of the ideas about reducing the memory pressure ? 

How many CF's + second indexes do you have?

Cheers


-
Aaron Morton
Freelance Cassandra Developer
@aaronmorton
http://www.thelastpickle.com

On 26 Jul 2011, at 17:10, lebron james wrote:

 I have only 4GB on server, so i give jvm 3 GB of heap, but this dont help, 
 cassandra still fall when i launch major compaction on 37 GB database.
 
 
 



Re: Repair fails with java.io.IOError: java.io.EOFException

2011-07-26 Thread aaron morton
Was guessing something like a token move may have happened in the past. 

Good suggestion to also kick off a major compaction. I've seen that make a big 
difference even for apps that do not do deletes, but do do overwrites. 

Cheers

-
Aaron Morton
Freelance Cassandra Developer
@aaronmorton
http://www.thelastpickle.com

On 26 Jul 2011, at 19:00, Sylvain Lebresne wrote:

 If they are and repair has completed use node tool cleanup to remove the
 data the node is no longer responsible. See bootstrap section above.
 
 I've seen that said a few times so allow me to correct. Cleanup is useless 
 after
 a repair. 'nodetool cleanup' removes rows the node is not responsible anymore
 and is thus useful only after operations that change the range a node is
 responsible for (bootstrap, move, decommission). After a repair, you will need
 compaction to kick in to see you disk usage come back to normal.
 
 --
 Sylvain
 
 Hope that helps.
 
 -
 Aaron Morton
 Freelance Cassandra Developer
 @aaronmorton
 http://www.thelastpickle.com
 On 26 Jul 2011, at 12:44, Sameer Farooqui wrote:
 
 Looks like the repair finished successfully the second time. However, the
 cluster is still severely unbalanced. I was hoping the repair would balance
 the nodes. We're using random partitioner. One node has 900GB and others
 have 128GB, 191GB, 129GB, 257 GB, etc. The 900GB and the 646GB are just
 insanely high. Not sure why or how to troubleshoot.
 
 
 
 On Fri, Jul 22, 2011 at 1:28 PM, Sameer Farooqui cassandral...@gmail.com
 wrote:
 
 I don't see a JVM crashlog ( hs_err_pid[pid].log) in
 ~/brisk/resources/cassandra/bin or /tmp. So maybe JVM didn't crash?
 
 We're running a pretty up to date with Sun Java:
 
 ubuntu@ip-10-2-x-x:/tmp$ java -version
 java version 1.6.0_24
 Java(TM) SE Runtime Environment (build 1.6.0_24-b07)
 Java HotSpot(TM) 64-Bit Server VM (build 19.1-b02, mixed mode)
 
 I'm gonna restart the Repair process in a few more hours. If there are any
 additional debug or troubleshooting logs you'd like me to enable first,
 please let me know.
 
 - Sameer
 
 
 
 On Thu, Jul 21, 2011 at 5:31 PM, Jonathan Ellis jbel...@gmail.com wrote:
 
 Did you check for a JVM crash log?
 
 You should make sure you're running the latest Sun JVM, older versions
 and OpenJDK in particular are prone to segfaulting.
 
 On Thu, Jul 21, 2011 at 6:53 PM, Sameer Farooqui
 cassandral...@gmail.com wrote:
 We are starting Cassandra with brisk cassandra, so as a stand-alone
 process, not a service.
 
 The syslog on the node doesn't show anything regarding the Cassandra
 Java
 process around the time the last entries were made in the Cassandra
 system.log (2011-07-21 13:01:51):
 
 Jul 21 12:35:01 ip-10-2-206-127 CRON[12826]: (root) CMD (command -v
 debian-sa1  /dev/null  debian-sa1 1 1)
 Jul 21 12:45:01 ip-10-2-206-127 CRON[13420]: (root) CMD (command -v
 debian-sa1  /dev/null  debian-sa1 1 1)
 Jul 21 12:55:01 ip-10-2-206-127 CRON[14021]: (root) CMD (command -v
 debian-sa1  /dev/null  debian-sa1 1 1)
 Jul 21 14:26:07 ip-10-2-206-127 kernel: imklog 4.2.0, log source =
 /proc/kmsg started.
 Jul 21 14:26:07 ip-10-2-206-127 rsyslogd: [origin software=rsyslogd
 swVersion=4.2.0 x-pid=663 x-info=http://www.rsyslog.com;]
 (re)start
 
 
 The last thing in the Cassandra log before INFO Logging initialized is:
 
  INFO [ScheduledTasks:1] 2011-07-21 13:01:51,187 GCInspector.java (line
 128)
 GC for ParNew: 202 ms, 153219160 reclaimed leaving 2040879600 used; max
 is
 4030726144
 
 
 I can start Repair again, but am worried that it will crash Cassandra
 again,
 so I want to turn on any debugging or helpful logs to diagnose the
 crash if
 it happens again.
 
 
 - Sameer
 
 
 On Thu, Jul 21, 2011 at 4:30 PM, aaron morton aa...@thelastpickle.com
 wrote:
 
 The default init.d script will direct std out/err to that file, how
 are
 you starting brisk / cassandra ?
 Check the syslog and other logs in /var/log to see if the OS killed
 cassandra.
 Also, what was the last thing in the casandra log before INFO [main]
 2011-07-21 15:48:07,233 AbstractCassandraDaemon.java (line 78) Logging
 initialised ?
 
 Cheers
 
 -
 Aaron Morton
 Freelance Cassandra Developer
 @aaronmorton
 http://www.thelastpickle.com
 On 22 Jul 2011, at 10:50, Sameer Farooqui wrote:
 
 Hey Aaron,
 
 I don't have any output.log files in that folder:
 
 ubuntu@ip-10-2-x-x:~$ cd /var/log/cassandra
 ubuntu@ip-10-2-x-x:/var/log/cassandra$ ls
 system.log system.log.11  system.log.4  system.log.7
 system.log.1   system.log.2   system.log.5  system.log.8
 system.log.10  system.log.3   system.log.6  system.log.9
 
 
 
 On Thu, Jul 21, 2011 at 3:40 PM, aaron morton
 aa...@thelastpickle.com
 wrote:
 
 Check /var/log/cassandra/output.log (assuming the default init
 scripts)
 A
 -
 Aaron Morton
 Freelance Cassandra Developer
 @aaronmorton
 http://www.thelastpickle.com
 On 22 Jul 2011, at 10:13, Sameer Farooqui wrote:
 
 Hmm. Just looked at the log more 

cassandra server disk full

2011-07-26 Thread Donna Li

Actually I was wrong�C our patch will disable gosisp and thrift but
leave the process running:

Could cassandra server be normal without restart after clear the disk?


Best Regards
Donna li

-邮件原件-
发件人: Ryan King [mailto:r...@twitter.com] 
发送时间: 2011年7月26日 1:53
收件人: user@cassandra.apache.org
主题: Re: cassandra server disk full

Actually I was wrong�C our patch will disable gosisp and thrift but
leave the process running:

https://issues.apache.org/jira/browse/CASSANDRA-2118

If people are interested in that I can make sure its up to date with
our latest version.

-ryan

On Mon, Jul 25, 2011 at 10:07 AM, Ryan King r...@twitter.com wrote:
 We have a patch somewhere that will kill the node on IOErrors, since
 those tend to be of the class that are unrecoverable.

 -ryan

 On Thu, Jul 7, 2011 at 8:02 PM, Jonathan Ellis jbel...@gmail.com wrote:
 Yeah, ideally it should probably die or drop into read-only mode if it
 runs out of space.
 (https://issues.apache.org/jira/browse/CASSANDRA-809)

 Unfortunately dealing with disk-full conditions tends to be a low
 priority for many people because it's relatively easy to avoid with
 decent monitoring, but if it's critical for you, we'd welcome the
 assistance.

 On Thu, Jul 7, 2011 at 8:34 PM, Donna Li donna...@utstar.com wrote:
 All:

 When one of the cassandra servers disk full, the cluster can not work
 normally, even I make space. I must reboot the server that disk full, the
 cluster can work normally.



 Best Regards

 Donna li



 --
 Jonathan Ellis
 Project Chair, Apache Cassandra
 co-founder of DataStax, the source for professional Cassandra support
 http://www.datastax.com




Re: Cassandra 0.7.8 and 0.8.1 fail when major compaction on 37GB database

2011-07-26 Thread lebron james
I have only one CF with one UTF8 column and without indexes. in column
always 1 byte of data and keys is 16byte strings.


Re: Capacity Planning

2011-07-26 Thread aaron morton
See the Edward Capriolo (media6degrees) – Real World Capacity Planning: 
Cassandra on Blades and Big Iron at 
http://www.datastax.com/events/cassandrasf2011/presentations

Open ended questions like this are really hard to answer. It's a lot easier for 
people if you provide some context, some idea of the data or expected loaded or 
what the app does. 

Cheers
 
-
Aaron Morton
Freelance Cassandra Developer
@aaronmorton
http://www.thelastpickle.com

On 26 Jul 2011, at 19:51, CASSANDRA learner wrote:

 Can you guys please explain how to do capacity planning in cassandra



Re: Stress test using Java-based stress utility

2011-07-26 Thread Nilabja Banerjee
Thank you every one it is working fine.

I was watching jconsole behavior...can tell me where exactly I can
find   *RecentHitRates
:
*Tuning for Optimal Caching:
Here they have given one example of that  *
http://www.datastax.com/docs/0.8/operations/cache_tuning#configuring-key-and-row-caches
* *RecentHitRates...  *In my jconsole within MBean I am unable to find
that one.
what is the value of long[36] and long[90].  From Jconsole attributes
how can I find the  *performance of the casssandra while stress testing?
Thank You
***

On 26 July 2011 14:33, aaron morton aa...@thelastpickle.com wrote:

 It's in the source distribution under tools/stress see the instructions in
 the README file and then look at the command line help (bin/stress --help).

 Cheers

 -
 Aaron Morton
 Freelance Cassandra Developer
 @aaronmorton
 http://www.thelastpickle.com

 On 26 Jul 2011, at 19:40, CASSANDRA learner wrote:

 Hi,,
 I too wanna know what this stress tool do? What is the usage of this
 tool... Please explain

 On Fri, Jul 22, 2011 at 6:39 PM, Jonathan Ellis jbel...@gmail.com wrote:

 What does nodetool ring say?

 On Fri, Jul 22, 2011 at 12:43 AM, Nilabja Banerjee
 nilabja.baner...@gmail.com wrote:
  Hi All,
 
  I am following this following link 
  http://www.datastax.com/docs/0.7/utilities/stress_java  for a stress
 test.
  I am getting this notification after running this command
 
  xxx.xxx.xxx.xx= my ip
 
  contrib/stress/bin/stress -d xxx.xxx.xxx.xx
 
  Created keyspaces. Sleeping 1s for propagation.
  total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
  Operation [44] retried 10 times - error inserting key 044
  ((UnavailableException))
 
  Operation [49] retried 10 times - error inserting key 049
  ((UnavailableException))
 
  Operation [7] retried 10 times - error inserting key 007
  ((UnavailableException))
 
  Operation [6] retried 10 times - error inserting key 006
  ((UnavailableException))
 
  Any idea why I am getting these things?
 
  Thank You
 
 
 



 --
 Jonathan Ellis
 Project Chair, Apache Cassandra
 co-founder of DataStax, the source for professional Cassandra support
 http://www.datastax.com






cassandra server disk full

2011-07-26 Thread Donna Li


I means the best way is that full disk server will not influence the service of 
cluster and the server can be automatic come back to work after clean the disk.


Best Regards
Donna li
-邮件原件-
发件人: aaron morton [mailto:aa...@thelastpickle.com] 
发送时间: 2011年7月26日 6:25
收件人: user@cassandra.apache.org
主题: Re: cassandra server disk full

If the commit log or data disk is full it's not possible for the server to 
process any writes, the best it could do is perform reads. But reads may result 
in a write due to read repair and will also need to do some app logging, so 
IMHO it's really down / dead. 

You should free space and restart the cassandra service. Restarting a cassandra 
service should be something your installation can handle. 

Is there something else I'm missing here ? 

Cheers

-
Aaron Morton
Freelance Cassandra Developer
@aaronmorton
http://www.thelastpickle.com

On 25 Jul 2011, at 20:06, Donna Li wrote:

 
 All:
   Could anyone help me?
 
 
 Best Regards
 Donna li
 
 -邮件原件-
 发件人: Donna Li [mailto:donna...@utstar.com] 
 发送时间: 2011年7月22日 11:23
 收件人: user@cassandra.apache.org
 主题: cassandra server disk full
 
 
 All:
 
 Is there an easy way to fix the bug by change server's code?
 
 
 Best Regards
 Donna li
 
 -邮件原件-
 发件人: Donna Li [mailto:donna...@utstar.com] 
 发送时间: 2011年7月8日 11:29
 收件人: user@cassandra.apache.org
 主题: cassandra server disk full
 
 
 Does CASSANDRA-809 resolved or any other path can resolve the problem? Is 
 there any way to avoid reboot the cassandra server?
 Thanks!
 
 Best Regards
 Donna li
 
 -邮件原件-
 发件人: Jonathan Ellis [mailto:jbel...@gmail.com] 
 发送时间: 2011年7月8日 11:03
 收件人: user@cassandra.apache.org
 主题: Re: cassandra server disk full
 
 Yeah, ideally it should probably die or drop into read-only mode if it
 runs out of space.
 (https://issues.apache.org/jira/browse/CASSANDRA-809)
 
 Unfortunately dealing with disk-full conditions tends to be a low
 priority for many people because it's relatively easy to avoid with
 decent monitoring, but if it's critical for you, we'd welcome the
 assistance.
 
 On Thu, Jul 7, 2011 at 8:34 PM, Donna Li donna...@utstar.com wrote:
 All:
 
 When one of the cassandra servers disk full, the cluster can not work
 normally, even I make space. I must reboot the server that disk full, the
 cluster can work normally.
 
 
 
 Best Regards
 
 Donna li
 
 
 
 -- 
 Jonathan Ellis
 Project Chair, Apache Cassandra
 co-founder of DataStax, the source for professional Cassandra support
 http://www.datastax.com



[RELEASE] Apache Cassandra 0.8.2 released

2011-07-26 Thread Sylvain Lebresne
The Cassandra team is pleased to announce the release of Apache Cassandra
version 0.8.2.

Cassandra is a highly scalable second-generation distributed database,
bringing together Dynamo's fully distributed design and Bigtable's
ColumnFamily-based data model. You can read more here:

 http://cassandra.apache.org/

Downloads of source and binary distributions are listed in our download
section:

 http://cassandra.apache.org/download/

This version is primarily a bug fix release[1] (it fixes a regression of
0.8.1 in particular that made hinted handoff not be delivered correctly) and
upgrade is highly encouraged.
Please however always pay attention to the release notes[2] before upgrading,

If you were to encounter any problem, let us know[4].

Have fun!


[1]: http://goo.gl/z61nT (CHANGES.txt)
[2]: http://goo.gl/Swjk5 (NEWS.txt)
[3]: https://issues.apache.org/jira/browse/CASSANDRA


Re: do I need to add more nodes? minor compaction eat all IO

2011-07-26 Thread Jim Ancona
On Mon, Jul 25, 2011 at 6:41 PM, aaron morton aa...@thelastpickle.com wrote:
 There are no hard and fast rules to add new nodes, but here are two 
 guidelines:

 1) Single node load is getting too high, rule of thumb is 300GB is probably 
 too high.

What is that rule of thumb based on? I would guess that working set
size would matter more than absolute size. Why isn't that the case?

Jim


Re: Stress test using Java-based stress utility

2011-07-26 Thread Jonathan Ellis
cassandra.db.Caches

On Tue, Jul 26, 2011 at 2:11 AM, Nilabja Banerjee
nilabja.baner...@gmail.com wrote:
 Thank you every one it is working fine.

 I was watching jconsole behavior...can tell me where exactly I can find 
 RecentHitRates :

 Tuning for Optimal Caching:

 Here they have given one example of that
 http://www.datastax.com/docs/0.8/operations/cache_tuning#configuring-key-and-row-caches
 RecentHitRates...  In my jconsole within MBean I am unable to find that
 one.
 what is the value of long[36] and long[90].  From Jconsole attributes
 how can I find the  performance of the casssandra while stress testing?
 Thank You


 On 26 July 2011 14:33, aaron morton aa...@thelastpickle.com wrote:

 It's in the source distribution under tools/stress see the instructions in
 the README file and then look at the command line help (bin/stress --help).
 Cheers
 -
 Aaron Morton
 Freelance Cassandra Developer
 @aaronmorton
 http://www.thelastpickle.com
 On 26 Jul 2011, at 19:40, CASSANDRA learner wrote:

 Hi,,
 I too wanna know what this stress tool do? What is the usage of this
 tool... Please explain

 On Fri, Jul 22, 2011 at 6:39 PM, Jonathan Ellis jbel...@gmail.com wrote:

 What does nodetool ring say?

 On Fri, Jul 22, 2011 at 12:43 AM, Nilabja Banerjee
 nilabja.baner...@gmail.com wrote:
  Hi All,
 
  I am following this following link 
  http://www.datastax.com/docs/0.7/utilities/stress_java  for a stress
  test.
  I am getting this notification after running this command
 
  xxx.xxx.xxx.xx= my ip
 
  contrib/stress/bin/stress -d xxx.xxx.xxx.xx
 
  Created keyspaces. Sleeping 1s for propagation.
  total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
  Operation [44] retried 10 times - error inserting key 044
  ((UnavailableException))
 
  Operation [49] retried 10 times - error inserting key 049
  ((UnavailableException))
 
  Operation [7] retried 10 times - error inserting key 007
  ((UnavailableException))
 
  Operation [6] retried 10 times - error inserting key 006
  ((UnavailableException))
 
  Any idea why I am getting these things?
 
  Thank You
 
 
 



 --
 Jonathan Ellis
 Project Chair, Apache Cassandra
 co-founder of DataStax, the source for professional Cassandra support
 http://www.datastax.com







-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of DataStax, the source for professional Cassandra support
http://www.datastax.com


Re: sstableloader throws storage_port error

2011-07-26 Thread John Conwell
If I have Cassandra already running on my machine, how do I configure
sstableloader to run on a different IP (127.0.0.2).  Also, does that mean in
order to use sstableloader on the same machine as an running Cassandra node,
I have to have two NIC cards?

I looked around for any info about how to configure and run sstableloader,
but other than what the cmdline spits out I cant find anything.  Are there
any examples or best practices?  Is it designed to be run on a machine that
isn't running a cassandra node?


On Mon, Jul 25, 2011 at 8:24 PM, Jonathan Ellis jbel...@gmail.com wrote:

 sstableloader uses gossip to discover the Cassandra ring, so you'll
 need to run it on a different IP (127.0.0.2 is fine).

 On Mon, Jul 25, 2011 at 2:41 PM, John Conwell j...@iamjohn.me wrote:
  I'm trying to figure out how to use the sstableloader tool.  For my test
 I
  have a single node cassandra instance running on my local machine.  I
 have
  cassandra running, and validate this by connecting to it with
 cassandra-cli.
  I run sstableloader using the following command:
  bin/sstableloader /Users/someuser/cassandra/mykeyspace
  and I get the following error:
  org.apache.cassandra.config.ConfigurationException: localhost/
 127.0.0.1:7000
  is in use by another process.  Change listen_address:storage_port in
  cassandra.yaml to values that do not conflict with other services
 
  I've played around with different ports, but nothing works.  It it
 because
  I'm trying to run sstableloader on the same machine that cassandra is
  running on?  It would be odd I think, but cant thing of another reason I
  would get that eror.
  Thanks,
  John



 --
 Jonathan Ellis
 Project Chair, Apache Cassandra
 co-founder of DataStax, the source for professional Cassandra support
 http://www.datastax.com




-- 

Thanks,
John C


Slow Reads

2011-07-26 Thread Priyanka

Hello All,

  I am doing some read tests on Cassandra on a single node.But they
are turning up to be very slow.
Here is the data model in detail.
I am using a super column family.Cassandra has total 970 rows and each row
has 620901 super columns and each super column has 2 columns.Total data in
the database would be around 45GB.
I am trying to retrieve the data of a particular super column[Trying to pull
the row key associated with the super column and the column values with in
the super column.
It is taking 2.5 secs with java code and 4.7 secs with the python code.

Here is the python code.
 result = col_fam.get_range(start=,
finish=,columns=None,column_start=,column_finish
=,column_reversed=False,column_count=2,row_count=None,include_timestamp=False,
super_column='23', read_consistency_level=None,buffer_size=None)

This is very slow compared to MySQL.
Am not sure whats going wrong here.Could some one let me know if there is
any problem with my model.


Any help in this regard is highly appreciated.

Thank you.

Regards,
Priyanka

 



--
View this message in context: 
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Slow-Reads-tp6622680p6622680.html
Sent from the cassandra-u...@incubator.apache.org mailing list archive at 
Nabble.com.


Re: Slow Reads

2011-07-26 Thread Philippe
i believe it's because it needs to read the whole row to get to your super
column.

you might have to reconsider your model.
Le 26 juil. 2011 17:39, Priyanka priya...@gmail.com a écrit :

 Hello All,

 I am doing some read tests on Cassandra on a single node.But they
 are turning up to be very slow.
 Here is the data model in detail.
 I am using a super column family.Cassandra has total 970 rows and each row
 has 620901 super columns and each super column has 2 columns.Total data in
 the database would be around 45GB.
 I am trying to retrieve the data of a particular super column[Trying to
pull
 the row key associated with the super column and the column values with in
 the super column.
 It is taking 2.5 secs with java code and 4.7 secs with the python code.

 Here is the python code.
 result = col_fam.get_range(start=,
 finish=,columns=None,column_start=,column_finish

=,column_reversed=False,column_count=2,row_count=None,include_timestamp=False,
 super_column='23', read_consistency_level=None,buffer_size=None)

 This is very slow compared to MySQL.
 Am not sure whats going wrong here.Could some one let me know if there is
 any problem with my model.


 Any help in this regard is highly appreciated.

 Thank you.

 Regards,
 Priyanka





 --
 View this message in context:
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Slow-Reads-tp6622680p6622680.html
 Sent from the cassandra-u...@incubator.apache.org mailing list archive at
Nabble.com.


Re: Slow Reads

2011-07-26 Thread Jake Luciani
It doesn't read the entire row, but it does read a section of the row from
disk...

How big is each supercolumn?  If you re-read the data does the query time
get faster?



On Tue, Jul 26, 2011 at 11:59 AM, Philippe watche...@gmail.com wrote:

 i believe it's because it needs to read the whole row to get to your super
 column.

 you might have to reconsider your model.
 Le 26 juil. 2011 17:39, Priyanka priya...@gmail.com a écrit :

 
  Hello All,
 
  I am doing some read tests on Cassandra on a single node.But they
  are turning up to be very slow.
  Here is the data model in detail.
  I am using a super column family.Cassandra has total 970 rows and each
 row
  has 620901 super columns and each super column has 2 columns.Total data
 in
  the database would be around 45GB.
  I am trying to retrieve the data of a particular super column[Trying to
 pull
  the row key associated with the super column and the column values with
 in
  the super column.
  It is taking 2.5 secs with java code and 4.7 secs with the python code.
 
  Here is the python code.
  result = col_fam.get_range(start=,
  finish=,columns=None,column_start=,column_finish
 
 =,column_reversed=False,column_count=2,row_count=None,include_timestamp=False,
  super_column='23', read_consistency_level=None,buffer_size=None)
 
  This is very slow compared to MySQL.
  Am not sure whats going wrong here.Could some one let me know if there is
  any problem with my model.
 
 
  Any help in this regard is highly appreciated.
 
  Thank you.
 
  Regards,
  Priyanka
 
 
 
 
 
  --
  View this message in context:
 http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Slow-Reads-tp6622680p6622680.html
  Sent from the cassandra-u...@incubator.apache.org mailing list archive
 at Nabble.com.




-- 
http://twitter.com/tjake


Re: Slow Reads

2011-07-26 Thread Sylvain Lebresne
On Tue, Jul 26, 2011 at 5:39 PM, Priyanka priya...@gmail.com wrote:

 Hello All,

          I am doing some read tests on Cassandra on a single node.But they
 are turning up to be very slow.
 Here is the data model in detail.
 I am using a super column family.Cassandra has total 970 rows and each row
 has 620901 super columns and each super column has 2 columns.Total data in
 the database would be around 45GB.
 I am trying to retrieve the data of a particular super column[Trying to pull
 the row key associated with the super column and the column values with in
 the super column.
 It is taking 2.5 secs with java code and 4.7 secs with the python code.

 Here is the python code.
  result = col_fam.get_range(start=,
 finish=,columns=None,column_start=,column_finish
 =,column_reversed=False,column_count=2,row_count=None,include_timestamp=False,
 super_column='23', read_consistency_level=None,buffer_size=None)

What are you trying to query exactly ? All the rows or only one ?
Because I'm no expert in pycassa but if read this code and pycassa
code correctly,
request will query 1024 rows upfront and return an iterator that will
eventually read
all the rows in the database if you iter.


 This is very slow compared to MySQL.
 Am not sure whats going wrong here.Could some one let me know if there is
 any problem with my model.


 Any help in this regard is highly appreciated.

 Thank you.

 Regards,
 Priyanka





 --
 View this message in context: 
 http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Slow-Reads-tp6622680p6622680.html
 Sent from the cassandra-u...@incubator.apache.org mailing list archive at 
 Nabble.com.



Re: Cassandra start/stop scripts

2011-07-26 Thread Priyanka
I do the same way...

On Tue, Jul 26, 2011 at 1:07 PM, mcasandra [via
cassandra-u...@incubator.apache.org] 
ml-node+6622977-1598721269-336...@n2.nabble.com wrote:

 I need to write cassandra start/stop script. Currently I run cassandra to
 start and kill -9 to stop.

 Is this the best way? kill -9 doesn't sound right :) Wondering how others
 do it.

 --
  If you reply to this email, your message will be added to the discussion
 below:

 http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Cassandra-start-stop-scripts-tp6622977p6622977.html
 To start a new topic under cassandra-u...@incubator.apache.org, email
 ml-node+3065146-137246924-336...@n2.nabble.com
 To unsubscribe from cassandra-u...@incubator.apache.org, click 
 herehttp://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_codenode=3065146code=cHJpeWE0MjlAZ21haWwuY29tfDMwNjUxNDZ8MTI0NzM0MTExOQ==.




--
View this message in context: 
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Cassandra-start-stop-scripts-tp6622977p6622997.html
Sent from the cassandra-u...@incubator.apache.org mailing list archive at 
Nabble.com.

Re: Slow Reads

2011-07-26 Thread Priyanka Ganuthula
Thanks Philippe , I have a question here...I am specifying the required
super column.Does it still need to read the entire row?
Or is it because am listing all the slices and then going to each slice and
picking data for the required super column?
SlicePredicate slicePredicate = new SlicePredicate();
SliceRange sliceRange = new SliceRange();
sliceRange.setStart(new byte[] {});
sliceRange.setFinish(new byte[] {});
slicePredicate.setSlice_range(sliceRange);


ColumnParent columnParent = new ColumnParent(COLUMN_FAMILY);
KeyRange keyRange = new KeyRange();
keyRange.start_key= ByteBuffer.wrap(lastkey.getBytes());
keyRange.end_key=ByteBuffer.wrap(.getBytes());


ListKeySlice slices = client.get_range_slices(columnParent,
slicePredicate, keyRange, ConsistencyLevel.ONE);

Then i loop around slices and  and list super columns and set the name of
the super column and look for that.

Am I missing sth here ?

On Tue, Jul 26, 2011 at 11:59 AM, Philippe watche...@gmail.com wrote:

 i believe it's because it needs to read the whole row to get to your super
 column.

 you might have to reconsider your model.
 Le 26 juil. 2011 17:39, Priyanka priya...@gmail.com a écrit :

 
  Hello All,
 
  I am doing some read tests on Cassandra on a single node.But they
  are turning up to be very slow.
  Here is the data model in detail.
  I am using a super column family.Cassandra has total 970 rows and each
 row
  has 620901 super columns and each super column has 2 columns.Total data
 in
  the database would be around 45GB.
  I am trying to retrieve the data of a particular super column[Trying to
 pull
  the row key associated with the super column and the column values with
 in
  the super column.
  It is taking 2.5 secs with java code and 4.7 secs with the python code.
 
  Here is the python code.
  result = col_fam.get_range(start=,
  finish=,columns=None,column_start=,column_finish
 
 =,column_reversed=False,column_count=2,row_count=None,include_timestamp=False,
  super_column='23', read_consistency_level=None,buffer_size=None)
 
  This is very slow compared to MySQL.
  Am not sure whats going wrong here.Could some one let me know if there is
  any problem with my model.
 
 
  Any help in this regard is highly appreciated.
 
  Thank you.
 
  Regards,
  Priyanka
 
 
 
 
 
  --
  View this message in context:
 http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Slow-Reads-tp6622680p6622680.html
  Sent from the cassandra-u...@incubator.apache.org mailing list archive
 at Nabble.com.



Re: Slow Reads

2011-07-26 Thread Priyanka Ganuthula
Supercolumn has two columns and each column has only one byte.
It is a bit faster but not significant.

On Tue, Jul 26, 2011 at 12:49 PM, Jake Luciani jak...@gmail.com wrote:

 It doesn't read the entire row, but it does read a section of the row from
 disk...

 How big is each supercolumn?  If you re-read the data does the query time
 get faster?



 On Tue, Jul 26, 2011 at 11:59 AM, Philippe watche...@gmail.com wrote:

 i believe it's because it needs to read the whole row to get to your super
 column.

 you might have to reconsider your model.
 Le 26 juil. 2011 17:39, Priyanka priya...@gmail.com a écrit :

 
  Hello All,
 
  I am doing some read tests on Cassandra on a single node.But they
  are turning up to be very slow.
  Here is the data model in detail.
  I am using a super column family.Cassandra has total 970 rows and each
 row
  has 620901 super columns and each super column has 2 columns.Total data
 in
  the database would be around 45GB.
  I am trying to retrieve the data of a particular super column[Trying to
 pull
  the row key associated with the super column and the column values with
 in
  the super column.
  It is taking 2.5 secs with java code and 4.7 secs with the python code.
 
  Here is the python code.
  result = col_fam.get_range(start=,
  finish=,columns=None,column_start=,column_finish
 
 =,column_reversed=False,column_count=2,row_count=None,include_timestamp=False,
  super_column='23', read_consistency_level=None,buffer_size=None)
 
  This is very slow compared to MySQL.
  Am not sure whats going wrong here.Could some one let me know if there
 is
  any problem with my model.
 
 
  Any help in this regard is highly appreciated.
 
  Thank you.
 
  Regards,
  Priyanka
 
 
 
 
 
  --
  View this message in context:
 http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Slow-Reads-tp6622680p6622680.html
  Sent from the cassandra-u...@incubator.apache.org mailing list archive
 at Nabble.com.




 --
 http://twitter.com/tjake



Re: Slow Reads

2011-07-26 Thread Priyanka
this is how my data looks
“rowkey1”:{
“supercol1”:{ “col1”:T,”col2”:C}
“supercol2”:{“col1”:C,”col2”:T }
“supercol3”:{ “col1”:C,”col2”:T} 
}
rowkey2”:{
   “supercol1”:{ “col1”:A,”col2”:A}
“supercol2”:{“col1”:A,”col2”:T }

“supercol3”:{ “col1”:C,”col2”:T} 
 }

each row has 620901 super columns and 2 columns for each super column.
Name of the super columns remain same for all the rows but the data in each
super column is different.
I am trying to get the data of a particular super col which is spread across
all the rows but with different data.

So  yes,its getting data from all rows.
Please suggest me a better way to do so.
Thank you.

the output of my query will be (suppose if i do for supercol1)
rowkey1,T,C
rowkey2,A,A



--
View this message in context: 
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Slow-Reads-tp6622680p6623091.html
Sent from the cassandra-u...@incubator.apache.org mailing list archive at 
Nabble.com.


Re: sstableloader throws storage_port error

2011-07-26 Thread John Conwell
After much research and experimentation, I figured out how to get
sstableloader running on the same machine as a live cassandra node instance.

The key, as Jonathan stated is to configure sstableloader to use a different
ipaddress than the running cassandra instance is using.  To do this, I ran
this command, which created the loopback address for 127.0.0.2

   sudo ifconfig lo0 alias 127.0.0.2

No you can have cassandra configured to listen to 127.0.0.1, and
sstableloader configured to listen to 127.0.0.2

By the way, to remove this ipaddress, run

sudo ifconfig lo0 -alias 127.0.0.2

But thats not really all.  Because sstableloader reads the cassandra.yaml
file to get the gossip ipaddress, you need to make a copy of the cassandra
install directory (or at least the bin and conf folders).  Basically one
folder with yaml configured for Cassandra, the other folder with yaml
configured for sstableloader.

Hope this helps people. I've written an in depth description of how to do
all this, and can post it if people want, but I'm not sure the etiquette of
posting blog links in the email list.

Thanks,
John

On Tue, Jul 26, 2011 at 7:40 AM, John Conwell j...@iamjohn.me wrote:

 If I have Cassandra already running on my machine, how do I configure
 sstableloader to run on a different IP (127.0.0.2).  Also, does that mean in
 order to use sstableloader on the same machine as an running Cassandra node,
 I have to have two NIC cards?

 I looked around for any info about how to configure and run sstableloader,
 but other than what the cmdline spits out I cant find anything.  Are there
 any examples or best practices?  Is it designed to be run on a machine that
 isn't running a cassandra node?


 On Mon, Jul 25, 2011 at 8:24 PM, Jonathan Ellis jbel...@gmail.com wrote:

 sstableloader uses gossip to discover the Cassandra ring, so you'll
 need to run it on a different IP (127.0.0.2 is fine).

 On Mon, Jul 25, 2011 at 2:41 PM, John Conwell j...@iamjohn.me wrote:
  I'm trying to figure out how to use the sstableloader tool.  For my test
 I
  have a single node cassandra instance running on my local machine.  I
 have
  cassandra running, and validate this by connecting to it with
 cassandra-cli.
  I run sstableloader using the following command:
  bin/sstableloader /Users/someuser/cassandra/mykeyspace
  and I get the following error:
  org.apache.cassandra.config.ConfigurationException: localhost/
 127.0.0.1:7000
  is in use by another process.  Change listen_address:storage_port in
  cassandra.yaml to values that do not conflict with other services
 
  I've played around with different ports, but nothing works.  It it
 because
  I'm trying to run sstableloader on the same machine that cassandra is
  running on?  It would be odd I think, but cant thing of another reason I
  would get that eror.
  Thanks,
  John



 --
 Jonathan Ellis
 Project Chair, Apache Cassandra
 co-founder of DataStax, the source for professional Cassandra support
 http://www.datastax.com




 --

 Thanks,
 John C




-- 

Thanks,
John C


Re: Cassandra start/stop scripts

2011-07-26 Thread Joaquin Casares
Did you install via a package or tarball binaries?

Packages allow you to run cassandra as a service with
sudo service cassandra start|stop

But if you are running via tarballs, then yes, running a kill command
against Cassandra is the way to do it since Cassandra runs in crash-only
mode. Kill pid would work however.

Thanks,

Joaquin Casares
DataStax
Software Engineer/Support



On Tue, Jul 26, 2011 at 12:19 PM, Priyanka priya...@gmail.com wrote:

 I do the same way...

 On Tue, Jul 26, 2011 at 1:07 PM, mcasandra [via [hidden 
 email]http://user/SendEmail.jtp?type=nodenode=6622997i=0]
 [hidden email] http://user/SendEmail.jtp?type=nodenode=6622997i=1wrote:

 I need to write cassandra start/stop script. Currently I run cassandra
 to start and kill -9 to stop.

 Is this the best way? kill -9 doesn't sound right :) Wondering how others
 do it.

 --
  If you reply to this email, your message will be added to the discussion
 below:

 http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Cassandra-start-stop-scripts-tp6622977p6622977.html
  To start a new topic under [hidden 
 email]http://user/SendEmail.jtp?type=nodenode=6622997i=2,
 email [hidden email]http://user/SendEmail.jtp?type=nodenode=6622997i=3
 To unsubscribe from [hidden 
 email]http://user/SendEmail.jtp?type=nodenode=6622997i=4,
 click here.



 --
 View this message in context: Re: Cassandra start/stop 
 scriptshttp://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Cassandra-start-stop-scripts-tp6622977p6622997.html

 Sent from the cassandra-u...@incubator.apache.org mailing list 
 archivehttp://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/at 
 Nabble.com.



Re: Predictable low RW latency, SLABS and STW GC

2011-07-26 Thread Peter Schuller
 Restarting the service will drop all the memmapped caches, cassandra caches 
 are saved / persistent and you can also use memcachd if you want.

Well, the OS won't evict everything from page cache just because the
last process to map them exits. That said, since restarts tend to have
secondary effects on caches like streaming through all the bf and
index files, restarts are certainly detrimental to the page cache.
Also you may still see some eviction (even if it doesn't *necessarily*
happen) depending (particularly if not running with numactl set to
interleave).

-- 
/ Peter Schuller (@scode on twitter)


Re: Repair fails with java.io.IOError: java.io.EOFException

2011-07-26 Thread Sameer Farooqui
Thanks for the info guys.

I'm running compaction on the two very highly loaded nodes now in hopes of
the data volume going down. But I'm skeptical because I don't see how it got
so unbalanced in the first place (all nodes were up while the writes were
being injected).

I should have an update tomorrow on whether compaction rebalanced the nodes.

The tokens are evenly distributed across the ring:

Address DC Rack Status State Load Owns Token
148873535527910577765226390751398592512
10.192.143.x DC1 RAC1 Up Normal 643.42 GB 12.50% 0
10.192.171.x DC1 RAC1 Up Normal 128.96 GB 6.25%
21267647932558653966460912964485513216
10.210.95.x DC1 RAC1 Up Normal 128.34 GB 12.50%
42535295865117307932921825928971026432
10.211.19.x DC1 RAC1 Up Normal 128.55 GB 6.25%
63802943797675961899382738893456539648
10.68.58.x DC1 RAC2 Up Normal 643.05 GB 12.50%
85070591730234615865843651857942052864
10.110.31.x DC1 RAC2 Up Normal 128.84 GB 6.25%
106338239662793269832304564822427566080
10.96.58.x DC1 RAC2 Up Normal 128.11 GB 12.50%
127605887595351923798765477786913079296
10.210.195.x DC1 RAC2 Up Normal 129.33 GB 6.25%
148873535527910577765226390751398592512
10.114.138.x DC2 RAC1 Up Normal 258.04 GB 6.25%
10633823966279326983230456482242756608
10.203.79.x DC2 RAC1 Up Normal 257.14 GB 6.25%
53169119831396634916152282411213783040
10.242.209.x DC2 RAC1 Up Normal 256.58 GB 6.25%
95704415696513942849074108340184809472
10.38.25.x DC2 RAC1 Up Normal 257.08 GB 6.25%
138239711561631250781995934269155835904

On Tue, Jul 26, 2011 at 1:59 AM, aaron morton aa...@thelastpickle.comwrote:

 Was guessing something like a token move may have happened in the past.

 Good suggestion to also kick off a major compaction. I've seen that make a
 big difference even for apps that do not do deletes, but do do overwrites.

 Cheers

 -
 Aaron Morton
 Freelance Cassandra Developer
 @aaronmorton
 http://www.thelastpickle.com

 On 26 Jul 2011, at 19:00, Sylvain Lebresne wrote:

  If they are and repair has completed use node tool cleanup to remove the
  data the node is no longer responsible. See bootstrap section above.
 
  I've seen that said a few times so allow me to correct. Cleanup is
 useless after
  a repair. 'nodetool cleanup' removes rows the node is not responsible
 anymore
  and is thus useful only after operations that change the range a node is
  responsible for (bootstrap, move, decommission). After a repair, you will
 need
  compaction to kick in to see you disk usage come back to normal.
 
  --
  Sylvain
 
  Hope that helps.
 
  -
  Aaron Morton
  Freelance Cassandra Developer
  @aaronmorton
  http://www.thelastpickle.com
  On 26 Jul 2011, at 12:44, Sameer Farooqui wrote:
 
  Looks like the repair finished successfully the second time. However,
 the
  cluster is still severely unbalanced. I was hoping the repair would
 balance
  the nodes. We're using random partitioner. One node has 900GB and others
  have 128GB, 191GB, 129GB, 257 GB, etc. The 900GB and the 646GB are just
  insanely high. Not sure why or how to troubleshoot.
 
 
 
  On Fri, Jul 22, 2011 at 1:28 PM, Sameer Farooqui 
 cassandral...@gmail.com
  wrote:
 
  I don't see a JVM crashlog ( hs_err_pid[pid].log) in
  ~/brisk/resources/cassandra/bin or /tmp. So maybe JVM didn't crash?
 
  We're running a pretty up to date with Sun Java:
 
  ubuntu@ip-10-2-x-x:/tmp$ java -version
  java version 1.6.0_24
  Java(TM) SE Runtime Environment (build 1.6.0_24-b07)
  Java HotSpot(TM) 64-Bit Server VM (build 19.1-b02, mixed mode)
 
  I'm gonna restart the Repair process in a few more hours. If there are
 any
  additional debug or troubleshooting logs you'd like me to enable first,
  please let me know.
 
  - Sameer
 
 
 
  On Thu, Jul 21, 2011 at 5:31 PM, Jonathan Ellis jbel...@gmail.com
 wrote:
 
  Did you check for a JVM crash log?
 
  You should make sure you're running the latest Sun JVM, older versions
  and OpenJDK in particular are prone to segfaulting.
 
  On Thu, Jul 21, 2011 at 6:53 PM, Sameer Farooqui
  cassandral...@gmail.com wrote:
  We are starting Cassandra with brisk cassandra, so as a stand-alone
  process, not a service.
 
  The syslog on the node doesn't show anything regarding the Cassandra
  Java
  process around the time the last entries were made in the Cassandra
  system.log (2011-07-21 13:01:51):
 
  Jul 21 12:35:01 ip-10-2-206-127 CRON[12826]: (root) CMD (command -v
  debian-sa1  /dev/null  debian-sa1 1 1)
  Jul 21 12:45:01 ip-10-2-206-127 CRON[13420]: (root) CMD (command -v
  debian-sa1  /dev/null  debian-sa1 1 1)
  Jul 21 12:55:01 ip-10-2-206-127 CRON[14021]: (root) CMD (command -v
  debian-sa1  /dev/null  debian-sa1 1 1)
  Jul 21 14:26:07 ip-10-2-206-127 kernel: imklog 4.2.0, log source =
  /proc/kmsg started.
  Jul 21 14:26:07 ip-10-2-206-127 rsyslogd: [origin software=rsyslogd
  swVersion=4.2.0 x-pid=663 x-info=http://www.rsyslog.com;]
  (re)start
 
 
  The last thing in the Cassandra log before INFO Logging initialized
 is:
 
   INFO 

Re: Cassandra start/stop scripts

2011-07-26 Thread Jason Pell
Check out the rpm packages from Cassandra they have init.d scripts that work 
very nicely, there are debs as well for ubuntu 

Sent from my iPhone

On Jul 27, 2011, at 3:19, Priyanka priya...@gmail.com wrote:

 I do the same way...
 
 On Tue, Jul 26, 2011 at 1:07 PM, mcasandra [via [hidden email]] [hidden 
 email] wrote:
 I need to write cassandra start/stop script. Currently I run cassandra to 
 start and kill -9 to stop. 
 
 Is this the best way? kill -9 doesn't sound right :) Wondering how others do 
 it. 
 
 If you reply to this email, your message will be added to the discussion 
 below:
 http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Cassandra-start-stop-scripts-tp6622977p6622977.html
 To start a new topic under [hidden email], email [hidden email] 
 To unsubscribe from [hidden email], click here.
 
 
 View this message in context: Re: Cassandra start/stop scripts
 Sent from the cassandra-u...@incubator.apache.org mailing list archive at 
 Nabble.com.


Internal error processing get during bootstrap

2011-07-26 Thread Rafael Almeida
Hello,

I'm evaluating cassandra for use in my system. I could add approximately 16 
million items using a single node. I'm using libcassandra (I can find my way 
through its code when I need to) to connect to it and I already have some 
infrastructure for handling and adding those items (I was using tokio cabinet 
before).

I couldn't find much documentation regarding how to make a cluster, but it 
seemed simple enough. At cassandra server A (10.0.0.2) I had seeds: locahost. 
At server B (10.0.0.3) I configured seeds: 10.0.0.2 and auto_bootstrap: true. 
Then I created a keyspace and a few column families in it.

I imediately began to add items and to get all these Internal error processing 
get. I found it quite odd, I thought it had to do with the load I was putting 
in, seeing that a few small tests had worked before. I spent quite some time 
debugging, when I finally decided to write this e-mail. I wanted to double 
check stuff, so I ran nodetool to see if everything was right. To my surprise, 
there was only one of the node available. It took a little while for the other 
one to show up as Joining and then as Normal.

After I waited that period, I was able to insert items to the cluster with no 
error at all. Is that expected behaviour? What is the recommended way to setup 
a cluster? Should it be done manually. Setting up the machines, creating all 
keyspaces and colum families then checking nodetool and waiting for it to get 
stable?

On a side note, sometimes I get Default TException (that seems to happen when 
the machine is in a heavier load than usual), commonly retrying the read or 
insert right after works fine.  Is that what's supposed to happen? Perhaps I 
should raise some timeout somewhere?

This is what ./bin/nodetool -h localhost ring reports me:

Address DC  Rack    Status State   Load    Owns    
Token   
   
119105113551249187083945476614048008053 
10.0.0.3 datacenter1 rack1   Up Normal  3.43 GB 65.90%  
61078635599166706937511052402724559481  
10.0.0.2    datacenter1 rack1   Up Normal  1.77 GB 34.10%  
119105113551249187083945476614048008053

It's still adding stuff. I have no idea why B owns so many more keys than A.

I'm sorry if what I'm asking is trivial. But I have been having a hard time 
finding documentation. I've found a lot of outdated stuff, which was 
frustrating. I hope you guys have the time to help me out or -- if not -- I 
hope you can give me good reading material.

Thank you,
Rafael



Cassandra allocation unit size

2011-07-26 Thread Andres Rodriguez Contreras
Hi, which is the allocation unit size to format a hard drive to use
Cassandra, usingUbuntu server and a SAN (Storage Area Network).


Re: Cassandra allocation unit size

2011-07-26 Thread Jonathan Ellis
You should see this thread about Cassandra with a SAN:
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Cassandra-on-iSCSI-td5945217.html

On Tue, Jul 26, 2011 at 4:38 PM, Andres Rodriguez Contreras
anrocoubu...@gmail.com wrote:
 Hi, which is the allocation unit size to format a hard drive to
 use Cassandra, usingUbuntu server and a SAN (Storage Area Network).



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of DataStax, the source for professional Cassandra support
http://www.datastax.com


Recovering from a multi-node cluster failure caused by OOM on repairs

2011-07-26 Thread Teijo Holzer

Hi,

I thought I share the following with this mailing list as a number of other
users seem to have had similar problems.

We have the following set-up:

OS: CentOS 5.5
RAM: 16GB
JVM heap size: 8GB (also tested with 14GB)
Cassandra version: 0.7.6-2 (also tested with 0.7.7)
Oracle JDK version: 1.6.0_26
Number of nodes: 5
Load per node: ~40GB
Replication factor: 3
Number of requests/day: 2.5 Million (95% inserts)
Total net insert data/day: 1GB
Default TTL for most of the data: 10 days

This set-up has been operating successfully for a few months, however recently
we started seeing multi-node failures, usually triggered by a repair, but 
occasionally also under normal operation. A repair on node 3,4 and 5 would

always cause the cluster as whole to fail, whereas node 1  2 completed their
repair cycles successfully.

These failures would usually result in 2 or 3 nodes becoming unresponsive and
dropping out of the cluster, resulting in client failure rates to spike up to
~10%. We normally operate with a failure rate of 0.1%.

The relevant log entries showed a complete heap memory exhaustion within 1
minute (see log lines below where we experimented with a larger heap size of
14GB). Also of interest was a number of huge SliceQueryFilter collections
running concurrently on the nodes in question (see log lines below).

The way we ended recovering from this situation was as follows. Remember these
steps were taken to get an unstable cluster back under control, so you might
want to revert some of the changes once the cluster is stable again.

Set disk_access_mode: standard in cassandra.yaml
This allowed us to prevent the JVM blowing out the hard limit of 8GB via large
mmaps. Heap size was set to 8GB (RAM/2). That meant the JVM was never using
more than 8GB total. mlockall didn't seem to make a difference for our
particular problem.

Turn off all row  key caches via cassandra-cli, e.g.
update column family Example with rows_cached=0;
update column family Example with keys_cached=0;
We were seeing compacted row maximum sizes of ~800MB from cfstats, that's why
we turned them all off. Again, we saw a significant drop in the actual memory
used from the available maximum of 8GB. Obviously, this will affect reads, but
as 95% of our requests are inserts, it didn't matter so much for us.

Bootstrap problematic node:
Kill Cassandra
Change auto_bootstrap: true in cassandra.yaml, remove own IP address from
list of seeds (important)
Delete all data directories (i.e. commit-log, data, saved-caches)
Start Cassandra
Wait for bootstrap to finish (see log  nodetool)
Change auto_bootstrap: false
(Run repair)

The first bootstrap completed very quickly, so we decided to bootstrap every
node in the cluster (not just the problematic ones). This resulted in some data
loss. The next time we will follow the bootstrap by a repair before
bootstrapping  repairing the next node to minimize data loss.

After this procedure, the cluster was operating normally again.

We now run a continuous rolling repair, followed by a (major) compaction and a
manual garbage collection. As the repairs a required anyway, we decided to run
them all the time in a continuous fashion. Therefore, potential problems can be 
identified earlier.


The major compaction followed by a manual GC allows us to keep the disk usage 
low on each node. The manual GC is necessary as the unused files on disk are 
only really deleted when the reference is garbage collected inside the JVM (a 
restart would achieve the same).


We also collected some statistics in regards to the duration of some of the
operations:

cleanup/compact: ~1 min/GB
repair: ~2-3 min/GB
bootstrap: ~1 min/GB

This means that if you have a node with 60GB of data, it will take ~1hr to
compact and ~2-3hrs to repair. Therefore, it is advisable to keep the data per
node below ~120GB. We achieve this by using an aggressive TTL on most of our
writes.

Cheers,

   Teijo

Here are the relevant log entries showing the OOM conditions:


[2011-07-21 11:12:11,059] INFO: GC for ParNew: 1141 ms, 509843976 reclaimed
leaving 1469443752 used; max is 14675869696 (ScheduledTasks:1 
GCInspector.java:128)
[2011-07-21 11:12:15,226] INFO: GC for ParNew: 1149 ms, 564409392 reclaimed
leaving 2247228920 used; max is 14675869696 (ScheduledTasks:1 
GCInspector.java:128)
...
[2011-07-21 11:12:55,062] INFO: GC for ParNew: 1110 ms, 564365792 reclaimed
leaving 12901974704 used; max is 14675869696 (ScheduledTasks:1
GCInspector.java:128)

[2011-07-21 10:57:23,548] DEBUG: collecting 4354206 of 2147483647:
940657e5b3b0d759eb4a14a7228ae365:false:41@1311102443362542 (ReadStage:27
SliceQueryFilter.java:123)


Re: Stress test using Java-based stress utility

2011-07-26 Thread Nilabja Banerjee
Thank you Jonathan.. :)




On 26 July 2011 20:08, Jonathan Ellis jbel...@gmail.com wrote:

 cassandra.db.Caches

 On Tue, Jul 26, 2011 at 2:11 AM, Nilabja Banerjee
 nilabja.baner...@gmail.com wrote:
  Thank you every one it is working fine.
 
  I was watching jconsole behavior...can tell me where exactly I can find 
  RecentHitRates :
 
  Tuning for Optimal Caching:
 
  Here they have given one example of that
 
 http://www.datastax.com/docs/0.8/operations/cache_tuning#configuring-key-and-row-caches
  RecentHitRates...  In my jconsole within MBean I am unable to find
 that
  one.
  what is the value of long[36] and long[90].  From Jconsole attributes
  how can I find the  performance of the casssandra while stress testing?
  Thank You
 
 
  On 26 July 2011 14:33, aaron morton aa...@thelastpickle.com wrote:
 
  It's in the source distribution under tools/stress see the instructions
 in
  the README file and then look at the command line help (bin/stress
 --help).
  Cheers
  -
  Aaron Morton
  Freelance Cassandra Developer
  @aaronmorton
  http://www.thelastpickle.com
  On 26 Jul 2011, at 19:40, CASSANDRA learner wrote:
 
  Hi,,
  I too wanna know what this stress tool do? What is the usage of this
  tool... Please explain
 
  On Fri, Jul 22, 2011 at 6:39 PM, Jonathan Ellis jbel...@gmail.com
 wrote:
 
  What does nodetool ring say?
 
  On Fri, Jul 22, 2011 at 12:43 AM, Nilabja Banerjee
  nilabja.baner...@gmail.com wrote:
   Hi All,
  
   I am following this following link 
   http://www.datastax.com/docs/0.7/utilities/stress_java  for a
 stress
   test.
   I am getting this notification after running this command
  
   xxx.xxx.xxx.xx= my ip
  
   contrib/stress/bin/stress -d xxx.xxx.xxx.xx
  
   Created keyspaces. Sleeping 1s for propagation.
   total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
   Operation [44] retried 10 times - error inserting key 044
   ((UnavailableException))
  
   Operation [49] retried 10 times - error inserting key 049
   ((UnavailableException))
  
   Operation [7] retried 10 times - error inserting key 007
   ((UnavailableException))
  
   Operation [6] retried 10 times - error inserting key 006
   ((UnavailableException))
  
   Any idea why I am getting these things?
  
   Thank You
  
  
  
 
 
 
  --
  Jonathan Ellis
  Project Chair, Apache Cassandra
  co-founder of DataStax, the source for professional Cassandra support
  http://www.datastax.com
 
 
 
 



 --
 Jonathan Ellis
 Project Chair, Apache Cassandra
 co-founder of DataStax, the source for professional Cassandra support
 http://www.datastax.com