Re: Out of Memory Issues - SERIOUS

2010-10-07 Thread Jonathan Ellis
if you don't want to lose data, don't wipe your commit logs. that part seems pretty obvious to me. :) cassandra aggressively logs its state when it is running out of memory so you can troubleshoot. look for the GCInspector lines in the log. but in this case it sounds pretty simple; you will be

Out of Memory Issues - SERIOUS

2010-10-07 Thread Dan Hendry
There seems to have been a fair amount of discussion on memory related issues so I apologize if this exact situation has come up before. I am currently in the process of load testing an metrics platform I have written which uses Cassandra and I have run into some very troubling issues. The app

Re: Dazed and confused with Cassandra on EC2 ...

2010-10-07 Thread Matthew Dennis
Also, in general, you probably want to set Xms = Xmx (regardless of the value you eventually decide on for that). If you set them equal, the JVM will just go ahead and allocate that amount on startup. If they're different, then when you grow above Xms it has to allocate more and move a bunch of s

Re: Heap Settings suggestions

2010-10-07 Thread kannan chandrasekaran
Good point.. Thanks to both of you for the replies. Kannan From: Matthew Dennis To: user@cassandra.apache.org Sent: Thu, October 7, 2010 4:59:28 PM Subject: Re: Heap Settings suggestions Keep in mind that .7 and on will have per-CF settings for most things s

Re: Retrieving dead node's token from system keyspace

2010-10-07 Thread Matthew Dennis
Allan, I'm confused on why removetoken doesn't do anything and would be interested in finding out why, but to answer your question: You can shutdown down your last node, nuke the system directory (make a backup just in case), restart the node, load the schema (export it first if need be) and be o

Re: Newbie Question about restarting Cassandra

2010-10-07 Thread Matthew Dennis
Yes. You probably shouldn't ever be using CL.ANY (though I'm certain there are others that disagree with me; I wish them the best of luck with that). CL.ONE + periodic sync can potentially lose recently written data, but if you care about that then you better care enough about your data to use so

Re: Retrieving dead node's token from system keyspace

2010-10-07 Thread Aaron Morton
Allan, I'm a bit confused about what you are trying to do here. You have 2 nodes with RF = ? , you lost one node completely and now you want to...Just get a cluster running again, don't worry about the data.ORRestore the data from the dead node. ORCreate a cluster with the data from the remaining n

RE: Newbie Question about restarting Cassandra

2010-10-07 Thread David McIntosh
Are there any data loss concerns if you have the commit log sync set to periodic and are writing with CL One or Any? From: Matthew Dennis [mailto:mden...@riptano.com] Sent: Wednesday, October 06, 2010 8:53 PM To: user@cassandra.apache.org Subject: Re: Newbie Question about restarting Cassandra

Re: Tuning cassandra to use less memory

2010-10-07 Thread Matthew Dennis
+1 on disabling swap On Oct 7, 2010 3:27 PM, "Peter Schuller" wrote: >> The nodes are still swapping, even though the swappiness is set to zero >> right now. After swapping comes the OOM. > > In addition to what's already been said, consider just flat out > disabling swap completely, unless you ha

Re: Heap Settings suggestions

2010-10-07 Thread Matthew Dennis
Keep in mind that .7 and on will have per-CF settings for most things so there will be even more control over the the tuning... On Oct 7, 2010 3:10 PM, "Peter Schuller" wrote: >> What if there is more than one keyspace in the system ? Assuming each >> keyspace has the same number of column familie

Cassandra and EC2 performance testing

2010-10-07 Thread Corey Hulen
I recently posted a blog article about Cassandra and EC2 performance testing for small vs large, EBS vs ephemeral storage, compared to real HW with and without an SSD. Hope people find it interesting. http://www.coreyhulen.org/?p=326 Highlights: - The variance in test results from run to run

Re: Retrieving dead node's token from system keyspace

2010-10-07 Thread Allan Carroll
I was able to figure out to use the sstable2json tool to get the values out of the system keyspace. Unfortunately, the node that went down took all of it's data with it and I only have access to the system keyspace of the remaining live node. There were only two nodes and the one left should ha

Retrieving dead node's token from system keyspace

2010-10-07 Thread Allan Carroll
Hey all, I had a node go down that I'm not able to get a token for from nodetool ring. The wiki says: "You can obtain the dead node's token by running nodetool ring on any live node, unless there was some kind of outage, and the others came up but not the down one -- in that case, you can ret

Re: Dazed and confused with Cassandra on EC2 ...

2010-10-07 Thread Peter Schuller
>  There's some words on the 'Net that - the recent pages on >  Riptano's site in fact - that strongly encourage scaling left >  and right, rather than beefing up the boxes - and certainly >  we're seeing far less bother from GC using a much smaller >  heap - previously we'd been going up to 16GB,

Re: Tuning cassandra to use less memory

2010-10-07 Thread Peter Schuller
> The nodes are still swapping, even though the swappiness is set to zero > right now. After swapping comes the OOM. In addition to what's already been said, consider just flat out disabling swap completely, unless you have other things on the machine that cause swap to be significantly useful (i.

Re: Heap Settings suggestions

2010-10-07 Thread Peter Schuller
> What if there is more than one keyspace in the system ? Assuming each > keyspace has the same number of column families, Can I linearly scale the > above recommendation to the number of keyspaces in the system .ie, if the > "X" is the heap size for a single keyspace and there are "Y" keyspaces, I

Heap Settings suggestions

2010-10-07 Thread kannan chandrasekaran
>From the Cassandra documentation @ riptano I see the following recommendation for Heap size setting MemtableThroughputInMB * 3 * (number of ColumnFamilies) + 1G + (size of internal caches) What if there is more than one keyspace in the system ? Assuming each keyspace has the same number of c

Re: Advice on settings

2010-10-07 Thread Dave Viner
Also, as a note related to EC2, choose whether you want to be in multiple availability zones. The highest performance possible is to be in a single AZ, as all those machines will have *very* high speed interconnects. But, individual AZs also can suffer outages. You can distribute your instances

Re: Advice on settings

2010-10-07 Thread B. Todd Burruss
if you are updating columns quite rapidly, you will scatter the columns over many sstables as you update them over time. this means that a read of a specific column will require looking at more sstables to find the data. performing a compaction (using nodetool) will merge the sstables into on

Re: Advice on settings

2010-10-07 Thread Peter Schuller
> However Ben Black suggests here that the cleanup will actually only > impact data deleted through the API: > > http://comments.gmane.org/gmane.comp.db.cassandra.user/4437 > > In this case, I guess that we need not worry too much about the > setting since we are actually updating, never deleting.

Re: Creating and using indices

2010-10-07 Thread Jonathan Ellis
On Thu, Oct 7, 2010 at 10:13 AM, Christian Decker wrote: > So basically my indices should work? Is there a simple way to check that, so > that we can exclude that? > > Are LTE working (or on the roadmap for the 0.7.0 release)? No, LT[E] is not on the roadmap for primary index clauses (GT[E] is, f

Re: Creating and using indices

2010-10-07 Thread Tyler Hobbs
Actually, you're trying to add an index to an already existing column family here, right? That's not yet supported, but should be soon. - Tyler On Thu, Oct 7, 2010 at 10:13 AM, Christian Decker < decker.christ...@gmail.com> wrote: > So basically my indices should work? Is there a simple way to

Re: Creating and using indices

2010-10-07 Thread Christian Decker
So basically my indices should work? Is there a simple way to check that, so that we can exclude that? Are LTE working (or on the roadmap for the 0.7.0 release)? Or does the EQ operator have to math anything or can I just add an EQ to a not existing value to get LTE working too? Regards, Chris O

Re: Creating and using indices

2010-10-07 Thread Petr Odut
What I've tested you must include at least one expression with EQ operator On Thu, Oct 7, 2010 at 3:45 PM, Matthew Dennis wrote: > If I remember correctly the only operator supported for secondary indexes > right now is EQ, not LTE (or the others). > > > On Thu, Oct 7, 2010 at 6:13 AM, Christian

Re: Creating secondary indices after startup

2010-10-07 Thread Jonathan Ellis
This is not in beta2 but will be in 0.7.0 (https://issues.apache.org/jira/browse/CASSANDRA-1532) On Thu, Oct 7, 2010 at 7:30 AM, wrote: > Hello, > > I am trying to work out the new secondary index code on my own, as there > is no documentation. I've seen the 'Cassandra explained' presentation an

Re: Creating and using indices

2010-10-07 Thread Matthew Dennis
If I remember correctly the only operator supported for secondary indexes right now is EQ, not LTE (or the others). On Thu, Oct 7, 2010 at 6:13 AM, Christian Decker wrote: > I'm currently trying to get started on secondary indices in Cassandra > 0.7.0svn, but without any luck so far. I have the

Creating and using indices

2010-10-07 Thread Christian Decker
I'm currently trying to get started on secondary indices in Cassandra 0.7.0svn, but without any luck so far. I have the following code that should create an index on ColA: KsDef ksDef = client.describe_keyspace("MyKeyspace"); > List cfs = ksDef.cf_defs; > String columnFamil

Advice on settings

2010-10-07 Thread Dave Gardner
Hi all We're rolling out a Cassandra cluster on EC2 and I've got a couple if questions about settings. I'm interested to hear what other people have experienced with different values and generally seek advice. *gcgraceseconds* Currently we configure one setting for all CFs. We experimented with