Re: ideal cluster size

2012-01-23 Thread aaron morton
I second Peters point, big servers are not always the best. My experience (using spinning disks) is that 200 to 300 GB of live data load per node (including replicated data) is a sweet spot. Above this the time taken for compaction, repair, off node backups, node moves etc starts to be a pain.

Re: ideal cluster size

2012-01-21 Thread Thorsten von Eicken
Good point. One thing I'm wondering about cassandra is what happens when there is a massive failure. For example, if 1/3 of the nodes go down or become unreachable. This could happen in EC2 if an AZ has a failure, or in a datacenter if a whole rack or UPS goes dark. I'm not so concerned about the t

Re: ideal cluster size

2012-01-21 Thread Peter Schuller
> Thanks for the responses! We'll definitely go for powerful servers to > reduce the total count. Beyond a dozen servers there really doesn't seem > to be much point in trying to increase count anymore for Just be aware that if "big" servers imply *lots* of data (especially in relation to memory s

Re: ideal cluster size

2012-01-21 Thread Thorsten von Eicken
We're embarking on a project where we estimate we will need on > the order > of 100 cassandra nodes. The data set is perfectly > partitionable, meaning > we have no queries that need to have access to all the data at > once. We >

Re: ideal cluster size

2012-01-21 Thread Eric Czech
odes. The data set is perfectly partitionable, meaning >> we have no queries that need to have access to all the data at once. We >> expect to run with RF=2 or =3. Is there some notion of ideal cluster >> size? Or perhaps asked differently, would it be easier to run one large >

Re: ideal cluster size

2012-01-20 Thread Maxim Potekhin
Is there some notion of ideal cluster size? Or perhaps asked differently, would it be easier to run one large cluster or would it be easier to run a bunch of, say, 16 node clusters? Everything we've done to date has fit into 4-5 node clusters.

Re: ideal cluster size

2012-01-19 Thread Peter Schuller
> We're embarking on a project where we estimate we will need on the order > of 100 cassandra nodes. The data set is perfectly partitionable, meaning > we have no queries that need to have access to all the data at once. We > expect to run with RF=2 or =3. Is there some notion

ideal cluster size

2012-01-19 Thread Thorsten von Eicken
We're embarking on a project where we estimate we will need on the order of 100 cassandra nodes. The data set is perfectly partitionable, meaning we have no queries that need to have access to all the data at once. We expect to run with RF=2 or =3. Is there some notion of ideal cluster siz