Re: Ring management and load balance

2010-03-29 Thread Jonathan Ellis
On Fri, Mar 26, 2010 at 4:35 PM, Mike Malone wrote: > With the random partitioner there's no need to suggest a token. The key > space is statistically random so you should be able to just split 2^128 into > equal sized segments and get fairly equal storage load. Your read / write > load could get

Re: Ring management and load balance

2010-03-27 Thread Roland Hänel
Mike, If you have the assumption that your rows are roughly equal in size (at least statistcally), then you could also just take a node's total load (this is exposed via Jmx) and divide by the amount of keys/rows on that node. Not sure how to get the latter, but shouldn't be such a big deal to int

Re: Ring management and load balance

2010-03-27 Thread Roland Hänel
Sorry for the last mail, hit the wrong button. This JMX property gives a per-CF granularity, right? I think it doesn't solve the problem completely here because the problem of key load-balancing effectively demands for a per-key granularity. But this could help statistical sampling. Roland 26.

Re: Ring management and load balance

2010-03-26 Thread Roland Hänel
But this 26.03.2010 22:29 schrieb am "Rob Coli" : On 3/26/10 1:36 PM, Roland Hänel wrote: > > If I was going to write such a tool: do you think the th... The JMX interface exposes an Attribute which seems appropriate to this use. It is called "TotalDiskSpaceUsed," and is available on a per-column

Re: Ring management and load balance

2010-03-26 Thread Mike Malone
2010/3/26 Roland Hänel > Jonathan, > > I agree with your idea about a tool that could 'propose' good token choices > for optimal load-balancing. > > If I was going to write such a tool: do you think the thrift API provides > the necessary information? I think with the RandomPartitioner you cannot

Re: Ring management and load balance

2010-03-26 Thread Rob Coli
On 3/26/10 1:36 PM, Roland Hänel wrote: If I was going to write such a tool: do you think the thrift API provides the necessary information? I think with the RandomPartitioner you cannot scan all your rows to actually find out how big certain ranges of rows are. And even with the OPP (that is the

Re: Ring management and load balance

2010-03-26 Thread Roland Hänel
Jonathan, I agree with your idea about a tool that could 'propose' good token choices for optimal load-balancing. If I was going to write such a tool: do you think the thrift API provides the necessary information? I think with the RandomPartitioner you cannot scan all your rows to actually find

RE: Ring management and load balance

2010-03-25 Thread Stu Hood
ing" Sent: Thursday, March 25, 2010 6:59pm To: "user@cassandra.apache.org" Subject: RE: Ring management and load balance I agree it's only a problem with 'small' clusters - but it seems like 'small' is 'most users'? Even with 10 nodes it looks like

RE: Ring management and load balance

2010-03-25 Thread Daniel Kluesing
e folks not had trouble with incremental scalability? -Original Message- From: Jonathan Ellis [mailto:jbel...@gmail.com] Sent: Thursday, March 25, 2010 2:27 PM To: user@cassandra.apache.org Subject: Re: Ring management and load balance One problem is if the heaviest node is next to a node

Re: Ring management and load balance

2010-03-25 Thread Jonathan Ellis
One problem is if the heaviest node is next to a node that's is lighter than average, instead of heavier. Then if the new node takes extra from the heaviest, say 75% instead of just 1/2, and then we take 1/2 of the heaviest's neighbor and put it on the heaviest, you made that lighter-than-average

Re: Ring management and load balance

2010-03-25 Thread Jeremy Dunck
On Thu, Mar 25, 2010 at 1:26 PM, Jonathan Ellis wrote: > Pretty much everything assumes that there is a 1:1 correspondence > between IP and Token.  It's probably in the ballpark of "one month to > code, two to get the bugs out."  Gossip is one of the trickier parts > of our code base, and this wou

Re: Ring management and load balance

2010-03-25 Thread Jonathan Ellis
On Thu, Mar 25, 2010 at 1:17 PM, Mike Malone wrote: > On Thu, Mar 25, 2010 at 9:56 AM, Jonathan Ellis wrote: >> >> The advantage to doing it the way Cassandra does is that you can keep >> keys sorted with OrderPreservingPartitioner for range scans.  grabbing >> one token of many from each node in

Re: Ring management and load balance

2010-03-25 Thread Mike Malone
On Thu, Mar 25, 2010 at 9:56 AM, Jonathan Ellis wrote: > The advantage to doing it the way Cassandra does is that you can keep > keys sorted with OrderPreservingPartitioner for range scans. grabbing > one token of many from each node in the ring would prohibit that. > > So we rely on active load

Re: Ring management and load balance

2010-03-25 Thread Jonathan Ellis
On Thu, Mar 25, 2010 at 11:40 AM, Jeremy Dunck wrote: > On Thu, Mar 25, 2010 at 10:56 AM, Jonathan Ellis wrote: >> The advantage to doing it the way Cassandra does is that you can keep >> keys sorted with OrderPreservingPartitioner for range scans.  grabbing >> one token of many from each node in

Re: Ring management and load balance

2010-03-25 Thread Jeremy Dunck
On Thu, Mar 25, 2010 at 10:56 AM, Jonathan Ellis wrote: > The advantage to doing it the way Cassandra does is that you can keep > keys sorted with OrderPreservingPartitioner for range scans.  grabbing > one token of many from each node in the ring would prohibit that. > > So we rely on active load

Re: Ring management and load balance

2010-03-25 Thread Jonathan Ellis
The advantage to doing it the way Cassandra does is that you can keep keys sorted with OrderPreservingPartitioner for range scans. grabbing one token of many from each node in the ring would prohibit that. So we rely on active load balancing to get to a "good enough" balance, say within 50%. It

Ring management and load balance

2010-03-25 Thread Daniel Kluesing
I wanted to check my understanding of the load balance operation. Let's say I have 5 nodes, each of them has been assigned at startup 1/5 of the ring, and the load is equal across them (say using random partitioner). The load on the cluster gets high, so I add a sixth server. During bootstrap, t