Re: RFC: Cassandra Virtual Nodes

2012-03-17 Thread Peter Schuller
Point of clarification: My use of the term "bucket" is completely unrelated to the term "bucket" used in the CRUSH paper. -- / Peter Schuller (@scode, http://worldmodscode.wordpress.com)

Re: RFC: Cassandra Virtual Nodes

2012-03-17 Thread Peter Schuller
> *The summary is*: we believe virtual nodes are the way forward. We would > like to add virtual nodes to Cassandra and we are asking for comments, > criticism and collaboration! I am very happy to see some momentum on this, and I would like to go even further than what you propose. The main reaso

Re: RFC: Cassandra Virtual Nodes

2012-03-17 Thread Eric Evans
On Sat, Mar 17, 2012 at 3:22 PM, Zhu Han wrote: > On Sat, Mar 17, 2012 at 7:38 AM, Sam Overton wrote: >> This is a long email. It concerns a significant change to Cassandra, so >> deserves a thorough introduction. >> >> *The summary is*: we believe virtual nodes are the way forward. We would >> l

Re: RFC: Cassandra Virtual Nodes

2012-03-17 Thread Edward Capriolo
I agree having smaller regions would help the rebalencing situation both with rp and bop. However i an not sure if dividing tables across disk s will give any better performance. you will have more seeking spindles and can possibly sub divide token ranges into separate files. But fs cache will ge

Re: RFC: Cassandra Virtual Nodes

2012-03-17 Thread Eric Evans
On Sat, Mar 17, 2012 at 11:15 AM, Radim Kolar wrote: > I don't like that every node will have same portion of data. > > 1. We are using nodes with different HW sizes (number of disks) > 2.  especially with ordered partitioner there tends to be hotspots and you > must assign smaller portion of data

Re: RFC: Cassandra Virtual Nodes

2012-03-17 Thread Zhu Han
On Sat, Mar 17, 2012 at 7:38 AM, Sam Overton wrote: > Hello cassandra-dev, > > This is a long email. It concerns a significant change to Cassandra, so > deserves a thorough introduction. > > *The summary is*: we believe virtual nodes are the way forward. We would > like to add virtual nodes to Ca

Re: RFC: Cassandra Virtual Nodes

2012-03-17 Thread Sam Overton
On 17 March 2012 11:15, Radim Kolar wrote: > I don't like that every node will have same portion of data. > > 1. We are using nodes with different HW sizes (number of disks) > 2. especially with ordered partitioner there tends to be hotspots and you > must assign smaller portion of data to nodes

Re: RFC: Cassandra Virtual Nodes

2012-03-17 Thread Radim Kolar
I don't like that every node will have same portion of data. 1. We are using nodes with different HW sizes (number of disks) 2. especially with ordered partitioner there tends to be hotspots and you must assign smaller portion of data to nodes holding hotspots