Hi Todd, Entity Groups : https://issues.apache.org/jira/browse/CASSANDRA-1684
-Jake On Wed, Nov 9, 2011 at 6:44 AM, Todd Burruss <bburr...@expedia.com> wrote: > I believe I heard someone talk at Cassandra SF conference about creating a > partitioner that was a derivation of RandomPartitioner. It essentially > would look for keys that adhere to a certain pattern, like <key>:<subkey>. > The <key> portion would be used for determining the location on the ring, > but <key>:<subkey> for actually storing. This would allow groups of data > (all having the same <key>) to reside on the same node, while still > maintaining uniqueness across the entire keyspace. > > Unbalanced nodes could still occur, but I don't think any worse than > wide/large rows can cause. > > > On 11/8/11 1:29 AM, "Daniel Doubleday" <daniel.double...@gmx.net> wrote: > > >Ah cool - thanks for the pointer! > > > >On Nov 7, 2011, at 5:25 PM, Ed Anuff wrote: > > > >> This is basically what entity groups are about - > >> https://issues.apache.org/jira/browse/CASSANDRA-1684 > >> > >> On Mon, Nov 7, 2011 at 5:26 AM, Peter Lin <wool...@gmail.com> wrote: > >>> This feature interests me, so I thought I'd add some comments. > >>> > >>> Having used partition features in existing databases like DB2, Oracle > >>> and manual partitioning, one of the biggest challenges is keeping the > >>> partitions balanced. What I've seen with manual partitioning is that > >>> often the partitions get unbalanced. Usually the developers take a > >>> best guess and hope it ends up balanced. > >>> > >>> Some of the approaches I've used in the past were zip code, area code, > >>> state and some kind of hash. > >>> > >>> So my question related deterministic sharding is this, "what rebalance > >>> feature(s) would be useful or needed once the partitions get > >>> unbalanced?" > >>> > >>> Without a decent plan for rebalancing, it often ends up being a very > >>> painful problem to solve in production. Back when I worked mobile > >>> apps, we saw issues with how OpenWave WAP servers partitioned the > >>> accounts. The early versions randomly assigned a phone to a server > >>> when it is provisioned the first time. Once the phone was associated > >>> to that server, it was stuck on that server. If the load on that > >>> server was heavier than the others, the only choice was to "scale up" > >>> the hardware. > >>> > >>> My understanding of Cassandra's current sharding is consistent and > >>> random. Does the new feature sit some where in-between? Are you > >>> thinking of a pluggable API so that you can provide your own hash > >>> algorithm for cassandra to use? > >>> > >>> > >>> > >>> On Mon, Nov 7, 2011 at 7:54 AM, Daniel Doubleday > >>> <daniel.double...@gmx.net> wrote: > >>>> Allow for deterministic / manual sharding of rows. > >>>> > >>>> Right now it seems that there is no way to force rows with different > >>>>row keys will be stored on the same nodes in the ring. > >>>> This is our number one reason why we get data inconsistencies when > >>>>nodes fail. > >>>> > >>>> Sometimes a logical transaction requires writing rows with different > >>>>row keys. If we could use something like this: > >>>> > >>>> prefix.uniquekey and let the partitioner use only the prefix the > >>>>probability that only part of the transaction would be written could > >>>>be reduced considerably. > >>>> > >>>> > >>>> > >>>> On Nov 1, 2011, at 11:59 PM, Jonathan Ellis wrote: > >>>> > >>>>> Hi all, > >>>>> > >>>>> Two years ago I asked for Cassandra use cases and feature requests. > >>>>> [1] The results [2] have been extremely useful in setting and > >>>>> prioritizing goals for Cassandra development. But with the release > >>>>>of > >>>>> 1.0 we've accomplished basically everything from our original wish > >>>>> list. [3] > >>>>> > >>>>> I'd love to hear from modern Cassandra users again, especially if > >>>>> you're usually a quiet lurker. What does Cassandra do well? What > >>>>>are > >>>>> your pain points? What's your feature wish list? > >>>>> > >>>>> As before, if you're in stealth mode or don't want to say anything in > >>>>> public, feel free to reply to me privately and I will keep it off the > >>>>> record. > >>>>> > >>>>> [1] > >>>>> > http://www.mail-archive.com/cassandra-dev@incubator.apache.org/msg0114 > >>>>>8.html > >>>>> [2] > >>>>> > http://www.mail-archive.com/cassandra-user@incubator.apache.org/msg014 > >>>>>46.html > >>>>> [3] > >>>>>http://www.mail-archive.com/dev@cassandra.apache.org/msg01524.html > >>>>> > >>>>> -- > >>>>> Jonathan Ellis > >>>>> Project Chair, Apache Cassandra > >>>>> co-founder of DataStax, the source for professional Cassandra support > >>>>> http://www.datastax.com > >>>> > >>>> > >>> > > > > -- http://twitter.com/tjake