Hi I noticed there are new blueprints for libdrizzle sharding functions on launchpad https://blueprints.launchpad.net/drizzle/+spec/libdrizzle-sharding-phase1
It mentions the ketama consistent sharding technique commonly used with memcached. I don't think (but I don't claim to know, so please comment) that this is at all applicable to Drizzle. As explained in http://www.mikeperham.com/2009/01/14/consistent-hashing-in-memcache-client/ ...the point in using a consistent hashing technique is to minimize cache misses after re-shard. But Drizzle is not a cache. Let me explain: In memcached, and using naive sharding [key % num_servers], when you have 2 servers and add a 3rd, it means that the sharding key changes for all your records/objects, and all data is lost. Well, in fact 1/3 of the keys will still map to the old server, and as you add more servers this number decreases to 1/4, 1/5 and so on... The point with using consistent hashing is to minimize cache misses, so that after adding servers, approximately 2/3 (and then 3/4, 4/5, and so on...) of the keys will still map to the old server and find the old record. But for the remaining 1/n fraction, the records are still lost. It is a cache, this is acceptable behavior., and it is very simple, no-maintenance solution. Drizzle is a database. So we cannot just discard 1/n of the data after adding a server. The data has to actually be moved from old location to the new ones. Consistent hashing doesn't make it any easier to know which data should go where, in fact it makes it harder. Imho the solutions available for Drizzle are: 1) back to naive approach (which I believe the stock memcached client uses). When scaling out, you just move the data record, by record into their correct shards. I think MySQL Cluster does something similar. The main drawback here is that if the sharding key doesn't result in an even spread of data, some shards are full and some empty and you cannot do much about it. 2) virtual buckets. - You start with 1 server. There can already be data in the database. - You divide your data into N virtual shards. N >> max number of shards you'll ever have. Say N=10k or more. (Ask Facebook :-) - At this point all data is still physically in one and the same InnoDB table. - When scaling out, you add more servers and move virtual buckets one by one to new servers. - You need to maintain a mapping from virtual bucket to actual Drizzle instances. All clients and servers need to agree on this. This could be a system table on one or all Drizzle instances. - you can move virtual buckets back and forth to re-balance your data. In both 1 and 2 the act of moving data is complicated, it has to happen somehow "atomically": - clients use old shard layout - copy data to new shards - keep in sync old shards and new shards - start using new shard layout - discard moved data from old shards henrik -- [email protected] +358-40-8211286 skype: henrik.ingo irc: hingo www.openlife.cc My LinkedIn profile: http://www.linkedin.com/profile/view?id=9522559 _______________________________________________ Mailing list: https://launchpad.net/~drizzle-discuss Post to : [email protected] Unsubscribe : https://launchpad.net/~drizzle-discuss More help : https://help.launchpad.net/ListHelp

