Hey Henrik, > I noticed there are new blueprints for libdrizzle sharding functions > on launchpad > https://blueprints.launchpad.net/drizzle/+spec/libdrizzle-sharding-phase1
That new blueprints is slightly dated :|. You will find a more recent discussion on libdrizzle sharding support in the mailing list archives. Don't consider that discussion complete in any regard; just a direction. My proposal on Melange was a little more updated, but its not showing the proposal now :(. Anyways, I will re-start the discussion soon. > It mentions the ketama consistent sharding technique commonly used > with memcached. I don't think (but I don't claim to know, so please > comment) that this is at all applicable to Drizzle. I will slightly disagree with the 'at all' point here. [...] > ...the point in using a consistent hashing technique is to minimize > cache misses after re-shard. But Drizzle is not a cache. Let me In case of Drizzle, minimizing cache miss roughly translates to minimizing the amount of data that needs to be re-sharded. [...] > servers this number decreases to 1/4, 1/5 and so on... The point with > using consistent hashing is to minimize cache misses, so that after > adding servers, approximately 2/3 (and then 3/4, 4/5, and so on...) of > the keys will still map to the old server and find the old record. But > for the remaining 1/n fraction, the records are still lost. It is a > cache, this is acceptable behavior., and it is very simple, > no-maintenance solution. A figure of 1/3 (that does not map to the same server) is a little unsettling; 2/1000, or 3/1000 is more comforting. The more the number of node, the less (percentage of) data you need to redistribute. So I feel that sharding with ketama is still relevant to Drizzle. IMO, the probabilistic distribution of data in a hash based partitioning scheme is not very attractive for a DBMS. > Drizzle is a database. So we cannot just discard 1/n of the data after > adding a server. The data has to actually be moved from old location > to the new ones. Consistent hashing doesn't make it any easier to know > which data should go where, in fact it makes it harder. For ketama, one way that I can think of to to move data from old to new server, is to 'replay' the creation of rows that map to the now orphaned key space. > 2) virtual buckets. As you will see in the links[1][2] that I pointed out, virtual buckets with some changes/additions is the current plan. > - You start with 1 server. There can already be data in the database. > - You divide your data into N virtual shards. N >> max number of > shards you'll ever have. Say N=10k or more. (Ask Facebook :-) With over 6 million people Facebook has got to define some upper limit :). [...] > In both 1 and 2 the act of moving data is complicated, it has to > happen somehow "atomically": > - clients use old shard layout > - copy data to new shards > - keep in sync old shards and new shards > - start using new shard layout > - discard moved data from old shards Yup. I have not thought about the implementation details here, but that would be the essence of it. -- Anurag Priyam http://about.me/yeban/ _______________________________________________ Mailing list: https://launchpad.net/~drizzle-discuss Post to : [email protected] Unsubscribe : https://launchpad.net/~drizzle-discuss More help : https://help.launchpad.net/ListHelp

