Hey Henrik,

> I noticed there are new blueprints for libdrizzle sharding functions
> on launchpad
> https://blueprints.launchpad.net/drizzle/+spec/libdrizzle-sharding-phase1

That new blueprints is slightly dated :|. You will find a more recent
discussion on libdrizzle sharding support in the mailing list
archives. Don't consider that discussion complete in any regard; just
a direction. My proposal on Melange was a little more updated, but its
not showing the proposal now :(. Anyways, I will re-start the
discussion soon.

> It mentions the ketama consistent sharding technique commonly used
> with memcached. I don't think (but I don't claim to know, so please
> comment) that this is at all applicable to Drizzle.

I will slightly disagree with the 'at all' point here.

[...]

> ...the point in using a consistent hashing technique is to minimize
> cache misses after re-shard. But Drizzle is not a cache. Let me

In case of Drizzle, minimizing cache miss roughly translates to
minimizing the amount of data that needs to be re-sharded.

[...]

> servers this number decreases to 1/4, 1/5 and so on... The point with
> using consistent hashing is to minimize cache misses, so that after
> adding servers, approximately 2/3 (and then 3/4, 4/5, and so on...) of
> the keys will still map to the old server and find the old record. But
> for the remaining 1/n fraction, the records are still lost. It is a
> cache, this is acceptable behavior., and it is very simple,
> no-maintenance solution.

A figure of 1/3 (that does not map to the same server) is a little
unsettling; 2/1000, or 3/1000 is more comforting. The more the number
of node, the less (percentage of) data you need to redistribute.

So I feel that sharding with ketama is still relevant to Drizzle. IMO,
the probabilistic distribution of data in a hash based partitioning
scheme is not very attractive for a DBMS.

> Drizzle is a database. So we cannot just discard 1/n of the data after
> adding a server. The data has to actually be moved from old location
> to the new ones. Consistent hashing doesn't make it any easier to know
> which data should go where, in fact it makes it harder.

For ketama, one way that I can think of to to move data from old to
new server, is to 'replay' the creation of rows that map to the now
orphaned key space.

> 2) virtual buckets.

As you will see in the links[1][2] that I pointed out, virtual buckets
with some changes/additions is the current plan.

>  - You start with 1 server. There can already be data in the database.
>  - You divide your data into N virtual shards. N >> max number of
> shards you'll ever have. Say N=10k or more. (Ask Facebook :-)

With over 6 million people Facebook has got to define some upper limit :).

[...]

> In both 1 and 2 the act of moving data is complicated, it has to
> happen somehow "atomically":
>  - clients use old shard layout
>  - copy data to new shards
>  - keep in sync old shards and new shards
>  - start using new shard layout
>  - discard moved data from old shards

Yup. I have not thought about the implementation details here, but
that would be the essence of it.

-- 
Anurag Priyam
http://about.me/yeban/

_______________________________________________
Mailing list: https://launchpad.net/~drizzle-discuss
Post to     : [email protected]
Unsubscribe : https://launchpad.net/~drizzle-discuss
More help   : https://help.launchpad.net/ListHelp

Reply via email to