On Sun, Mar 20, 2011 at 8:12 PM, Anurag Priyam <[email protected]> wrote:
[...]
> I am interested in implementing native sharding support in libdrizzle,
> and would like to take it up as a summer of code project. I want to

Here is an overview of what I am thinking.

*****

Overview
========

The idea behind sharding is to split data across multiple nodes in an order
preserving manner. Queries are therefore distributed across various nodes.

Mapping Queries to Shards
-------------------------

Use a vBucket[1][2] style partitioning scheme coupled with libhashkit[3] for a
pluggable hashing solution.

vBucket offers greater flexibility over ketama in partitioning data. It follows
a two stage mapping:
a. a hashing fucntion maps data to virtual buckets
b. virtual buckets statically map to nodes

This scheme lets one effectively manage hotspots, and easily add/remove nodes
over any cluster size.

Libhashkit lets one choose from a number of pre-defined hashing functions, or
define one of your own, which can be leveraged to control the spread of data
across virtual buckets.

Shard groups
---------------

For a vBucket we can have an entry in a vBucket map that is an array of indices
into a server list. Something like:

server_list = ['yeban.in:3306', 'yeban.in:3307', 'yeban.in:3308']
vbucket_map = [ [0, 1, 2], [1, 2, 0], [2, 1, 0] ]

In the above configuration, each server is capable of serving a vbucket.

vBucket has the concept of states. Typically (as used in libvbucket[4]), a
vBucket can have one of the following states:
a. Active  - This server is servicing all requests for this vbucket.
b. Dead    - This server is not in any way responsible for this vbucket
c. Replica - Handle replication but do not serve.
d. Pending - Do not serve.

Since, vBucket is the acutal partition entity, and more than one server can
handle a vBucket, we can have shard groups. The responsibility of a member
within a shard group can be controlled by modifying states.

Shard Group Topology
--------------------

Drizzle has a pluggable replication system, so that one can write a replication
scheme of varying complextity.

For eg:
a. The default slave plugin's one to one replication[5]
a. Joe Daly's Heirearchial Replication[6]
b. David Shrewsbury's Master Master Replication[7]

Let us slightly modify the default semantics of a vBucket state, so that, a
replica can server read requests.

It is then possible to setup various shard group topologies, ranging from simple
master-slave, to master-master, to multiple master - multiple slave, depending
on an applications requirements.

A simple master-slave scheme would be something like:
a. for a given vBucket, we have one active, and several replicas
b. write to the active server, which replicates to the replicas
c. read from the replicas

Handling Failure
----------------

In the event of the failure of an active node, we change the state of one of the
replicas to active.

Addition/Removal of New Servers
-------------------------------

On addition/removal of a server we need to migrate vBuckets from one server to
the other.

To add a new server:
a. map a vBucket to the new server (add it to a group)
b. mark the new one as dead
c. replicate to the new server
d. mark the new server active, and the old one dead, or replica

Implementation
==============

TODO

Testing
=======

TODO

Andrew was suggestive of providing access to Jenkins build slaves.

About me
========

TODO

Mentor
=====
Andrew, Brian, Stewart?

Links
=====
[1] http://techzone.couchbase.com/wiki/display/membase/vBuckets
[2] http://dustin.github.com/2010/06/29/memcached-vbuckets.html
[3] 
http://bazaar.launchpad.net/~libmemcached-developers/libmemcached/trunk/files/head:/libhashkit/
[4] https://github.com/membase/libvbucket
[5] http://docs.drizzle.org/plugins/slave/index.html
[6] http://www.8bitsofbytes.com/?p=28
[7] http://dshrewsbury.blogspot.com/2011/03/multi-master-support-in-drizzle.html

*****

-- 
Anurag Priyam
http://about.me/yeban/

_______________________________________________
Mailing list: https://launchpad.net/~drizzle-discuss
Post to     : [email protected]
Unsubscribe : https://launchpad.net/~drizzle-discuss
More help   : https://help.launchpad.net/ListHelp

Reply via email to