Hello Dmitry,

This is a follow-up on our today's discussion.

I do think you can make a client-only clustering solution
with automatic rebalance when a node is added or deleted.

You need to use the standard technique with a consistent hash ring
and virtual nodes. You also need to maintain data redundancy.

The technique is described, for example, here:

www.christof-strauch.de/nosqldbs.pdf (in this paper you can also
find a link to the original paper).

In a nutshell, you create a consistent hash ring of, say, 10k
virtual servers.
Each virtual server is mapped to a physical server:
- if, say, there are 10 physical servers, and redundancy factory
  is 2,  server 1 contains virtual servers 1-1000, plus copies
  of virtual servers 9000-1000, server 2 contains data virtual
  servers 2000-3000 plus copies of virtual servers 8000-9000.

Actually, which virtual servers are replicated on which physical
server can be also calculated using a hash function (essentially, 
another hash ring).

Then your client can have the following operations:

- add physical server: this operation only adds the server
  to the mapping, i.e. assigns a bunch of virtual servers,
  to it, and that's all.

- remove the physical server. This operation removes the server
  from the consistent hash ring, and redistributes the keys
  assigned to its virtual servers among adjacent virtual
  servers on the hash ring. (the keys can be looked
  up on physical servers contain replicas of the virtual servers
  which are gone). This redistribution needs to happen
  to keep the redundancy factor.

- insert a key -- this operation inserts the key into the correct
  virtual, and eventually physical server, and to replicas.

- find a key - this tries to find the key in the server according 
  to the current state of the consistent hash ring.
  If the key is not found, it checks replicas.
  If the key is not found on replicas, it checks
  the key on the physical server which is responsible for the
  next range of virtual servers in the hash ring, until the key is
  found.

  Here, of course, we can end up with a lot of checks for missing
  keys: but this can later on be optimized by keeping information
  about whether or not the configuration is in a degraded state,
  and only doing the extra checks on degraded virtual servers.

-- 

_______________________________________________
Mailing list: https://launchpad.net/~tarantool-developers
Post to     : [email protected]
Unsubscribe : https://launchpad.net/~tarantool-developers
More help   : https://help.launchpad.net/ListHelp

Reply via email to