|
Hello, I’m new to slony, but
have been evaluating it in order to examine the feasibility of splitting up a
national database so that it can run on a farm of simple servers. Very briefly, I describe the
scheme, my tests so far, and then ask my questions: ------------------------------------------------ THE SCHEME ---------- a) The entire country is
subdivided into a grid of cells, (the size of which to be decided. Let’s
say 2km square for now) b) All the objects in the
database are located by grid reference. c) All queries are designed
to return results which are geographically close (let's say no more than 2km
away from a particular point). No queries need ever return data further than
one cell away. d) Each cell maintains its
own database. e) A cell's database is the
'master' database for all data located within the cell's boundaries f) The tables in each cell
are slaved to the adjoining neighbours (with slony-1) g) In order to provide cross
border searches (to prevent the problem of people living near the edge of a
cell only seeing half the stuff nearby), queries served by the cell use a union
of the master tables and the sets slaved from the adjoining cells. h) Cells are distributed
across a server farm. The number of cells on each server depends upon the
activity in each cell and the capability of the server. The worst case scenario
is that a single cell occupies its own server. To start with, many cells (20 or
so) may occupy a single server, but will be migrated to new servers as they
become busier. What this means in practice
is best shown with a diagram: ---------------------------- | | | | | Cell A | Cell B | Cell C | | | | | ---------------------------- | | | | | Cell D | Cell E | Cell F | | | | | ---------------------------- | | | | | Cell G | Cell H | Cell I | | | | | ---------------------------- Considering Cell E: a)Cell E's database is the
master database for information located geographically within cell E. b) The 8 adjoining cells
slave this data c) Cell E slaves data from
all 8 adjoining cells. --------------------------------------------------------- MY TESTS SO FAR --------------- So far, I've manually setup
a test case with just the top row of the example above (3 cells - A B and C in
a row). It works. Add something to
A, it appears in B. Add something to B, it appears in A and C. Add something to
C, it appears in B. So far, so good. --------------------------------------------------------- MY CONCERNS ----------- 1. For cell E, the number of
slon threads would be 18. (it seems to be two for each node). Is this within
the acceptable parameters, or is it a really bad idea? What are the system
overheads? In my example of a single server running 20 (not very active) cells,
this would inflate 20 360 processes. 2. In order to provide
'cross border' local searches, all 'cross border data' is effectively slaved 8
times. Should I be concerned by this? The aim is that both the web
serving and the database for a particular cell is managed by the same server,
and that because the cells are small, the local data will easily fit within RAM
(allowing for apache and other services) - even with the local slaved copies of
adjacent cells' data. 3. Has this been tried
before with disastrous consequences?! Many thanks, Andy Ballingall |
_______________________________________________ Slony1-general mailing list [email protected] http://gborg.postgresql.org/mailman/listinfo/slony1-general
