Re: [GENERAL] Partitioning Vs. Split Databases - performance?

Benjamin Smith Thu, 21 Dec 2006 14:07:46 -0800

On Thursday 21 December 2006 11:47, Ron Johnson wrote:
> This gives you linear growth potential, since if your current box
> gets over-utilized, buy a 2nd box and move some of the databases to it.


So far, I'm inclined to go this way, due to the option for linear scaling. 

> >> 2) Copy out the data specific to a customer and load into separate tables
> >> (with slightly different names, EG table "dates" becomes "cust1_dates")
> >> and 
> >> use data partitioning to help with performance as needed.
> 
> Definitely *not* scalable.  And *very* messy.  Yech.

Scales better than present, methinks, but still not the best idea. I'd have to 
revisit all my queries to make sure that they use the correct tablename. 

> > 3) Put each customer in their own schema/namespace which resides within
> > its own table space.
> >
> > Then you can move customers wherever you need in terms of IO.

How is that functionally different than using a separate database? What's the 
advantage here? I don't *need* to restrict myself to one database, and doing 
this does require that I revisit 100% of the SQL queries to make sure that 
I'm referencing the right schema. 

This solution seems to have the same problems as using dynamic tablenames. 

> Splitting like mentioned in these three tactics means that you've
> now got 2x as many tables.  Add more customers and you've got that
> many more tables.  Perfect candidate for "schema drift".
> 
> If each table has cust_id in it, then you could:
> 
> 4) retain 1 database and partition each table on cust_id.
> 
> http://www.postgresql.org/docs/8.1/interactive/ddl-partitioning.html

Intriguing idea, and one that I might do in the future. However, I have 170 
normalized tables defined at present. The task of auditing each of these 
tables for the constraint ranges sounds somewhat nightmarish. Not all the 
tables have the cust_id record defined - some are implicit. 

> >> Given the same physical hardware, which one is likely to perform better? 
Does
> >> it make any difference? Does using separate databases use more RAM than a
> >> single database with a bunch of different tables?
> 
> Config files are global, so I doubt it.
> 
> >> Company is growing rapidly, so growth room is important...
> 
> Then go for Option 1.

My conclusion, too. Another poster mentioned schema drift, and that's a real 
concern, but we're already updating the schema through a script which could 
be extended to update all databases, not just one, so I feel this problem 
would be minor to non-existent. 

Thanks! 

-Ben 
-- 
"The best way to predict the future is to invent it."
- XEROX PARC slogan, circa 1978

---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
       choose an index scan if your joining column's datatypes do not
       match

Re: [GENERAL] Partitioning Vs. Split Databases - performance?

Reply via email to