On Fri, 2007-05-25 at 20:16 +0200, Arnau wrote:
The point I'm worried is performance. Do you think the performance would be better executing exactly the same queries only adding an extra column to all the tables e.g. customer_id, than open a connection to the only one customers DB and execute the query there?

There is no simple answer to this question; it depends too much on your data.  
In many cases, adding a customer_id to every table, and perhaps also 
per-customer views (per Jeff's suggestion), can work really well.

However, performance is not the only consideration, or even the main 
consideration.  We operate with about 150 separate databases.  In our cases, 
administration issues and software design outweighed performance issues.

For example, with separate databases, security is simpler, *and* it's easy to 
convince the customer that their data is protected.  Creating views only helps 
for read-only access.  When the customer wants to modify their data, how will 
you keep them from accessing and overwriting one another's data?  Even with 
views, can you convince the customer you've done it right?  With separate 
databases, you use the built-in security of Postgres, and don't have to 
duplicate it in your schema and apps.

With separate databases, it's really easy to discard a customer.  This can be 
particularly important for a big customer with millions of linked records.  In 
a database-for-everyone design, you'll have lots of foreign keys, indexes, etc. 
that make deleting a whole customer a REALLY big job.  Contrast that with just 
discarding a whole database, which typically takes a couple seconds.

But even more important (to us) is the simplicity of the applications and management.  
It's far more than just an extra " ... and customer = xyz" added to every 
query.  Throwing the customers together means every application has to understand 
security, and many operations that would be simple become horribly tangled.  Want to back 
up a customer's data?  You can't use pg_dump, you have to write your own dump app.  Want 
to restore a customer's data?  Same.  Want to do a big update?  Your whole database is 
affected and probably needs to be vacuum/analyzed.  On and on, at every turn, management 
and applications are more complex.

If you have hundreds of separate databases, it's also easy to scale: Just buy 
more servers, and move some of the databases.  With a single monster database, 
as load increases, you may hit the wall sooner or later.

Postgres is really good at maintaining many separate databases.  Why do it 
yourself?

There are indeed performance issues, but even that's not black and white.  
Depending on the specifics of your queries and the load on your servers, you 
may get better performance from a single monster database, or from hundreds of 
separate databases.

So, your question has no simple answer.  You should indeed evaluate the 
performance, but other issues may dominate your decision.

Craig



---------------------------(end of broadcast)---------------------------
TIP 7: You can help support the PostgreSQL project by donating at

               http://www.postgresql.org/about/donate

Reply via email to