On 21 Nov 2010, at 5:08, Kent Tong wrote:

> Hi,
> 
> Let's say that there is some data that should be logically shared by
> many applications in the company, such as some core information about
> its customers (name, address, contact info). In principle, such data
> should be stored in a DB for sharing.
> 
> But then if a certain application needs to access such shared data, it
> will need to access its own DB plus this shared DB, which is
> troublesome and requires distributed transactions.

I think most companies have ended up at that point just by the progress of 
time. They have several different databases (often from different vendors even) 
that they need to aggregate their information from.

Usually the different databases contain different kinds of contents, so the 
need for distributed transactions and such is quite minimal. Where there is a 
need to keep relational integrity, the related data is usually in one big 
central database. You'd be amazed how much a database like that can handle!

That said, separating things out would be an improvement. For example, for 
generating reports (big aggregations of many data sources at once), you tend to 
generate heavy-weight queries that are likely to impact other queries (unless 
your database server is capable of limiting I/O and such per process, that 
helps some).

> An alternative is to store everything into a single DB and use, say,
> schemas to separate the data. The shared data can be put into one
> schema and that application can have its own schema. This eliminates
> the need for distributed transactions but then you'll have a single
> DB for the whole company! This creates problems in:
> 1) availability: if the DB is down, everything is down.
> 2) maintenance: if we need to deploy a new application, we'll need to
> create a new schema in that single DB, potentially impacting other
> apps.
> 3) performance: all apps are access that single DB server.
> 
> I was wondering in practice, how people address this problem?


You limit access to who can touch what and you replicate.

For example, the people in your sales department will need to be able to add 
and modify customer information, but the guys in the IT department don't need 
to. So the first group gets access to a database server where the customer 
database is a master, while the others get access to a read-only slave.

I wouldn't go so far as to create a separate database for every business unit 
though, the distinction is more a role-based one than an organisational one - 
there will be overlap between who has access to what.

That said, unless you're in a very large company, a central database will 
probably do for almost anything. The exceptions are more likely to be among the 
lines of reporting and frequent short but specialised queries for, for example, 
machine performance statistics.


At the company where I work we have a central DB2/400 database for production, 
shipping and sales. That database gets replicated for reporting. We also have a 
few separate MS SQL databases where for example machines on our production 
facility send their performance statistics, which they get back in an 
aggregated form every 5 minutes or so.

It isn't ideal, but that 20-year old DB2/400 database (although the hardware's 
been upgraded to i5-series or so) can handle its load just fine.

I suspect that Postgres would actually perform better, but you can't just 
switch a big 24/7 company from one database to another (provided I had anything 
to say about it at all, which I don't). That's an expensive, time-consuming and 
risky process.
Just saying, I don't know from experience how well Postgres would fare there, 
as it's not what we're using. I have no reason to suspect it to perform less 
well though.

Alban Hertroys

--
Screwing up is an excellent way to attach something to the ceiling.


!DSPAM:737,4ce923f910421136214443!



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Reply via email to