On 25/11/12 11:11, Kevin Grittner wrote:
Gavin Flower wrote:
We found that the real-world production performance of a web
application servicing millions of we hits per day with thousands
of concurrent users improved when we reconfigured our database
connection pool to be about 35 instead of 55, on a 16 core box
with a 40 drive RAID.
Curious, what is your RAID configuration?
We have the OS on RAID 1 on one RAID controller, WAL on another RAID
1 array on its own controller, and the PostgreSQL data directory on
RAID 5 on a pair of controllers which each has a path to every drive.
(Don't ask me how that works, I just trust the hardware guys to get
that part right.) The active portion of the database is cached, the
read-to-write ratio is about 10 to 1, and the controllers all have
BBU cache with write-back. It is actually one pool of about 30 for
the read-only web app with so many hits, and a separate pool of 6
database connections for logical replication from about 80 source
databases handling an OLTP load of 3000 directly connected users.
We were seeing some performance degradation at peak loads, and
improved both user response time and overall throughput by reducing
the database connections used by the connection pool. Peak load was
handled much better that way.
I cringe every time I see someone say they need a large number of
database connections in order to support that many users. Having one
database connection for each user is, in my experience, guaranteed to
make things fall apart under load.
-Kevin
Thanks for a very comprehensive answer! (almost simply asked what 'type'
of RAID did you have, I am glad I rephrased that! I learnt far more
than I expected, which is good - more than merely satisfied my curiosity!.
I use RAID-6; but in development, reliability (for me, I've had 5 hard
disk crashes in 20 years) is orders of magnitude more important than
performance, with rare exceptions!
I can understand the mentality of wanting one connection per user,
though I fully know why it is the wrong approach - I would also want to
use connection pooling in your situation.
Backing the 1980's I was a Mainframe COBOL analyst/programmer and just
starting to use databases. Initially I thought of a transaction in terms
of the user wanting to ensure their display and update were in the same
transaction. Took me a while to fully appreciate and accept that
transactions should be short to protect the database while not blocking
other users too much (multi-user access was also a whole new concept),
and that keeping a transaction open for user actions was not the 'done
thing'! It requires a whole new change of mindset – something a lot of
people find difficult, as it conflicts with part of their implicit
belief system (just try discussing even the most obvious flaws in
Creationism with a Believer!).
Now I'm quite comfortable with the idea of multi-user and am much more
database centric while still appreciating the needs of users. Now my
weapons of choice are Java & Linux, with Postgres being my preferred
database.
Cheers,
Gavin