On Wed, May 11, 2005 at 08:57:57AM +0100, David Roussel wrote:
> For an interesting look at scalability, clustering, caching, etc for a
> large site have a look at how livejournal did it.
> http://www.danga.com/words/2004_lisa/lisa04.pdf
> 
> They have 2.6 Million active users, posting 200 new blog entries per
> minute, plus many comments and countless page views.

Neither of which is that horribly impressive. 200 TPM is less than 4TPS.
While I haven't run high transaction rate databases under PostgreSQL, I
suspect others who have will say that 4TPS isn't that big of a deal.

> Although this system is of a different sort to the type I work on it's
> interesting to see how they've made it scale.
> 
> They use mysql on dell hardware! And found single master replication did
> not scale.  There's a section on multimaster replication, not sure if
Probably didn't scale because they used to use MyISAM.

> they use it.  The main approach they use is to parition users into
> spefic database clusters.  Caching is done using memcached at the
Which means they've got a huge amount of additional code complexity, not
to mention how many times you can't post something because 'that cluster
is down for maintenance'.

> application level to avoid hitting the db for rendered pageviews.
Memcached is about the only good thing I've seen come out of
livejournal.

> It's interesting that the solution livejournal have arrived at is quite
> similar in ways to the way google is set up.

Except that unlike LJ, google stays up and it's fast. Though granted, LJ
is quite a bit faster than it was 6 months ago.
-- 
Jim C. Nasby, Database Consultant               [EMAIL PROTECTED] 
Give your computer some brain candy! www.distributed.net Team #1828

Windows: "Where do you want to go today?"
Linux: "Where do you want to go tomorrow?"
FreeBSD: "Are you guys coming, or what?"

---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Reply via email to