On Wed, May 11, 2005 at 08:57:57AM +0100, David Roussel wrote: > For an interesting look at scalability, clustering, caching, etc for a > large site have a look at how livejournal did it. > http://www.danga.com/words/2004_lisa/lisa04.pdf > > They have 2.6 Million active users, posting 200 new blog entries per > minute, plus many comments and countless page views.
Neither of which is that horribly impressive. 200 TPM is less than 4TPS. While I haven't run high transaction rate databases under PostgreSQL, I suspect others who have will say that 4TPS isn't that big of a deal. > Although this system is of a different sort to the type I work on it's > interesting to see how they've made it scale. > > They use mysql on dell hardware! And found single master replication did > not scale. There's a section on multimaster replication, not sure if Probably didn't scale because they used to use MyISAM. > they use it. The main approach they use is to parition users into > spefic database clusters. Caching is done using memcached at the Which means they've got a huge amount of additional code complexity, not to mention how many times you can't post something because 'that cluster is down for maintenance'. > application level to avoid hitting the db for rendered pageviews. Memcached is about the only good thing I've seen come out of livejournal. > It's interesting that the solution livejournal have arrived at is quite > similar in ways to the way google is set up. Except that unlike LJ, google stays up and it's fast. Though granted, LJ is quite a bit faster than it was 6 months ago. -- Jim C. Nasby, Database Consultant [EMAIL PROTECTED] Give your computer some brain candy! www.distributed.net Team #1828 Windows: "Where do you want to go today?" Linux: "Where do you want to go tomorrow?" FreeBSD: "Are you guys coming, or what?" ---------------------------(end of broadcast)--------------------------- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]