Re: Sharding, and all that

2008-12-19 Thread Andy Wardley

Richard Huxton wrote:

Hmm - skimming these articles, I'm not hugely impressed. The chap(ess?)
behind them is clearly a developer rather than a DBA.


You're right.  I perhaps should have quantified that better as a good
*introduction* to the subject.  It was a bit hand-wavy on the detail,
but it got me thinking about the subject (speaking as more of a developer
than a DBA).

He said:

Split the database into smaller, specialized db’s. Use a DB for users,
one for messanging functionalities, one for product orders, etc. Each of
these databases must exist independently, 


You said:

Brilliant! So now your messages don't necessarily have a valid user
associated with them.


It's a shame he didn't go into the details of how you were supposed to
make the must exist independently bit happen.  Key management is clearly
going to be a critical part of this.

My take on this is that you would have to (re-)build your application and
underlying DB to be distributed from the start.  Your user DB/app would
provide an authentication service, perhaps something similar to OpenID, that
your message board would use to authenticate users.

The problem you mention is that the message.user foreign key then goes out
the window.

One approach would be to have a local users table in the messaging DB which
maps internal user IDs to some kind of external user identifier.  That could
be their email address (in the simplest case), a URL that identifies the user
at your authentication app (e.g. http://auth.your-app.com/joe90) or perhaps
a GUUID.

Either way, you're explicitly decoupling the internal requirement that a
message must have a valid user id (i.e. a record in the users table) from
the assumption that you actually know anything valuable about that user that
doesn't relate explicitly to the message board.

You can then maintain referential integrity in your message DB regardless of
what happens in the user DB.  It would be possible to delete a user from the
user system, for example, without having any *internal* effect on the message
board DB (no more cascading delete woes).

Of course, there is still an *external* effect, namely a broken link from
the user in your message board to the one in your auth app.  For example,
your message board may have messages written by user 42 who was identified at
the time as jo...@example.com, but there's no guarantee that the user still
has login access, or even exists.  You'll have to go and ask the user
application for further information on that *and* accept the implicit
assumption that you may get back the SaaS equivalent of a blank stare.

I appreciate that decoupling is a fancy way of saying broken, but I'm
beginning to see it as a feature rather than a liability in this context.

I should point out that I haven't implemented anything like this yet so I
could be way off course. But I'm about to implement something like it for a
$work project so any pointers on this would be welcome if I'm sailing in the
wrong direction.

This sort of stuff is very easy to get wrong. 


Agreed.  I don't think there's any easy substitute for proper data design.
Scalability doesn't work as an afterthought.


A


Re: Sharding, and all that

2008-12-19 Thread Richard Huxton
Andy Wardley wrote:
 My take on this is that you would have to (re-)build your application and
 underlying DB to be distributed from the start.  Your user DB/app would
 provide an authentication service, perhaps something similar to OpenID,
 that
 your message board would use to authenticate users.

Yep - that's what sharding is all about - separate disconnected silos
of data. You know, like the ones that were all the rage in the 60s that
drove people to invent RDBMS. The good thing is, if your application is
successful and is still in use a couple of years from now you get to
either spend all your time fire-fighting or re-implementing integrity
constraints.

 The problem you mention is that the message.user foreign key then goes out
 the window.
 
 One approach would be to have a local users table in the messaging DB which
 maps internal user IDs to some kind of external user identifier.  That
 could
 be their email address (in the simplest case), a URL that identifies the
 user
 at your authentication app (e.g. http://auth.your-app.com/joe90) or perhaps
 a GUUID.

Beside the point. Nothing stopping you running separate services talking
to one big database. Nothing stopping you replicating the user table to
various slaves and having separate message tables on each. That's not
sharding though - sharding (if it means anything) means to have
*disconnected* databases. Which works fine for Google since they
effectively need only one table for web-searching.

 Either way, you're explicitly decoupling the internal requirement that a
 message must have a valid user id (i.e. a record in the users table) from
 the assumption that you actually know anything valuable about that user
 that
 doesn't relate explicitly to the message board.
 
 You can then maintain referential integrity in your message DB
 regardless of
 what happens in the user DB.  It would be possible to delete a user from
 the
 user system, for example, without having any *internal* effect on the
 message
 board DB (no more cascading delete woes).

Then you're not maintaining referential integrity. There's no point in
having a user-id that doesn't *mean* anything. Primary keys, foreign
keys and all the other bits and pieces of RI in a SQL database are there
to maintain the *meaning* of your data.

 I appreciate that decoupling is a fancy way of saying broken, but I'm
 beginning to see it as a feature rather than a liability in this context.

I can only think of two contexts:
  1. Don't care
  2. Last resort - the project is pushing the boundaries of what is
 possible.
Now Google were lucky in that they scored on both.

 I should point out that I haven't implemented anything like this yet so I
 could be way off course. But I'm about to implement something like it for a
 $work project so any pointers on this would be welcome if I'm sailing in
 the
 wrong direction.

It's not uncommon to find a project with one very large table (sensor
readings, apache log lines, messages, text from documents). It's a lot
less common to have several very large tables. If you're in the first
camp then you can replicate all the (comparatively) small tables and
partition the single large one. That lets you maintain the view of a
single database from the outside.

Before that though, make sure you do have a problem. Pick the right tool
for the job - if high concurrency/complex queries/procedural code for
constraints is a requirement then it's probably not MySQL. Always
consider what an extra couple of grand on hardware will gain you.

The article you linked to and the one Nigel Hamilton mentions further
down the thread are both being clever. In neither case (from the small
amount of detail available) does it look justified.

-- 
  Richard Huxton
  Archonet Ltd



Re: Sharding, and all that

2008-12-19 Thread James Laver
On Fri, Dec 19, 2008 at 9:52 AM, Richard Huxton d...@archonet.com wrote:

 Yep - that's what sharding is all about - separate disconnected silos
 of data. You know, like the ones that were all the rage in the 60s that
 drove people to invent RDBMS. The good thing is, if your application is
 successful and is still in use a couple of years from now you get to
 either spend all your time fire-fighting or re-implementing integrity
 constraints.

 Then you're not maintaining referential integrity. There's no point in
 having a user-id that doesn't *mean* anything. Primary keys, foreign
 keys and all the other bits and pieces of RI in a SQL database are there
 to maintain the *meaning* of your data.


It depends on what you're using a database for.

I had a long, drawn-out discussion with my boss a couple of weeks ago
regarding the purpose of a database. I argued it was there to store
data reliably (eg. maintaining integrity, cascading deletes or
ensuring you can't delete something being referenced in another
relation etc.), he argued it was to improve the performance of the
application that was using it (I'm talking in the context of web-apps
since it's what pays the bills).

At my previous job we used a MySQL database. I chose the InnoDB engine
because I wanted to make sure the site data was going to stay as
unbroken as MySQL can enforce. The important thing was actual foreign
key checks. The site workflow naturally suited this since it was
updated overnight into these tables with real referential integrity
(MyISAM will not enforce foreign keys). After the synchronisation, you
can then regenerate cache tables which don't have to be perfect,
they're there to be faster.

On the other hand, it's more work to generate cache tables.

Really it depends what you want. I fall strongly on the side of good
data and then put in place other measures if it's not fast enough.
Other people are purely concerned with performance, Just wanting to
use database indexes to make things fast and easy. As a result you're
often implementing data security in your code which I don't feel is
the right place for it.

Cheers,
--James


Re: Sharding, and all that

2008-12-19 Thread Andy Wardley

Richard Huxton wrote:

Yep - that's what sharding is all about - separate disconnected silos
of data. 


I thought sharding specifically related to horizontal partitioning.  i.e.
splitting one table across several databases,  e.g. records with even row
ids in one DB, odd in another.  Apologies if my terminology is wrong.

I was thinking more specifically about vertical partitioning along the
functional boundaries which wouldn't be sharding by my (possibly incorrect)
definition.  Apologies for being off-topic, too  :-)


Then you're not maintaining referential integrity. There's no point in
having a user-id that doesn't *mean* anything. 


I'm not suggesting that the user id doesn't mean anything.  It means
exactly the same thing as it does in any other database - a unique
identifier to a user record.

I'm saying that the user record in the message DB doesn't need to
store email address, username, password, favourite colour, challenge
question/response, and all the other cruft that goes with user administration.
All it needs is a unique id, an authentication realm (identifying the
authentication service) and authentication token (identifying the user
at that service).  And perhaps any message-board specific niceties such
as your local nick and display preferences.

Similarly in the authentication DB there's no need to store information
in the users table relating to message board preferences.

I can see that trivially splitting one user table into two leads to all
sorts of integrity problem.  But I'm thinking of them as two separate
user databases from the outset and accepting that they're potentially
disjoint.

The best (but poor) example I can think of right now is how my gravatar
popped up automatically when I signed up at github.  Not because the github
user database is referentially at one with the gravatar database, but because
I used the same public identifier (my email address) on both systems.

So it could be argued that there is *a* point in having a user id (such
as email address) that doesn't *mean* anything to the *current* database,
because it might have meaning to *other* databases.  It's the closest
thing we've got to referential integrity in a distributed world.


Primary keys, foreign
keys and all the other bits and pieces of RI in a SQL database are there
to maintain the *meaning* of your data.


Sure, I recognise the fact that you lose referential integrity at the
boundary between your db and the outside world.  But internally,
the DB remains referentially intact.  The message board still has its
own user records for messages to reference.  The fact that the authentication
realm/token may at some point in the future become invalid is really no
different to the current situation where a user's email address changes
and they can no longer login or get a password reminder/reset.


Before that though, make sure you do have a problem. Pick the right tool
for the job - if high concurrency/complex queries/procedural code for
constraints is a requirement then it's probably not MySQL. Always
consider what an extra couple of grand on hardware will gain you.


For this particular project we don't really have a database problem that
can't be solved with a bit of replication and a whack of the hardware hammer.
The majority of the load will be searches that we're reading from a single
read-only table.  So we can replicate that easily across as many web heads as
required for performance and it gives us simple redundancy in case of machine
failure, etc.

All the read/write functionality (mostly CMS-like stuff) will happen
(initially) on a single server with master read/write database.  It'll
be fairly low-load and it's not mission critical (in the sense that having
the CMS offline for a few hours is an inconvenience not a hanging offence).

The impetus was more about turning one large and tangled CMS-like system into
several smaller, simpler systems.  90% of the tables fall neatly into one
sub-system or the other (front-end search and content delivery, back-end
content management, image management, accounting and billing, and so on).
It's the last few  tables (mostly relating to users, unsurprisingly) that
straddle spread-eagled across the database dangling their tools in the works.

So I'm really coming at this from the perspective of someone thinking about
building distributed apps and wondering how the database is going to work
behind the scenes, rather than someone with a database performance problem
considering partitioning.

Cheers
A


Re: Sharding, and all that

2008-12-19 Thread Richard Huxton
Andy Wardley wrote:
 Richard Huxton wrote:
 Yep - that's what sharding is all about - separate disconnected silos
 of data. 
 
 I thought sharding specifically related to horizontal partitioning.  i.e.
 splitting one table across several databases,  e.g. records with even row
 ids in one DB, odd in another.  Apologies if my terminology is wrong.
 
 I was thinking more specifically about vertical partitioning along the
 functional boundaries which wouldn't be sharding by my (possibly incorrect)
 definition.  Apologies for being off-topic, too  :-)

If sharding means anything at all, then it has to be something other
than partitioning or partial replication, otherwise we could say
partitioning or partial replication. Of course it's entirely
possible it *doesn't* mean anything at all, and is just partitioning2.0

 Then you're not maintaining referential integrity. There's no point in
 having a user-id that doesn't *mean* anything. 
 
 I'm not suggesting that the user id doesn't mean anything.  It means
 exactly the same thing as it does in any other database - a unique
 identifier to a user record.
 
 I'm saying that the user record in the message DB doesn't need to
 store email address, username, password, favourite colour, challenge
 question/response, and all the other cruft that goes with user
 administration.
 All it needs is a unique id, an authentication realm (identifying the
 authentication service) and authentication token (identifying the user
 at that service).  And perhaps any message-board specific niceties such
 as your local nick and display preferences.
 
 Similarly in the authentication DB there's no need to store information
 in the users table relating to message board preferences.

Having one 300-column user_details table is probably a bad idea anyway.
However, if you can't say uid=123456 on your message is the same as
uid=123456 in your user-login table then I'd say it doesn't mean
anything. You can't even say two messages posted by uid=123456 are the
same user without providing guarantees about the security of your uid
transfer code (which is back to providing your own relational integrity).

 I can see that trivially splitting one user table into two leads to all
 sorts of integrity problem.  But I'm thinking of them as two separate
 user databases from the outset and accepting that they're potentially
 disjoint.
 
 The best (but poor) example I can think of right now is how my gravatar
 popped up automatically when I signed up at github.  Not because the github
 user database is referentially at one with the gravatar database, but
 because
 I used the same public identifier (my email address) on both systems.
 
 So it could be argued that there is *a* point in having a user id (such
 as email address) that doesn't *mean* anything to the *current* database,
 because it might have meaning to *other* databases.  It's the closest
 thing we've got to referential integrity in a distributed world.

Yep, but that's option #1 - don't care. If your gravatar didn't work,
or you got the wrong one, well whatever.

Option #2 would be something like OpenID where you can't (and dont' want
to) have a single large database accessed by everyone. So - you have to
build in trust mechanisms. Or GPG key-signing say.

 Primary keys, foreign
 keys and all the other bits and pieces of RI in a SQL database are there
 to maintain the *meaning* of your data.
 
 Sure, I recognise the fact that you lose referential integrity at the
 boundary between your db and the outside world.  But internally,
 the DB remains referentially intact.  The message board still has its
 own user records for messages to reference.  The fact that the
 authentication
 realm/token may at some point in the future become invalid is really no
 different to the current situation where a user's email address changes
 and they can no longer login or get a password reminder/reset.

No, there's a vital difference. If I want to I can track every change a
user makes to their account, so long as I can trust the uid. Barring
errors in the RDBMS code it's a relatively simple task for me to
guarantee that. If the police come knocking because someones posted
threatening messages or whatever then once they've got their warrant
they can have their data. Of course maybe it doesn't matter, but that's
putting you back in #1 - don't care.

 Before that though, make sure you do have a problem. Pick the right tool
 for the job - if high concurrency/complex queries/procedural code for
 constraints is a requirement then it's probably not MySQL. Always
 consider what an extra couple of grand on hardware will gain you.
 
 For this particular project we don't really have a database problem that
 can't be solved with a bit of replication and a whack of the hardware
 hammer.
 The majority of the load will be searches that we're reading from a single
 read-only table.  So we can replicate that easily across as many web
 heads as
 required for performance and it gives

Re: Sharding, and all that

2008-12-19 Thread Paul Makepeace
On Fri, Dec 19, 2008 at 1:43 PM, Richard Huxton d...@archonet.com wrote:

 Andy Wardley wrote:
  Richard Huxton wrote:
  Yep - that's what sharding is all about - separate disconnected silos
  of data.
 
  I thought sharding specifically related to horizontal partitioning.  i.e.
  splitting one table across several databases,  e.g. records with even row
  ids in one DB, odd in another.  Apologies if my terminology is wrong.
 
  I was thinking more specifically about vertical partitioning along the
  functional boundaries which wouldn't be sharding by my (possibly incorrect)
  definition.  Apologies for being off-topic, too  :-)

 If sharding means anything at all, then it has to be something other
 than partitioning or partial replication, otherwise we could say
 partitioning or partial replication. Of course it's entirely
 possible it *doesn't* mean anything at all, and is just partitioning2.0

Sharding is horizontal partitioning: splitting your data by a primary
key, for example having a few hundred million email boxes spread over
thousands of machines keyed by email address.

Having your user accounts data in a separate database from say those
users' preferences is a form of vertical partitioning. Both in a big
system are very likely to be sharded by some kind of internal key.

P


Sharding, and all that

2008-12-18 Thread Mark Fowler

So,

It's nearly the new year and the time for new projects.  On my project  
radar for next year is consider (better) sharding our MySQL database.


Now I understand the basic principles behind this and can, if needs  
be, implement my own wheel, but before I do I'd love to find out what  
everyone else is doing. Are there any good reference sites out there -  
MySQL sharding success stories if you will - and how did they do it?   
I'm pretty sure that everyone does it slightly differently but the  
more things I can look at before considering my own implementation,  
the better.


Hot technologies (by which I mean what blogs and google have been  
throwing at me recently) include:


- Spock Proxy
- MySQL Clustering
- Implementing it at the app level.

What's the collective group think on these?  Obviously (for this list,  
at least) the app that will be accessing the database is Perl based,  
so I'm only considering compatible technologies at the moment.


Mark.


Re: Sharding, and all that

2008-12-18 Thread Leo Lapworth
2008/12/18 Mark Fowler m...@twoshortplanks.com

 It's nearly the new year and the time for new projects.  On my project
 radar for next year is consider (better) sharding our MySQL database.


I came across this and noted it for future reference:

http://spockproxy.sourceforge.net/ - doh, just read the rest of your email -
you've already found it

http://www.mysqlperformanceblog.com/ - seems to know their stuff so you
might find something in here.

Leo


Re: Sharding, and all that

2008-12-18 Thread Andy Wardley

Mark Fowler wrote:

What's the collective group think on these?


There's a good series of articles on sharding starting here:

http://lifescaler.com/2008/04/database-sharding-unraveled-part-i/

The conclusion I drew from it was that functional partitioning
(where possible) was much easier to implement than horizontal partitioning.

A



Re: Sharding, and all that

2008-12-18 Thread Dave Hodgkinson


On 18 Dec 2008, at 10:19, Mark Fowler wrote:


So,

It's nearly the new year and the time for new projects.  On my  
project radar for next year is consider (better) sharding our MySQL  
database.




There's really good stuff in the Sybase Unleashed book on horizontal
and vertical partitioning. And as people have said, you can partition
by kind of data.

sharding is a horrible term.

--
Dave HodgkinsonMSN: daveh...@hotmail.com
Site: http://www.davehodgkinson.com  UK: +44 7768 490620
Blog: http://davehodg.blogspot.com
Photos: http://www.flickr.com/photos/davehodg









Re: Sharding, and all that

2008-12-18 Thread Jonathan Stowe
2008/12/18 Dave Hodgkinson daveh...@gmail.com:


 sharding is a horrible term.


But the computing 2.0 neophytes have to have obscure neologisms - it
would be an affront to their freedom of thought to use a term that
everyone else understands.


Re: Sharding, and all that

2008-12-18 Thread Martin A. Brooks

Mark Fowler wrote:

consider (better) sharding our MySQL database.


Is that when you throw it against the wall really, really, hard?




Re: Sharding, and all that

2008-12-18 Thread Richard Huxton
Andy Wardley wrote:
 Mark Fowler wrote:
 What's the collective group think on these?
 
 There's a good series of articles on sharding starting here:
 
 http://lifescaler.com/2008/04/database-sharding-unraveled-part-i/
 
 The conclusion I drew from it was that functional partitioning
 (where possible) was much easier to implement than horizontal partitioning.

Hmm - skimming these articles, I'm not hugely impressed. The chap(ess?)
behind them is clearly a developer rather than a DBA.

http://lifescaler.com/2008/06/database-sharding-unraveled-part-iii/

I’ll give you a practical example. It involves a forum with about 150k
registered users, 600k posts and about 10k unique visitors/day (peaks
reaching 50k unique visitors/day). The DB server is a 8 dual-core XEON
processors with 8gb/ram. The whole DB is about 1GB in size and the
server is MySQL.
When the number of simultaneous DB connections reaches a critical level
(it’s variable), MySQL will freeze. That’s not all, as the sessions and
posts tables will often get corrupted.
The first solution would be to put the tables in memory, so as not to
stress the HDD. This actually turns out to be the worst idea possible.
So what to do?

Well, how about (1) connection-pooling and (2) running memcached using
some of the 6GB of RAM you have free. Even if that only reduces your
queries by 90% (and you'd expect better than that for a forum) you're
not going to be seeing more than 16 simultaneous queries - which is the
number of cores you have. While we're on the topic you've probably got
too many cores and not enough disks (and I'm basing that solely on the
fact that a DB server has been described *without* mentioning disks at all).

Oh, and if you're routinely finding your tables are getting corrupted
either (a) replace your hardware or (b) replace your database.

Split the database into smaller, specialized db’s. Use a DB for users,
one for messanging functionalities, one for product orders, etc. Each of
these databases must exist independently, that is, the splitting must be
made so that I could, for example, create a new functionality
(implicitly a new database), let’s say discussion board, and just take
the users DB as a whole and use it, without affecting anything else.

Brilliant! So now your messages don't necessarily have a valid user
associated with them. That's OK though, because you're checking that in
your application aren't you? Everywhere? Including any admin scripts?
And manual interactions with the database? And you can prove that you've
not missed any checking?

This sort of stuff is very easy to get wrong. It's very easy to make the
sort of mistake that isn't noticed for months when it's impossible to
correct corrupt data. Now I know the likes of ebay and skype do it, but
you can bet they wouldn't unless they had to. These articles seem to put
it on a par with trying out a new theme in firefox.

-- 
  Richard Huxton
  Archonet Ltd



Re: Sharding, and all that

2008-12-18 Thread Jonathan Stowe
2008/12/18 Martin A. Brooks mar...@hinterlands.org:
 Mark Fowler wrote:

 consider (better) sharding our MySQL database.

 Is that when you throw it against the wall really, really, hard?

Or make the DBAs roll naked in a bathtub of broken glass until they
sort the performance problems out.


Re: Sharding, and all that

2008-12-18 Thread Richard Huxton
Jonathan Stowe wrote:
 2008/12/18 Martin A. Brooks mar...@hinterlands.org:
 Mark Fowler wrote:
 consider (better) sharding our MySQL database.
 Is that when you throw it against the wall really, really, hard?
 
 Or make the DBAs roll naked in a bathtub of broken glass until they
 sort the performance problems out.

You're already making them admin MySQL - rolling in broken glass is what
they do to relax.

-- 
  Richard Huxton
  Archonet Ltd


Re: Sharding, and all that

2008-12-18 Thread James Laver
On Thu, Dec 18, 2008 at 12:49 PM, Jonathan Stowe
j...@integration-house.com wrote:
 2008/12/18 Martin A. Brooks mar...@hinterlands.org:
 Mark Fowler wrote:

 consider (better) sharding our MySQL database.

 Is that when you throw it against the wall really, really, hard?

 Or make the DBAs roll naked in a bathtub of broken glass until they
 sort the performance problems out.


I thought after they sharded it in a particularly awful manner it
just felt like rolling in a bathtub of broken glass, as per Richard
Huxton's response...

--James


Re: Sharding, and all that

2008-12-18 Thread Jacqui Caren

Mark Fowler wrote:

So,

It's nearly the new year and the time for new projects.  On my project 
radar for next year is consider (better) sharding our MySQL database.


Standard consultant question: Why?

go faster/scale up is not a good reason.

If you just wanted to go faster you would have moved to Pg or similar :-)

There is no magic solution - the DB and application and usage should be
reviewed to see if some minor tweaks can make major improvements.

Picking a solution without first knowing what the existing problems and
bottlenecks are smells of I wanna play :-)

Jacqui

p.s. I have wanted to play with a Pg SSI over a OpenMosix image for
ages but have yet to make the time. The original question sounds
similar to the reasons for my wanting to play with Pg over OpenMosix
(because it sounds like fun and May help scale up a app) - I had a
good excuse a year or two ago but today with 16+ CPU servers, Hadoop
etc there are better things to do with my time.



Re: Sharding, and all that

2008-12-18 Thread Simon Wistow
On Thu, Dec 18, 2008 at 10:19:06AM +, Mark Fowler said:
 What's the collective group think on these?  Obviously (for this list, at 
 least) the app that will be accessing the database is Perl based, so I'm 
 only considering compatible technologies at the moment.

We use Data::ObjectDriver which has built in support for sharding and 
memcache (somewhat unsuprisingly) plus MySQL with replication.

The main advantages are that it's conceptually simple, resilient, 
flexible (especially with the ability to move entities between shards) 
and cheap to scale. 

The big question is whether to use a hashing scheme for shard selection 
(keywords: Consistent Hashing, libketama, DHTs) or whether to use a 
master lookup device (which is what I believe Flickr uses). 

As always, YMMV. 





Re: Sharding, and all that

2008-12-18 Thread Nigel Hamilton
HI Mark,

So,

 It's nearly the new year and the time for new projects.  On my project
 radar for next year is consider (better) sharding our MySQL database.


I think it depends on your write/read ratio. If it is read heavy then
replication and/or memcache will probably do the trick - on the other hand
if you expect more writes than reads then I saw an interesting technique for
handling excess DB writes by using thousands of temporary MySQL MyISAM
tables[1].

Writing to MyISAM files is quick - but the downside is locking - by using
_lots_ of temporary MyISAM tables you can improve overall write capacity.

NIge

[1] These guys used 20,000+ MyISAM tables to scale writes
http://en.oreilly.com/mysql2008/public/schedule/detail/592