Re: Sharding, and all that
Richard Huxton wrote: Hmm - skimming these articles, I'm not hugely impressed. The chap(ess?) behind them is clearly a developer rather than a DBA. You're right. I perhaps should have quantified that better as a good *introduction* to the subject. It was a bit hand-wavy on the detail, but it got me thinking about the subject (speaking as more of a developer than a DBA). He said: Split the database into smaller, specialized db’s. Use a DB for users, one for messanging functionalities, one for product orders, etc. Each of these databases must exist independently, You said: Brilliant! So now your messages don't necessarily have a valid user associated with them. It's a shame he didn't go into the details of how you were supposed to make the must exist independently bit happen. Key management is clearly going to be a critical part of this. My take on this is that you would have to (re-)build your application and underlying DB to be distributed from the start. Your user DB/app would provide an authentication service, perhaps something similar to OpenID, that your message board would use to authenticate users. The problem you mention is that the message.user foreign key then goes out the window. One approach would be to have a local users table in the messaging DB which maps internal user IDs to some kind of external user identifier. That could be their email address (in the simplest case), a URL that identifies the user at your authentication app (e.g. http://auth.your-app.com/joe90) or perhaps a GUUID. Either way, you're explicitly decoupling the internal requirement that a message must have a valid user id (i.e. a record in the users table) from the assumption that you actually know anything valuable about that user that doesn't relate explicitly to the message board. You can then maintain referential integrity in your message DB regardless of what happens in the user DB. It would be possible to delete a user from the user system, for example, without having any *internal* effect on the message board DB (no more cascading delete woes). Of course, there is still an *external* effect, namely a broken link from the user in your message board to the one in your auth app. For example, your message board may have messages written by user 42 who was identified at the time as jo...@example.com, but there's no guarantee that the user still has login access, or even exists. You'll have to go and ask the user application for further information on that *and* accept the implicit assumption that you may get back the SaaS equivalent of a blank stare. I appreciate that decoupling is a fancy way of saying broken, but I'm beginning to see it as a feature rather than a liability in this context. I should point out that I haven't implemented anything like this yet so I could be way off course. But I'm about to implement something like it for a $work project so any pointers on this would be welcome if I'm sailing in the wrong direction. This sort of stuff is very easy to get wrong. Agreed. I don't think there's any easy substitute for proper data design. Scalability doesn't work as an afterthought. A
Re: Sharding, and all that
Andy Wardley wrote: My take on this is that you would have to (re-)build your application and underlying DB to be distributed from the start. Your user DB/app would provide an authentication service, perhaps something similar to OpenID, that your message board would use to authenticate users. Yep - that's what sharding is all about - separate disconnected silos of data. You know, like the ones that were all the rage in the 60s that drove people to invent RDBMS. The good thing is, if your application is successful and is still in use a couple of years from now you get to either spend all your time fire-fighting or re-implementing integrity constraints. The problem you mention is that the message.user foreign key then goes out the window. One approach would be to have a local users table in the messaging DB which maps internal user IDs to some kind of external user identifier. That could be their email address (in the simplest case), a URL that identifies the user at your authentication app (e.g. http://auth.your-app.com/joe90) or perhaps a GUUID. Beside the point. Nothing stopping you running separate services talking to one big database. Nothing stopping you replicating the user table to various slaves and having separate message tables on each. That's not sharding though - sharding (if it means anything) means to have *disconnected* databases. Which works fine for Google since they effectively need only one table for web-searching. Either way, you're explicitly decoupling the internal requirement that a message must have a valid user id (i.e. a record in the users table) from the assumption that you actually know anything valuable about that user that doesn't relate explicitly to the message board. You can then maintain referential integrity in your message DB regardless of what happens in the user DB. It would be possible to delete a user from the user system, for example, without having any *internal* effect on the message board DB (no more cascading delete woes). Then you're not maintaining referential integrity. There's no point in having a user-id that doesn't *mean* anything. Primary keys, foreign keys and all the other bits and pieces of RI in a SQL database are there to maintain the *meaning* of your data. I appreciate that decoupling is a fancy way of saying broken, but I'm beginning to see it as a feature rather than a liability in this context. I can only think of two contexts: 1. Don't care 2. Last resort - the project is pushing the boundaries of what is possible. Now Google were lucky in that they scored on both. I should point out that I haven't implemented anything like this yet so I could be way off course. But I'm about to implement something like it for a $work project so any pointers on this would be welcome if I'm sailing in the wrong direction. It's not uncommon to find a project with one very large table (sensor readings, apache log lines, messages, text from documents). It's a lot less common to have several very large tables. If you're in the first camp then you can replicate all the (comparatively) small tables and partition the single large one. That lets you maintain the view of a single database from the outside. Before that though, make sure you do have a problem. Pick the right tool for the job - if high concurrency/complex queries/procedural code for constraints is a requirement then it's probably not MySQL. Always consider what an extra couple of grand on hardware will gain you. The article you linked to and the one Nigel Hamilton mentions further down the thread are both being clever. In neither case (from the small amount of detail available) does it look justified. -- Richard Huxton Archonet Ltd
Re: Sharding, and all that
On Fri, Dec 19, 2008 at 9:52 AM, Richard Huxton d...@archonet.com wrote: Yep - that's what sharding is all about - separate disconnected silos of data. You know, like the ones that were all the rage in the 60s that drove people to invent RDBMS. The good thing is, if your application is successful and is still in use a couple of years from now you get to either spend all your time fire-fighting or re-implementing integrity constraints. Then you're not maintaining referential integrity. There's no point in having a user-id that doesn't *mean* anything. Primary keys, foreign keys and all the other bits and pieces of RI in a SQL database are there to maintain the *meaning* of your data. It depends on what you're using a database for. I had a long, drawn-out discussion with my boss a couple of weeks ago regarding the purpose of a database. I argued it was there to store data reliably (eg. maintaining integrity, cascading deletes or ensuring you can't delete something being referenced in another relation etc.), he argued it was to improve the performance of the application that was using it (I'm talking in the context of web-apps since it's what pays the bills). At my previous job we used a MySQL database. I chose the InnoDB engine because I wanted to make sure the site data was going to stay as unbroken as MySQL can enforce. The important thing was actual foreign key checks. The site workflow naturally suited this since it was updated overnight into these tables with real referential integrity (MyISAM will not enforce foreign keys). After the synchronisation, you can then regenerate cache tables which don't have to be perfect, they're there to be faster. On the other hand, it's more work to generate cache tables. Really it depends what you want. I fall strongly on the side of good data and then put in place other measures if it's not fast enough. Other people are purely concerned with performance, Just wanting to use database indexes to make things fast and easy. As a result you're often implementing data security in your code which I don't feel is the right place for it. Cheers, --James
Re: Sharding, and all that
Richard Huxton wrote: Yep - that's what sharding is all about - separate disconnected silos of data. I thought sharding specifically related to horizontal partitioning. i.e. splitting one table across several databases, e.g. records with even row ids in one DB, odd in another. Apologies if my terminology is wrong. I was thinking more specifically about vertical partitioning along the functional boundaries which wouldn't be sharding by my (possibly incorrect) definition. Apologies for being off-topic, too :-) Then you're not maintaining referential integrity. There's no point in having a user-id that doesn't *mean* anything. I'm not suggesting that the user id doesn't mean anything. It means exactly the same thing as it does in any other database - a unique identifier to a user record. I'm saying that the user record in the message DB doesn't need to store email address, username, password, favourite colour, challenge question/response, and all the other cruft that goes with user administration. All it needs is a unique id, an authentication realm (identifying the authentication service) and authentication token (identifying the user at that service). And perhaps any message-board specific niceties such as your local nick and display preferences. Similarly in the authentication DB there's no need to store information in the users table relating to message board preferences. I can see that trivially splitting one user table into two leads to all sorts of integrity problem. But I'm thinking of them as two separate user databases from the outset and accepting that they're potentially disjoint. The best (but poor) example I can think of right now is how my gravatar popped up automatically when I signed up at github. Not because the github user database is referentially at one with the gravatar database, but because I used the same public identifier (my email address) on both systems. So it could be argued that there is *a* point in having a user id (such as email address) that doesn't *mean* anything to the *current* database, because it might have meaning to *other* databases. It's the closest thing we've got to referential integrity in a distributed world. Primary keys, foreign keys and all the other bits and pieces of RI in a SQL database are there to maintain the *meaning* of your data. Sure, I recognise the fact that you lose referential integrity at the boundary between your db and the outside world. But internally, the DB remains referentially intact. The message board still has its own user records for messages to reference. The fact that the authentication realm/token may at some point in the future become invalid is really no different to the current situation where a user's email address changes and they can no longer login or get a password reminder/reset. Before that though, make sure you do have a problem. Pick the right tool for the job - if high concurrency/complex queries/procedural code for constraints is a requirement then it's probably not MySQL. Always consider what an extra couple of grand on hardware will gain you. For this particular project we don't really have a database problem that can't be solved with a bit of replication and a whack of the hardware hammer. The majority of the load will be searches that we're reading from a single read-only table. So we can replicate that easily across as many web heads as required for performance and it gives us simple redundancy in case of machine failure, etc. All the read/write functionality (mostly CMS-like stuff) will happen (initially) on a single server with master read/write database. It'll be fairly low-load and it's not mission critical (in the sense that having the CMS offline for a few hours is an inconvenience not a hanging offence). The impetus was more about turning one large and tangled CMS-like system into several smaller, simpler systems. 90% of the tables fall neatly into one sub-system or the other (front-end search and content delivery, back-end content management, image management, accounting and billing, and so on). It's the last few tables (mostly relating to users, unsurprisingly) that straddle spread-eagled across the database dangling their tools in the works. So I'm really coming at this from the perspective of someone thinking about building distributed apps and wondering how the database is going to work behind the scenes, rather than someone with a database performance problem considering partitioning. Cheers A
Re: Sharding, and all that
Andy Wardley wrote: Richard Huxton wrote: Yep - that's what sharding is all about - separate disconnected silos of data. I thought sharding specifically related to horizontal partitioning. i.e. splitting one table across several databases, e.g. records with even row ids in one DB, odd in another. Apologies if my terminology is wrong. I was thinking more specifically about vertical partitioning along the functional boundaries which wouldn't be sharding by my (possibly incorrect) definition. Apologies for being off-topic, too :-) If sharding means anything at all, then it has to be something other than partitioning or partial replication, otherwise we could say partitioning or partial replication. Of course it's entirely possible it *doesn't* mean anything at all, and is just partitioning2.0 Then you're not maintaining referential integrity. There's no point in having a user-id that doesn't *mean* anything. I'm not suggesting that the user id doesn't mean anything. It means exactly the same thing as it does in any other database - a unique identifier to a user record. I'm saying that the user record in the message DB doesn't need to store email address, username, password, favourite colour, challenge question/response, and all the other cruft that goes with user administration. All it needs is a unique id, an authentication realm (identifying the authentication service) and authentication token (identifying the user at that service). And perhaps any message-board specific niceties such as your local nick and display preferences. Similarly in the authentication DB there's no need to store information in the users table relating to message board preferences. Having one 300-column user_details table is probably a bad idea anyway. However, if you can't say uid=123456 on your message is the same as uid=123456 in your user-login table then I'd say it doesn't mean anything. You can't even say two messages posted by uid=123456 are the same user without providing guarantees about the security of your uid transfer code (which is back to providing your own relational integrity). I can see that trivially splitting one user table into two leads to all sorts of integrity problem. But I'm thinking of them as two separate user databases from the outset and accepting that they're potentially disjoint. The best (but poor) example I can think of right now is how my gravatar popped up automatically when I signed up at github. Not because the github user database is referentially at one with the gravatar database, but because I used the same public identifier (my email address) on both systems. So it could be argued that there is *a* point in having a user id (such as email address) that doesn't *mean* anything to the *current* database, because it might have meaning to *other* databases. It's the closest thing we've got to referential integrity in a distributed world. Yep, but that's option #1 - don't care. If your gravatar didn't work, or you got the wrong one, well whatever. Option #2 would be something like OpenID where you can't (and dont' want to) have a single large database accessed by everyone. So - you have to build in trust mechanisms. Or GPG key-signing say. Primary keys, foreign keys and all the other bits and pieces of RI in a SQL database are there to maintain the *meaning* of your data. Sure, I recognise the fact that you lose referential integrity at the boundary between your db and the outside world. But internally, the DB remains referentially intact. The message board still has its own user records for messages to reference. The fact that the authentication realm/token may at some point in the future become invalid is really no different to the current situation where a user's email address changes and they can no longer login or get a password reminder/reset. No, there's a vital difference. If I want to I can track every change a user makes to their account, so long as I can trust the uid. Barring errors in the RDBMS code it's a relatively simple task for me to guarantee that. If the police come knocking because someones posted threatening messages or whatever then once they've got their warrant they can have their data. Of course maybe it doesn't matter, but that's putting you back in #1 - don't care. Before that though, make sure you do have a problem. Pick the right tool for the job - if high concurrency/complex queries/procedural code for constraints is a requirement then it's probably not MySQL. Always consider what an extra couple of grand on hardware will gain you. For this particular project we don't really have a database problem that can't be solved with a bit of replication and a whack of the hardware hammer. The majority of the load will be searches that we're reading from a single read-only table. So we can replicate that easily across as many web heads as required for performance and it gives
Re: Sharding, and all that
On Fri, Dec 19, 2008 at 1:43 PM, Richard Huxton d...@archonet.com wrote: Andy Wardley wrote: Richard Huxton wrote: Yep - that's what sharding is all about - separate disconnected silos of data. I thought sharding specifically related to horizontal partitioning. i.e. splitting one table across several databases, e.g. records with even row ids in one DB, odd in another. Apologies if my terminology is wrong. I was thinking more specifically about vertical partitioning along the functional boundaries which wouldn't be sharding by my (possibly incorrect) definition. Apologies for being off-topic, too :-) If sharding means anything at all, then it has to be something other than partitioning or partial replication, otherwise we could say partitioning or partial replication. Of course it's entirely possible it *doesn't* mean anything at all, and is just partitioning2.0 Sharding is horizontal partitioning: splitting your data by a primary key, for example having a few hundred million email boxes spread over thousands of machines keyed by email address. Having your user accounts data in a separate database from say those users' preferences is a form of vertical partitioning. Both in a big system are very likely to be sharded by some kind of internal key. P
Sharding, and all that
So, It's nearly the new year and the time for new projects. On my project radar for next year is consider (better) sharding our MySQL database. Now I understand the basic principles behind this and can, if needs be, implement my own wheel, but before I do I'd love to find out what everyone else is doing. Are there any good reference sites out there - MySQL sharding success stories if you will - and how did they do it? I'm pretty sure that everyone does it slightly differently but the more things I can look at before considering my own implementation, the better. Hot technologies (by which I mean what blogs and google have been throwing at me recently) include: - Spock Proxy - MySQL Clustering - Implementing it at the app level. What's the collective group think on these? Obviously (for this list, at least) the app that will be accessing the database is Perl based, so I'm only considering compatible technologies at the moment. Mark.
Re: Sharding, and all that
2008/12/18 Mark Fowler m...@twoshortplanks.com It's nearly the new year and the time for new projects. On my project radar for next year is consider (better) sharding our MySQL database. I came across this and noted it for future reference: http://spockproxy.sourceforge.net/ - doh, just read the rest of your email - you've already found it http://www.mysqlperformanceblog.com/ - seems to know their stuff so you might find something in here. Leo
Re: Sharding, and all that
Mark Fowler wrote: What's the collective group think on these? There's a good series of articles on sharding starting here: http://lifescaler.com/2008/04/database-sharding-unraveled-part-i/ The conclusion I drew from it was that functional partitioning (where possible) was much easier to implement than horizontal partitioning. A
Re: Sharding, and all that
On 18 Dec 2008, at 10:19, Mark Fowler wrote: So, It's nearly the new year and the time for new projects. On my project radar for next year is consider (better) sharding our MySQL database. There's really good stuff in the Sybase Unleashed book on horizontal and vertical partitioning. And as people have said, you can partition by kind of data. sharding is a horrible term. -- Dave HodgkinsonMSN: daveh...@hotmail.com Site: http://www.davehodgkinson.com UK: +44 7768 490620 Blog: http://davehodg.blogspot.com Photos: http://www.flickr.com/photos/davehodg
Re: Sharding, and all that
2008/12/18 Dave Hodgkinson daveh...@gmail.com: sharding is a horrible term. But the computing 2.0 neophytes have to have obscure neologisms - it would be an affront to their freedom of thought to use a term that everyone else understands.
Re: Sharding, and all that
Mark Fowler wrote: consider (better) sharding our MySQL database. Is that when you throw it against the wall really, really, hard?
Re: Sharding, and all that
Andy Wardley wrote: Mark Fowler wrote: What's the collective group think on these? There's a good series of articles on sharding starting here: http://lifescaler.com/2008/04/database-sharding-unraveled-part-i/ The conclusion I drew from it was that functional partitioning (where possible) was much easier to implement than horizontal partitioning. Hmm - skimming these articles, I'm not hugely impressed. The chap(ess?) behind them is clearly a developer rather than a DBA. http://lifescaler.com/2008/06/database-sharding-unraveled-part-iii/ I’ll give you a practical example. It involves a forum with about 150k registered users, 600k posts and about 10k unique visitors/day (peaks reaching 50k unique visitors/day). The DB server is a 8 dual-core XEON processors with 8gb/ram. The whole DB is about 1GB in size and the server is MySQL. When the number of simultaneous DB connections reaches a critical level (it’s variable), MySQL will freeze. That’s not all, as the sessions and posts tables will often get corrupted. The first solution would be to put the tables in memory, so as not to stress the HDD. This actually turns out to be the worst idea possible. So what to do? Well, how about (1) connection-pooling and (2) running memcached using some of the 6GB of RAM you have free. Even if that only reduces your queries by 90% (and you'd expect better than that for a forum) you're not going to be seeing more than 16 simultaneous queries - which is the number of cores you have. While we're on the topic you've probably got too many cores and not enough disks (and I'm basing that solely on the fact that a DB server has been described *without* mentioning disks at all). Oh, and if you're routinely finding your tables are getting corrupted either (a) replace your hardware or (b) replace your database. Split the database into smaller, specialized db’s. Use a DB for users, one for messanging functionalities, one for product orders, etc. Each of these databases must exist independently, that is, the splitting must be made so that I could, for example, create a new functionality (implicitly a new database), let’s say discussion board, and just take the users DB as a whole and use it, without affecting anything else. Brilliant! So now your messages don't necessarily have a valid user associated with them. That's OK though, because you're checking that in your application aren't you? Everywhere? Including any admin scripts? And manual interactions with the database? And you can prove that you've not missed any checking? This sort of stuff is very easy to get wrong. It's very easy to make the sort of mistake that isn't noticed for months when it's impossible to correct corrupt data. Now I know the likes of ebay and skype do it, but you can bet they wouldn't unless they had to. These articles seem to put it on a par with trying out a new theme in firefox. -- Richard Huxton Archonet Ltd
Re: Sharding, and all that
2008/12/18 Martin A. Brooks mar...@hinterlands.org: Mark Fowler wrote: consider (better) sharding our MySQL database. Is that when you throw it against the wall really, really, hard? Or make the DBAs roll naked in a bathtub of broken glass until they sort the performance problems out.
Re: Sharding, and all that
Jonathan Stowe wrote: 2008/12/18 Martin A. Brooks mar...@hinterlands.org: Mark Fowler wrote: consider (better) sharding our MySQL database. Is that when you throw it against the wall really, really, hard? Or make the DBAs roll naked in a bathtub of broken glass until they sort the performance problems out. You're already making them admin MySQL - rolling in broken glass is what they do to relax. -- Richard Huxton Archonet Ltd
Re: Sharding, and all that
On Thu, Dec 18, 2008 at 12:49 PM, Jonathan Stowe j...@integration-house.com wrote: 2008/12/18 Martin A. Brooks mar...@hinterlands.org: Mark Fowler wrote: consider (better) sharding our MySQL database. Is that when you throw it against the wall really, really, hard? Or make the DBAs roll naked in a bathtub of broken glass until they sort the performance problems out. I thought after they sharded it in a particularly awful manner it just felt like rolling in a bathtub of broken glass, as per Richard Huxton's response... --James
Re: Sharding, and all that
Mark Fowler wrote: So, It's nearly the new year and the time for new projects. On my project radar for next year is consider (better) sharding our MySQL database. Standard consultant question: Why? go faster/scale up is not a good reason. If you just wanted to go faster you would have moved to Pg or similar :-) There is no magic solution - the DB and application and usage should be reviewed to see if some minor tweaks can make major improvements. Picking a solution without first knowing what the existing problems and bottlenecks are smells of I wanna play :-) Jacqui p.s. I have wanted to play with a Pg SSI over a OpenMosix image for ages but have yet to make the time. The original question sounds similar to the reasons for my wanting to play with Pg over OpenMosix (because it sounds like fun and May help scale up a app) - I had a good excuse a year or two ago but today with 16+ CPU servers, Hadoop etc there are better things to do with my time.
Re: Sharding, and all that
On Thu, Dec 18, 2008 at 10:19:06AM +, Mark Fowler said: What's the collective group think on these? Obviously (for this list, at least) the app that will be accessing the database is Perl based, so I'm only considering compatible technologies at the moment. We use Data::ObjectDriver which has built in support for sharding and memcache (somewhat unsuprisingly) plus MySQL with replication. The main advantages are that it's conceptually simple, resilient, flexible (especially with the ability to move entities between shards) and cheap to scale. The big question is whether to use a hashing scheme for shard selection (keywords: Consistent Hashing, libketama, DHTs) or whether to use a master lookup device (which is what I believe Flickr uses). As always, YMMV.
Re: Sharding, and all that
HI Mark, So, It's nearly the new year and the time for new projects. On my project radar for next year is consider (better) sharding our MySQL database. I think it depends on your write/read ratio. If it is read heavy then replication and/or memcache will probably do the trick - on the other hand if you expect more writes than reads then I saw an interesting technique for handling excess DB writes by using thousands of temporary MySQL MyISAM tables[1]. Writing to MyISAM files is quick - but the downside is locking - by using _lots_ of temporary MyISAM tables you can improve overall write capacity. NIge [1] These guys used 20,000+ MyISAM tables to scale writes http://en.oreilly.com/mysql2008/public/schedule/detail/592