:31
To: Naz Gassiep
Cc: mysql@lists.mysql.com
Subject: Re: Integrity on large sites
Naz,
*Really* big sites don't ever have referential integrity. Or if the
few spots
they do (like with financial transactions) it's implemented on the
application
level (via, say, optimistic locking), never
B. Keith Murphy wrote:
Here is the kicker. Each box was a top of the line Sun server that
had 32 processors and 32 gigs of RAM. They could handle up to 64
procs and 64 gigs. And each cost well over a million dollars for
the hardware alone. Running Oracle on it must have cost over
100,000
You youngsters may not realize that there were billing applications
serving millions of customers long, long before there were any kind of
database management systems. They employed concepts called flat
files and batch processing. And they ran on machines far weaker
than anything any of
Hey there, thanks for your comments. There are issues where sharding may
be appropriate, but you are talking about the heaviest of heavy duty
loads. Not only that, hardware is getting to the point where it is
surpassing our needs. Remember the days when it cost $200k to run a
library database?
It is my contention that as the clustering capabilities of MySQL
continue to grow and mature (think of when version 6.0 goes stable)
companies will move to MySQL in droves. THEN you have the ability to
build a single virtual database (at least from the point of view of
your application)
Hi Naz,
Just to throw out (plug) an ongoing project:
http://www.hivedb.org/
From the site:
HiveDB is an open source framework for horizontally partitioning MySQL
systems. Building scalable and high performance MySQL-backed systems
requires a good deal of expertise in designing the system
I'm working in a project at the moment that is using MySQL, and people keep
making assertions like this one:
*Really* big sites don't ever have referential integrity. Or if the few spots
they do (like with financial transactions) it's implemented on the application
level (via, say, optimistic
Naz,
*Really* big sites don't ever have referential integrity. Or if the
few spots
they do (like with financial transactions) it's implemented on the
application
level (via, say, optimistic locking), never the database level.
Mebbe that view was common in the MySQL community in the time of
I'm working in a project at the moment that is using MySQL, and people
keep making assertions like this one:
*Really* big sites don't ever have referential integrity. Or if the few
spots they do (like with financial transactions) it's implemented on the
application level (via, say, optimistic
Naz,
Without going into detail about various projects I've seen, surfice it to
say that I have wittnessed some true horrors. In defence however, the
largest abomination I have ever witnessed was from an MS shop that had grown
a database from a MS Access system upward and had then, bluntly
Since the question was about *really* big websites, the answer is both
yes and no.
Yes, they do turn off RI on the database side, simply because it's not
possible to enforce RI on a database system where data is partitioned
across server farms (or shards) both vertically and horizontally. And
Data partitioning? Sorry, I disagree that partitioning a table into more
and more servers is the way to scale properly. Perhaps putting
databases' tables onto different servers with different hardware
designed to meat different usage patterns is a good idea, but data
partitioning was a very short
Sometimes partitioning is absolutely necessary. If you can't run a
cluster - how else can you really scale writes to the database? Some
companies can't use clustering because in 5.0.x (the non-beta release)
clustering is all done in memory - all tables have to be in memory (just
like the old
You certainly have a right to disagree, but pretty much every
scalability talk at the MySQL conference a few weeks ago was focused
on data partitioning and sharding. And those talks very given by folks
working for some of the most popular (top 100) websites in the world.
It certainly looks like
Wow.
The problem with sharding I have is the large amount of code
required in the app to make it work. IMHO the app should be agnostic to
the underlying database system (by that I don't mean the DB in use such
as MySQL or whatever or the schema, I mean the way the DB has been
deployed) so that
OK. Going to try this again. After reading through these emails I
think I have learned a little more about the way you are thinking.
I DO NOT want to start some kind of flame war.
However, I disagree very strongly with what you are saying. Yes, you
are right, sharding does require more
16 matches
Mail list logo