RE: Integrity on large sites

2007-05-25 Thread Rhys Campbell
:31 To: Naz Gassiep Cc: mysql@lists.mysql.com Subject: Re: Integrity on large sites Naz, *Really* big sites don't ever have referential integrity. Or if the few spots they do (like with financial transactions) it's implemented on the application level (via, say, optimistic locking), never

Re: Integrity on large sites

2007-05-25 Thread Barry Newton
B. Keith Murphy wrote: Here is the kicker. Each box was a top of the line Sun server that had 32 processors and 32 gigs of RAM. They could handle up to 64 procs and 64 gigs. And each cost well over a million dollars for the hardware alone. Running Oracle on it must have cost over 100,000

Re: Integrity on large sites

2007-05-25 Thread Naz Gassiep
You youngsters may not realize that there were billing applications serving millions of customers long, long before there were any kind of database management systems. They employed concepts called flat files and batch processing. And they ran on machines far weaker than anything any of

Re: Integrity on large sites

2007-05-25 Thread Naz Gassiep
Hey there, thanks for your comments. There are issues where sharding may be appropriate, but you are talking about the heaviest of heavy duty loads. Not only that, hardware is getting to the point where it is surpassing our needs. Remember the days when it cost $200k to run a library database?

Re: Integrity on large sites

2007-05-25 Thread Martijn Tonies
It is my contention that as the clustering capabilities of MySQL continue to grow and mature (think of when version 6.0 goes stable) companies will move to MySQL in droves. THEN you have the ability to build a single virtual database (at least from the point of view of your application)

Re: Integrity on large sites

2007-05-25 Thread Jeremy Cole
Hi Naz, Just to throw out (plug) an ongoing project: http://www.hivedb.org/ From the site: HiveDB is an open source framework for horizontally partitioning MySQL systems. Building scalable and high performance MySQL-backed systems requires a good deal of expertise in designing the system

Integrity on large sites

2007-05-24 Thread Naz Gassiep
I'm working in a project at the moment that is using MySQL, and people keep making assertions like this one: *Really* big sites don't ever have referential integrity. Or if the few spots they do (like with financial transactions) it's implemented on the application level (via, say, optimistic

Re: Integrity on large sites

2007-05-24 Thread Peter Brawley
Naz, *Really* big sites don't ever have referential integrity. Or if the few spots they do (like with financial transactions) it's implemented on the application level (via, say, optimistic locking), never the database level. Mebbe that view was common in the MySQL community in the time of

Re: Integrity on large sites

2007-05-24 Thread Martijn Tonies
I'm working in a project at the moment that is using MySQL, and people keep making assertions like this one: *Really* big sites don't ever have referential integrity. Or if the few spots they do (like with financial transactions) it's implemented on the application level (via, say, optimistic

Re: Integrity on large sites

2007-05-24 Thread Philip Mather
Naz, Without going into detail about various projects I've seen, surfice it to say that I have wittnessed some true horrors. In defence however, the largest abomination I have ever witnessed was from an MS shop that had grown a database from a MS Access system upward and had then, bluntly

Re: Integrity on large sites

2007-05-24 Thread Evaldas Imbrasas
Since the question was about *really* big websites, the answer is both yes and no. Yes, they do turn off RI on the database side, simply because it's not possible to enforce RI on a database system where data is partitioned across server farms (or shards) both vertically and horizontally. And

Re: Integrity on large sites

2007-05-24 Thread Naz Gassiep
Data partitioning? Sorry, I disagree that partitioning a table into more and more servers is the way to scale properly. Perhaps putting databases' tables onto different servers with different hardware designed to meat different usage patterns is a good idea, but data partitioning was a very short

Re: Integrity on large sites

2007-05-24 Thread B. Keith Murphy
Sometimes partitioning is absolutely necessary. If you can't run a cluster - how else can you really scale writes to the database? Some companies can't use clustering because in 5.0.x (the non-beta release) clustering is all done in memory - all tables have to be in memory (just like the old

Re: Integrity on large sites

2007-05-24 Thread Evaldas Imbrasas
You certainly have a right to disagree, but pretty much every scalability talk at the MySQL conference a few weeks ago was focused on data partitioning and sharding. And those talks very given by folks working for some of the most popular (top 100) websites in the world. It certainly looks like

Re: Integrity on large sites

2007-05-24 Thread Naz Gassiep
Wow. The problem with sharding I have is the large amount of code required in the app to make it work. IMHO the app should be agnostic to the underlying database system (by that I don't mean the DB in use such as MySQL or whatever or the schema, I mean the way the DB has been deployed) so that

Re: Integrity on large sites

2007-05-24 Thread B. Keith Murphy
OK. Going to try this again. After reading through these emails I think I have learned a little more about the way you are thinking. I DO NOT want to start some kind of flame war. However, I disagree very strongly with what you are saying. Yes, you are right, sharding does require more