Re: [GENERAL] postgre vs MySQL

paul rivers Wed, 12 Mar 2008 13:03:32 -0700

Alvaro Herrera wrote:

Ivan Sergio Borgonovo wrote:

On Wed, 12 Mar 2008 09:13:14 -0700
paul rivers <[EMAIL PROTECTED]> wrote:

For a database of InnoDB tables, people tend to replicate the
database, and then backup the slave (unless the db is trivially

That recalled me the *unsupported* feeling I have that it is easier
to setup a HA replication solution on MySQL.


Well, if you have a crappy system that cannot sustain concurrent load or
even be backed up concurrently with regular operation, one solution is
to write a kick-ass replication system.

The other solution is to enhance the ability of the system to deal with
concurrent operation.

We keep hearing how great all those Web 2.0 sites are; Slashdot, Flickr,
etc; and they all run on farms and farms of MySQL servers, "because
MySQL replication is so good".  I wonder if replication is an actual
_need_ or it's there just because the other aspects of the system are so
crappy

"Kick-ass" imho really means "really simple to setup" and included aspart of the standard db.

There are all kinds of corner cases that can bite you with MySQLreplication. Offhand, I wager most of these (at least in InnoDB) resultfrom the replication "commit" status of a transaction is in the binlogs,which is not the same as the InnoDB database commit status in the .ibdfiles. Writing out binlog entries happens at a higher level than thestorage engine, and so it's not hard to imagine what can go wrong there.There are a few my.cnf settings that let you really roll the dice withdata integrity based on this dichotomy, if you so choose.

In those high volume shops, imho replication is a requirement, but inpart to overcome technical limitations of MySQL. Or to phrase it from aMySQL point of view, to do it the MySQL way. If you have 50-ish minutes,this video by the YouTube people talks about their evolution with MySQL(among many other things) :


http://video.google.com/videoplay?docid=-6304964351441328559

The summary from the video is:

- Start with a MySQL instance using InnoDB
- Go to 1-M replication, and use the replicants as read-only version.

- Eventually the cost of replication outweighs the gains, so go todatabase sharding- Keep 1-M replication within a shard group to allow easy backups of aslave, some read-only use of the slaves, and a new master in case ofmaster failure (i.e. high availability)

Almost everyone finds MyISAM unworkable in large scale environmentsbecause of the repairs necessary post-crash.



Big complaints about MySQL high-volume shops often, imho, come back to :

- You can only have so many active threads in the InnoDB storage enginemodule at a time. See e.g.:


http://dev.mysql.com/doc/refman/5.0/en/innodb-parameters.html#option_mysqld_innodb_thread_concurrency

- Auto_increment columns as pkeys in InnoDB tables are practicallyrequired, yet severely limited scalability due to how a transactionwould lock the structure to get the next auto-increment (significantlyimproved in 5.1)

- Shutting down a MySQL engine can take forever, due partly dirty pagewrites, partly due to insert buffer merging. See:


http://dev.mysql.com/doc/refman/5.1/en/innodb-insert-buffering.html

There are other complaints you'd expect people to have, but don't seemto get talked about much, because people are so used to (from my pointof view) working around them. For example, statistics on an InnoDB tableare calculated when the table is first accessed, but not storedanywhere, so there are extra costs on database startup. The backup issuewith InnoDB has already been covered. Tablespace management in InnoDBseems exceptionally primitive, and is helped somewhat by thetablespace-per-table option. There are many more, again imho.


Paul







--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] postgre vs MySQL

Reply via email to