Hello Wes,
On 4/2/2012 4:05 PM, Wes Modes wrote:
Thanks Shawn and Karen, for the suggestions, even given my vague
requirements.
To clarify some of my requirements.
*Application: *We are using an open-source application called Omeka,
which is a "free, flexible, and open source web-publishing platform for
the display of library, museum, archives, and scholarly collections and
exhibitions." Without getting into how free (or scalable) free software
really is, we can view it as one aspect we cannot change, having been
written into the grant requirements we received for the project.
Experienced Omeka developers and our own developer have suggested
that/it is not feasible to separate database writes from reads in the
application/ (given time and resources).
That's a shame. Sounds like you are back to one big server or several
smaller servers with in-program sharding.
*SAN: *The SAN is a Dell EqualLogic 6100 which has redundant everything,
including multiple NICs, controllers, and power. So we are less
concerned about the SAN being a SPoF. On the other hand, if we have a
single big MySQL server that fails, we could bring up another copy of it
via VMWare, but until the server came up, the application would be dead
in the water. If the database is corrupted, service will be interrupted
for a considerable time.
Again, each MySQL instance needs it's own copy of the data. Having only
one big powerful disk system means that each instance you fire up must
both share spindles and networking to access its data. Just like a
freeway at rush hour, you may find the traffic into and out of this one
device crawling to a halt exactly when you don't want it to.
*High Availability:* It sounds like there is some debate over how to
provide HA best, but do people really disagree on the desired results?
Without getting into the many meanings of this buzz word, here's what we
mean: /We desire to maintain high availability of service, allowing a
small fraction of users to experience outage for only seconds at a
time. We desire to provide this through horizontal scaling, redundancy,
failover planning, and external monitoring. /
"Small fraction of users" - this implies data sharding. Multiple MySQL
instances each with enough data to operate independently for one slice
of your most important data and an application smart enough to know
which shard to go to for each slice of data.
"For a few seconds at a time" - you do not want a shared disk. Should
the active MySQL die, it's data will be in an inconsistent state. Once
you fire up the passive daemon it will need to perform a recovery
restart. This down time is more than likely not going to take only a few
seconds. The more data you have, the longer the checks will take. An
independent copy maintained by a slave instance, provides a logically
consistent copy of the master's data as it will only replicate complete
transactions.
"horizontal scaling" - one master, multiple slaves. This requires the
separation of writes and reads.
*Scalability: *Again, seems like there are lots of applications and
implementation, but people agree on the general concept. Here's what we
mean for this project: /We desire to scale our services so that a
usage surge does not cause unavailability of the services for some
users. We prefer to horizontally increase scalability using
load-balancing strategies to treat clusters of servers as single logical
units./
"Horizontal scalability" - see above
"clusters of servers" - unless they share a common data set (which
cannot happen via file sharing) they will not be useful as a logical
unit. Multiple slaves all replicating from the same master does match
this requirement.
The application may have not been designed with great scalability in
mind, but if multiple application instances are accessing multiple
database servers treated as one logical unit, that may not be too
relevant.
"logical unit" - see above
"multiple database servers" - remember! each instance of MySQL requires
its own set of data. No sharing is allowed.
I am responsible for creating an architecture upon which this project
will run. I am not responsible for redesigning the application. So
far, no one has suggested anything that approached meeting our
requirements, even our vague ones. Perhaps I am asking the wrong list?
No, you are asking the correct list. You are just not believing our
answers. The application design is your stumbling block. It is written
to function only with one large database (SPOF). Until you can fix that
(as painful as it may be), you cannot achieve your availability goals.
Does anyone have any experience with MySQL Multi-Master Replication?
Perhaps that should be a separate post.
Replication rings are possible but you must design your application to
take special care to NOT update the same row in multiple nodes of the
ring at the same time. This is even harder to design and code for than
splitting writes/reads to master/slaves.
Also the loss of one node of a replication ring is not as easy to
recover from as simply promoting one slave to become the new master of a
replication tree (demoting the recovered former-master to become yet
another slave) as there may be pending events in the relay logs of the
lost node that have not yet been relayed to the downstream node.
I may not have every answer, but I have seen nearly every kind of
failure. Everyone else is encouraged to add their views to the discussion.
--
Shawn Green
MySQL Principal Technical Support Engineer
Oracle USA, Inc. - Hardware and Software, Engineered to Work Together.
Office: Blountville, TN
--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe: http://lists.mysql.com/mysql