Re: HA & Scalability w MySQL + SAN + VMWare: Architecture Suggestion Wanted

2012-04-04 Thread Wes Modes
Thanks, Ian.

W.

On 4/4/2012 4:02 AM, Ian wrote:
> On 04/04/2012 01:11, Wes Modes wrote:
>> On 4/3/2012 3:04 AM, Ian wrote:
>>> On 03/04/2012 00:47, Wes Modes wrote:
 Thanks again for sharing your knowledge.  I do believe the answers I've
 receiving, but since I have requirements that I cannot easily alter, I'm
 also gently pushing my expert advisers here to look beyond their own
 preferences and direct experience.

 RE: Shared storage.  I can easily let go of the preference to take
 advantage of shared storage.  I understand duplicated databases are the
 essence of database redundancy.  You make good points.

 In terms of the acceptability of a small fraction of users being
 temporarily unable to access services:  rather than sharding, which
 again requires more control over the application than we have, I was
 more envisioning that would be the fraction of users who hit the one
 peer MySQL server that is temporarily unavailable due to h/w or s/w
 failure or DB corruption while its fail over is powered up.

 Does MySQL cluster seem like it will address my requirements to allow us
 to horizontally scale a number of MySQL nodes as peers without
 separating reads and writes, or slaves and masters. 

 Wes
>>> Hi Wes,
>>>
>>> If you can't alter the application to split reads and writes, why not
>>> let MySQL Proxy to do it for you?
>>>
>>> http://forge.mysql.com/wiki/MySQL_Proxy
>>>
>>> Combine this with haproxy and you could build a multi-master environment
>>> with each master having any number of slaves.  Set MySQL Proxy to send
>>> writes to the masters and reads to the slaves.
>>>
>>> Regards
>>>
>>> Ian
>> Ian, what is the best place to ask specific questions about mysql-proxy? 
>>
>> In general, here are my questions:
>>
>>  1. Does the proxy sit on a separate server and route all MySQL
>> requests, or is it installed on each of the MySQL nodes and
>> re-shuffle MySQL requests to the appropriate place?
>>
>>  2. Can multiple proxies be run in concert to provide redundancy and
>> scalability as well as eliminate SPoF and bottlenecks?
>>
>>  3. In 2007 when RW Splitting was new, there were a few problems and
>> limitations.  What is the current status of development of this
>> important feature?
>>
>> Thanks!
>>
>> And again I appreciate the brainstorming that many have done here to
>> find a solution that fits my requirements.
>>
>> Wes
> Hi Wes,
>
> I'm afraid I can't help you with your questions as I have never actually
> used MySQL Proxy!  I was aware of its existence so brought it up in case
> it fitted your needs.
>
> I sympathize with your position and hope you find the answers you need.
>
> Regards
>
> Ian

-- 
Wes Modes
Systems Designer, Developer, and Administrator
University Library ITS
University of California, Santa Cruz


-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql



Re: HA & Scalability w MySQL + SAN + VMWare: Architecture Suggestion Wanted

2012-04-03 Thread Wes Modes
On 4/3/2012 3:04 AM, Ian wrote:
> On 03/04/2012 00:47, Wes Modes wrote:
>> Thanks again for sharing your knowledge.  I do believe the answers I've
>> receiving, but since I have requirements that I cannot easily alter, I'm
>> also gently pushing my expert advisers here to look beyond their own
>> preferences and direct experience.
>>
>> RE: Shared storage.  I can easily let go of the preference to take
>> advantage of shared storage.  I understand duplicated databases are the
>> essence of database redundancy.  You make good points.
>>
>> In terms of the acceptability of a small fraction of users being
>> temporarily unable to access services:  rather than sharding, which
>> again requires more control over the application than we have, I was
>> more envisioning that would be the fraction of users who hit the one
>> peer MySQL server that is temporarily unavailable due to h/w or s/w
>> failure or DB corruption while its fail over is powered up.
>>
>> Does MySQL cluster seem like it will address my requirements to allow us
>> to horizontally scale a number of MySQL nodes as peers without
>> separating reads and writes, or slaves and masters. 
>>
>> Wes
> Hi Wes,
>
> If you can't alter the application to split reads and writes, why not
> let MySQL Proxy to do it for you?
>
> http://forge.mysql.com/wiki/MySQL_Proxy
>
> Combine this with haproxy and you could build a multi-master environment
> with each master having any number of slaves.  Set MySQL Proxy to send
> writes to the masters and reads to the slaves.
>
> Regards
>
> Ian

Ian, what is the best place to ask specific questions about mysql-proxy? 

In general, here are my questions:

 1. Does the proxy sit on a separate server and route all MySQL
requests, or is it installed on each of the MySQL nodes and
re-shuffle MySQL requests to the appropriate place?

 2. Can multiple proxies be run in concert to provide redundancy and
scalability as well as eliminate SPoF and bottlenecks?

 3. In 2007 when RW Splitting was new, there were a few problems and
limitations.  What is the current status of development of this
important feature?

Thanks!

And again I appreciate the brainstorming that many have done here to
find a solution that fits my requirements.

Wes

-- 
Wes Modes
Systems Designer, Developer, and Administrator
University Library ITS
University of California, Santa Cruz



Re: HA & Scalability w MySQL + SAN + VMWare: Architecture Suggestion Wanted

2012-04-03 Thread Wes Modes
Am I right in seeing that if you can split reads and writes without the
application having to be replication-aware, one does not need multiple
masters?  One can simply have standard MySQL replication, yes?

For us, we only were interested in multiple masters so that we could
read or write from any of the available MySQL nodes.

W.

On 4/3/2012 3:04 AM, Ian wrote:
> On 03/04/2012 00:47, Wes Modes wrote:
>> Thanks again for sharing your knowledge.  I do believe the answers I've
>> receiving, but since I have requirements that I cannot easily alter, I'm
>> also gently pushing my expert advisers here to look beyond their own
>> preferences and direct experience.
>>
>> RE: Shared storage.  I can easily let go of the preference to take
>> advantage of shared storage.  I understand duplicated databases are the
>> essence of database redundancy.  You make good points.
>>
>> In terms of the acceptability of a small fraction of users being
>> temporarily unable to access services:  rather than sharding, which
>> again requires more control over the application than we have, I was
>> more envisioning that would be the fraction of users who hit the one
>> peer MySQL server that is temporarily unavailable due to h/w or s/w
>> failure or DB corruption while its fail over is powered up.
>>
>> Does MySQL cluster seem like it will address my requirements to allow us
>> to horizontally scale a number of MySQL nodes as peers without
>> separating reads and writes, or slaves and masters. 
>>
>> Wes
> Hi Wes,
>
> If you can't alter the application to split reads and writes, why not
> let MySQL Proxy to do it for you?
>
> http://forge.mysql.com/wiki/MySQL_Proxy
>
> Combine this with haproxy and you could build a multi-master environment
> with each master having any number of slaves.  Set MySQL Proxy to send
> writes to the masters and reads to the slaves.
>
> Regards
>
> Ian

-- 
Wes Modes
Systems Designer, Developer, and Administrator
University Library ITS
University of California, Santa Cruz


-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql



Re: HA & Scalability w MySQL + SAN + VMWare: Architecture Suggestion Wanted

2012-04-03 Thread Johan De Meersman
- Original Message -
> From: "Ian" 
> 
> with each master having any number of slaves.  Set MySQL Proxy to
> send writes to the masters and reads to the slaves.

Yes, except when you have replication delays (asynchronous, remember?) like 
someone else recently posted, your application writes a value, and consequently 
can't find it again. Boom goes the software :-)

I'm sure proxy has it's uses, but I remain extremely sceptical about both 
performance and trustworthyness of the read/write splitting scheme as long as 
the application isn't aware of it.

Now, if you could somehow cough up a *context*, say, by having proxy pass all 
requests from a connection that has done a write to the master, that would 
help, but...
 * it would yield less scaling - master must again serve some selects
 * with replication delays, you still risk updating the master based off old 
data from a not-up-to-date slave
 * in a web context, connections are not permanent; to keep track of a single 
stream you'd need the proxy to be aware of something like a browser session or 
whatnot

No, if you *really* want to do horizontal scaling, have a look at SchoonerSQL, 
Clustrix and the like. They're third-party, but they (claim to) have proper 
multimaster replication. I've not actually looked at any of them in depth 
myself - we're in an environment where we have a lot of smaller clusters.


-- 
Bier met grenadyn
Is als mosterd by den wyn
Sy die't drinkt, is eene kwezel
Hy die't drinkt, is ras een ezel

-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql



Re: HA & Scalability w MySQL + SAN + VMWare: Architecture Suggestion Wanted

2012-04-03 Thread Ian
On 03/04/2012 00:47, Wes Modes wrote:
> Thanks again for sharing your knowledge.  I do believe the answers I've
> receiving, but since I have requirements that I cannot easily alter, I'm
> also gently pushing my expert advisers here to look beyond their own
> preferences and direct experience.
> 
> RE: Shared storage.  I can easily let go of the preference to take
> advantage of shared storage.  I understand duplicated databases are the
> essence of database redundancy.  You make good points.
> 
> In terms of the acceptability of a small fraction of users being
> temporarily unable to access services:  rather than sharding, which
> again requires more control over the application than we have, I was
> more envisioning that would be the fraction of users who hit the one
> peer MySQL server that is temporarily unavailable due to h/w or s/w
> failure or DB corruption while its fail over is powered up.
> 
> Does MySQL cluster seem like it will address my requirements to allow us
> to horizontally scale a number of MySQL nodes as peers without
> separating reads and writes, or slaves and masters. 
> 
> Wes

Hi Wes,

If you can't alter the application to split reads and writes, why not
let MySQL Proxy to do it for you?

http://forge.mysql.com/wiki/MySQL_Proxy

Combine this with haproxy and you could build a multi-master environment
with each master having any number of slaves.  Set MySQL Proxy to send
writes to the masters and reads to the slaves.

Regards

Ian
-- 




> On 4/2/2012 2:25 PM, shawn green wrote:
>> Hello Wes,
>>
>> On 4/2/2012 4:05 PM, Wes Modes wrote:
>>> Thanks Shawn and Karen, for the suggestions, even given my vague
>>> requirements.
>>>
>>> To clarify some of my requirements.
>>>
>>> *Application:  *We are using an open-source application called Omeka,
>>> which is a "free, flexible, and open source web-publishing platform for
>>> the display of library, museum, archives, and scholarly collections and
>>> exhibitions."  Without getting into how free (or scalable) free software
>>> really is, we can view it as one aspect we cannot change, having been
>>> written into the grant requirements we received for the project.
>>> Experienced Omeka developers and our own developer have suggested
>>> that/it is not feasible to separate database writes from reads in the
>>> application/ (given time and resources).
>>>
>>
>> That's a shame. Sounds like you are back to one big server or several
>> smaller servers with in-program sharding.
>>
>>> *SAN: *The SAN is a Dell EqualLogic 6100 which has redundant everything,
>>> including multiple NICs, controllers, and power.  So we are less
>>> concerned about the SAN being a SPoF.  On the other hand, if we have a
>>> single big MySQL server that fails, we could bring up another copy of it
>>> via VMWare, but until the server came up, the application would be dead
>>> in the water.  If the database is corrupted, service will be interrupted
>>> for a considerable time.
>>>
>>
>> Again, each MySQL instance needs it's own copy of the data. Having
>> only one big powerful disk system means that each instance you fire up
>> must both share spindles and networking to access its data. Just like
>> a freeway at rush hour, you may find the traffic into and out of this
>> one device crawling to a halt exactly when you don't want it to.
>>
>>> *High Availability:*  It sounds like there is some debate over how to
>>> provide HA best, but do people really disagree on the desired results?
>>> Without getting into the many meanings of this buzz word, here's what we
>>> mean: /We desire to maintain high availability of service, allowing a
>>> small fraction of users to experience outage for only seconds at a
>>> time.  We desire to provide this through horizontal scaling, redundancy,
>>> failover planning, and external monitoring.  /
>>>
>>
>> "Small fraction of users" - this implies data sharding. Multiple MySQL
>> instances each with enough data to operate independently for one slice
>> of your most important data and an application smart enough to know
>> which shard to go to for each slice of data.
>>
>> "For a few seconds at a time" - you do not want a shared disk. Should
>> the active MySQL die, it's data will be in an inconsistent state. Once
>> you fire up the passive daemon it will need to perform a recovery
>> restart. This down time is more than likely not going to take only a
>> few seconds. The more data you have, the longer the checks will take. 
>> An independent copy maintained by a slave instance, provides a
>> logically consistent copy of the master's data as it will only
>> replicate complete transactions.
>>
>> "horizontal scaling" - one master, multiple slaves. This requires the
>> separation of writes and reads.
>>
>>
>>> *Scalability:  *Again, seems like there are lots of applications and
>>> implementation, but people agree on the general concept.  Here's what we
>>> mean for this project:  /We desire to  scale our services so that a
>>> usage surge does

Re: HA & Scalability w MySQL + SAN + VMWare: Architecture Suggestion Wanted

2012-04-02 Thread Tim Dunphy
Why not use a load balancer such as HA/Proxy as a way of maintaining high 
availability on your mysql nodes?

http://blogs.reliablepenguin.com/2011/03/31/mysql-load-balancing-with-haproxy

http://linuxadminzone.com/how-to-install-setup-and-config-haproxy-loadbalancer-for-content-switching/

http://agiletesting.blogspot.com/2010/10/mysql-load-balancing-with-haproxy.html

I've done this a few times in the past over the years and (depending on your 
setup) the combination of HA/Proxy can be dead simple to setup and rock solid 
in terms of reliability.

Regards,
tim

 

- Original Message -
From: "Wes Modes" 
To: mysql@lists.mysql.com
Sent: Monday, April 2, 2012 7:47:18 PM
Subject: Re: HA & Scalability w MySQL + SAN + VMWare: Architecture Suggestion 
Wanted

Thanks again for sharing your knowledge.  I do believe the answers I've
receiving, but since I have requirements that I cannot easily alter, I'm
also gently pushing my expert advisers here to look beyond their own
preferences and direct experience.

RE: Shared storage.  I can easily let go of the preference to take
advantage of shared storage.  I understand duplicated databases are the
essence of database redundancy.  You make good points.

In terms of the acceptability of a small fraction of users being
temporarily unable to access services:  rather than sharding, which
again requires more control over the application than we have, I was
more envisioning that would be the fraction of users who hit the one
peer MySQL server that is temporarily unavailable due to h/w or s/w
failure or DB corruption while its fail over is powered up.

Does MySQL cluster seem like it will address my requirements to allow us
to horizontally scale a number of MySQL nodes as peers without
separating reads and writes, or slaves and masters. 

Wes

On 4/2/2012 2:25 PM, shawn green wrote:
> Hello Wes,
>
> On 4/2/2012 4:05 PM, Wes Modes wrote:
>> Thanks Shawn and Karen, for the suggestions, even given my vague
>> requirements.
>>
>> To clarify some of my requirements.
>>
>> *Application:  *We are using an open-source application called Omeka,
>> which is a "free, flexible, and open source web-publishing platform for
>> the display of library, museum, archives, and scholarly collections and
>> exhibitions."  Without getting into how free (or scalable) free software
>> really is, we can view it as one aspect we cannot change, having been
>> written into the grant requirements we received for the project.
>> Experienced Omeka developers and our own developer have suggested
>> that/it is not feasible to separate database writes from reads in the
>> application/ (given time and resources).
>>
>
> That's a shame. Sounds like you are back to one big server or several
> smaller servers with in-program sharding.
>
>> *SAN: *The SAN is a Dell EqualLogic 6100 which has redundant everything,
>> including multiple NICs, controllers, and power.  So we are less
>> concerned about the SAN being a SPoF.  On the other hand, if we have a
>> single big MySQL server that fails, we could bring up another copy of it
>> via VMWare, but until the server came up, the application would be dead
>> in the water.  If the database is corrupted, service will be interrupted
>> for a considerable time.
>>
>
> Again, each MySQL instance needs it's own copy of the data. Having
> only one big powerful disk system means that each instance you fire up
> must both share spindles and networking to access its data. Just like
> a freeway at rush hour, you may find the traffic into and out of this
> one device crawling to a halt exactly when you don't want it to.
>
>> *High Availability:*  It sounds like there is some debate over how to
>> provide HA best, but do people really disagree on the desired results?
>> Without getting into the many meanings of this buzz word, here's what we
>> mean: /We desire to maintain high availability of service, allowing a
>> small fraction of users to experience outage for only seconds at a
>> time.  We desire to provide this through horizontal scaling, redundancy,
>> failover planning, and external monitoring.  /
>>
>
> "Small fraction of users" - this implies data sharding. Multiple MySQL
> instances each with enough data to operate independently for one slice
> of your most important data and an application smart enough to know
> which shard to go to for each slice of data.
>
> "For a few seconds at a time" - you do not want a shared disk. Should
> the active MySQL die, it's data will be in an inconsistent state. Once
> you fire up the passive daemon it will need to perform a recovery
> restart. This down time is more than likely not going 

Re: HA & Scalability w MySQL + SAN + VMWare: Architecture Suggestion Wanted

2012-04-02 Thread Wes Modes
Thanks again for sharing your knowledge.  I do believe the answers I've
receiving, but since I have requirements that I cannot easily alter, I'm
also gently pushing my expert advisers here to look beyond their own
preferences and direct experience.

RE: Shared storage.  I can easily let go of the preference to take
advantage of shared storage.  I understand duplicated databases are the
essence of database redundancy.  You make good points.

In terms of the acceptability of a small fraction of users being
temporarily unable to access services:  rather than sharding, which
again requires more control over the application than we have, I was
more envisioning that would be the fraction of users who hit the one
peer MySQL server that is temporarily unavailable due to h/w or s/w
failure or DB corruption while its fail over is powered up.

Does MySQL cluster seem like it will address my requirements to allow us
to horizontally scale a number of MySQL nodes as peers without
separating reads and writes, or slaves and masters. 

Wes

On 4/2/2012 2:25 PM, shawn green wrote:
> Hello Wes,
>
> On 4/2/2012 4:05 PM, Wes Modes wrote:
>> Thanks Shawn and Karen, for the suggestions, even given my vague
>> requirements.
>>
>> To clarify some of my requirements.
>>
>> *Application:  *We are using an open-source application called Omeka,
>> which is a "free, flexible, and open source web-publishing platform for
>> the display of library, museum, archives, and scholarly collections and
>> exhibitions."  Without getting into how free (or scalable) free software
>> really is, we can view it as one aspect we cannot change, having been
>> written into the grant requirements we received for the project.
>> Experienced Omeka developers and our own developer have suggested
>> that/it is not feasible to separate database writes from reads in the
>> application/ (given time and resources).
>>
>
> That's a shame. Sounds like you are back to one big server or several
> smaller servers with in-program sharding.
>
>> *SAN: *The SAN is a Dell EqualLogic 6100 which has redundant everything,
>> including multiple NICs, controllers, and power.  So we are less
>> concerned about the SAN being a SPoF.  On the other hand, if we have a
>> single big MySQL server that fails, we could bring up another copy of it
>> via VMWare, but until the server came up, the application would be dead
>> in the water.  If the database is corrupted, service will be interrupted
>> for a considerable time.
>>
>
> Again, each MySQL instance needs it's own copy of the data. Having
> only one big powerful disk system means that each instance you fire up
> must both share spindles and networking to access its data. Just like
> a freeway at rush hour, you may find the traffic into and out of this
> one device crawling to a halt exactly when you don't want it to.
>
>> *High Availability:*  It sounds like there is some debate over how to
>> provide HA best, but do people really disagree on the desired results?
>> Without getting into the many meanings of this buzz word, here's what we
>> mean: /We desire to maintain high availability of service, allowing a
>> small fraction of users to experience outage for only seconds at a
>> time.  We desire to provide this through horizontal scaling, redundancy,
>> failover planning, and external monitoring.  /
>>
>
> "Small fraction of users" - this implies data sharding. Multiple MySQL
> instances each with enough data to operate independently for one slice
> of your most important data and an application smart enough to know
> which shard to go to for each slice of data.
>
> "For a few seconds at a time" - you do not want a shared disk. Should
> the active MySQL die, it's data will be in an inconsistent state. Once
> you fire up the passive daemon it will need to perform a recovery
> restart. This down time is more than likely not going to take only a
> few seconds. The more data you have, the longer the checks will take. 
> An independent copy maintained by a slave instance, provides a
> logically consistent copy of the master's data as it will only
> replicate complete transactions.
>
> "horizontal scaling" - one master, multiple slaves. This requires the
> separation of writes and reads.
>
>
>> *Scalability:  *Again, seems like there are lots of applications and
>> implementation, but people agree on the general concept.  Here's what we
>> mean for this project:  /We desire to  scale our services so that a
>> usage surge does not cause unavailability of the services for some
>> users.  We prefer to horizontally increase scalability using
>> load-balancing strategies to treat clusters of servers as single logical
>> units./
>>
>
> "Horizontal scalability" - see above
>
> "clusters of servers" - unless they share a common data set (which
> cannot happen via file sharing) they will not be useful as a logical
> unit.  Multiple slaves all replicating from the same master does match
> this requirement.
>
>> The application may have not bee

Re: HA & Scalability w MySQL + SAN + VMWare: Architecture Suggestion Wanted

2012-04-02 Thread shawn green

Hello Wes,

On 4/2/2012 4:05 PM, Wes Modes wrote:

Thanks Shawn and Karen, for the suggestions, even given my vague
requirements.

To clarify some of my requirements.

*Application:  *We are using an open-source application called Omeka,
which is a "free, flexible, and open source web-publishing platform for
the display of library, museum, archives, and scholarly collections and
exhibitions."  Without getting into how free (or scalable) free software
really is, we can view it as one aspect we cannot change, having been
written into the grant requirements we received for the project.
Experienced Omeka developers and our own developer have suggested
that/it is not feasible to separate database writes from reads in the
application/ (given time and resources).



That's a shame. Sounds like you are back to one big server or several 
smaller servers with in-program sharding.



*SAN: *The SAN is a Dell EqualLogic 6100 which has redundant everything,
including multiple NICs, controllers, and power.  So we are less
concerned about the SAN being a SPoF.  On the other hand, if we have a
single big MySQL server that fails, we could bring up another copy of it
via VMWare, but until the server came up, the application would be dead
in the water.  If the database is corrupted, service will be interrupted
for a considerable time.



Again, each MySQL instance needs it's own copy of the data. Having only 
one big powerful disk system means that each instance you fire up must 
both share spindles and networking to access its data. Just like a 
freeway at rush hour, you may find the traffic into and out of this one 
device crawling to a halt exactly when you don't want it to.



*High Availability:*  It sounds like there is some debate over how to
provide HA best, but do people really disagree on the desired results?
Without getting into the many meanings of this buzz word, here's what we
mean: /We desire to maintain high availability of service, allowing a
small fraction of users to experience outage for only seconds at a
time.  We desire to provide this through horizontal scaling, redundancy,
failover planning, and external monitoring.  /



"Small fraction of users" - this implies data sharding. Multiple MySQL 
instances each with enough data to operate independently for one slice 
of your most important data and an application smart enough to know 
which shard to go to for each slice of data.


"For a few seconds at a time" - you do not want a shared disk. Should 
the active MySQL die, it's data will be in an inconsistent state. Once 
you fire up the passive daemon it will need to perform a recovery 
restart. This down time is more than likely not going to take only a few 
seconds. The more data you have, the longer the checks will take.  An 
independent copy maintained by a slave instance, provides a logically 
consistent copy of the master's data as it will only replicate complete 
transactions.


"horizontal scaling" - one master, multiple slaves. This requires the 
separation of writes and reads.




*Scalability:  *Again, seems like there are lots of applications and
implementation, but people agree on the general concept.  Here's what we
mean for this project:  /We desire to  scale our services so that a
usage surge does not cause unavailability of the services for some
users.  We prefer to horizontally increase scalability using
load-balancing strategies to treat clusters of servers as single logical
units./



"Horizontal scalability" - see above

"clusters of servers" - unless they share a common data set (which 
cannot happen via file sharing) they will not be useful as a logical 
unit.  Multiple slaves all replicating from the same master does match 
this requirement.



The application may have not been designed with great scalability in
mind, but if multiple application instances are accessing multiple
database servers treated as one logical unit, that may not be too
relevant.



"logical unit" - see above
"multiple database servers" - remember! each instance of MySQL requires 
its own set of data. No sharing is allowed.



I am responsible for creating an architecture upon which this project
will run.  I am not responsible for redesigning the application.   So
far, no one has suggested anything that approached meeting our
requirements, even our vague ones.  Perhaps I am asking the wrong list?



No, you are asking the correct list. You are just not believing our 
answers. The application design is your stumbling block. It is written 
to function only with one large database (SPOF). Until you can fix that 
(as painful as it may be), you cannot achieve your availability goals.




Does anyone have any experience with MySQL Multi-Master Replication?
Perhaps that should be a separate post.



Replication rings are possible but you must design your application to 
take special care to NOT update the same row in multiple nodes of the 
ring at the same time. This is even harder to design and code for than 
spli

Re: HA & Scalability w MySQL + SAN + VMWare: Architecture Suggestion Wanted

2012-04-02 Thread Rick James
DRBD, SAN, etc.  Sure, they are highly redundant.  Sure they are 
reliable.  But they do not handle the building being in a 
flood/earthquake/tornado/etc.  If you want HA, you have to start with 
having two (or more) copies of all the data sitting in geographically 
distinct flood plains, etc.  If you have 3 copies of the data, why 
bother with redundant anything in each of the 3 sites?  HA within a 
device costs a lot of extra money.


You cannot get more than a few 9s, regardless of how you architect it.

Let's look at the design from a different point of view...
* How much data?
* How much traffic?
* It sounds like much of the data is images or videos; correct?  One 
could argue that those do not need to be in a database, other than a URL 
to them.  Then you can scatter copies around the globe, and use Akamai 
(or similar load balancing service) to not only find a viable copy, but 
find the nearest copy, thereby improving the "user experience"!
* You must put effort into splitting reads from writes.  Otherwise you 
are asking for extra vulnerability from hackers.  Yes, hackers will be a 
big issue, and you have not mentioned it yet.


You may need to scale via Sharding.  (Without some of the questions 
above answered, I cannot predict.)  If you do need sharding, you must 
incorporate it into the software design _early_.




On 4/2/12 1:05 PM, Wes Modes wrote:

Thanks Shawn and Karen, for the suggestions, even given my vague
requirements.

To clarify some of my requirements.

*Application:  *We are using an open-source application called Omeka,
which is a "free, flexible, and open source web-publishing platform for
the display of library, museum, archives, and scholarly collections and
exhibitions."  Without getting into how free (or scalable) free software
really is, we can view it as one aspect we cannot change, having been
written into the grant requirements we received for the project.
Experienced Omeka developers and our own developer have suggested
that/it is not feasible to separate database writes from reads in the
application/ (given time and resources).

*SAN: *The SAN is a Dell EqualLogic 6100 which has redundant everything,
including multiple NICs, controllers, and power.  So we are less
concerned about the SAN being a SPoF.  On the other hand, if we have a
single big MySQL server that fails, we could bring up another copy of it
via VMWare, but until the server came up, the application would be dead
in the water.  If the database is corrupted, service will be interrupted
for a considerable time.

*High Availability:*  It sounds like there is some debate over how to
provide HA best, but do people really disagree on the desired results?
Without getting into the many meanings of this buzz word, here's what we
mean: /We desire to maintain high availability of service, allowing a
small fraction of users to experience outage for only seconds at a
time.  We desire to provide this through horizontal scaling, redundancy,
failover planning, and external monitoring.  /

*Scalability:  *Again, seems like there are lots of applications and
implementation, but people agree on the general concept.  Here's what we
mean for this project:  /We desire to  scale our services so that a
usage surge does not cause unavailability of the services for some
users.  We prefer to horizontally increase scalability using
load-balancing strategies to treat clusters of servers as single logical
units./

The application may have not been designed with great scalability in
mind, but if multiple application instances are accessing multiple
database servers treated as one logical unit, that may not be too
relevant.

I am responsible for creating an architecture upon which this project
will run.  I am not responsible for redesigning the application.   So
far, no one has suggested anything that approached meeting our
requirements, even our vague ones.  Perhaps I am asking the wrong list?

Does anyone have any experience with MySQL Multi-Master Replication?
Perhaps that should be a separate post.

Wes

On 3/30/2012 3:56 PM, shawn green wrote:

Hello Wes,

On 3/29/2012 9:23 PM, Wes Modes wrote:

First, thank you in advance for good solid suggestions you can offer. I
suppose someone has already asked this, but perhaps you will view it as
a fun challenge to meet my many criteria with your suggested MySQL
architecture.

I am working at a University on a high-profile database driven project
that we expect to be slammed within the first few months. Since this is
a new project and one that we expect to be popular, we don't know what
kind of usage to expect, but we want to be prepared. Therefore, we are
building in extra capacity.

Our top goals are scalability and high availability, provided we hope
through multiple MySQL nodes and VMWare functionality. I've been
surprised that there are not more MySQL architects trying to meet these
high-level goals using virtualization and shared storage (or at least
they do not seem to be writing about

Re: HA & Scalability w MySQL + SAN + VMWare: Architecture Suggestion Wanted

2012-04-02 Thread Wes Modes
Thanks Shawn and Karen, for the suggestions, even given my vague
requirements. 

To clarify some of my requirements. 

*Application:  *We are using an open-source application called Omeka,
which is a "free, flexible, and open source web-publishing platform for
the display of library, museum, archives, and scholarly collections and
exhibitions."  Without getting into how free (or scalable) free software
really is, we can view it as one aspect we cannot change, having been
written into the grant requirements we received for the project. 
Experienced Omeka developers and our own developer have suggested
that/it is not feasible to separate database writes from reads in the
application/ (given time and resources).

*SAN: *The SAN is a Dell EqualLogic 6100 which has redundant everything,
including multiple NICs, controllers, and power.  So we are less
concerned about the SAN being a SPoF.  On the other hand, if we have a
single big MySQL server that fails, we could bring up another copy of it
via VMWare, but until the server came up, the application would be dead
in the water.  If the database is corrupted, service will be interrupted
for a considerable time.

*High Availability:*  It sounds like there is some debate over how to
provide HA best, but do people really disagree on the desired results? 
Without getting into the many meanings of this buzz word, here's what we
mean: /We desire to maintain high availability of service, allowing a
small fraction of users to experience outage for only seconds at a
time.  We desire to provide this through horizontal scaling, redundancy,
failover planning, and external monitoring.  /

*Scalability:  *Again, seems like there are lots of applications and
implementation, but people agree on the general concept.  Here's what we
mean for this project:  /We desire to  scale our services so that a
usage surge does not cause unavailability of the services for some
users.  We prefer to horizontally increase scalability using
load-balancing strategies to treat clusters of servers as single logical
units./ 

The application may have not been designed with great scalability in
mind, but if multiple application instances are accessing multiple
database servers treated as one logical unit, that may not be too
relevant. 

I am responsible for creating an architecture upon which this project
will run.  I am not responsible for redesigning the application.   So
far, no one has suggested anything that approached meeting our
requirements, even our vague ones.  Perhaps I am asking the wrong list? 

Does anyone have any experience with MySQL Multi-Master Replication? 
Perhaps that should be a separate post.

Wes

On 3/30/2012 3:56 PM, shawn green wrote:
> Hello Wes,
>
> On 3/29/2012 9:23 PM, Wes Modes wrote:
>> First, thank you in advance for good solid suggestions you can offer. I
>> suppose someone has already asked this, but perhaps you will view it as
>> a fun challenge to meet my many criteria with your suggested MySQL
>> architecture.
>>
>> I am working at a University on a high-profile database driven project
>> that we expect to be slammed within the first few months. Since this is
>> a new project and one that we expect to be popular, we don't know what
>> kind of usage to expect, but we want to be prepared. Therefore, we are
>> building in extra capacity.
>>
>> Our top goals are scalability and high availability, provided we hope
>> through multiple MySQL nodes and VMWare functionality. I've been
>> surprised that there are not more MySQL architects trying to meet these
>> high-level goals using virtualization and shared storage (or at least
>> they do not seem to be writing about it).
>>
>> I've looked at replication, multi-mastering, DRBD, clustering,
>> partitioning, and sharding.
>>
>> Here's what we got, and some of our constraints:
>>
>> * We are concerned that One Big Database instance won't be enough to
>> handle all of the queries, plus it is a single point of failure.
>> Therefore, multiple nodes are desirable.
>>
>> * With the primary application that will be using the database, writes
>> and reads cannot be split off from each other. This limitation alone,
>> rules out replication, MMM, and a few other solutions.
>>
>> * We do not expect to be especially write-heavy.
>>
>> * We have shared storage in the form of an iSCSI SAN. We'd like to
>> leverage the shared storage, if possible.
>>
>> * We have VMWare HA which already monitors hosts and brings them up
>> within minutes elsewhere if we lose a host. So some of the suggested HA
>> solutions are redundant.
>>
>> * We expect to have another instance of our system running in the Amazon
>> cloud for the first few months while the traffic is high, so we may take
>> advantage of RDS, though an exact duplicate of our local system will
>> save us development work.
>>
>> Thanks for any advice you can give.
>>
>> Wes Modes
>>
>
> As the others have already pointed out, your specifications are rather
> vague and appear to be only h

Re: HA & Scalability w MySQL + SAN + VMWare: Architecture Suggestion Wanted

2012-03-30 Thread shawn green

Hello Wes,

On 3/29/2012 9:23 PM, Wes Modes wrote:

First, thank you in advance for good solid suggestions you can offer. I
suppose someone has already asked this, but perhaps you will view it as
a fun challenge to meet my many criteria with your suggested MySQL
architecture.

I am working at a University on a high-profile database driven project
that we expect to be slammed within the first few months. Since this is
a new project and one that we expect to be popular, we don't know what
kind of usage to expect, but we want to be prepared. Therefore, we are
building in extra capacity.

Our top goals are scalability and high availability, provided we hope
through multiple MySQL nodes and VMWare functionality. I've been
surprised that there are not more MySQL architects trying to meet these
high-level goals using virtualization and shared storage (or at least
they do not seem to be writing about it).

I've looked at replication, multi-mastering, DRBD, clustering,
partitioning, and sharding.

Here's what we got, and some of our constraints:

* We are concerned that One Big Database instance won't be enough to
handle all of the queries, plus it is a single point of failure.
Therefore, multiple nodes are desirable.

* With the primary application that will be using the database, writes
and reads cannot be split off from each other. This limitation alone,
rules out replication, MMM, and a few other solutions.

* We do not expect to be especially write-heavy.

* We have shared storage in the form of an iSCSI SAN. We'd like to
leverage the shared storage, if possible.

* We have VMWare HA which already monitors hosts and brings them up
within minutes elsewhere if we lose a host. So some of the suggested HA
solutions are redundant.

* We expect to have another instance of our system running in the Amazon
cloud for the first few months while the traffic is high, so we may take
advantage of RDS, though an exact duplicate of our local system will
save us development work.

Thanks for any advice you can give.

Wes Modes



As the others have already pointed out, your specifications are rather 
vague and appear to be only hardware-focused. I can state with some 
certainty, through my six years of direct MySQL support experience, that 
the majority of the problems related to performance are NOT hardware 
related.  The majority of the problems I need to resolve are related to 
bad queries, bad table design, and bad application design.


For example, you state:
> * With the primary application that will be using the database, writes
> and reads cannot be split off from each other.

This is already a red flag that your application is not designed for scale.

Also, you say:
> * We are concerned that One Big Database instance won't be enough to
> handle all of the queries, plus it is a single point of failure.
then you say:
> * We have shared storage in the form of an iSCSI SAN. We'd like to
> leverage the shared storage, if possible.

This is another red flag in that your 'shared storage' once again 
becomes a single point of failure.  Also, MySQL instances do not share 
files. Period. Each MySQL instance needs its own, completely 
independent, set of files and folders. Please take the hint that 
Replication should be in your very near future.


Here are some solid suggestions:

a) modularize your code so that WRITES can go to a MASTER server and 
READS can be distributed over a large number of SLAVE servers.


b) modularize your data so that you can easily subdivide subsets of it 
to different master-slave trees. For example, students whose last name 
starts with A-D are assigned to machines in group1, E-K in group2, ... etc.


c) Optimize your queries to avoid subqueries and to take full advantage 
of indexes.


d) Build indexes wisely -  A well-crafted multi-column index can take up 
less space and will be more functional that multiple single-column indexes.


e) Test, test, and retest - Until you can push your system to the 
choking point in the lab, you will have no idea of how much traffic it 
will be able to handle in the field.


Good luck!
--
Shawn Green
MySQL Principal Technical Support Engineer
Oracle USA, Inc. - Hardware and Software, Engineered to Work Together.
Office: Blountville, TN

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql



Re: HA & Scalability w MySQL + SAN + VMWare: Architecture Suggestion Wanted

2012-03-30 Thread Karen Abgarian
Hi, 

First, it is kind of funny to advise on something that is unknown.   The devil 
of such systems is in details.   A small detail might cancel the whole big idea 
of using, say, sharing, clustering, etc.So any discussion on this will be 
quite general and can only be applied to your project with a lot of validation. 

Second, different folks have very different things in mind when they say magic 
words like 'scalability' or 'high availability'.For example, 'high 
availability' could mean such things as:

- the way to bring back resource online shall be available at the time the user 
(application) discovers the failure and can take reasonable time to be made 
available; 
- the backup resource shall be available at the time the user (application) 
discovers the failure and requires no preparation; 
- the user (application) shall not be dependent on the detection and resolution 
mechanisms of the underlying technology stack. 

Having said that, the idea to provide HA with VMWare and node reboots qualifies 
this as a project with "eventual" availability as described in the first 
statement above.   It is qualified as eventual because the backup system is not 
verified as being available, and the VMWare might fail to bring the node 
online, which is not controlled by no software. 

The 'scalability', for example, could also have a number of different meanings: 

- the system shall allow placing new growth data on the additional nodes;  
- the system shall dynamically place the new growth data on the additional 
nodes; 
- the system shall allow redistributing access to the data among the available 
nodes; 
- the system shall load balance access to the data among the available nodes; 
- the system shall load balance the access and data across all available nodes 
and keep all nodes equally loaded. 

Now, to the question of one big instance and multiple instances.   People 
usually get stuck with one big instance if they just can't break it into 
smaller instances.They usually can't break into multiple instances if such 
instances would have interdependencies which cannot be removed by changing the 
app design (or design cannot be easily changed).Such interdependencies may 
randomly alter your scalability and availability.   For example, if you have 
some critical data on just one node out of dozen, the availability of the whole 
system will be impacted if that node is down.   

….

I ran out of time.But on these subjects, the chatting could go on for 
years.   You may want to clearer explain what you are trying to do.  It could 
make discussion more focused.

Peace,
Karen Abgarian.


On Mar 29, 2012, at 6:23 PM, Wes Modes wrote:

> First, thank you in advance for good solid suggestions you can offer. I
> suppose someone has already asked this, but perhaps you will view it as
> a fun challenge to meet my many criteria with your suggested MySQL
> architecture.
> 
> I am working at a University on a high-profile database driven project
> that we expect to be slammed within the first few months. Since this is
> a new project and one that we expect to be popular, we don't know what
> kind of usage to expect, but we want to be prepared. Therefore, we are
> building in extra capacity.
> 
> Our top goals are scalability and high availability, provided we hope
> through multiple MySQL nodes and VMWare functionality. I've been
> surprised that there are not more MySQL architects trying to meet these
> high-level goals using virtualization and shared storage (or at least
> they do not seem to be writing about it).
> 
> I've looked at replication, multi-mastering, DRBD, clustering,
> partitioning, and sharding.
> 
> Here's what we got, and some of our constraints:
> 
> * We are concerned that One Big Database instance won't be enough to
> handle all of the queries, plus it is a single point of failure.
> Therefore, multiple nodes are desirable.
> 
> * With the primary application that will be using the database, writes
> and reads cannot be split off from each other. This limitation alone,
> rules out replication, MMM, and a few other solutions.
> 
> * We do not expect to be especially write-heavy.
> 
> * We have shared storage in the form of an iSCSI SAN. We'd like to
> leverage the shared storage, if possible.
> 
> * We have VMWare HA which already monitors hosts and brings them up
> within minutes elsewhere if we lose a host. So some of the suggested HA
> solutions are redundant.
> 
> * We expect to have another instance of our system running in the Amazon
> cloud for the first few months while the traffic is high, so we may take
> advantage of RDS, though an exact duplicate of our local system will
> save us development work.
> 
> Thanks for any advice you can give.
> 
> Wes Modes
> 
> -- 
> Wes Modes
> Systems Designer, Developer, and Administrator
> University Library ITS
> University of California, Santa Cruz
> 
> 
> -- 
> MySQL General Mailing List
> For list archives: http://lists.mysql.com/

Re: HA & Scalability w MySQL + SAN + VMWare: Architecture Suggestion Wanted

2012-03-30 Thread Rick James

Caution:  You are not going to like my answers.

> and VMWare
> shared storage
Why?  Seems like scalability should plan on having dedicated hardware.

> replication
The best choice
> multi-mastering
Dual-Master gives good HA
> DRBD
Partially solves one subset of HA; don't bother with it; set your sights 
higher.

> clustering
maybe
> PARTITION
Only if you have really big tables
> Sharding
Only if you are writing faster than a single server can handle -- I say 
that even for Dual-Master setups.


> With the primary application that will be using the database, writes 
and reads cannot be split off from each other.
That is probably false.  Yes, it will take some deeper thought to 
understand how to split them, or split them in certain situations.


> We do not expect to be especially write-heavy.
Skip Sharding, use readonly slaves.

> We have shared storage
Single-point-of-failure --> there goes your HA.

> We have VMWare HA which already monitors hosts and brings them up 
within minutes elsewhere if we lose a host.
That is thinking at the hardware level.  But will it really work for 
Dual-Master?


> another instance of our system running in the Amazon cloud
Bother.  More complexity.


On 3/29/12 6:23 PM, Wes Modes wrote:

First, thank you in advance for good solid suggestions you can offer. I
suppose someone has already asked this, but perhaps you will view it as
a fun challenge to meet my many criteria with your suggested MySQL
architecture.

I am working at a University on a high-profile database driven project
that we expect to be slammed within the first few months. Since this is
a new project and one that we expect to be popular, we don't know what
kind of usage to expect, but we want to be prepared. Therefore, we are
building in extra capacity.

Our top goals are scalability and high availability, provided we hope
through multiple MySQL nodes and VMWare functionality. I've been
surprised that there are not more MySQL architects trying to meet these
high-level goals using virtualization and shared storage (or at least
they do not seem to be writing about it).

I've looked at replication, multi-mastering, DRBD, clustering,
partitioning, and sharding.

Here's what we got, and some of our constraints:

* We are concerned that One Big Database instance won't be enough to
handle all of the queries, plus it is a single point of failure.
Therefore, multiple nodes are desirable.

* With the primary application that will be using the database, writes
and reads cannot be split off from each other. This limitation alone,
rules out replication, MMM, and a few other solutions.

* We do not expect to be especially write-heavy.

* We have shared storage in the form of an iSCSI SAN. We'd like to
leverage the shared storage, if possible.

* We have VMWare HA which already monitors hosts and brings them up
within minutes elsewhere if we lose a host. So some of the suggested HA
solutions are redundant.

* We expect to have another instance of our system running in the Amazon
cloud for the first few months while the traffic is high, so we may take
advantage of RDS, though an exact duplicate of our local system will
save us development work.

Thanks for any advice you can give.

Wes Modes



--
Rick James - MySQL Geek


--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql