Re: HA & Scalability w MySQL + SAN + VMWare: Architecture Suggestion Wanted
Thanks, Ian. W. On 4/4/2012 4:02 AM, Ian wrote: > On 04/04/2012 01:11, Wes Modes wrote: >> On 4/3/2012 3:04 AM, Ian wrote: >>> On 03/04/2012 00:47, Wes Modes wrote: Thanks again for sharing your knowledge. I do believe the answers I've receiving, but since I have requirements that I cannot easily alter, I'm also gently pushing my expert advisers here to look beyond their own preferences and direct experience. RE: Shared storage. I can easily let go of the preference to take advantage of shared storage. I understand duplicated databases are the essence of database redundancy. You make good points. In terms of the acceptability of a small fraction of users being temporarily unable to access services: rather than sharding, which again requires more control over the application than we have, I was more envisioning that would be the fraction of users who hit the one peer MySQL server that is temporarily unavailable due to h/w or s/w failure or DB corruption while its fail over is powered up. Does MySQL cluster seem like it will address my requirements to allow us to horizontally scale a number of MySQL nodes as peers without separating reads and writes, or slaves and masters. Wes >>> Hi Wes, >>> >>> If you can't alter the application to split reads and writes, why not >>> let MySQL Proxy to do it for you? >>> >>> http://forge.mysql.com/wiki/MySQL_Proxy >>> >>> Combine this with haproxy and you could build a multi-master environment >>> with each master having any number of slaves. Set MySQL Proxy to send >>> writes to the masters and reads to the slaves. >>> >>> Regards >>> >>> Ian >> Ian, what is the best place to ask specific questions about mysql-proxy? >> >> In general, here are my questions: >> >> 1. Does the proxy sit on a separate server and route all MySQL >> requests, or is it installed on each of the MySQL nodes and >> re-shuffle MySQL requests to the appropriate place? >> >> 2. Can multiple proxies be run in concert to provide redundancy and >> scalability as well as eliminate SPoF and bottlenecks? >> >> 3. In 2007 when RW Splitting was new, there were a few problems and >> limitations. What is the current status of development of this >> important feature? >> >> Thanks! >> >> And again I appreciate the brainstorming that many have done here to >> find a solution that fits my requirements. >> >> Wes > Hi Wes, > > I'm afraid I can't help you with your questions as I have never actually > used MySQL Proxy! I was aware of its existence so brought it up in case > it fitted your needs. > > I sympathize with your position and hope you find the answers you need. > > Regards > > Ian -- Wes Modes Systems Designer, Developer, and Administrator University Library ITS University of California, Santa Cruz -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql
Re: HA & Scalability w MySQL + SAN + VMWare: Architecture Suggestion Wanted
On 4/3/2012 3:04 AM, Ian wrote: > On 03/04/2012 00:47, Wes Modes wrote: >> Thanks again for sharing your knowledge. I do believe the answers I've >> receiving, but since I have requirements that I cannot easily alter, I'm >> also gently pushing my expert advisers here to look beyond their own >> preferences and direct experience. >> >> RE: Shared storage. I can easily let go of the preference to take >> advantage of shared storage. I understand duplicated databases are the >> essence of database redundancy. You make good points. >> >> In terms of the acceptability of a small fraction of users being >> temporarily unable to access services: rather than sharding, which >> again requires more control over the application than we have, I was >> more envisioning that would be the fraction of users who hit the one >> peer MySQL server that is temporarily unavailable due to h/w or s/w >> failure or DB corruption while its fail over is powered up. >> >> Does MySQL cluster seem like it will address my requirements to allow us >> to horizontally scale a number of MySQL nodes as peers without >> separating reads and writes, or slaves and masters. >> >> Wes > Hi Wes, > > If you can't alter the application to split reads and writes, why not > let MySQL Proxy to do it for you? > > http://forge.mysql.com/wiki/MySQL_Proxy > > Combine this with haproxy and you could build a multi-master environment > with each master having any number of slaves. Set MySQL Proxy to send > writes to the masters and reads to the slaves. > > Regards > > Ian Ian, what is the best place to ask specific questions about mysql-proxy? In general, here are my questions: 1. Does the proxy sit on a separate server and route all MySQL requests, or is it installed on each of the MySQL nodes and re-shuffle MySQL requests to the appropriate place? 2. Can multiple proxies be run in concert to provide redundancy and scalability as well as eliminate SPoF and bottlenecks? 3. In 2007 when RW Splitting was new, there were a few problems and limitations. What is the current status of development of this important feature? Thanks! And again I appreciate the brainstorming that many have done here to find a solution that fits my requirements. Wes -- Wes Modes Systems Designer, Developer, and Administrator University Library ITS University of California, Santa Cruz
Re: HA & Scalability w MySQL + SAN + VMWare: Architecture Suggestion Wanted
Am I right in seeing that if you can split reads and writes without the application having to be replication-aware, one does not need multiple masters? One can simply have standard MySQL replication, yes? For us, we only were interested in multiple masters so that we could read or write from any of the available MySQL nodes. W. On 4/3/2012 3:04 AM, Ian wrote: > On 03/04/2012 00:47, Wes Modes wrote: >> Thanks again for sharing your knowledge. I do believe the answers I've >> receiving, but since I have requirements that I cannot easily alter, I'm >> also gently pushing my expert advisers here to look beyond their own >> preferences and direct experience. >> >> RE: Shared storage. I can easily let go of the preference to take >> advantage of shared storage. I understand duplicated databases are the >> essence of database redundancy. You make good points. >> >> In terms of the acceptability of a small fraction of users being >> temporarily unable to access services: rather than sharding, which >> again requires more control over the application than we have, I was >> more envisioning that would be the fraction of users who hit the one >> peer MySQL server that is temporarily unavailable due to h/w or s/w >> failure or DB corruption while its fail over is powered up. >> >> Does MySQL cluster seem like it will address my requirements to allow us >> to horizontally scale a number of MySQL nodes as peers without >> separating reads and writes, or slaves and masters. >> >> Wes > Hi Wes, > > If you can't alter the application to split reads and writes, why not > let MySQL Proxy to do it for you? > > http://forge.mysql.com/wiki/MySQL_Proxy > > Combine this with haproxy and you could build a multi-master environment > with each master having any number of slaves. Set MySQL Proxy to send > writes to the masters and reads to the slaves. > > Regards > > Ian -- Wes Modes Systems Designer, Developer, and Administrator University Library ITS University of California, Santa Cruz -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql
Re: HA & Scalability w MySQL + SAN + VMWare: Architecture Suggestion Wanted
- Original Message - > From: "Ian" > > with each master having any number of slaves. Set MySQL Proxy to > send writes to the masters and reads to the slaves. Yes, except when you have replication delays (asynchronous, remember?) like someone else recently posted, your application writes a value, and consequently can't find it again. Boom goes the software :-) I'm sure proxy has it's uses, but I remain extremely sceptical about both performance and trustworthyness of the read/write splitting scheme as long as the application isn't aware of it. Now, if you could somehow cough up a *context*, say, by having proxy pass all requests from a connection that has done a write to the master, that would help, but... * it would yield less scaling - master must again serve some selects * with replication delays, you still risk updating the master based off old data from a not-up-to-date slave * in a web context, connections are not permanent; to keep track of a single stream you'd need the proxy to be aware of something like a browser session or whatnot No, if you *really* want to do horizontal scaling, have a look at SchoonerSQL, Clustrix and the like. They're third-party, but they (claim to) have proper multimaster replication. I've not actually looked at any of them in depth myself - we're in an environment where we have a lot of smaller clusters. -- Bier met grenadyn Is als mosterd by den wyn Sy die't drinkt, is eene kwezel Hy die't drinkt, is ras een ezel -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql
Re: HA & Scalability w MySQL + SAN + VMWare: Architecture Suggestion Wanted
On 03/04/2012 00:47, Wes Modes wrote: > Thanks again for sharing your knowledge. I do believe the answers I've > receiving, but since I have requirements that I cannot easily alter, I'm > also gently pushing my expert advisers here to look beyond their own > preferences and direct experience. > > RE: Shared storage. I can easily let go of the preference to take > advantage of shared storage. I understand duplicated databases are the > essence of database redundancy. You make good points. > > In terms of the acceptability of a small fraction of users being > temporarily unable to access services: rather than sharding, which > again requires more control over the application than we have, I was > more envisioning that would be the fraction of users who hit the one > peer MySQL server that is temporarily unavailable due to h/w or s/w > failure or DB corruption while its fail over is powered up. > > Does MySQL cluster seem like it will address my requirements to allow us > to horizontally scale a number of MySQL nodes as peers without > separating reads and writes, or slaves and masters. > > Wes Hi Wes, If you can't alter the application to split reads and writes, why not let MySQL Proxy to do it for you? http://forge.mysql.com/wiki/MySQL_Proxy Combine this with haproxy and you could build a multi-master environment with each master having any number of slaves. Set MySQL Proxy to send writes to the masters and reads to the slaves. Regards Ian -- > On 4/2/2012 2:25 PM, shawn green wrote: >> Hello Wes, >> >> On 4/2/2012 4:05 PM, Wes Modes wrote: >>> Thanks Shawn and Karen, for the suggestions, even given my vague >>> requirements. >>> >>> To clarify some of my requirements. >>> >>> *Application: *We are using an open-source application called Omeka, >>> which is a "free, flexible, and open source web-publishing platform for >>> the display of library, museum, archives, and scholarly collections and >>> exhibitions." Without getting into how free (or scalable) free software >>> really is, we can view it as one aspect we cannot change, having been >>> written into the grant requirements we received for the project. >>> Experienced Omeka developers and our own developer have suggested >>> that/it is not feasible to separate database writes from reads in the >>> application/ (given time and resources). >>> >> >> That's a shame. Sounds like you are back to one big server or several >> smaller servers with in-program sharding. >> >>> *SAN: *The SAN is a Dell EqualLogic 6100 which has redundant everything, >>> including multiple NICs, controllers, and power. So we are less >>> concerned about the SAN being a SPoF. On the other hand, if we have a >>> single big MySQL server that fails, we could bring up another copy of it >>> via VMWare, but until the server came up, the application would be dead >>> in the water. If the database is corrupted, service will be interrupted >>> for a considerable time. >>> >> >> Again, each MySQL instance needs it's own copy of the data. Having >> only one big powerful disk system means that each instance you fire up >> must both share spindles and networking to access its data. Just like >> a freeway at rush hour, you may find the traffic into and out of this >> one device crawling to a halt exactly when you don't want it to. >> >>> *High Availability:* It sounds like there is some debate over how to >>> provide HA best, but do people really disagree on the desired results? >>> Without getting into the many meanings of this buzz word, here's what we >>> mean: /We desire to maintain high availability of service, allowing a >>> small fraction of users to experience outage for only seconds at a >>> time. We desire to provide this through horizontal scaling, redundancy, >>> failover planning, and external monitoring. / >>> >> >> "Small fraction of users" - this implies data sharding. Multiple MySQL >> instances each with enough data to operate independently for one slice >> of your most important data and an application smart enough to know >> which shard to go to for each slice of data. >> >> "For a few seconds at a time" - you do not want a shared disk. Should >> the active MySQL die, it's data will be in an inconsistent state. Once >> you fire up the passive daemon it will need to perform a recovery >> restart. This down time is more than likely not going to take only a >> few seconds. The more data you have, the longer the checks will take. >> An independent copy maintained by a slave instance, provides a >> logically consistent copy of the master's data as it will only >> replicate complete transactions. >> >> "horizontal scaling" - one master, multiple slaves. This requires the >> separation of writes and reads. >> >> >>> *Scalability: *Again, seems like there are lots of applications and >>> implementation, but people agree on the general concept. Here's what we >>> mean for this project: /We desire to scale our services so that a >>> usage surge does
Re: HA & Scalability w MySQL + SAN + VMWare: Architecture Suggestion Wanted
Why not use a load balancer such as HA/Proxy as a way of maintaining high availability on your mysql nodes? http://blogs.reliablepenguin.com/2011/03/31/mysql-load-balancing-with-haproxy http://linuxadminzone.com/how-to-install-setup-and-config-haproxy-loadbalancer-for-content-switching/ http://agiletesting.blogspot.com/2010/10/mysql-load-balancing-with-haproxy.html I've done this a few times in the past over the years and (depending on your setup) the combination of HA/Proxy can be dead simple to setup and rock solid in terms of reliability. Regards, tim - Original Message - From: "Wes Modes" To: mysql@lists.mysql.com Sent: Monday, April 2, 2012 7:47:18 PM Subject: Re: HA & Scalability w MySQL + SAN + VMWare: Architecture Suggestion Wanted Thanks again for sharing your knowledge. I do believe the answers I've receiving, but since I have requirements that I cannot easily alter, I'm also gently pushing my expert advisers here to look beyond their own preferences and direct experience. RE: Shared storage. I can easily let go of the preference to take advantage of shared storage. I understand duplicated databases are the essence of database redundancy. You make good points. In terms of the acceptability of a small fraction of users being temporarily unable to access services: rather than sharding, which again requires more control over the application than we have, I was more envisioning that would be the fraction of users who hit the one peer MySQL server that is temporarily unavailable due to h/w or s/w failure or DB corruption while its fail over is powered up. Does MySQL cluster seem like it will address my requirements to allow us to horizontally scale a number of MySQL nodes as peers without separating reads and writes, or slaves and masters. Wes On 4/2/2012 2:25 PM, shawn green wrote: > Hello Wes, > > On 4/2/2012 4:05 PM, Wes Modes wrote: >> Thanks Shawn and Karen, for the suggestions, even given my vague >> requirements. >> >> To clarify some of my requirements. >> >> *Application: *We are using an open-source application called Omeka, >> which is a "free, flexible, and open source web-publishing platform for >> the display of library, museum, archives, and scholarly collections and >> exhibitions." Without getting into how free (or scalable) free software >> really is, we can view it as one aspect we cannot change, having been >> written into the grant requirements we received for the project. >> Experienced Omeka developers and our own developer have suggested >> that/it is not feasible to separate database writes from reads in the >> application/ (given time and resources). >> > > That's a shame. Sounds like you are back to one big server or several > smaller servers with in-program sharding. > >> *SAN: *The SAN is a Dell EqualLogic 6100 which has redundant everything, >> including multiple NICs, controllers, and power. So we are less >> concerned about the SAN being a SPoF. On the other hand, if we have a >> single big MySQL server that fails, we could bring up another copy of it >> via VMWare, but until the server came up, the application would be dead >> in the water. If the database is corrupted, service will be interrupted >> for a considerable time. >> > > Again, each MySQL instance needs it's own copy of the data. Having > only one big powerful disk system means that each instance you fire up > must both share spindles and networking to access its data. Just like > a freeway at rush hour, you may find the traffic into and out of this > one device crawling to a halt exactly when you don't want it to. > >> *High Availability:* It sounds like there is some debate over how to >> provide HA best, but do people really disagree on the desired results? >> Without getting into the many meanings of this buzz word, here's what we >> mean: /We desire to maintain high availability of service, allowing a >> small fraction of users to experience outage for only seconds at a >> time. We desire to provide this through horizontal scaling, redundancy, >> failover planning, and external monitoring. / >> > > "Small fraction of users" - this implies data sharding. Multiple MySQL > instances each with enough data to operate independently for one slice > of your most important data and an application smart enough to know > which shard to go to for each slice of data. > > "For a few seconds at a time" - you do not want a shared disk. Should > the active MySQL die, it's data will be in an inconsistent state. Once > you fire up the passive daemon it will need to perform a recovery > restart. This down time is more than likely not going
Re: HA & Scalability w MySQL + SAN + VMWare: Architecture Suggestion Wanted
Thanks again for sharing your knowledge. I do believe the answers I've receiving, but since I have requirements that I cannot easily alter, I'm also gently pushing my expert advisers here to look beyond their own preferences and direct experience. RE: Shared storage. I can easily let go of the preference to take advantage of shared storage. I understand duplicated databases are the essence of database redundancy. You make good points. In terms of the acceptability of a small fraction of users being temporarily unable to access services: rather than sharding, which again requires more control over the application than we have, I was more envisioning that would be the fraction of users who hit the one peer MySQL server that is temporarily unavailable due to h/w or s/w failure or DB corruption while its fail over is powered up. Does MySQL cluster seem like it will address my requirements to allow us to horizontally scale a number of MySQL nodes as peers without separating reads and writes, or slaves and masters. Wes On 4/2/2012 2:25 PM, shawn green wrote: > Hello Wes, > > On 4/2/2012 4:05 PM, Wes Modes wrote: >> Thanks Shawn and Karen, for the suggestions, even given my vague >> requirements. >> >> To clarify some of my requirements. >> >> *Application: *We are using an open-source application called Omeka, >> which is a "free, flexible, and open source web-publishing platform for >> the display of library, museum, archives, and scholarly collections and >> exhibitions." Without getting into how free (or scalable) free software >> really is, we can view it as one aspect we cannot change, having been >> written into the grant requirements we received for the project. >> Experienced Omeka developers and our own developer have suggested >> that/it is not feasible to separate database writes from reads in the >> application/ (given time and resources). >> > > That's a shame. Sounds like you are back to one big server or several > smaller servers with in-program sharding. > >> *SAN: *The SAN is a Dell EqualLogic 6100 which has redundant everything, >> including multiple NICs, controllers, and power. So we are less >> concerned about the SAN being a SPoF. On the other hand, if we have a >> single big MySQL server that fails, we could bring up another copy of it >> via VMWare, but until the server came up, the application would be dead >> in the water. If the database is corrupted, service will be interrupted >> for a considerable time. >> > > Again, each MySQL instance needs it's own copy of the data. Having > only one big powerful disk system means that each instance you fire up > must both share spindles and networking to access its data. Just like > a freeway at rush hour, you may find the traffic into and out of this > one device crawling to a halt exactly when you don't want it to. > >> *High Availability:* It sounds like there is some debate over how to >> provide HA best, but do people really disagree on the desired results? >> Without getting into the many meanings of this buzz word, here's what we >> mean: /We desire to maintain high availability of service, allowing a >> small fraction of users to experience outage for only seconds at a >> time. We desire to provide this through horizontal scaling, redundancy, >> failover planning, and external monitoring. / >> > > "Small fraction of users" - this implies data sharding. Multiple MySQL > instances each with enough data to operate independently for one slice > of your most important data and an application smart enough to know > which shard to go to for each slice of data. > > "For a few seconds at a time" - you do not want a shared disk. Should > the active MySQL die, it's data will be in an inconsistent state. Once > you fire up the passive daemon it will need to perform a recovery > restart. This down time is more than likely not going to take only a > few seconds. The more data you have, the longer the checks will take. > An independent copy maintained by a slave instance, provides a > logically consistent copy of the master's data as it will only > replicate complete transactions. > > "horizontal scaling" - one master, multiple slaves. This requires the > separation of writes and reads. > > >> *Scalability: *Again, seems like there are lots of applications and >> implementation, but people agree on the general concept. Here's what we >> mean for this project: /We desire to scale our services so that a >> usage surge does not cause unavailability of the services for some >> users. We prefer to horizontally increase scalability using >> load-balancing strategies to treat clusters of servers as single logical >> units./ >> > > "Horizontal scalability" - see above > > "clusters of servers" - unless they share a common data set (which > cannot happen via file sharing) they will not be useful as a logical > unit. Multiple slaves all replicating from the same master does match > this requirement. > >> The application may have not bee
Re: HA & Scalability w MySQL + SAN + VMWare: Architecture Suggestion Wanted
Hello Wes, On 4/2/2012 4:05 PM, Wes Modes wrote: Thanks Shawn and Karen, for the suggestions, even given my vague requirements. To clarify some of my requirements. *Application: *We are using an open-source application called Omeka, which is a "free, flexible, and open source web-publishing platform for the display of library, museum, archives, and scholarly collections and exhibitions." Without getting into how free (or scalable) free software really is, we can view it as one aspect we cannot change, having been written into the grant requirements we received for the project. Experienced Omeka developers and our own developer have suggested that/it is not feasible to separate database writes from reads in the application/ (given time and resources). That's a shame. Sounds like you are back to one big server or several smaller servers with in-program sharding. *SAN: *The SAN is a Dell EqualLogic 6100 which has redundant everything, including multiple NICs, controllers, and power. So we are less concerned about the SAN being a SPoF. On the other hand, if we have a single big MySQL server that fails, we could bring up another copy of it via VMWare, but until the server came up, the application would be dead in the water. If the database is corrupted, service will be interrupted for a considerable time. Again, each MySQL instance needs it's own copy of the data. Having only one big powerful disk system means that each instance you fire up must both share spindles and networking to access its data. Just like a freeway at rush hour, you may find the traffic into and out of this one device crawling to a halt exactly when you don't want it to. *High Availability:* It sounds like there is some debate over how to provide HA best, but do people really disagree on the desired results? Without getting into the many meanings of this buzz word, here's what we mean: /We desire to maintain high availability of service, allowing a small fraction of users to experience outage for only seconds at a time. We desire to provide this through horizontal scaling, redundancy, failover planning, and external monitoring. / "Small fraction of users" - this implies data sharding. Multiple MySQL instances each with enough data to operate independently for one slice of your most important data and an application smart enough to know which shard to go to for each slice of data. "For a few seconds at a time" - you do not want a shared disk. Should the active MySQL die, it's data will be in an inconsistent state. Once you fire up the passive daemon it will need to perform a recovery restart. This down time is more than likely not going to take only a few seconds. The more data you have, the longer the checks will take. An independent copy maintained by a slave instance, provides a logically consistent copy of the master's data as it will only replicate complete transactions. "horizontal scaling" - one master, multiple slaves. This requires the separation of writes and reads. *Scalability: *Again, seems like there are lots of applications and implementation, but people agree on the general concept. Here's what we mean for this project: /We desire to scale our services so that a usage surge does not cause unavailability of the services for some users. We prefer to horizontally increase scalability using load-balancing strategies to treat clusters of servers as single logical units./ "Horizontal scalability" - see above "clusters of servers" - unless they share a common data set (which cannot happen via file sharing) they will not be useful as a logical unit. Multiple slaves all replicating from the same master does match this requirement. The application may have not been designed with great scalability in mind, but if multiple application instances are accessing multiple database servers treated as one logical unit, that may not be too relevant. "logical unit" - see above "multiple database servers" - remember! each instance of MySQL requires its own set of data. No sharing is allowed. I am responsible for creating an architecture upon which this project will run. I am not responsible for redesigning the application. So far, no one has suggested anything that approached meeting our requirements, even our vague ones. Perhaps I am asking the wrong list? No, you are asking the correct list. You are just not believing our answers. The application design is your stumbling block. It is written to function only with one large database (SPOF). Until you can fix that (as painful as it may be), you cannot achieve your availability goals. Does anyone have any experience with MySQL Multi-Master Replication? Perhaps that should be a separate post. Replication rings are possible but you must design your application to take special care to NOT update the same row in multiple nodes of the ring at the same time. This is even harder to design and code for than spli
Re: HA & Scalability w MySQL + SAN + VMWare: Architecture Suggestion Wanted
DRBD, SAN, etc. Sure, they are highly redundant. Sure they are reliable. But they do not handle the building being in a flood/earthquake/tornado/etc. If you want HA, you have to start with having two (or more) copies of all the data sitting in geographically distinct flood plains, etc. If you have 3 copies of the data, why bother with redundant anything in each of the 3 sites? HA within a device costs a lot of extra money. You cannot get more than a few 9s, regardless of how you architect it. Let's look at the design from a different point of view... * How much data? * How much traffic? * It sounds like much of the data is images or videos; correct? One could argue that those do not need to be in a database, other than a URL to them. Then you can scatter copies around the globe, and use Akamai (or similar load balancing service) to not only find a viable copy, but find the nearest copy, thereby improving the "user experience"! * You must put effort into splitting reads from writes. Otherwise you are asking for extra vulnerability from hackers. Yes, hackers will be a big issue, and you have not mentioned it yet. You may need to scale via Sharding. (Without some of the questions above answered, I cannot predict.) If you do need sharding, you must incorporate it into the software design _early_. On 4/2/12 1:05 PM, Wes Modes wrote: Thanks Shawn and Karen, for the suggestions, even given my vague requirements. To clarify some of my requirements. *Application: *We are using an open-source application called Omeka, which is a "free, flexible, and open source web-publishing platform for the display of library, museum, archives, and scholarly collections and exhibitions." Without getting into how free (or scalable) free software really is, we can view it as one aspect we cannot change, having been written into the grant requirements we received for the project. Experienced Omeka developers and our own developer have suggested that/it is not feasible to separate database writes from reads in the application/ (given time and resources). *SAN: *The SAN is a Dell EqualLogic 6100 which has redundant everything, including multiple NICs, controllers, and power. So we are less concerned about the SAN being a SPoF. On the other hand, if we have a single big MySQL server that fails, we could bring up another copy of it via VMWare, but until the server came up, the application would be dead in the water. If the database is corrupted, service will be interrupted for a considerable time. *High Availability:* It sounds like there is some debate over how to provide HA best, but do people really disagree on the desired results? Without getting into the many meanings of this buzz word, here's what we mean: /We desire to maintain high availability of service, allowing a small fraction of users to experience outage for only seconds at a time. We desire to provide this through horizontal scaling, redundancy, failover planning, and external monitoring. / *Scalability: *Again, seems like there are lots of applications and implementation, but people agree on the general concept. Here's what we mean for this project: /We desire to scale our services so that a usage surge does not cause unavailability of the services for some users. We prefer to horizontally increase scalability using load-balancing strategies to treat clusters of servers as single logical units./ The application may have not been designed with great scalability in mind, but if multiple application instances are accessing multiple database servers treated as one logical unit, that may not be too relevant. I am responsible for creating an architecture upon which this project will run. I am not responsible for redesigning the application. So far, no one has suggested anything that approached meeting our requirements, even our vague ones. Perhaps I am asking the wrong list? Does anyone have any experience with MySQL Multi-Master Replication? Perhaps that should be a separate post. Wes On 3/30/2012 3:56 PM, shawn green wrote: Hello Wes, On 3/29/2012 9:23 PM, Wes Modes wrote: First, thank you in advance for good solid suggestions you can offer. I suppose someone has already asked this, but perhaps you will view it as a fun challenge to meet my many criteria with your suggested MySQL architecture. I am working at a University on a high-profile database driven project that we expect to be slammed within the first few months. Since this is a new project and one that we expect to be popular, we don't know what kind of usage to expect, but we want to be prepared. Therefore, we are building in extra capacity. Our top goals are scalability and high availability, provided we hope through multiple MySQL nodes and VMWare functionality. I've been surprised that there are not more MySQL architects trying to meet these high-level goals using virtualization and shared storage (or at least they do not seem to be writing about
Re: HA & Scalability w MySQL + SAN + VMWare: Architecture Suggestion Wanted
Thanks Shawn and Karen, for the suggestions, even given my vague requirements. To clarify some of my requirements. *Application: *We are using an open-source application called Omeka, which is a "free, flexible, and open source web-publishing platform for the display of library, museum, archives, and scholarly collections and exhibitions." Without getting into how free (or scalable) free software really is, we can view it as one aspect we cannot change, having been written into the grant requirements we received for the project. Experienced Omeka developers and our own developer have suggested that/it is not feasible to separate database writes from reads in the application/ (given time and resources). *SAN: *The SAN is a Dell EqualLogic 6100 which has redundant everything, including multiple NICs, controllers, and power. So we are less concerned about the SAN being a SPoF. On the other hand, if we have a single big MySQL server that fails, we could bring up another copy of it via VMWare, but until the server came up, the application would be dead in the water. If the database is corrupted, service will be interrupted for a considerable time. *High Availability:* It sounds like there is some debate over how to provide HA best, but do people really disagree on the desired results? Without getting into the many meanings of this buzz word, here's what we mean: /We desire to maintain high availability of service, allowing a small fraction of users to experience outage for only seconds at a time. We desire to provide this through horizontal scaling, redundancy, failover planning, and external monitoring. / *Scalability: *Again, seems like there are lots of applications and implementation, but people agree on the general concept. Here's what we mean for this project: /We desire to scale our services so that a usage surge does not cause unavailability of the services for some users. We prefer to horizontally increase scalability using load-balancing strategies to treat clusters of servers as single logical units./ The application may have not been designed with great scalability in mind, but if multiple application instances are accessing multiple database servers treated as one logical unit, that may not be too relevant. I am responsible for creating an architecture upon which this project will run. I am not responsible for redesigning the application. So far, no one has suggested anything that approached meeting our requirements, even our vague ones. Perhaps I am asking the wrong list? Does anyone have any experience with MySQL Multi-Master Replication? Perhaps that should be a separate post. Wes On 3/30/2012 3:56 PM, shawn green wrote: > Hello Wes, > > On 3/29/2012 9:23 PM, Wes Modes wrote: >> First, thank you in advance for good solid suggestions you can offer. I >> suppose someone has already asked this, but perhaps you will view it as >> a fun challenge to meet my many criteria with your suggested MySQL >> architecture. >> >> I am working at a University on a high-profile database driven project >> that we expect to be slammed within the first few months. Since this is >> a new project and one that we expect to be popular, we don't know what >> kind of usage to expect, but we want to be prepared. Therefore, we are >> building in extra capacity. >> >> Our top goals are scalability and high availability, provided we hope >> through multiple MySQL nodes and VMWare functionality. I've been >> surprised that there are not more MySQL architects trying to meet these >> high-level goals using virtualization and shared storage (or at least >> they do not seem to be writing about it). >> >> I've looked at replication, multi-mastering, DRBD, clustering, >> partitioning, and sharding. >> >> Here's what we got, and some of our constraints: >> >> * We are concerned that One Big Database instance won't be enough to >> handle all of the queries, plus it is a single point of failure. >> Therefore, multiple nodes are desirable. >> >> * With the primary application that will be using the database, writes >> and reads cannot be split off from each other. This limitation alone, >> rules out replication, MMM, and a few other solutions. >> >> * We do not expect to be especially write-heavy. >> >> * We have shared storage in the form of an iSCSI SAN. We'd like to >> leverage the shared storage, if possible. >> >> * We have VMWare HA which already monitors hosts and brings them up >> within minutes elsewhere if we lose a host. So some of the suggested HA >> solutions are redundant. >> >> * We expect to have another instance of our system running in the Amazon >> cloud for the first few months while the traffic is high, so we may take >> advantage of RDS, though an exact duplicate of our local system will >> save us development work. >> >> Thanks for any advice you can give. >> >> Wes Modes >> > > As the others have already pointed out, your specifications are rather > vague and appear to be only h
Re: HA & Scalability w MySQL + SAN + VMWare: Architecture Suggestion Wanted
Hello Wes, On 3/29/2012 9:23 PM, Wes Modes wrote: First, thank you in advance for good solid suggestions you can offer. I suppose someone has already asked this, but perhaps you will view it as a fun challenge to meet my many criteria with your suggested MySQL architecture. I am working at a University on a high-profile database driven project that we expect to be slammed within the first few months. Since this is a new project and one that we expect to be popular, we don't know what kind of usage to expect, but we want to be prepared. Therefore, we are building in extra capacity. Our top goals are scalability and high availability, provided we hope through multiple MySQL nodes and VMWare functionality. I've been surprised that there are not more MySQL architects trying to meet these high-level goals using virtualization and shared storage (or at least they do not seem to be writing about it). I've looked at replication, multi-mastering, DRBD, clustering, partitioning, and sharding. Here's what we got, and some of our constraints: * We are concerned that One Big Database instance won't be enough to handle all of the queries, plus it is a single point of failure. Therefore, multiple nodes are desirable. * With the primary application that will be using the database, writes and reads cannot be split off from each other. This limitation alone, rules out replication, MMM, and a few other solutions. * We do not expect to be especially write-heavy. * We have shared storage in the form of an iSCSI SAN. We'd like to leverage the shared storage, if possible. * We have VMWare HA which already monitors hosts and brings them up within minutes elsewhere if we lose a host. So some of the suggested HA solutions are redundant. * We expect to have another instance of our system running in the Amazon cloud for the first few months while the traffic is high, so we may take advantage of RDS, though an exact duplicate of our local system will save us development work. Thanks for any advice you can give. Wes Modes As the others have already pointed out, your specifications are rather vague and appear to be only hardware-focused. I can state with some certainty, through my six years of direct MySQL support experience, that the majority of the problems related to performance are NOT hardware related. The majority of the problems I need to resolve are related to bad queries, bad table design, and bad application design. For example, you state: > * With the primary application that will be using the database, writes > and reads cannot be split off from each other. This is already a red flag that your application is not designed for scale. Also, you say: > * We are concerned that One Big Database instance won't be enough to > handle all of the queries, plus it is a single point of failure. then you say: > * We have shared storage in the form of an iSCSI SAN. We'd like to > leverage the shared storage, if possible. This is another red flag in that your 'shared storage' once again becomes a single point of failure. Also, MySQL instances do not share files. Period. Each MySQL instance needs its own, completely independent, set of files and folders. Please take the hint that Replication should be in your very near future. Here are some solid suggestions: a) modularize your code so that WRITES can go to a MASTER server and READS can be distributed over a large number of SLAVE servers. b) modularize your data so that you can easily subdivide subsets of it to different master-slave trees. For example, students whose last name starts with A-D are assigned to machines in group1, E-K in group2, ... etc. c) Optimize your queries to avoid subqueries and to take full advantage of indexes. d) Build indexes wisely - A well-crafted multi-column index can take up less space and will be more functional that multiple single-column indexes. e) Test, test, and retest - Until you can push your system to the choking point in the lab, you will have no idea of how much traffic it will be able to handle in the field. Good luck! -- Shawn Green MySQL Principal Technical Support Engineer Oracle USA, Inc. - Hardware and Software, Engineered to Work Together. Office: Blountville, TN -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql
Re: HA & Scalability w MySQL + SAN + VMWare: Architecture Suggestion Wanted
Hi, First, it is kind of funny to advise on something that is unknown. The devil of such systems is in details. A small detail might cancel the whole big idea of using, say, sharing, clustering, etc.So any discussion on this will be quite general and can only be applied to your project with a lot of validation. Second, different folks have very different things in mind when they say magic words like 'scalability' or 'high availability'.For example, 'high availability' could mean such things as: - the way to bring back resource online shall be available at the time the user (application) discovers the failure and can take reasonable time to be made available; - the backup resource shall be available at the time the user (application) discovers the failure and requires no preparation; - the user (application) shall not be dependent on the detection and resolution mechanisms of the underlying technology stack. Having said that, the idea to provide HA with VMWare and node reboots qualifies this as a project with "eventual" availability as described in the first statement above. It is qualified as eventual because the backup system is not verified as being available, and the VMWare might fail to bring the node online, which is not controlled by no software. The 'scalability', for example, could also have a number of different meanings: - the system shall allow placing new growth data on the additional nodes; - the system shall dynamically place the new growth data on the additional nodes; - the system shall allow redistributing access to the data among the available nodes; - the system shall load balance access to the data among the available nodes; - the system shall load balance the access and data across all available nodes and keep all nodes equally loaded. Now, to the question of one big instance and multiple instances. People usually get stuck with one big instance if they just can't break it into smaller instances.They usually can't break into multiple instances if such instances would have interdependencies which cannot be removed by changing the app design (or design cannot be easily changed).Such interdependencies may randomly alter your scalability and availability. For example, if you have some critical data on just one node out of dozen, the availability of the whole system will be impacted if that node is down. …. I ran out of time.But on these subjects, the chatting could go on for years. You may want to clearer explain what you are trying to do. It could make discussion more focused. Peace, Karen Abgarian. On Mar 29, 2012, at 6:23 PM, Wes Modes wrote: > First, thank you in advance for good solid suggestions you can offer. I > suppose someone has already asked this, but perhaps you will view it as > a fun challenge to meet my many criteria with your suggested MySQL > architecture. > > I am working at a University on a high-profile database driven project > that we expect to be slammed within the first few months. Since this is > a new project and one that we expect to be popular, we don't know what > kind of usage to expect, but we want to be prepared. Therefore, we are > building in extra capacity. > > Our top goals are scalability and high availability, provided we hope > through multiple MySQL nodes and VMWare functionality. I've been > surprised that there are not more MySQL architects trying to meet these > high-level goals using virtualization and shared storage (or at least > they do not seem to be writing about it). > > I've looked at replication, multi-mastering, DRBD, clustering, > partitioning, and sharding. > > Here's what we got, and some of our constraints: > > * We are concerned that One Big Database instance won't be enough to > handle all of the queries, plus it is a single point of failure. > Therefore, multiple nodes are desirable. > > * With the primary application that will be using the database, writes > and reads cannot be split off from each other. This limitation alone, > rules out replication, MMM, and a few other solutions. > > * We do not expect to be especially write-heavy. > > * We have shared storage in the form of an iSCSI SAN. We'd like to > leverage the shared storage, if possible. > > * We have VMWare HA which already monitors hosts and brings them up > within minutes elsewhere if we lose a host. So some of the suggested HA > solutions are redundant. > > * We expect to have another instance of our system running in the Amazon > cloud for the first few months while the traffic is high, so we may take > advantage of RDS, though an exact duplicate of our local system will > save us development work. > > Thanks for any advice you can give. > > Wes Modes > > -- > Wes Modes > Systems Designer, Developer, and Administrator > University Library ITS > University of California, Santa Cruz > > > -- > MySQL General Mailing List > For list archives: http://lists.mysql.com/
Re: HA & Scalability w MySQL + SAN + VMWare: Architecture Suggestion Wanted
Caution: You are not going to like my answers. > and VMWare > shared storage Why? Seems like scalability should plan on having dedicated hardware. > replication The best choice > multi-mastering Dual-Master gives good HA > DRBD Partially solves one subset of HA; don't bother with it; set your sights higher. > clustering maybe > PARTITION Only if you have really big tables > Sharding Only if you are writing faster than a single server can handle -- I say that even for Dual-Master setups. > With the primary application that will be using the database, writes and reads cannot be split off from each other. That is probably false. Yes, it will take some deeper thought to understand how to split them, or split them in certain situations. > We do not expect to be especially write-heavy. Skip Sharding, use readonly slaves. > We have shared storage Single-point-of-failure --> there goes your HA. > We have VMWare HA which already monitors hosts and brings them up within minutes elsewhere if we lose a host. That is thinking at the hardware level. But will it really work for Dual-Master? > another instance of our system running in the Amazon cloud Bother. More complexity. On 3/29/12 6:23 PM, Wes Modes wrote: First, thank you in advance for good solid suggestions you can offer. I suppose someone has already asked this, but perhaps you will view it as a fun challenge to meet my many criteria with your suggested MySQL architecture. I am working at a University on a high-profile database driven project that we expect to be slammed within the first few months. Since this is a new project and one that we expect to be popular, we don't know what kind of usage to expect, but we want to be prepared. Therefore, we are building in extra capacity. Our top goals are scalability and high availability, provided we hope through multiple MySQL nodes and VMWare functionality. I've been surprised that there are not more MySQL architects trying to meet these high-level goals using virtualization and shared storage (or at least they do not seem to be writing about it). I've looked at replication, multi-mastering, DRBD, clustering, partitioning, and sharding. Here's what we got, and some of our constraints: * We are concerned that One Big Database instance won't be enough to handle all of the queries, plus it is a single point of failure. Therefore, multiple nodes are desirable. * With the primary application that will be using the database, writes and reads cannot be split off from each other. This limitation alone, rules out replication, MMM, and a few other solutions. * We do not expect to be especially write-heavy. * We have shared storage in the form of an iSCSI SAN. We'd like to leverage the shared storage, if possible. * We have VMWare HA which already monitors hosts and brings them up within minutes elsewhere if we lose a host. So some of the suggested HA solutions are redundant. * We expect to have another instance of our system running in the Amazon cloud for the first few months while the traffic is high, so we may take advantage of RDS, though an exact duplicate of our local system will save us development work. Thanks for any advice you can give. Wes Modes -- Rick James - MySQL Geek -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql