Which replication solution should I choose?
Hi All :) I am creating an environment based on about 15 hardware nodes. 6 of them will be for mysql databases. They will be divided into three pairs (3x2 nodes). Nodes in every pair will be configured with replication. I done similar configuration about an year ago. I used than: - percona 5.5 with standard asynchronous replication; - Multi Master Replication Manager for MySQL (http://mysql-mmm.org/) for automatic assignment/failover of read and write IP addresses and checking the status of replication (mmm provides status scripts); - Pacemaker to create a cluster in which scripts for Multi Master Replication Manager were used. My question is: I heard that there are other options than standard asynchronous replication which sometimes was problematic (for example there was one slave thread only in percona 5.5 and there was a possibility that slave node will fall behind master node in high load situations). I also am thinking which MySQL fork is the best option if I plan to use replication. What can you propose based on your experience? BR, Rafal.
Re: Which replication solution should I choose?
Hi Rafael, On Wed, Oct 29, 2014 at 10:15 AM, Rafał Radecki radecki.ra...@gmail.com wrote: I am creating an environment based on about 15 hardware nodes. 6 of them will be for mysql databases. They will be divided into three pairs (3x2 nodes). Nodes in every pair will be configured with replication. I done similar configuration about an year ago. I used than: - percona 5.5 with standard asynchronous replication; - Multi Master Replication Manager for MySQL (http://mysql-mmm.org/ https://t.yesware.com/tl/2627942011ccc3835f76c6e9ba4c0af91f3d3722/407ea34fcfb29332bbc93cdb34f13e4f/df42c03e188abb8c80b981666b82e7ff?ytl=http%3A%2F%2Fmysql-mmm.org%2F) for automatic assignment/failover of read and write IP addresses and checking the status of replication (mmm provides status scripts); While it works fine for many situations, I wouldn't recommend using it for any new setups. There hasn't been any updates for years and there are a decent number of edge cases where MMM fails miserably. There are better alternatives out there these days. My question is: I heard that there are other options than standard asynchronous replication which sometimes was problematic (for example there was one slave thread only in percona 5.5 and there was a possibility that slave node will fall behind master node in high load situations). I also am thinking which MySQL fork is the best option if I plan to use replication. What can you propose based on your experience? If you want to stick with standard replication, either MariaDB 10 or Percona 5.6 will do just fine. Instead of MMM you can google for MHA for instance. I would recommend taking a look at Galera though, which takes a different approach but with some nice benefits. The one thing is that with Galera your data generally lives on 3 or more servers (2 is possible, but not recommended). If that is a good idea to you, then Galera would be my personal preference. -- Best regards, Walter Heck CEO / Founder OlinData https://t.yesware.com/tl/2627942011ccc3835f76c6e9ba4c0af91f3d3722/407ea34fcfb29332bbc93cdb34f13e4f/fd1ca40450db4c95f61e02cfe4940db2?ytl=http%3A%2F%2Folindata.com%2F%3Fsrc%3Dwh_gapp - Open Source Training Consulting Check out our upcoming trainings https://t.yesware.com/tl/2627942011ccc3835f76c6e9ba4c0af91f3d3722/407ea34fcfb29332bbc93cdb34f13e4f/63ebe1eaf25f15c25c5b43a8b2954a8d?ytl=http%3A%2F%2Folindata.com%2Ftraining%2Fupcoming
Re: Which replication solution should I choose?
Thanks for the info, Walter. I checked some basic info about Galera and it looks very promising. Can you tell what is preferable for loadbalancing of requests? HAProxy/Galera loadbalancer or maybe something else? Can you tell me also how much does MHA differ from MMM? Functionality looks quite similar. BR, Rafal. 2014-10-29 11:38 GMT+01:00 Heck, Walter walterh...@olindata.com: Hi Rafael, On Wed, Oct 29, 2014 at 10:15 AM, Rafał Radecki radecki.ra...@gmail.com wrote: I am creating an environment based on about 15 hardware nodes. 6 of them will be for mysql databases. They will be divided into three pairs (3x2 nodes). Nodes in every pair will be configured with replication. I done similar configuration about an year ago. I used than: - percona 5.5 with standard asynchronous replication; - Multi Master Replication Manager for MySQL (http://mysql-mmm.org/ https://t.yesware.com/tl/2627942011ccc3835f76c6e9ba4c0af91f3d3722/407ea34fcfb29332bbc93cdb34f13e4f/df42c03e188abb8c80b981666b82e7ff?ytl=http%3A%2F%2Fmysql-mmm.org%2F) for automatic assignment/failover of read and write IP addresses and checking the status of replication (mmm provides status scripts); While it works fine for many situations, I wouldn't recommend using it for any new setups. There hasn't been any updates for years and there are a decent number of edge cases where MMM fails miserably. There are better alternatives out there these days. My question is: I heard that there are other options than standard asynchronous replication which sometimes was problematic (for example there was one slave thread only in percona 5.5 and there was a possibility that slave node will fall behind master node in high load situations). I also am thinking which MySQL fork is the best option if I plan to use replication. What can you propose based on your experience? If you want to stick with standard replication, either MariaDB 10 or Percona 5.6 will do just fine. Instead of MMM you can google for MHA for instance. I would recommend taking a look at Galera though, which takes a different approach but with some nice benefits. The one thing is that with Galera your data generally lives on 3 or more servers (2 is possible, but not recommended). If that is a good idea to you, then Galera would be my personal preference. -- Best regards, Walter Heck CEO / Founder OlinData https://t.yesware.com/tl/2627942011ccc3835f76c6e9ba4c0af91f3d3722/407ea34fcfb29332bbc93cdb34f13e4f/fd1ca40450db4c95f61e02cfe4940db2?ytl=http%3A%2F%2Folindata.com%2F%3Fsrc%3Dwh_gapp - Open Source Training Consulting Check out our upcoming trainings https://t.yesware.com/tl/2627942011ccc3835f76c6e9ba4c0af91f3d3722/407ea34fcfb29332bbc93cdb34f13e4f/63ebe1eaf25f15c25c5b43a8b2954a8d?ytl=http%3A%2F%2Folindata.com%2Ftraining%2Fupcoming
Re: Which replication solution should I choose?
Hi Rafal, On Wed, Oct 29, 2014 at 3:16 PM, Rafał Radecki radecki.ra...@gmail.com wrote: Thanks for the info, Walter. I checked some basic info about Galera and it looks very promising. Can you tell what is preferable for loadbalancing of requests? HAProxy/Galera loadbalancer or maybe something else? We use haProxy for our clients, I blogged about it here: http://www.olindata.com/blog/2014/04/managing-percona-xtradb-cluster-puppet If you happen to be at the Percona Live conference next week in London I'm speaking about Galera in a high performance setup as well. Can you tell me also how much does MHA differ from MMM? Functionality looks quite similar. The principles are very similar, the implementation is quite different. MHA is actively maintained as far as I know. Haven't used it myself though. -- Best regards, Walter Heck CEO / Founder OlinData https://t.yesware.com/tl/2627942011ccc3835f76c6e9ba4c0af91f3d3722/8fb00d467529a08502c00ecbdee7bfca/48dba89e9de7b75805f230f49ef482e2?ytl=http%3A%2F%2Folindata.com%2F%3Fsrc%3Dwh_gapp - Open Source Training Consulting Check out our upcoming trainings https://t.yesware.com/tl/2627942011ccc3835f76c6e9ba4c0af91f3d3722/8fb00d467529a08502c00ecbdee7bfca/0ebf153e4161f5822e92cb8f14a23661?ytl=http%3A%2F%2Folindata.com%2Ftraining%2Fupcoming
Using INTEGER instead of VARCHAR/DATE - is this a way to faster access?
I'm going to establish a large database (ISAM) of simple structure, say the table with only 3 columns: - VARCHAR(80) - VARCHAR(40) - DATE The number of rows can be quite large, about hundred million or so. The first column, containing actual information, will contain unique values, unlike the two others - but the two others shall be used for data selection (and I'll index them). Now to the point: considering, that the second column shall contain about 100-200 different labels - so in average many of such labels can be repeated one million times (or even more) - will it speed-up the selection done with something like ...WHERE label='xyz' AND date='2013-02-25' (and maybe the insertion as well?), if instead of textual labels I'll use SMALLINT as simple integer code for each different label (and in my program a translation table)? If so - will I have further performance gain, if instead of human-readable DATE for the third column I'll use Unix time put into INTEGER type column, not DATE-type? And the final question: even, if so - is it worthy? I mean: will the supposed performance gain be significant (e.g. 2-3 times faster selection) - and not, say, just 5% faster (only possible to detect by using benchmarking tools)? Thanks in advance for your opinions. -- Zbig -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql