Which replication solution should I choose?

2014-10-29 Thread Rafał Radecki
Hi All :)

I am creating an environment based on about 15 hardware nodes. 6 of them
will be for mysql databases. They will be divided into three pairs (3x2
nodes). Nodes in every pair will be configured with replication.

I done similar configuration about an year ago. I used than:
- percona 5.5 with standard asynchronous replication;
- Multi Master Replication Manager for MySQL (http://mysql-mmm.org/) for
automatic assignment/failover of read and write IP addresses and checking
the status of replication (mmm provides status scripts);
- Pacemaker to create a cluster in which scripts for Multi Master
Replication Manager were used.

My question is: I heard that there are other options than standard
asynchronous replication which sometimes was problematic (for example there
was one slave thread only in percona 5.5 and there was a possibility that
slave node will fall behind master node in high load situations). I also am
thinking which MySQL fork is the best option if I plan to use replication.
What can you propose based on your experience?

BR,
Rafal.


Re: Which replication solution should I choose?

2014-10-29 Thread Heck, Walter
Hi Rafael,
On Wed, Oct 29, 2014 at 10:15 AM, Rafał Radecki radecki.ra...@gmail.com
wrote:

 I am creating an environment based on about 15 hardware nodes. 6 of them
 will be for mysql databases. They will be divided into three pairs (3x2
 nodes). Nodes in every pair will be configured with replication.

 I done similar configuration about an year ago. I used than:
 - percona 5.5 with standard asynchronous replication;
 - Multi Master Replication Manager for MySQL (http://mysql-mmm.org/
 https://t.yesware.com/tl/2627942011ccc3835f76c6e9ba4c0af91f3d3722/407ea34fcfb29332bbc93cdb34f13e4f/df42c03e188abb8c80b981666b82e7ff?ytl=http%3A%2F%2Fmysql-mmm.org%2F)
 for
 automatic assignment/failover of read and write IP addresses and checking
 the status of replication (mmm provides status scripts);

While it works fine for many situations, I wouldn't recommend using it for
any new setups. There hasn't been any updates for years and there are a
decent number of edge cases where MMM fails miserably. There are better
alternatives out there these days.


 My question is: I heard that there are other options than standard
 asynchronous replication which sometimes was problematic (for example there
 was one slave thread only in percona 5.5 and there was a possibility that
 slave node will fall behind master node in high load situations). I also am
 thinking which MySQL fork is the best option if I plan to use replication.
 What can you propose based on your experience?

If you want to stick with standard replication, either MariaDB 10 or
Percona 5.6 will do just fine. Instead of MMM you can google for MHA for
instance.

I would recommend taking a look at Galera though, which takes a different
approach but with some nice benefits. The one thing is that with Galera
your data generally lives on 3 or more servers (2 is possible, but not
recommended). If that is a good idea to you, then Galera would be my
personal preference.


-- 
Best regards,

Walter Heck
CEO / Founder OlinData
https://t.yesware.com/tl/2627942011ccc3835f76c6e9ba4c0af91f3d3722/407ea34fcfb29332bbc93cdb34f13e4f/fd1ca40450db4c95f61e02cfe4940db2?ytl=http%3A%2F%2Folindata.com%2F%3Fsrc%3Dwh_gapp
- Open Source Training  Consulting

Check out our upcoming trainings
https://t.yesware.com/tl/2627942011ccc3835f76c6e9ba4c0af91f3d3722/407ea34fcfb29332bbc93cdb34f13e4f/63ebe1eaf25f15c25c5b43a8b2954a8d?ytl=http%3A%2F%2Folindata.com%2Ftraining%2Fupcoming


Re: Which replication solution should I choose?

2014-10-29 Thread Rafał Radecki
Thanks for the info, Walter.
I checked some basic info about Galera and it looks very promising. Can you
tell what is preferable for loadbalancing of requests? HAProxy/Galera
loadbalancer or maybe something else?

Can you tell me also how much does MHA differ from MMM? Functionality looks
quite similar.

BR,
Rafal.

2014-10-29 11:38 GMT+01:00 Heck, Walter walterh...@olindata.com:

 Hi Rafael,
 On Wed, Oct 29, 2014 at 10:15 AM, Rafał Radecki radecki.ra...@gmail.com
 wrote:

 I am creating an environment based on about 15 hardware nodes. 6 of them
 will be for mysql databases. They will be divided into three pairs (3x2
 nodes). Nodes in every pair will be configured with replication.

 I done similar configuration about an year ago. I used than:
 - percona 5.5 with standard asynchronous replication;
 - Multi Master Replication Manager for MySQL (http://mysql-mmm.org/
 https://t.yesware.com/tl/2627942011ccc3835f76c6e9ba4c0af91f3d3722/407ea34fcfb29332bbc93cdb34f13e4f/df42c03e188abb8c80b981666b82e7ff?ytl=http%3A%2F%2Fmysql-mmm.org%2F)
 for
 automatic assignment/failover of read and write IP addresses and checking
 the status of replication (mmm provides status scripts);

 While it works fine for many situations, I wouldn't recommend using it for
 any new setups. There hasn't been any updates for years and there are a
 decent number of edge cases where MMM fails miserably. There are better
 alternatives out there these days.


 My question is: I heard that there are other options than standard
 asynchronous replication which sometimes was problematic (for example
 there
 was one slave thread only in percona 5.5 and there was a possibility that
 slave node will fall behind master node in high load situations). I also
 am
 thinking which MySQL fork is the best option if I plan to use replication.
 What can you propose based on your experience?

 If you want to stick with standard replication, either MariaDB 10 or
 Percona 5.6 will do just fine. Instead of MMM you can google for MHA for
 instance.

 I would recommend taking a look at Galera though, which takes a different
 approach but with some nice benefits. The one thing is that with Galera
 your data generally lives on 3 or more servers (2 is possible, but not
 recommended). If that is a good idea to you, then Galera would be my
 personal preference.


 --
 Best regards,

 Walter Heck
 CEO / Founder OlinData
 https://t.yesware.com/tl/2627942011ccc3835f76c6e9ba4c0af91f3d3722/407ea34fcfb29332bbc93cdb34f13e4f/fd1ca40450db4c95f61e02cfe4940db2?ytl=http%3A%2F%2Folindata.com%2F%3Fsrc%3Dwh_gapp
 - Open Source Training  Consulting

 Check out our upcoming trainings
 https://t.yesware.com/tl/2627942011ccc3835f76c6e9ba4c0af91f3d3722/407ea34fcfb29332bbc93cdb34f13e4f/63ebe1eaf25f15c25c5b43a8b2954a8d?ytl=http%3A%2F%2Folindata.com%2Ftraining%2Fupcoming




Re: Which replication solution should I choose?

2014-10-29 Thread Heck, Walter
Hi Rafal,
On Wed, Oct 29, 2014 at 3:16 PM, Rafał Radecki radecki.ra...@gmail.com
wrote:

 Thanks for the info, Walter.
 I checked some basic info about Galera and it looks very promising. Can
 you tell what is preferable for loadbalancing of requests? HAProxy/Galera
 loadbalancer or maybe something else?

We use haProxy for our clients, I blogged about it here:
http://www.olindata.com/blog/2014/04/managing-percona-xtradb-cluster-puppet

If you happen to be at the Percona Live conference next week in London I'm
speaking about Galera in a high performance setup as well.


 Can you tell me also how much does MHA differ from MMM? Functionality
 looks quite similar.

The principles are very similar, the implementation is quite different. MHA
is actively maintained as far as I know. Haven't used it myself though.

-- 
Best regards,

Walter Heck
CEO / Founder OlinData
https://t.yesware.com/tl/2627942011ccc3835f76c6e9ba4c0af91f3d3722/8fb00d467529a08502c00ecbdee7bfca/48dba89e9de7b75805f230f49ef482e2?ytl=http%3A%2F%2Folindata.com%2F%3Fsrc%3Dwh_gapp
- Open Source Training  Consulting

Check out our upcoming trainings
https://t.yesware.com/tl/2627942011ccc3835f76c6e9ba4c0af91f3d3722/8fb00d467529a08502c00ecbdee7bfca/0ebf153e4161f5822e92cb8f14a23661?ytl=http%3A%2F%2Folindata.com%2Ftraining%2Fupcoming


Using INTEGER instead of VARCHAR/DATE - is this a way to faster access?

2014-10-29 Thread Zbigniew
I'm going to establish a large database (ISAM) of simple structure,
say the table with only 3 columns:

- VARCHAR(80)
- VARCHAR(40)
- DATE

The number of rows can be quite large, about hundred million or so.
The first column, containing actual information, will contain unique
values, unlike the two others - but the two others shall be used for
data selection (and I'll index them).

Now to the point: considering, that the second column shall contain
about 100-200 different labels - so in average many of such labels
can be repeated one million times (or even more) - will it speed-up
the selection done with something like ...WHERE label='xyz' AND
date='2013-02-25'  (and maybe the insertion as well?), if instead of
textual labels I'll use SMALLINT as simple integer code for each
different label (and in my program a translation table)?

If so - will I have further performance gain, if instead of
human-readable DATE for the third column I'll use Unix time put into
INTEGER type column, not DATE-type?

And the final question: even, if so - is it worthy? I mean: will the
supposed performance gain be significant (e.g. 2-3 times faster
selection) - and not, say, just 5% faster (only possible to detect by
using benchmarking tools)?

Thanks in advance for your opinions.
-- 
Zbig

-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql