Re: Here's an Idea for Re-Syncing Master and Slave During Production Hours without Interrupting Users (Much)
Hi Eric, >> At least with Maatkit, you get transparency. We make a concerted >> effort to update the RISKS section of each tool with each release, so > there >> is full disclosure. > > Fair enough, but I still found the warnings a little too scary. A more > complete explanation of the exact nature of the bugs and the exact > circumstances under which I should be concerned about triggering them > would have increased my comfort level. I've made a note to review these, because the ones I checked have kind of drifted from their original purity. I updated the RISKS section for mk-table-sync the other day. I checked it and agreed with you -- it didn't distinguish between cases where there is actually a risk, or cases where the tool would just refuse to work (which isn't a risk IMO). And it sounded ambiguously scary in a don't-blame-us, we're-avoiding-your-eyes kind of way because of passive voice. You can see my changes here: http://code.google.com/p/maatkit/source/detail?r=5269 I think that's a pretty realistic balanced statement of risk: you are playing with a powerful tool, so learn how to use it first. Thanks for the feedback! BTW, there's also a Maatkit mailing list that I watch closely: http://groups.google.com/group/maatkit-discuss - Baron -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org
RE: Here's an Idea for Re-Syncing Master and Slave During Production Hours without Interrupting Users (Much)
Hi Baron, > I'm the primary author of Maatkit. Awkward... :-) > What can I say -- you could go buy a commercial off-the-shelf tool > and believe the song and dance they feed you about the tool > being perfect. There's not a single commercial software solution in our toolbox. We're big fans of CentOS, LVS, heartbeat, ldirectord, tomcat, MySQL, Xen, pureFTP, and more. We've been happy with the performance and reliability of all of our FOSS tools. I'm definitely not a Kool-aid drinker when it comes to commercial product marketing. > At least with Maatkit, you get transparency. We make a concerted > effort to update the RISKS section of each tool with each release, so there > is full disclosure. Fair enough, but I still found the warnings a little too scary. A more complete explanation of the exact nature of the bugs and the exact circumstances under which I should be concerned about triggering them would have increased my comfort level. > I think Maatkit is by far the best solution for live master-slave sync > in most real-world situations. We'll give it another look. -- Eric Robinson Disclaimer - December 9, 2009 This email and any files transmitted with it are confidential and intended solely for Baron Schwartz,Gavin Towey,Tom Worster,my...@lists.mysql.com. If you are not the named addressee you should not disseminate, distribute, copy or alter this email. Any views or opinions presented in this email are solely those of the author and might not represent those of . Warning: Although has taken reasonable precautions to ensure no viruses are present in this email, the company cannot accept responsibility for any loss or damage arising from the use of this email or attachments. This disclaimer was added by Policy Patrol: http://www.policypatrol.com/ -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org
Re: Here's an Idea for Re-Syncing Master and Slave During Production Hours without Interrupting Users (Much)
Eric, >> There are ways to resync data that don't involve all >> this as well: Maatkit has some tools > > I've looked with great interest at Maatkit, but their tools are replete > with warnings about dangers, bugs, and crashes. They certainly do not > inspire confidence. I'm the primary author of Maatkit. What can I say -- you could go buy a commercial off-the-shelf tool and believe the song and dance they feed you about the tool being perfect. At least with Maatkit, you get transparency. We make a concerted effort to update the RISKS section of each tool with each release, so there is full disclosure. I think Maatkit is by far the best solution for live master-slave sync in most real-world situations. - Baron -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org
RE: Here's an Idea for Re-Syncing Master and Slave During Production Hours without Interrupting Users (Much)
>> I would never have any confidence that the replication >> is solid enough to use the slave server for backup purposes. > I agree completely there. That's the other reason I like filesystem > snapshots is that it allows you to take a backup from > the master relatively painlessly. I've thought of using snapshots. Offhand, can't remember the reason that I decided they would not work for us. It'll come to me... -- Eric Robinson Disclaimer - December 4, 2009 This email and any files transmitted with it are confidential and intended solely for Gavin Towey,my...@lists.mysql.com. If you are not the named addressee you should not disseminate, distribute, copy or alter this email. Any views or opinions presented in this email are solely those of the author and might not represent those of . Warning: Although has taken reasonable precautions to ensure no viruses are present in this email, the company cannot accept responsibility for any loss or damage arising from the use of this email or attachments. This disclaimer was added by Policy Patrol: http://www.policypatrol.com/ -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org
RE: Here's an Idea for Re-Syncing Master and Slave During Production Hours without Interrupting Users (Much)
> I would never have any confidence that the replication is solid > enough to use the slave server for backup purposes. I agree completely there. That's the other reason I like filesystem snapshots is that it allows you to take a backup from the master relatively painlessly. -Original Message- From: Robinson, Eric [mailto:eric.robin...@psmnv.com] Sent: Friday, December 04, 2009 1:24 PM To: Gavin Towey; Tom Worster; mysql@lists.mysql.com Subject: RE: Here's an Idea for Re-Syncing Master and Slave During Production Hours without Interrupting Users (Much) > I would say that it's very important to know why data > is getting out of sync between your master and slave. Ultimately, I agree. But since it's a canned application, getting to that point might be hard, and once it is resolved, new issues might arise. I would never have any confidence that the replication is solid enough to use the slave server for backup purposes. (Which, by the way, is the real reason I'm doing this. In the middle of the night, when there are few users on the system, I want to backup the slave, but first I want to make sure I have a 100% reliable copy of the data.) > There are ways to resync data that don't involve all > this as well: Maatkit has some tools I've looked with great interest at Maatkit, but their tools are replete with warnings about dangers, bugs, and crashes. They certainly do not inspire confidence. -- Eric Robinson Disclaimer - December 4, 2009 This email and any files transmitted with it are confidential and intended solely for Gavin Towey,Tom Worster,my...@lists.mysql.com. If you are not the named addressee you should not disseminate, distribute, copy or alter this email. Any views or opinions presented in this email are solely those of the author and might not represent those of . Warning: Although has taken reasonable precautions to ensure no viruses are present in this email, the company cannot accept responsibility for any loss or damage arising from the use of this email or attachments. This disclaimer was added by Policy Patrol: http://www.policypatrol.com/ This message contains confidential information and is intended only for the individual named. If you are not the named addressee, you are notified that reviewing, disseminating, disclosing, copying or distributing this e-mail is strictly prohibited. Please notify the sender immediately by e-mail if you have received this e-mail by mistake and delete this e-mail from your system. E-mail transmission cannot be guaranteed to be secure or error-free as information could be intercepted, corrupted, lost, destroyed, arrive late or incomplete, or contain viruses. The sender therefore does not accept liability for any loss or damage caused by viruses or errors or omissions in the contents of this message, which arise as a result of e-mail transmission. [FriendFinder Networks, Inc., 220 Humbolt court, Sunnyvale, CA 94089, USA, FriendFinder.com -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org
RE: Here's an Idea for Re-Syncing Master and Slave During Production Hours without Interrupting Users (Much)
> I would say that it's very important to know why data > is getting out of sync between your master and slave. Ultimately, I agree. But since it's a canned application, getting to that point might be hard, and once it is resolved, new issues might arise. I would never have any confidence that the replication is solid enough to use the slave server for backup purposes. (Which, by the way, is the real reason I'm doing this. In the middle of the night, when there are few users on the system, I want to backup the slave, but first I want to make sure I have a 100% reliable copy of the data.) > There are ways to resync data that don't involve all > this as well: Maatkit has some tools I've looked with great interest at Maatkit, but their tools are replete with warnings about dangers, bugs, and crashes. They certainly do not inspire confidence. -- Eric Robinson Disclaimer - December 4, 2009 This email and any files transmitted with it are confidential and intended solely for Gavin Towey,Tom Worster,my...@lists.mysql.com. If you are not the named addressee you should not disseminate, distribute, copy or alter this email. Any views or opinions presented in this email are solely those of the author and might not represent those of . Warning: Although has taken reasonable precautions to ensure no viruses are present in this email, the company cannot accept responsibility for any loss or damage arising from the use of this email or attachments. This disclaimer was added by Policy Patrol: http://www.policypatrol.com/ -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org
Re: Here's an Idea for Re-Syncing Master and Slave During Production Hours without Interrupting Users (Much)
On 12/4/09 3:14 PM, "Gavin Towey" wrote: > I would say that it's very important to know why data is getting out of sync > between your master and slave. Fixing those root causes would eliminate the > need for this. i very much agree. the only instances of slaves getting out of whack that i've experienced was when i screwed something up administratively. > There are cases where non-deterministic queries will produce > different results, but that's what row based replication is supposed to solve > =) 16.3.1 lists some interesting cases to consider: http://dev.mysql.com/doc/refman/5.0/en/replication-features.html > There are ways to resync data that don't involve all this as well: Maatkit > has some tools that compare data between servers, and can fix them with > queries. No stopping the slave or locking the master necessary. I've used > them in production with good results. thanks for the pointer. looks handy. -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org
RE: Here's an Idea for Re-Syncing Master and Slave During Production Hours without Interrupting Users (Much)
I think he's trying to say that this method wouldn't work for innodb, unless you copied files from an LVM snapshot, or something similar. I would say that it's very important to know why data is getting out of sync between your master and slave. Fixing those root causes would eliminate the need for this. There are cases where non-deterministic queries will produce different results, but that's what row based replication is supposed to solve =) There are ways to resync data that don't involve all this as well: Maatkit has some tools that compare data between servers, and can fix them with queries. No stopping the slave or locking the master necessary. I've used them in production with good results. Regards, Gavin Towey -Original Message- From: Robinson, Eric [mailto:eric.robin...@psmnv.com] Sent: Friday, December 04, 2009 9:00 AM To: Tom Worster; mysql@lists.mysql.com Subject: RE: Here's an Idea for Re-Syncing Master and Slave During Production Hours without Interrupting Users (Much) > (1) innodb? It's an off-the-shelf application that uses MyISAM tables. It is possible to convert to innodb, but I have not been sold on innodb in terms of its performance characteristics for this particular application. Maybe I've been reading the wrong stuff. Do you have general thoughts on the differences with respect to performance? > (2) why delete slave logs when you can > restart the slave with --skip-slave and > then use CHANGE MASTER TO? Well... I guess mainly because I didn't know about that option! I thought I needed to "fake out" mysql on this, but it sounds like I can just do 'flush tables with read lock;reset master;' on the master and 'change master to...;' on the slave. So cool. Thanks for the input! -- Eric Robinson Disclaimer - December 4, 2009 This email and any files transmitted with it are confidential and intended solely for Tom Worster,my...@lists.mysql.com. If you are not the named addressee you should not disseminate, distribute, copy or alter this email. Any views or opinions presented in this email are solely those of the author and might not represent those of . Warning: Although has taken reasonable precautions to ensure no viruses are present in this email, the company cannot accept responsibility for any loss or damage arising from the use of this email or attachments. This disclaimer was added by Policy Patrol: http://www.policypatrol.com/ -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql?unsub=gto...@ffn.com This message contains confidential information and is intended only for the individual named. If you are not the named addressee, you are notified that reviewing, disseminating, disclosing, copying or distributing this e-mail is strictly prohibited. Please notify the sender immediately by e-mail if you have received this e-mail by mistake and delete this e-mail from your system. E-mail transmission cannot be guaranteed to be secure or error-free as information could be intercepted, corrupted, lost, destroyed, arrive late or incomplete, or contain viruses. The sender therefore does not accept liability for any loss or damage caused by viruses or errors or omissions in the contents of this message, which arise as a result of e-mail transmission. [FriendFinder Networks, Inc., 220 Humbolt court, Sunnyvale, CA 94089, USA, FriendFinder.com -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org
Re: Here's an Idea for Re-Syncing Master and Slave During Production Hours without Interrupting Users (Much)
On 12/4/09 11:59 AM, "Robinson, Eric" wrote: >> (2) why delete slave logs when you can >> restart the slave with --skip-slave and >> then use CHANGE MASTER TO? > > Well... I guess mainly because I didn't know about that option! I > thought I needed to "fake out" mysql on this, but it sounds like I can > just do 'flush tables with read lock;reset master;' on the master and > 'change master to...;' on the slave. So cool. Thanks for the input! 16.1.1 is probably my favorite chapter of the manual. 16.1.1.8 is particularly worth a read. http://dev.mysql.com/doc/refman/5.0/en/replication-howto-existingdata.html -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org
RE: Here's an Idea for Re-Syncing Master and Slave During Production Hours without Interrupting Users (Much)
> (1) innodb? It's an off-the-shelf application that uses MyISAM tables. It is possible to convert to innodb, but I have not been sold on innodb in terms of its performance characteristics for this particular application. Maybe I've been reading the wrong stuff. Do you have general thoughts on the differences with respect to performance? > (2) why delete slave logs when you can > restart the slave with --skip-slave and > then use CHANGE MASTER TO? Well... I guess mainly because I didn't know about that option! I thought I needed to "fake out" mysql on this, but it sounds like I can just do 'flush tables with read lock;reset master;' on the master and 'change master to...;' on the slave. So cool. Thanks for the input! -- Eric Robinson Disclaimer - December 4, 2009 This email and any files transmitted with it are confidential and intended solely for Tom Worster,my...@lists.mysql.com. If you are not the named addressee you should not disseminate, distribute, copy or alter this email. Any views or opinions presented in this email are solely those of the author and might not represent those of . Warning: Although has taken reasonable precautions to ensure no viruses are present in this email, the company cannot accept responsibility for any loss or damage arising from the use of this email or attachments. This disclaimer was added by Policy Patrol: http://www.policypatrol.com/ -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org
Re: Here's an Idea for Re-Syncing Master and Slave During Production Hours without Interrupting Users (Much)
i have two questions. (1) innodb? (2) why delete slave logs when you can restart the slave with --skip-slave and then use CHANGE MASTER TO? tom On 12/4/09 6:34 AM, "Robinson, Eric" wrote: > > Let's face it, sometimes the master and slave get out of sync, even when > 'show slave status' and 'show master status' indicate that all is well. > And sometimes it is not feasible to wait until after production hours to > resync them. We've been working on a method to do an emergency > hot-resync during production hours with little or no user downtime. What > do you guys think of this approach? It's only for Linux, though... > > 1. Shut down the slave and remove its replication logs (master.info and > *relay* files). > > 2. Do an initial rsync of the master to the slave. Using rsync's > bit-differential algorithm, this quickly copies most of the changed data > and can be safely be done against a live database. This initial rsync is > done before the next step to minimize the time during which the tables > will be read-locked. > > 3. Do a 'flush tables with read lock;reset master' on the master server. > At this point, user apps may freeze briefly during inserts or updates. > > 4. Do a second rsync, which goes very fast because very little data has > changed between steps 2 and 3. > > 5. Unlock the master tables. > > 6. Restart the slave. > > When you're done, you have a 100% binary duplicate of the master > database on the slave, with no worries that some queries got missed > somewhere. The master was never stopped and users were not severely > impacted. (Mileage may vary, of course.) > > We've tried this a few times and it has seemed to work well in most > cases. We had once case where the slave SQL thread did not want to > restart afterwards and we had to do the whole thing again, only we > stopped the master the second time. Not yet sure what that was all > about, but I think it may have been a race issue of some kind. We're > still exploring it. > > Anyway, comments would be appreciated. > > -- > Eric Robinson -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org