Re: Here's an Idea for Re-Syncing Master and Slave During Production Hours without Interrupting Users (Much)

2009-12-10 Thread Baron Schwartz
Hi Eric,

>> At least with Maatkit, you get transparency.  We make a concerted
>> effort to update the RISKS section of each tool with each release, so
> there
>> is full disclosure.
>
> Fair enough, but I still found the warnings a little too scary. A more
> complete explanation of the exact nature of the bugs and the exact
> circumstances under which I should be concerned about triggering them
> would have increased my comfort level.

I've made a note to review these, because the ones I checked have kind
of drifted from their original purity.  I updated the RISKS section
for mk-table-sync the other day.  I checked it and agreed with you --
it didn't distinguish between cases where there is actually a risk, or
cases where the tool would just refuse to work (which isn't a risk
IMO).  And it sounded ambiguously scary in a don't-blame-us,
we're-avoiding-your-eyes kind of way because of passive voice.  You
can see my changes here:
http://code.google.com/p/maatkit/source/detail?r=5269  I think that's
a pretty realistic balanced statement of risk: you are playing with a
powerful tool, so learn how to use it first.

Thanks for the feedback!  BTW, there's also a Maatkit mailing list
that I watch closely: http://groups.google.com/group/maatkit-discuss

- Baron

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org



RE: Here's an Idea for Re-Syncing Master and Slave During Production Hours without Interrupting Users (Much)

2009-12-09 Thread Robinson, Eric
Hi Baron,

> I'm the primary author of Maatkit.  

Awkward... :-)

> What can I say -- you could go buy a commercial off-the-shelf tool 
> and believe the song and dance they feed you about the tool 
> being perfect.  

There's not a single commercial software solution in our toolbox. We're
big fans of CentOS, LVS, heartbeat, ldirectord, tomcat, MySQL, Xen,
pureFTP, and more. We've been happy with the performance and reliability
of all of our FOSS tools. I'm definitely not a Kool-aid drinker when it
comes to commercial product marketing.

> At least with Maatkit, you get transparency.  We make a concerted 
> effort to update the RISKS section of each tool with each release, so
there 
> is full disclosure.

Fair enough, but I still found the warnings a little too scary. A more
complete explanation of the exact nature of the bugs and the exact
circumstances under which I should be concerned about triggering them
would have increased my comfort level.  

> I think Maatkit is by far the best solution for live master-slave sync

> in most real-world situations.

We'll give it another look.

--
Eric Robinson



Disclaimer - December 9, 2009 
This email and any files transmitted with it are confidential and intended 
solely for Baron Schwartz,Gavin Towey,Tom Worster,my...@lists.mysql.com. If you 
are not the named addressee you should not disseminate, distribute, copy or 
alter this email. Any views or opinions presented in this email are solely 
those of the author and might not represent those of . Warning: Although  has 
taken reasonable precautions to ensure no viruses are present in this email, 
the company cannot accept responsibility for any loss or damage arising from 
the use of this email or attachments. 
This disclaimer was added by Policy Patrol: http://www.policypatrol.com/

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org



Re: Here's an Idea for Re-Syncing Master and Slave During Production Hours without Interrupting Users (Much)

2009-12-08 Thread Baron Schwartz
Eric,

>> There are ways to resync data that don't involve all
>> this as well:  Maatkit has some tools
>
> I've looked with great interest at Maatkit, but their tools are replete
> with warnings about dangers, bugs, and crashes. They certainly do not
> inspire confidence.

I'm the primary author of Maatkit.  What can I say -- you could go buy
a commercial off-the-shelf tool and believe the song and dance they
feed you about the tool being perfect.  At least with Maatkit, you get
transparency.  We make a concerted effort to update the RISKS section
of each tool with each release, so there is full disclosure.

I think Maatkit is by far the best solution for live master-slave sync
in most real-world situations.

- Baron

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org



RE: Here's an Idea for Re-Syncing Master and Slave During Production Hours without Interrupting Users (Much)

2009-12-04 Thread Robinson, Eric
>> I would never have any confidence that the replication 
>> is solid enough to use the slave server for backup purposes.

> I agree completely there.  That's the other reason I like filesystem 
> snapshots is that it allows you to take a backup from 
> the master relatively painlessly.

I've thought of using snapshots. Offhand, can't remember the reason that
I decided they would not work for us. It'll come to me... 

--
Eric Robinson


Disclaimer - December 4, 2009 
This email and any files transmitted with it are confidential and intended 
solely for Gavin Towey,my...@lists.mysql.com. If you are not the named 
addressee you should not disseminate, distribute, copy or alter this email. Any 
views or opinions presented in this email are solely those of the author and 
might not represent those of . Warning: Although  has taken reasonable 
precautions to ensure no viruses are present in this email, the company cannot 
accept responsibility for any loss or damage arising from the use of this email 
or attachments. 
This disclaimer was added by Policy Patrol: http://www.policypatrol.com/

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org



RE: Here's an Idea for Re-Syncing Master and Slave During Production Hours without Interrupting Users (Much)

2009-12-04 Thread Gavin Towey
> I would never have any confidence that the replication is solid
> enough to use the slave server for backup purposes.

I agree completely there.  That's the other reason I like filesystem snapshots 
is that it allows you to take a backup from the master relatively painlessly.

-Original Message-
From: Robinson, Eric [mailto:eric.robin...@psmnv.com]
Sent: Friday, December 04, 2009 1:24 PM
To: Gavin Towey; Tom Worster; mysql@lists.mysql.com
Subject: RE: Here's an Idea for Re-Syncing Master and Slave During Production 
Hours without Interrupting Users (Much)

> I would say that it's very important to know why data
> is getting out of sync between your master and slave.

Ultimately, I agree. But since it's a canned application, getting to
that point might be hard, and once it is resolved, new issues might
arise. I would never have any confidence that the replication is solid
enough to use the slave server for backup purposes. (Which, by the way,
is the real reason I'm doing this. In the middle of the night, when
there are few users on the system, I want to backup the slave, but first
I want to make sure I have a 100% reliable copy of the data.)

> There are ways to resync data that don't involve all
> this as well:  Maatkit has some tools

I've looked with great interest at Maatkit, but their tools are replete
with warnings about dangers, bugs, and crashes. They certainly do not
inspire confidence.

--
Eric Robinson



Disclaimer - December 4, 2009
This email and any files transmitted with it are confidential and intended 
solely for Gavin Towey,Tom Worster,my...@lists.mysql.com. If you are not the 
named addressee you should not disseminate, distribute, copy or alter this 
email. Any views or opinions presented in this email are solely those of the 
author and might not represent those of . Warning: Although  has taken 
reasonable precautions to ensure no viruses are present in this email, the 
company cannot accept responsibility for any loss or damage arising from the 
use of this email or attachments.
This disclaimer was added by Policy Patrol: http://www.policypatrol.com/

This message contains confidential information and is intended only for the 
individual named.  If you are not the named addressee, you are notified that 
reviewing, disseminating, disclosing, copying or distributing this e-mail is 
strictly prohibited.  Please notify the sender immediately by e-mail if you 
have received this e-mail by mistake and delete this e-mail from your system. 
E-mail transmission cannot be guaranteed to be secure or error-free as 
information could be intercepted, corrupted, lost, destroyed, arrive late or 
incomplete, or contain viruses. The sender therefore does not accept liability 
for any loss or damage caused by viruses or errors or omissions in the contents 
of this message, which arise as a result of e-mail transmission. [FriendFinder 
Networks, Inc., 220 Humbolt court, Sunnyvale, CA 94089, USA, FriendFinder.com

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org



RE: Here's an Idea for Re-Syncing Master and Slave During Production Hours without Interrupting Users (Much)

2009-12-04 Thread Robinson, Eric
> I would say that it's very important to know why data 
> is getting out of sync between your master and slave. 

Ultimately, I agree. But since it's a canned application, getting to
that point might be hard, and once it is resolved, new issues might
arise. I would never have any confidence that the replication is solid
enough to use the slave server for backup purposes. (Which, by the way,
is the real reason I'm doing this. In the middle of the night, when
there are few users on the system, I want to backup the slave, but first
I want to make sure I have a 100% reliable copy of the data.)

> There are ways to resync data that don't involve all 
> this as well:  Maatkit has some tools

I've looked with great interest at Maatkit, but their tools are replete
with warnings about dangers, bugs, and crashes. They certainly do not
inspire confidence. 

--
Eric Robinson 



Disclaimer - December 4, 2009 
This email and any files transmitted with it are confidential and intended 
solely for Gavin Towey,Tom Worster,my...@lists.mysql.com. If you are not the 
named addressee you should not disseminate, distribute, copy or alter this 
email. Any views or opinions presented in this email are solely those of the 
author and might not represent those of . Warning: Although  has taken 
reasonable precautions to ensure no viruses are present in this email, the 
company cannot accept responsibility for any loss or damage arising from the 
use of this email or attachments. 
This disclaimer was added by Policy Patrol: http://www.policypatrol.com/

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org



Re: Here's an Idea for Re-Syncing Master and Slave During Production Hours without Interrupting Users (Much)

2009-12-04 Thread Tom Worster
On 12/4/09 3:14 PM, "Gavin Towey"  wrote:

> I would say that it's very important to know why data is getting out of sync
> between your master and slave.  Fixing those root causes would eliminate the
> need for this.

i very much agree. the only instances of slaves getting out of whack that
i've experienced was when i screwed something up administratively.

> There are cases where non-deterministic queries will produce
> different results, but that's what row based replication is supposed to solve
> =)

16.3.1 lists some interesting cases to consider:

http://dev.mysql.com/doc/refman/5.0/en/replication-features.html


> There are ways to resync data that don't involve all this as well:  Maatkit
> has some tools that compare data between servers, and can fix them with
> queries.  No stopping the slave or locking the master necessary.  I've used
> them in production with good results.

thanks for the pointer. looks handy.



-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org



RE: Here's an Idea for Re-Syncing Master and Slave During Production Hours without Interrupting Users (Much)

2009-12-04 Thread Gavin Towey
I think he's trying to say that this method wouldn't work for innodb, unless 
you copied files from an LVM snapshot, or something similar.

I would say that it's very important to know why data is getting out of sync 
between your master and slave.  Fixing those root causes would eliminate the 
need for this.  There are cases where non-deterministic queries will produce 
different results, but that's what row based replication is supposed to solve =)

There are ways to resync data that don't involve all this as well:  Maatkit has 
some tools that compare data between servers, and can fix them with queries.  
No stopping the slave or locking the master necessary.  I've used them in 
production with good results.

Regards,
Gavin Towey



-Original Message-
From: Robinson, Eric [mailto:eric.robin...@psmnv.com]
Sent: Friday, December 04, 2009 9:00 AM
To: Tom Worster; mysql@lists.mysql.com
Subject: RE: Here's an Idea for Re-Syncing Master and Slave During Production 
Hours without Interrupting Users (Much)

> (1) innodb?

It's an off-the-shelf application that uses MyISAM tables. It is
possible to convert to innodb, but I have not been sold on innodb in
terms of its  performance characteristics for this particular
application. Maybe I've been reading the wrong stuff. Do you have
general thoughts on the differences with respect to performance?

> (2) why delete slave logs when you can
> restart the slave with --skip-slave and
> then use CHANGE MASTER TO?

Well... I guess mainly because I didn't know about that option! I
thought I needed to "fake out" mysql on this, but it sounds like I can
just do 'flush tables with read lock;reset master;' on the master and
'change master to...;' on the slave. So cool. Thanks for the input!

--
Eric Robinson


Disclaimer - December 4, 2009
This email and any files transmitted with it are confidential and intended 
solely for Tom Worster,my...@lists.mysql.com. If you are not the named 
addressee you should not disseminate, distribute, copy or alter this email. Any 
views or opinions presented in this email are solely those of the author and 
might not represent those of . Warning: Although  has taken reasonable 
precautions to ensure no viruses are present in this email, the company cannot 
accept responsibility for any loss or damage arising from the use of this email 
or attachments.
This disclaimer was added by Policy Patrol: http://www.policypatrol.com/

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql?unsub=gto...@ffn.com


This message contains confidential information and is intended only for the 
individual named.  If you are not the named addressee, you are notified that 
reviewing, disseminating, disclosing, copying or distributing this e-mail is 
strictly prohibited.  Please notify the sender immediately by e-mail if you 
have received this e-mail by mistake and delete this e-mail from your system. 
E-mail transmission cannot be guaranteed to be secure or error-free as 
information could be intercepted, corrupted, lost, destroyed, arrive late or 
incomplete, or contain viruses. The sender therefore does not accept liability 
for any loss or damage caused by viruses or errors or omissions in the contents 
of this message, which arise as a result of e-mail transmission. [FriendFinder 
Networks, Inc., 220 Humbolt court, Sunnyvale, CA 94089, USA, FriendFinder.com

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org



Re: Here's an Idea for Re-Syncing Master and Slave During Production Hours without Interrupting Users (Much)

2009-12-04 Thread Tom Worster
On 12/4/09 11:59 AM, "Robinson, Eric"  wrote:

>> (2) why delete slave logs when you can
>> restart the slave with --skip-slave and
>> then use CHANGE MASTER TO?
> 
> Well... I guess mainly because I didn't know about that option! I
> thought I needed to "fake out" mysql on this, but it sounds like I can
> just do 'flush tables with read lock;reset master;' on the master and
> 'change master to...;' on the slave. So cool. Thanks for the input!

16.1.1 is probably my favorite chapter of the manual. 16.1.1.8 is
particularly worth a read.

http://dev.mysql.com/doc/refman/5.0/en/replication-howto-existingdata.html



-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org



RE: Here's an Idea for Re-Syncing Master and Slave During Production Hours without Interrupting Users (Much)

2009-12-04 Thread Robinson, Eric
> (1) innodb? 

It's an off-the-shelf application that uses MyISAM tables. It is
possible to convert to innodb, but I have not been sold on innodb in
terms of its  performance characteristics for this particular
application. Maybe I've been reading the wrong stuff. Do you have
general thoughts on the differences with respect to performance?

> (2) why delete slave logs when you can 
> restart the slave with --skip-slave and 
> then use CHANGE MASTER TO?

Well... I guess mainly because I didn't know about that option! I
thought I needed to "fake out" mysql on this, but it sounds like I can
just do 'flush tables with read lock;reset master;' on the master and
'change master to...;' on the slave. So cool. Thanks for the input!

--
Eric Robinson


Disclaimer - December 4, 2009 
This email and any files transmitted with it are confidential and intended 
solely for Tom Worster,my...@lists.mysql.com. If you are not the named 
addressee you should not disseminate, distribute, copy or alter this email. Any 
views or opinions presented in this email are solely those of the author and 
might not represent those of . Warning: Although  has taken reasonable 
precautions to ensure no viruses are present in this email, the company cannot 
accept responsibility for any loss or damage arising from the use of this email 
or attachments. 
This disclaimer was added by Policy Patrol: http://www.policypatrol.com/

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org



Re: Here's an Idea for Re-Syncing Master and Slave During Production Hours without Interrupting Users (Much)

2009-12-04 Thread Tom Worster
i have two questions. (1) innodb? (2) why delete slave logs when you can
restart the slave with --skip-slave and then use CHANGE MASTER TO?

tom

On 12/4/09 6:34 AM, "Robinson, Eric"  wrote:

>  
> Let's face it, sometimes the master and slave get out of sync, even when
> 'show slave status' and 'show master status' indicate that all is well.
> And sometimes it is not feasible to wait until after production hours to
> resync them. We've been working on a method to do an emergency
> hot-resync during production hours with little or no user downtime. What
> do you guys think of this approach? It's only for Linux, though...
> 
> 1. Shut down the slave and remove its replication logs (master.info and
> *relay* files).
> 
> 2. Do an initial rsync of the master to the slave. Using rsync's
> bit-differential algorithm, this quickly copies most of the changed data
> and can be safely be done against a live database. This initial rsync is
> done before the next step to minimize the time during which the tables
> will be read-locked.
> 
> 3. Do a 'flush tables with read lock;reset master' on the master server.
> At this point, user apps may freeze briefly during inserts or updates.
> 
> 4. Do a second rsync, which goes very fast because very little data has
> changed between steps 2 and 3.
> 
> 5. Unlock the master tables.
> 
> 6. Restart the slave.
> 
> When you're done, you have a 100% binary duplicate of the master
> database on the slave, with no worries that some queries got missed
> somewhere. The master was never stopped and users were not severely
> impacted. (Mileage may vary, of course.)
> 
> We've tried this a few times and it has seemed to work well in most
> cases. We had once case where the slave SQL thread did not want to
> restart afterwards and we had to do the whole thing again, only we
> stopped the master the second time. Not yet sure what that was all
> about, but I think it may have been a race issue of some kind. We're
> still exploring it.
> 
> Anyway, comments would be appreciated.
> 
> --
> Eric Robinson



-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org