Re: SOLVED: Problem with *very* slow replication, FreeBSD 6.2
On Sat, 3 Nov 2007, bob b wrote: Good to hear that you found the problem. The only remaining puzzle is why the replica reported that it was up to date when it was several binlogs behind. Possibly the replica was always caught up with the last entry from the very slow link. Perhaps you should report this as a bug? The replication mechanism should be able to check the last binlog being written on the master and report that difference? Bob Bankay (from home) The reporting confusion is due to the fact that the "seconds behind master" figure is based on the relay logs and how long it will take to catch up. For example, I had replication shut down for 45 minutes wile feeding millions of writes into the master. On slave restart the binlog dump started, and it went fast. As the relay log grew so did seconds behind master. One the relay log was up to date, the "seconds behind master" was based on the execution rate and backlog. (Somthing like 12 minutes and counting down) So, a slave is down for 8hrs. It comes online and pulls the binlog in 120 seconds. The "seconds behind master" does not reflect 8hrs, but how many seconds (at current processing rate) before the slave finishes the relay logs. The "seconds behind master" value is really "seconds until currency with the relay logs" and should prolly be documented as such. It would be nice if there was a way for the slave to find the actual current master position and compare with the local state though. -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
SOLVED: Problem with *very* slow replication, FreeBSD 6.2
An update for those actually paying attention. I have been fighing unusual performance issues with replication between FreeBSD 6.2 machines. The unusual part is that while replication would never top 10 writes per second (even while the master was taking hundres of writes per second), the slave always reported zero seconds behind. This is on servers with less than 1% CPU used. The actual problem was not with writing the binlog, or the slave SQL thread, but the actual transfer of the binlog across the network. After days of running, the slave would be many Gigs behind the master. While debugging I tried many things including updating from 5.1.19 to 5.1.22, rebuilding with WITH_PROC_SCOPE_PTH=yes, and even rebuilding using linuxthreads. None of this worked. The problem was rfc1323... Window scaling *SHOULD* have improved performance given that this is a jumbo frame GigE network. For reasons I don't understand, with rfc1323 enabled the data transfer rate for replication is limited to a ~ 200Kbyte/sec (I do not see the same slowdown for http or scp transfers). To verify I rebuilt both systems back to default (native threads), re-inited the Master<->Master replication loop, shutdown one of the servers and inserted several million records on the live system (about 1.8Gbyte of binlog). On restarting the second system it read the binlog into the relay log at 20 - 25 Mbyte/sec. The seconds behind master value showed sane values, and it processed the relay-log backlog at about 6600 writes/sec until finished. Further testing included 3,000 inserts/sec to each of the servers (6,000/sec total) with the master/master replication loop active. During a run of 10,000,000 inserts to each server replication was never more than 2 seconds behind. On Tue, 30 Oct 2007, Christopher E. Brown wrote: On Thu, 25 Oct 2007, [EMAIL PROTECTED] wrote: Not sure that I get the whole picture. We have been running replication since about 4.0 and we have been through several upgrades and are now at 5.0.27. The 'show slave status' always gives us an accurate reflection of where it is at which is usually 0 seconds behind. Occasionally, it falls behind if the master is really busy (>2200 q/s with about 70% being updates/deletes/inserts). At those times the slave tops out at about 1200 q/s of which most are db mods of some kind and some selects since we have reports running against the replica and it will fall behind temporarily. Can you send show slave status and show master status as well and typical mytop outputs for master and slave? That might let me be able to provide more help. Bob Unfortunatly I had to tear down replication as it was causing problems with the master. (The master will not delete binlogs that a slave is still loading, when the slave is 40 file behind disk gets short). CPU load was near zero on both systems (98% idle or better). Disk load is minimal. The slave is always up to date with relay file processing and reporting zero seconds behind. In short, everything looks fine. What happens is that the master -> slave binlog feed runs very slow (no more than abount 10 writes/sec). So, afer a few days the slave is still reporting zero seconds behind, and it is zero seconds behind the relay log. The problem is that while the master is currently writing binlog 650, the slave is actually zero seconds behind the feed, but the binlog feed has fallen 20 - 30 files behind (our binlog rolls at 256M). Since there is no load issue, I expect there is a timing or trigger issue with the master side proc doing the binlog dump, or the slave side receiving it. I can stop/start replication and/or reload both servers, it still holds. I see the replication restart, with the slave running zero seconds behind the relay log, the binlog feed starts up right where it left off but the feed only runs at about 10 writes a second. Are your running native or LinuxThreads? This is smelling like threading issue to me (we are running FreeBSD 6.2 with native threading and 5.1.19). The exact same setup was pre-built on Linux systems (2.6.x Slackware) before being built out on the production systems (FreeBSD 6.2). During the testing 1000 writes/sec were no problem (small/simple table, fits in memory). When I forced a backlog of approx 2GB by shuttong down the slave on restart the binlog -> relay log feed ran at over 25MB/sec until caught up. -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
Re: Problem with *very* slow replication
On Thu, 25 Oct 2007, [EMAIL PROTECTED] wrote: Not sure that I get the whole picture. We have been running replication since about 4.0 and we have been through several upgrades and are now at 5.0.27. The 'show slave status' always gives us an accurate reflection of where it is at which is usually 0 seconds behind. Occasionally, it falls behind if the master is really busy (>2200 q/s with about 70% being updates/deletes/inserts). At those times the slave tops out at about 1200 q/s of which most are db mods of some kind and some selects since we have reports running against the replica and it will fall behind temporarily. Can you send show slave status and show master status as well and typical mytop outputs for master and slave? That might let me be able to provide more help. Bob Unfortunatly I had to tear down replication as it was causing problems with the master. (The master will not delete binlogs that a slave is still loading, when the slave is 40 file behind disk gets short). CPU load was near zero on both systems (98% idle or better). Disk load is minimal. The slave is always up to date with relay file processing and reporting zero seconds behind. In short, everything looks fine. What happens is that the master -> slave binlog feed runs very slow (no more than abount 10 writes/sec). So, afer a few days the slave is still reporting zero seconds behind, and it is zero seconds behind the relay log. The problem is that while the master is currently writing binlog 650, the slave is actually zero seconds behind the feed, but the binlog feed has fallen 20 - 30 files behind (our binlog rolls at 256M). Since there is no load issue, I expect there is a timing or trigger issue with the master side proc doing the binlog dump, or the slave side receiving it. I can stop/start replication and/or reload both servers, it still holds. I see the replication restart, with the slave running zero seconds behind the relay log, the binlog feed starts up right where it left off but the feed only runs at about 10 writes a second. Are your running native or LinuxThreads? This is smelling like threading issue to me (we are running FreeBSD 6.2 with native threading and 5.1.19). The exact same setup was pre-built on Linux systems (2.6.x Slackware) before being built out on the production systems (FreeBSD 6.2). During the testing 1000 writes/sec were no problem (small/simple table, fits in memory). When I forced a backlog of approx 2GB by shuttong down the slave on restart the binlog -> relay log feed ran at over 25MB/sec until caught up. -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
Re: remote connect crash
On Thu, 9 Jan 2003, Dmitry V. Sokolov wrote: > Good day, > could you help me to solve this problem? > > MySQL server segmentation faults when remote mysql client > tries to connect on source and binary distributions. Local > client connect does not cause any problems whatsoever. The server dies when the connecting hosts IP fails a reverse lookup. According to a note I received this morning it was fixed last night in the source tree, and 4.0.9 is being built for release. -- I route, therefore you are. - Before posting, please check: http://www.mysql.com/manual.php (the manual) http://lists.mysql.com/ (the list archive) To request this thread, e-mail <[EMAIL PROTECTED]> To unsubscribe, e-mail <[EMAIL PROTECTED]> Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php
Re: Re[2]: mysql 4.0.8- crash on TCP connection
On Thu, 9 Jan 2003, Gelu Gogancea wrote: > > Functions gethostby* ,from glibc, work directly with the /etc/hosts file.If > this functions didn't find an entry for the client, will be crashed. > I try to find in the Andrew e-mail if he has installed the glibc 2.2.x but i > don't see nothing about it.What i see is, he use 2.95.x which is declared by > MySQL like unstable.In this context can be a coincidence what is happened. > Also i don't find difference in MYSQL daemon source code(hostname.cc) > between 4.0.7 and 4.0.8. > Regards, > > Gelu No, the glibc gethostby* will walk the tree defined in hosts.conf, normally files,dns. A non-find in /etc/hosts followed by a NXDOMAIN from DNS results in a negative return from the gethostby* call. *This should never cause a crash*, it is not a failure in the resolver code, it is a negative result. As to gcc, 2.95.3 is fine and stable, the notes you mention refer to gcc 2.96, an *unofficial* gcc release, a heavily patched monster released by RedHat and (for a while) used in alot of places. -- I route, therefore you are. - Before posting, please check: http://www.mysql.com/manual.php (the manual) http://lists.mysql.com/ (the list archive) To request this thread, e-mail <[EMAIL PROTECTED]> To unsubscribe, e-mail <[EMAIL PROTECTED]> Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php
Re: Re[2]: mysql 4.0.8- crash on TCP connection
On Thu, 9 Jan 2003, Gelu Gogancea wrote: > Hi, > This is a glibc problem.In this case you can start mysql daemon with option > "--skip-name-resolve" and in this situation is no need to add the IP address > of every client in hosts file.The disadvantage is that the client can not > connect to the server using host alias. > Regards, > > Gelu Could you clarify "This is a glibc problem"? A known standard glibc 2.2.5 against which every other piece of software functions correctly, even when receiving null returns on reverse lookups, but 4.0.8 (both precompiled binary and locally build) crashes on a null return. Specially when (according to other reports) 4.0.7 functions correctly. -- I route, therefore you are. - Before posting, please check: http://www.mysql.com/manual.php (the manual) http://lists.mysql.com/ (the list archive) To request this thread, e-mail <[EMAIL PROTECTED]> To unsubscribe, e-mail <[EMAIL PROTECTED]> Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php
Re: mysql 4.0.8- crash on TCP connection
On Thu, 9 Jan 2003, Gelu Gogancea wrote: > Hi, > What OS you use ? > > Regards, > > Gelu > _ > G.NET SOFTWARE COMPANY > > Permanent e-mail address : [EMAIL PROTECTED] > [EMAIL PROTECTED] > - Original Message - > From: "Andrew Sitnikov" <[EMAIL PROTECTED]> > To: <[EMAIL PROTECTED]> > Sent: Wednesday, January 08, 2003 8:14 PM > Subject: mysql 4.0.8- crash on TCP connection > > > > Hello mysql, > > > > I try use 4.0.8 (max & standard)in our production box, > > and it was crash every TCP connection, For 4.0.7 (standard) i has over > 20 days uptime. This sounds like what I just submitted a bug report for. Connections to 4.0.8 (compiled locally or binary distro) cause a server crash if the IP of the client is not resolvable in DNS. One can add an entry to the hosts file for certain IPs to stop this, however this still leaves the fact that ANY IP that can connect to the server can crash it if there is no reverse entry. -- I route, therefore you are. - Before posting, please check: http://www.mysql.com/manual.php (the manual) http://lists.mysql.com/ (the list archive) To request this thread, e-mail <[EMAIL PROTECTED]> To unsubscribe, e-mail <[EMAIL PROTECTED]> Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php
Possable bug, remote mysqld crash and DoS
Description: MySQL 4.0.8, both compiled my me and the official release version crashes whenever receiving a network connection from a system without a DNS entry. Connecting from a system that resolves on reverse (eithor from DNS or a local hosts file entry) works find. This system is a Slackware 8.1 install with all currect updates. I do not know if this is a mysqld internal thing or some interaction with the system resolver in glibc 2.2.5, as unfort even a staticly compiled glibc binary uses the system resolver. This of course concerns me, there is a large potential for remote DoS here. How-To-Repeat: Install 4.0.8, run the install db script and fire it up. Attempt to connect from a host that will not reverse resolve. Even a telnet to port 3306 crashed the daemon. The dump from mysqld is included at the bottom of the message. Fix: Unknown Submitter-Id: [EMAIL PROTECTED] Originator: Organization: MySQL support: none Synopsis: Severity: serious Priority: high Category: mysql Class: sw-bug Release:mysql-4.0.8-gamma-standard (Official MySQL-standard binary) C compiler:2.95.3 C++ compiler: 2.95.3 Environment: System: Linux inlet 2.4.20 #2 Tue Dec 24 08:59:29 AKST 2002 i686 unknown Architecture: i686 Some paths: /usr/bin/perl /usr/bin/make /usr/bin/gmake /usr/bin/gcc /usr/bin/cc GCC: Reading specs from /usr/lib/gcc-lib/i386-slackware-linux/2.95.3/specs gcc version 2.95.3 20010315 (release) Compilation info: CC='gcc' CFLAGS='-O2 -mcpu=pentiumpro' CXX='gcc' CXXFLAGS='-O2 -mcpu=pentiumpro -felide-constructors' LDFLAGS='' ASFLAGS='' LIBC: lrwxrwxrwx1 root root 13 Dec 23 18:45 /lib/libc.so.6 -> libc-2.2.5.so -rwxr-xr-x1 root root 1237712 Jul 30 14:15 /lib/libc-2.2.5.so -rw-r--r--1 root root 24984184 Jul 30 12:55 /usr/lib/libc.a -rw-r--r--1 root root 178 Jul 30 12:56 /usr/lib/libc.so Configure command: ./configure '--prefix=/usr/local/mysql' '--with-comment=Official MySQL-standard binary' '--with-extra-charsets=complex' '--with-server-suffix=-standard' '--enable-thread-safe-client' '--enable-local-infile' '--enable-assembler' '--disable-shared' '--with-client-ldflags=-all-static' '--with-mysqld-ldflags=-all-static' '--with-innodb' 'CFLAGS=-O2 -mcpu=pentiumpro' 'CXXFLAGS=-O2 -mcpu=pentiumpro -felide-constructors' 'CXX=gcc' mysqld got signal 11; This could be because you hit a bug. It is also possible that this binary or one of the libraries it was linked against is corrupt, improperly built, or misconfigured. This error can also be caused by malfunctioning hardware. We will try our best to scrape up some info that will hopefully help diagnose the problem, but since we have already crashed, something is definitely wrong and this may fail. key_buffer_size=33554432 read_buffer_size=131072 sort_buffer_size=1048568 max_used_connections=0 max_connections=100 threads_connected=1 It is possible that mysqld could use up to key_buffer_size + (read_buffer_size + sort_buffer_size)*max_connections = 147967 K bytes of memory Hope that's ok; if not, decrease some variables in the equation. thd=0x8717a50 Attempting backtrace. You can use the following information to find out where mysqld died. If you see no messages after this, something went terribly wrong... Cannot determine thread, fp=0xbfe7f608, backtrace may not be correct. Stack range sanity check OK, backtrace follows: 0x806f3bb 0x8269928 0x807724c 0x8077665 0x82670dc 0x829c67a New value of fp=(nil) failed sanity check, terminating stack trace! Please read http://www.mysql.com/doc/U/s/Using_stack_trace.html and follow instructions on how to resolve the stack trace. Resolved stack trace is much more helpful in diagnosing the problem, so please do resolve it Trying to get some variables. Some pointers may be invalid and cause the dump to abort... thd->query at (nil) is invalid pointer thd->thread_id=1 Successfully dumped variables, if you ran with --log, take a look at the details of what thread 1 did to cause the crash. In some cases of really bad corruption, the values shown above may be invalid. The manual page at http://www.mysql.com/doc/C/r/Crashing.html contains information that should help you find out what is causing the crash. - Before posting, please check: http://www.mysql.com/manual.php (the manual) http://lists.mysql.com/ (the list archive) To request this thread, e-mail <[EMAIL PROTECTED]> To unsubscribe, e-mail <[EMAIL PROTECTED]> Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php