Hi, My company runs a moderately large and loaded MySQL replication network across four Solaris machines. While upgrading from a fairly old 3.23.4x installation to 3.23.52 we've encountered a problem with replication and binlog rotation. One of the machines is simultaneously slave to one server and master to two others; on that machine, if a "RESET MASTER" or "PURGE MASTER LOGS" is executed while its slave thread is performing a query, mysqld reproducibly dies.
Looking in the mysql list archives, I see one other report of something similar, but no detailed bug reports or resolution. A gdb backtrace, details of the machine and the mysqld's configuration follow: MySQL 3.23.52, compiled for debugging purposes with # CC=gcc CFLAGS="-O2" CXX=gcc CXXFLAGS="-O2 -felide-constructors \ -fno-exceptions -fno-rtti" ./configure --prefix=/usr/local/mysql \ --with-debug=full --with-extra-charsets=complex --with-innodb using gcc 2.95.3 The machine the crash was reproduced on is a 4-processor Sun E4500 with 2G of memory and Solaris 8; it was originally observed on a different E4500 with 8 processors and 8G of memory, running the stock 3.23.52 MySQL-MAX binary for Sparc Solaris 8. # uname -a SunOS testdb 5.8 Generic_108528-16 sun4u sparc SUNW,Ultra-Enterprise (I've cut off the end of the query text in the backtrace for company data security, but the query itself does not seem to be the issue - the crash happens equally on other inserts and updates on other tables) # gdb /usr/local/mysql/libexec/mysqld core GNU gdb 5.0 Copyright 2000 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "sparc-sun-solaris2.8"... Core was generated by `/usr/local/mysql/libexec/mysqld --basedir=/usr/local/mysql --datadir=/database/'. Reading symbols from /usr/lib/libdl.so.1...done. Loaded symbols for /usr/lib/libdl.so.1 Reading symbols from /usr/lib/libpthread.so.1...done. Loaded symbols for /usr/lib/libpthread.so.1 Reading symbols from /usr/lib/libthread.so.1...done. Loaded symbols for /usr/lib/libthread.so.1 Reading symbols from /usr/lib/libz.so...done. Loaded symbols for /usr/lib/libz.so Reading symbols from /usr/lib/libcrypt_i.so.1...done. Loaded symbols for /usr/lib/libcrypt_i.so.1 Reading symbols from /usr/lib/libgen.so.1...done. Loaded symbols for /usr/lib/libgen.so.1 Reading symbols from /usr/lib/libsocket.so.1...done. Loaded symbols for /usr/lib/libsocket.so.1 Reading symbols from /usr/lib/libnsl.so.1...done. Loaded symbols for /usr/lib/libnsl.so.1 Reading symbols from /usr/lib/libm.so.1...done. Loaded symbols for /usr/lib/libm.so.1 Reading symbols from /usr/lib/libc.so.1...done. Loaded symbols for /usr/lib/libc.so.1 Reading symbols from /usr/lib/libmp.so.2...done. Loaded symbols for /usr/lib/libmp.so.2 Reading symbols from /usr/platform/SUNW,Ultra-Enterprise/lib/libc_psr.so.1... done. Loaded symbols for /usr/platform/SUNW,Ultra-Enterprise/lib/libc_psr.so.1 Reading symbols from /usr/lib/nss_files.so.1...done. Loaded symbols for /usr/lib/nss_files.so.1 ---Type <return> to continue, or q <return> to quit--- #0 0xff339794 in __sigprocmask () from /usr/lib/libthread.so.1 (gdb) backtrace full #0 0xff339794 in __sigprocmask () from /usr/lib/libthread.so.1 No symbol table info available. #1 0xff32e9a8 in _resetsig () from /usr/lib/libthread.so.1 No symbol table info available. #2 0xff32e148 in _sigon () from /usr/lib/libthread.so.1 No symbol table info available. #3 0xff331188 in _thrp_kill () from /usr/lib/libthread.so.1 No symbol table info available. #4 0xbbd78 in write_core (sig=11) at stacktrace.c:220 No locals. #5 0x4415c in handle_segfault (sig=11) at mysqld.cc:1326 No locals. #6 0xff33b838 in __sighndlr () from /usr/lib/libthread.so.1 No symbol table info available. #7 <signal handler called> No symbol table info available. #8 0xff240620 in memcpy () from /usr/platform/SUNW,Ultra-Enterprise/lib/libc_psr.so.1 No symbol table info available. #9 0x7d160 in Log_event::write_header (this=0xfe7718d0, file=0x333990) at log_event.cc:66 buf = "·â\232=\002ê\003\000\000\002\001\000" pos = 0xfe771201 "\002\001" ---Type <return> to continue, or q <return> to quit--- tmp = -25749296 #10 0x7d054 in Log_event::write (this=0xfe7718d0, file=0x333990) at log_event.cc:50 No locals. #11 0x7d038 in Query_log_event::write (this=0xfe7718d0, file=0x333990) at log_event.cc:45 No locals. #12 0x7bf14 in MYSQL_LOG::write (this=0x333908, event_info=0xfe7718d0) at log.cc:726 thd = (THD *) 0x393738 file = (IO_CACHE *) 0x333990 error = true should_rotate = false #13 0x74a24 in mysql_insert (thd=0x393738, table_list=0x333800, fields=@0x32d400, values_list=@0x3938d0, duplic=DUP_ERROR, lock_type=TL_WRITE_CONCURRENT_INSERT) at sql_insert.cc:270 qinfo = {<Log_event> = {when = 1033560759, exec_time = 5565, valid_exec_time = 1, server_id = 1002, _vptr. = 0x32d7c0}, data_buf = 0x0, query = 0x83c0a5f "insert into active_sessions ("..., db = 0x83c0a50 "some_db", q_len = 219, db_len = 14, error_code = 0, thread_id = 6, thd = 0x393738, cache_stmt = false} ---Type <return> to continue, or q <return> to quit--- error = 0 log_on = false using_transactions = false value_count = 3331072 save_time_stamp = 0 counter = 0 id = 0 info = {records = 1, deleted = 0, copied = 1, error = 858994224, handle_duplicates = DUP_ERROR, escape_char = 3203072} table = (TABLE *) 0x8420180 its = {<base_list_iterator> = {list = 0x3938d0, el = 0x840b994, prev = 0x840b994, current = 0x0}, <No data fields>} values = (List_item *) 0x0 query = 0xfe7718d0 "=\232â·" _db_func_ = 0x32d400 "" _db_file_ = 0xfe771888 "Rows matched: 1 Changed: 1 Warnings: 0" _db_level_ = 138460304 _db_framep_ = (char **) 0x393738 #14 0x4d1a0 in mysql_execute_command () at sql_parse.cc:1581 res = 0 thd = (THD *) 0x393738 lex = (LEX *) 0x393858 tables = (TABLE_LIST *) 0x840b658 ---Type <return> to continue, or q <return> to quit--- _db_func_ = 0xfe771937 "" _db_file_ = 0xfe771a08 "" _db_level_ = 7805556 _db_framep_ = (char **) 0x36300036 #15 0x4eed4 in mysql_parse (thd=0x393738, inBuf=0x83c0a5f "insert into active_sessions ("..., length=3749976) at sql_parse.cc:2364 _db_func_ = 0x0 _db_file_ = 0x3d9b0683 <Address 0x3d9b0683 out of bounds> _db_level_ = 0 _db_framep_ = (char **) 0x103 lex = (LEX *) 0x393858 #16 0xb6348 in exec_event (thd=0x393738, net=0xdb, mi=0x3396b0, event_len=258) at slave.cc:987 q_len = 219 expected_error = 0 actual_error = 0 type_code = 3346432 ev = (class Log_event *) 0x83ddf50 llbuff = "\000\000\000\t\000\000\000f\000\000\000\003\000\000\001\022\000\000\000\001\000" ---Type <return> to continue, or q <return> to quit--- #17 0xb7538 in handle_slave (arg=0x30ac00) at slave.cc:1444 suppress_warnings = false event_len = 258 thd = (THD *) 0x393738 mysql = (MYSQL *) 0x83a00f0 llbuff = "780826502", '\000' <repeats 12 times> retried_once = false last_failed_pos = 0 _db_func_ = 0x0 _db_file_ = 0x0 _db_level_ = 0 _db_framep_ = (char **) 0x0 (gdb) # cat /etc/my.cnf [mysqld] server-id = 2002 master-host = 192.168.x.x master-user = replication master-password = replication master-port = 3306 log-bin log-slave-updates skip-slave-start skip-locking skip-innodb skip-bdb core-file datadir = /database/data log-slow-queries = /database/data/slow-queries.log set-variable = open_files_limit=30000 set-variable = max_connections=1400 set-variable = max_user_connections=1200 set-variable = connect_timeout=20 set-variable = key_buffer=128M set-variable = sort_buffer=16M set-variable = record_buffer=16M set-variable = table_cache=4000 set-variable = back_log=300 set-variable = join_buffer=8M set-variable = wait_timeout=1800 set-variable = max_allowed_packet=100M Tail of error log: 021002 14:45:05 Slave thread initialized 021002 14:45:05 Slave: connected to master '[EMAIL PROTECTED]:3306', replication started in log 'yyy-bin.017' at position 780826502 mysqld got signal 11; This could be because you hit a bug. It is also possible that this binary or one of the libraries it was linked agaist is corrupt, improperly built, or misconfigured. This error can also be caused by malfunctioning hardware. We will try our best to scrape up some info that will hopefully help diagnose the problem, but since we have already crashed, something is definitely wrong and this may fail key_buffer_size=134213632 record_buffer=16773120 sort_buffer=16777180 max_used_connections=0 max_connections=1400 threads_connected=1 It is possible that mysqld could use up to key_buffer_size + (record_buffer + sort_buffer)*max_connections = 4057578 K bytes of memory Hope that's ok, if not, decrease some variables in the equation 021002 14:45:31 mysqld restarted /usr/local/mysql/libexec/mysqld: ready for connections --------------------------------------------------------------------- Before posting, please check: http://www.mysql.com/manual.php (the manual) http://lists.mysql.com/ (the list archive) To request this thread, e-mail <[EMAIL PROTECTED]> To unsubscribe, e-mail <[EMAIL PROTECTED]> Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php