Hi,

My company runs a moderately large and loaded MySQL replication network
across four Solaris machines. While upgrading from a fairly old 3.23.4x
installation to 3.23.52 we've encountered a problem with replication and
binlog rotation.
One of the machines is simultaneously slave to one server and master to
two others; on that machine, if a "RESET MASTER" or "PURGE MASTER LOGS"
is executed while its slave thread is performing a query, mysqld
reproducibly dies.


Looking in the mysql list archives, I see one other report of something
similar, but no detailed bug reports or resolution.


A gdb backtrace, details of the machine and the mysqld's configuration
follow:


MySQL 3.23.52, compiled for debugging purposes with

# CC=gcc CFLAGS="-O2" CXX=gcc CXXFLAGS="-O2 -felide-constructors \
   -fno-exceptions -fno-rtti" ./configure --prefix=/usr/local/mysql \
   --with-debug=full --with-extra-charsets=complex --with-innodb

using gcc 2.95.3

The machine the crash was reproduced on is a 4-processor Sun E4500 with
2G of memory and Solaris 8; it was originally observed on a different
E4500 with 8 processors and 8G of memory, running the stock 3.23.52
MySQL-MAX binary for Sparc Solaris 8.


# uname -a
SunOS testdb 5.8 Generic_108528-16 sun4u sparc SUNW,Ultra-Enterprise


(I've cut off the end of the query text in the backtrace for company
data security, but the query itself does not seem to be the issue - the
crash happens equally on other inserts and updates on other tables)

# gdb /usr/local/mysql/libexec/mysqld core
GNU gdb 5.0
Copyright 2000 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "sparc-sun-solaris2.8"...
Core was generated by `/usr/local/mysql/libexec/mysqld --basedir=/usr/local/mysql 
--datadir=/database/'.
Reading symbols from /usr/lib/libdl.so.1...done.
Loaded symbols for /usr/lib/libdl.so.1
Reading symbols from /usr/lib/libpthread.so.1...done.
Loaded symbols for /usr/lib/libpthread.so.1
Reading symbols from /usr/lib/libthread.so.1...done.
Loaded symbols for /usr/lib/libthread.so.1
Reading symbols from /usr/lib/libz.so...done.
Loaded symbols for /usr/lib/libz.so
Reading symbols from /usr/lib/libcrypt_i.so.1...done.
Loaded symbols for /usr/lib/libcrypt_i.so.1
Reading symbols from /usr/lib/libgen.so.1...done.
Loaded symbols for /usr/lib/libgen.so.1
Reading symbols from /usr/lib/libsocket.so.1...done.
Loaded symbols for /usr/lib/libsocket.so.1
Reading symbols from /usr/lib/libnsl.so.1...done.
Loaded symbols for /usr/lib/libnsl.so.1
Reading symbols from /usr/lib/libm.so.1...done.
Loaded symbols for /usr/lib/libm.so.1
Reading symbols from /usr/lib/libc.so.1...done.
Loaded symbols for /usr/lib/libc.so.1
Reading symbols from /usr/lib/libmp.so.2...done.
Loaded symbols for /usr/lib/libmp.so.2
Reading symbols from /usr/platform/SUNW,Ultra-Enterprise/lib/libc_psr.so.1...
done.
Loaded symbols for /usr/platform/SUNW,Ultra-Enterprise/lib/libc_psr.so.1
Reading symbols from /usr/lib/nss_files.so.1...done.
Loaded symbols for /usr/lib/nss_files.so.1
---Type <return> to continue, or q <return> to quit---
#0  0xff339794 in __sigprocmask () from /usr/lib/libthread.so.1
(gdb) backtrace full
#0  0xff339794 in __sigprocmask () from /usr/lib/libthread.so.1
No symbol table info available.
#1  0xff32e9a8 in _resetsig () from /usr/lib/libthread.so.1
No symbol table info available.
#2  0xff32e148 in _sigon () from /usr/lib/libthread.so.1
No symbol table info available.
#3  0xff331188 in _thrp_kill () from /usr/lib/libthread.so.1
No symbol table info available.
#4  0xbbd78 in write_core (sig=11) at stacktrace.c:220
No locals.
#5  0x4415c in handle_segfault (sig=11) at mysqld.cc:1326
No locals.
#6  0xff33b838 in __sighndlr () from /usr/lib/libthread.so.1
No symbol table info available.
#7  <signal handler called>
No symbol table info available.
#8  0xff240620 in memcpy ()
   from /usr/platform/SUNW,Ultra-Enterprise/lib/libc_psr.so.1
No symbol table info available.
#9  0x7d160 in Log_event::write_header (this=0xfe7718d0, file=0x333990)
    at log_event.cc:66
        buf = "·â\232=\002ê\003\000\000\002\001\000"
        pos = 0xfe771201 "\002\001"
---Type <return> to continue, or q <return> to quit---
        tmp = -25749296
#10 0x7d054 in Log_event::write (this=0xfe7718d0, file=0x333990)
    at log_event.cc:50
No locals.
#11 0x7d038 in Query_log_event::write (this=0xfe7718d0, file=0x333990)
    at log_event.cc:45
No locals.
#12 0x7bf14 in MYSQL_LOG::write (this=0x333908, event_info=0xfe7718d0)
    at log.cc:726
        thd = (THD *) 0x393738
        file = (IO_CACHE *) 0x333990
        error = true
        should_rotate = false
#13 0x74a24 in mysql_insert (thd=0x393738, table_list=0x333800, 
    fields=@0x32d400, values_list=@0x3938d0, duplic=DUP_ERROR, 
    lock_type=TL_WRITE_CONCURRENT_INSERT) at sql_insert.cc:270
        qinfo = {<Log_event> = {when = 1033560759, exec_time = 5565, 
    valid_exec_time = 1, server_id = 1002, _vptr. = 0x32d7c0}, data_buf = 0x0, 
  query = 0x83c0a5f "insert into active_sessions ("..., 
  db = 0x83c0a50 "some_db", q_len = 219, db_len = 14, error_code = 0, 
  thread_id = 6, thd = 0x393738, cache_stmt = false}
---Type <return> to continue, or q <return> to quit---
        error = 0
        log_on = false
        using_transactions = false
        value_count = 3331072
        save_time_stamp = 0
        counter = 0
        id = 0
        info = {records = 1, deleted = 0, copied = 1, error = 858994224, 
  handle_duplicates = DUP_ERROR, escape_char = 3203072}
        table = (TABLE *) 0x8420180
        its = {<base_list_iterator> = {list = 0x3938d0, el = 0x840b994, 
    prev = 0x840b994, current = 0x0}, <No data fields>}
        values = (List_item *) 0x0
        query = 0xfe7718d0 "=\232â·"
        _db_func_ = 0x32d400 ""
        _db_file_ = 0xfe771888 "Rows matched: 1  Changed: 1  Warnings: 0"
        _db_level_ = 138460304
        _db_framep_ = (char **) 0x393738
#14 0x4d1a0 in mysql_execute_command () at sql_parse.cc:1581
        res = 0
        thd = (THD *) 0x393738
        lex = (LEX *) 0x393858
        tables = (TABLE_LIST *) 0x840b658
---Type <return> to continue, or q <return> to quit---
        _db_func_ = 0xfe771937 ""
        _db_file_ = 0xfe771a08 ""
        _db_level_ = 7805556
        _db_framep_ = (char **) 0x36300036
#15 0x4eed4 in mysql_parse (thd=0x393738, 
    inBuf=0x83c0a5f "insert into active_sessions ("..., 
    length=3749976) at sql_parse.cc:2364
        _db_func_ = 0x0
        _db_file_ = 0x3d9b0683 <Address 0x3d9b0683 out of bounds>
        _db_level_ = 0
        _db_framep_ = (char **) 0x103
        lex = (LEX *) 0x393858
#16 0xb6348 in exec_event (thd=0x393738, net=0xdb, mi=0x3396b0, event_len=258)
    at slave.cc:987
        q_len = 219
        expected_error = 0
        actual_error = 0
        type_code = 3346432
        ev = (class Log_event *) 0x83ddf50
        llbuff = 
"\000\000\000\t\000\000\000f\000\000\000\003\000\000\001\022\000\000\000\001\000"
---Type <return> to continue, or q <return> to quit---
#17 0xb7538 in handle_slave (arg=0x30ac00) at slave.cc:1444
        suppress_warnings = false
        event_len = 258
        thd = (THD *) 0x393738
        mysql = (MYSQL *) 0x83a00f0
        llbuff = "780826502", '\000' <repeats 12 times>
        retried_once = false
        last_failed_pos = 0
        _db_func_ = 0x0
        _db_file_ = 0x0
        _db_level_ = 0
        _db_framep_ = (char **) 0x0
(gdb)


# cat /etc/my.cnf

[mysqld]
server-id = 2002
master-host = 192.168.x.x
master-user = replication
master-password = replication
master-port = 3306
log-bin
log-slave-updates
skip-slave-start
skip-locking
skip-innodb
skip-bdb
core-file
datadir = /database/data
log-slow-queries = /database/data/slow-queries.log
set-variable = open_files_limit=30000
set-variable = max_connections=1400
set-variable = max_user_connections=1200
set-variable = connect_timeout=20
set-variable = key_buffer=128M
set-variable = sort_buffer=16M
set-variable = record_buffer=16M
set-variable = table_cache=4000
set-variable = back_log=300
set-variable = join_buffer=8M
set-variable = wait_timeout=1800
set-variable = max_allowed_packet=100M


Tail of error log:

021002 14:45:05  Slave thread initialized
021002 14:45:05  Slave: connected to master '[EMAIL PROTECTED]:3306',  
replication started in log 'yyy-bin.017' at position 780826502
mysqld got signal 11;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked agaist is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
We will try our best to scrape up some info that will hopefully help diagnose
the problem, but since we have already crashed, something is definitely wrong
and this may fail

key_buffer_size=134213632
record_buffer=16773120
sort_buffer=16777180
max_used_connections=0
max_connections=1400
threads_connected=1
It is possible that mysqld could use up to 
key_buffer_size + (record_buffer + sort_buffer)*max_connections = 4057578 K
bytes of memory
Hope that's ok, if not, decrease some variables in the equation

021002 14:45:31  mysqld restarted
/usr/local/mysql/libexec/mysqld: ready for connections




---------------------------------------------------------------------
Before posting, please check:
   http://www.mysql.com/manual.php   (the manual)
   http://lists.mysql.com/           (the list archive)

To request this thread, e-mail <[EMAIL PROTECTED]>
To unsubscribe, e-mail <[EMAIL PROTECTED]>
Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php

Reply via email to