Re: MySQL 3.23.58 seg faults occasionally

Michael Bacarella Tue, 27 Jan 2004 08:24:24 -0800

> I guess the table is corrupt and its row contains a wrong trx id. InnoDB
> tries to retrieve an earlier version of the row, but the roll ptr in the row
> is garbage too, and it contains a nonexistent rollback segment id.
>
> "
> InnoDB: Error: trying to return an SQL NULL field in a non-null innoDB:
> column! Table name
> secret/tablename
> "
> 
> The above is also a clear indication of table corruption. The NULL bit in an
> InnoDB column is set, though the field should never be NULL.


Dropping the table and restoring it from a dump seems to be the ticket.

> You should upgrade to MySQL-4.0.17, which has better data page checksum
> checking. Do you get corruption with the same application on earlier Linux
> kernels, or could there be something wrong with Linux-2.4.23?

We've never experienced table corruption like this on previous kernels.
The biggest change we've introduced (besides Linux 2.4.23) is MD
instead of hardware RAID.

A MySQL 4 migration is in the works.

> /**********************************************************************
> Copies an undo record to heap. This function can be called if we know that
> the undo log record exists. */
> 
> trx_undo_rec_t*
> trx_undo_get_undo_rec_low(
> /*======================*/
>                                         /* out, own: copy of the record */
>         dulint          roll_ptr,       /* in: roll pointer to record */
>         mem_heap_t*     heap)           /* in: memory heap where copied */
> {
>         trx_undo_rec_t* undo_rec;
>         ulint           rseg_id;
>         ulint           page_no;
>         ulint           offset;
>         page_t*         undo_page;
>         trx_rseg_t*     rseg;
>         ibool           is_insert;
>         mtr_t           mtr;
> 
>         trx_undo_decode_roll_ptr(roll_ptr, &is_insert, &rseg_id, &page_no,
>                                                                 &offset);
>         rseg = trx_rseg_get_on_id(rseg_id);
> 
> ...
> 
> /**********************************************************************
> Looks for a rollback segment, based on the rollback segment id. */
> 
> trx_rseg_t*
> trx_rseg_get_on_id(
> /*===============*/
>                         /* out: rollback segment */
>         ulint   id)     /* in: rollback segment id */
> {
>         trx_rseg_t*     rseg;
> 
>         rseg = UT_LIST_GET_FIRST(trx_sys->rseg_list);
>         ut_ad(rseg);
> 
>         while (rseg->id != id) {
>                 rseg = UT_LIST_GET_NEXT(rseg_list, rseg);
>                 ut_ad(rseg);
>         }
> 
>         return(rseg);
> }
> 
> (gdb) disassemble trx_rseg_get_on_id
> Dump of assembler code for function trx_rseg_get_on_id:
> 0x81fa930 <trx_rseg_get_on_id>: push   %ebp
> 0x81fa931 <trx_rseg_get_on_id+1>:       mov    0x83f3190,%eax
> 0x81fa936 <trx_rseg_get_on_id+6>:       mov    0x24(%eax),%eax
> 0x81fa939 <trx_rseg_get_on_id+9>:       mov    %esp,%ebp
> 0x81fa93b <trx_rseg_get_on_id+11>:      mov    0x8(%ebp),%edx
> 0x81fa93e <trx_rseg_get_on_id+14>:      cmp    %edx,(%eax)
> 0x81fa940 <trx_rseg_get_on_id+16>:
>     je     0x81fa94c <trx_rseg_get_on_id+28>
> 0x81fa942 <trx_rseg_get_on_id+18>:      mov    0xa0(%eax),%eax
> 0x81fa948 <trx_rseg_get_on_id+24>:      cmp    %edx,(%eax)
> 0x81fa94a <trx_rseg_get_on_id+26>:
>     jne    0x81fa942 <trx_rseg_get_on_id+18>
> 0x81fa94c <trx_rseg_get_on_id+28>:      mov    %ebp,%esp
> 0x81fa94e <trx_rseg_get_on_id+30>:      pop    %ebp
> 0x81fa94f <trx_rseg_get_on_id+31>:      ret
> End of assembler dump.
> (gdb)
> 
> ................
> List:MySQL General Discussion« Previous MessageNext Message »
> From:Michael BacarellaDate:January 26 2004 6:21pm
> Subject:Re: MySQL 3.23.58 seg faults occasionally
> 
> > > the crashes below happen in independent areas of code. The 2 first are
> > > inside InnoDB, and the third inside MySQL. This looks like random thread
> > > crashes, or random memory corruption.
> >
> > My colleague tells me that the third stack trace (the one inside MySQL)
> > is from a RESET MASTER.  This server used to be a slave that was promoted
> > to master.  My apologies for not making this distinction.
> >
> > The ones inside InnoDB have been triggered by querying disk-resident
> > low use tables.
> 
> We can get SIGSEGV almost every hour (from an hourly maintenance script).
> This is the stack trace:
> 
> 0x80c23d5 handle_segfault__Fi + 425
> 0x40022f54 _end + 935506260
> 0x823d4a8 trx_rseg_get_on_id + 24
> 0x823952d trx_undo_get_undo_rec_low + 45
> 0x823977d trx_undo_get_undo_rec + 49
> 0x82399c0 trx_undo_prev_version_build + 548
> 0x81f3f35 row_vers_build_for_consistent_read + 641
> 0x81e9d5e row_sel_build_prev_vers_for_mysql + 226
> 0x81ecda4 row_search_for_mysql + 6580
> 0x8111152 general_fetch__11ha_innobasePcUiUi + 322
> 0x8111373 rnd_next__11ha_innobasePc + 83
> 0x8103da6 rr_sequential__FP14st_read_record + 150
> 0x80e74d9 sub_select__FP4JOINP13st_join_tableb + 341
> 0x80e7190 do_select__FP4JOINPt4List1Z4ItemP8st_tableP9Procedure + 412
> 0x80df9d7
> mysql_select__FP3THDP13st_table_listRt4List1Z4ItemP4ItemP8st_orderT4T3T4UiP1
> 3select_result
> + 4167
> 0x80c8aba mysql_execute_command__Fv + 806
> 0x80cbd88 mysql_parse__FP3THDPcUi + 72
> 0x80c7c74 do_command__FP3THD + 1324
> 
> Attempting to convert the table that caused the crash to MyISAM
> resulted in:
> 
> InnoDB: Error: trying to return an SQL NULL field in a non-null innoDB:
> column! Table name
> secret/tablename
> 
> Also, duplicate keys.  We're deleting the table and recreating it.
> 
> Any thoughts?
> 
> 
> > > You are running a relatively new Linux kernel, 2.4.23. Did the crashes
> start
> > > when you upgraded Linux?
> >
> > Well, maybe.  This is a new machine.  The previous machine was Red Hat
> 8.0's
> > default kernel, which is 2.4.18 + RH patches.  We were also running
> 3.23.56
> > which didn't seg fault like this.
> >
> > >From the log, MySQL says:
> >
> > > It is possible that mysqld could use up to
> > > key_buffer_size + (record_buffer + sort_buffer)*max_connections =
> 1884024 K
> > > bytes of memory
> >
> > I never remember if the above calculation is usable or not.
> >
> > basedir=/usr/local/mysql/
> > user=mysql
> > server-id=3
> > log-bin=/dblog/dbms3-bin
> >
> > log-slow-queries = /usr/local/mysql/data/dbms3.slow_queries
> > set-variable = long_query_time=5
> >
> > innodb_data_home_dir =
> > innodb_data_file_path = /dev/md0:76319Mraw
> >
> > set-variable = innodb_buffer_pool_size=800M
> > set-variable = innodb_additional_mem_pool_size=50M
> >
> > innodb_log_group_home_dir = /dblog
> > innodb_log_arch_dir = /dblog
> > innodb_log_archive=0
> >
> > set-variable = innodb_log_files_in_group=3
> > set-variable = innodb_log_file_size=1.5G
> > set-variable = innodb_log_buffer_size=8M
> >
> > innodb_flush_log_at_trx_commit=0
> > set-variable = innodb_file_io_threads=4
> > set-variable = innodb_lock_wait_timeout=50
> >
> > set-variable = wait_timeout=360
> > set-variable = tmp_table_size=8000000
> > set-variable = max_connections=415
> > set-variable = table_cache=256
> > set-variable = sort_buffer=4M
> > set-variable = key_buffer=128M
> > set-variable = thread_cache_size=415
> > set-variable = max_connect_errors=50000
> > set-variable = thread_stack=256k
> 
> -- 
> Michael Bacarella                24/7 phone: 1-646-641-8662
> Netgraft Corporation                   http://netgraft.com/
> 
> 
> -- 
> MySQL General Mailing List
> For list archives: http://lists.mysql.com/mysql
> To unsubscribe:    http://lists.mysql.com/[EMAIL PROTECTED]

-- 
Michael Bacarella                24/7 phone: 1-646-641-8662
Netgraft Corporation                   http://netgraft.com/

-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:    http://lists.mysql.com/[EMAIL PROTECTED]

Re: MySQL 3.23.58 seg faults occasionally

Reply via email to