Re: Strange server crashes with large table and myisamchk

gerald_clark Fri, 02 Jul 2004 06:10:56 -0700

It is telling you that your hard drive is failing.
Replace it.

Hanno Fietz wrote:

Hello everybody,
I'm experiencing problems with a 4.0.15 MySQL-Server running on a SuSE Linux 8.2 box with 512 MB RAM, some one-point-something GHz CPU and 40 GB IDE Harddisk.

We have a database with some administrative tables and one large data table (now ~ 30 M rows, ~ 1GB index file and ~ 800 MB data file) that we insert new rows into on a per-minute basis. Read / Write ratio probably is around 1 : 2 or 1 : 3. To achieve good performance despite the size of the table, we run "myisamchk -r" and "myisamchk -R 1" every night as a part of the backup routine. The server is taken down for that purpose.

For the last two weeks now, we are getting these syslog messages when running the optimization:

Jul 2 03:10:28 t56 kernel: hda: dma_intr: status=0x51 { DriveReadySeekComplete Error } Jul 2 03:10:28 t56 kernel: hda: dma_intr: error=0x40 { UncorrectableError }, LBAsect=429367, sector=316864 Jul 2 03:10:28 t56 kernel: end_request: I/O error, dev 03:02 (hda), sector 316864 Jul 2 03:10:28 t56 kernel: klogd 1.4.1, ---------- state change ---------- Jul 2 03:10:30 t56 kernel: hda: dma_intr: status=0x51 { DriveReadySeekComplete Error } Jul 2 03:10:30 t56 kernel: hda: dma_intr: error=0x40 { UncorrectableError }, LBAsect=429367, sector=316872 Jul 2 03:10:30 t56 kernel: end_request: I/O error, dev 03:02 (hda), sector 316872 Jul 2 03:10:32 t56 kernel: hda: dma_intr: status=0x51 { DriveReadySeekComplete Error } Jul 2 03:10:32 t56 kernel: hda: dma_intr: error=0x40 { UncorrectableError }, LBAsect=429367, sector=316880 Jul 2 03:10:32 t56 kernel: end_request: I/O error, dev 03:02 (hda), sector 316880 Jul 2 03:10:33 t56 kernel: hda: dma_intr: status=0x51 { DriveReadySeekComplete Error } Jul 2 03:10:33 t56 kernel: hda: dma_intr: error=0x40 { UncorrectableError }, LBAsect=429367, sector=316888 Jul 2 03:10:33 t56 kernel: end_request: I/O error, dev 03:02 (hda), sector 316888 Jul 2 03:10:39 t56 kernel: hda: dma_intr: status=0x51 { DriveReadySeekComplete Error } Jul 2 03:10:39 t56 kernel: hda: dma_intr: error=0x40 { UncorrectableError }, LBAsect=429367, sector=316896 Jul 2 03:10:39 t56 kernel: end_request: I/O error, dev 03:02 (hda), sector 316896 Jul 2 03:10:39 t56 kernel: hda: dma_intr: status=0x51 { DriveReadySeekComplete Error } Jul 2 03:10:39 t56 kernel: hda: dma_intr: error=0x40 { UncorrectableError }, LBAsect=429367, sector=316904 Jul 2 03:10:39 t56 kernel: end_request: I/O error, dev 03:02 (hda), sector 316904 Jul 2 03:10:39 t56 kernel: hda: dma_intr: status=0x51 { DriveReadySeekComplete Error } Jul 2 03:10:39 t56 kernel: hda: dma_intr: error=0x40 { UncorrectableError }, LBAsect=429367, sector=316912 Jul 2 03:10:39 t56 kernel: end_request: I/O error, dev 03:02 (hda), sector 316912 Jul 2 03:12:17 t56 kernel: hda: dma_intr: status=0x51 { DriveReadySeekComplete Error } Jul 2 03:12:17 t56 kernel: hda: dma_intr: error=0x40 { UncorrectableError }, LBAsect=159072, sector=46592 Jul 2 03:12:17 t56 kernel: end_request: I/O error, dev 03:02 (hda), sector 46592 Jul 2 03:12:19 t56 kernel: hda: dma_intr: status=0x51 { DriveReadySeekComplete Error } Jul 2 03:12:19 t56 kernel: hda: dma_intr: error=0x40 { UncorrectableError }, LBAsect=159072, sector=46600 Jul 2 03:12:19 t56 kernel: end_request: I/O error, dev 03:02 (hda), sector 46600 Jul 2 03:13:14 t56 kernel: hda: dma_intr: status=0x51 { DriveReadySeekComplete Error } Jul 2 03:13:14 t56 kernel: hda: dma_intr: error=0x40 { UncorrectableError }, LBAsect=285328, sector=172864 Jul 2 03:13:14 t56 kernel: end_request: I/O error, dev 03:02 (hda), sector 172864 Jul 2 03:13:16 t56 kernel: hda: dma_intr: status=0x51 { DriveReadySeekComplete Error } Jul 2 03:13:16 t56 kernel: hda: dma_intr: error=0x40 { UncorrectableError }, LBAsect=285328, sector=172872 Jul 2 03:13:16 t56 kernel: end_request: I/O error, dev 03:02 (hda), sector 172872

Occasionally (not always!!), the MySQL-Server won't some up again after optimization, sometimes myisamchk even leaves the table corrupted and has to be run again. To make it even more confusing: sometimes I get server crashes during shutdown, due to signal 11 (SEGV). I included a resolved stack dump below:
0x8071f64 handle_segfault + 420
0x82916c8 pthread_sighandler + 184
0x8188a9f btr_search_drop_page_hash_index + 5359
0x8188e1a btr_search_drop_page_hash_when_freed + 138
0x81dbbea fseg_free_extent + 746
0x81dc7fa fseg_free_step + 2458
0x815c3ba btr_free_but_not_root + 122
0x8100efe dict_drop_index_tree + 94
0x814969a row_upd_clust_step + 538
0x81499fa row_upd + 106
0x8149c62 row_upd_step + 322
0x811c7be que_run_threads + 334
0x8136132 row_drop_table_for_mysql + 2114
0x80cf4ce delete_table__11ha_innobasePCc + 270
0x80c5c8c ha_delete_table__F7db_typePCc + 60
0x80d3bf1 mysql_rm_table_part2__FP3THDP13st_table_listbT2 + 497
0x80d38c1 mysql_rm_table__FP3THDP13st_table_listc + 177
0x807e6f1 mysql_execute_command__Fv + 8561
0x8080565 mysql_parse__FP3THDPcUi + 149
0x807bac3 dispatch_command__F19enum_server_commandP3THDPcUi + 1443
0x807b50e do_command__FP3THD + 158
0x807acfe handle_one_connection + 638
0x828ee7c pthread_start_thread + 220
0x82c258a thread_start + 4
Server crashes like that (caught signal 11) have recently been observed during normal operations as well, also preceded by hd errors in the syslog:

Jun 30 14:06:55 t56 kernel: hda: dma_intr: status=0x51 { DriveReadySeekComplete Error } Jun 30 14:06:55 t56 kernel: hda: dma_intr: error=0x40 { UncorrectableError }, LBAsect=186887, sector=74432 Jun 30 14:06:55 t56 kernel: end_request: I/O error, dev 03:02 (hda), sector 74432

The server restarted itself after that and wrote error messages to the logfile. Again, I include the stack trace:
0x8071f64 handle_segfault + 420
0x82916c8 pthread_sighandler + 184
0x82aad07 vfprintf + 6295
0x82b1645 vsprintf + 85
0x823928b ut_sprintf + 27
0x8226406 sync_array_cell_print + 166
0x8226ea4 sync_array_print_long_waits + 116
0x80f99d8 srv_error_monitor_thread + 88
0x828ee7c pthread_start_thread + 220
0x82c258a thread_start + 4
I have googled the syslog messages and worked myself through several forums but can't really pinpoint the problem. It seems there are some problems with our hard disk, which could mean that it is damaged (bad blocks etc.) but what I can't see is why this is so closely related to the optimization / backup script. There definetly is a strong correlation (we do get hd errors outside the backup process, but very rarely) between running myisamchk and getting I / O errors, but I just don't know if myisamchk causes the problem or if it is prone to suffer from disk trouble more than other processes.

Any help would be appreciated. Some questions I have: - How do I read the resolved stack trace? There are function calls (probably youngest first), OK, but what does that " + xxx" at the end of each line mean?) - Do the function calls executed just before p_thread_sighandler have something in common? - Is there a way to get more output out of myisamchk apart from -v?
Hanno Fietz


--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:    http://lists.mysql.com/[EMAIL PROTECTED]

Re: Strange server crashes with large table and myisamchk

Reply via email to