~75% performance drop upgrading from Mysql 4.0 to 5.0
Here are the timings: 64-bit 5.0 Single Thread 1:00:12.17 total (~76% slower) 64-bit 4.1 Single Thread 41:38.07 total (~20% slower) 64-bit 4.0 Single Thread 34:50.23 total I have been trying to get a stable configuration for a 64-bit mysql on ubuntu for the past 6-8 months, and have developed a number of stress tests to try to isolate problems. I've been able to successfully discover several bugs in MySQL (or libc) running in 64-bit mode. I've recently started running these tests against 4.1 and 5.0 versions of MySQL, but I have discovered very very different performance characteristics compared to MySQL 4.0. For this test, it imports one of our customer databases, records the import time, drops the database, and restarts. I've run this test against the 4.0, 4.1 and 5.0 binaries. I've found a *dramatic* performance drop on the 4.1 and 5.0 lines against the 4.0 line of MySQL when executing this test. The tests and configurations are identical. This test may be lopsided (insert + alter table enable keys), however it is something that we actually do daily in our production environment. If we were to consider upgrading our production environment from 4.0 to 5.0, a 75% drop in performance would not be acceptable for these types of operations. I ran these tests for a minimum of 24 hours each (although there was very little variance between the 1st and the last run). Each installation was a fresh mysql binary installation, with newly built grant tables (EG: blank). The machine is a 2-cpu 2.4GHz Opteron 4GB Memory with a 4 disc RAID 0+1 array, running Ubuntu Breezy. I ran these tests against the 32-bit version for each of those versions, and saw similar timings, except on average they were 11% slower than their corresponding 64-bit version. I also ran these tests with two threads, (operating on separate databases), and found similar performance hits with MySQL 4.1 and 5.0 and an overall 17% drop in performance against single threaded operation. All my.cnf files were identical between the versions, except that I changed the datadir and language settings (to point to the appropriate directory). I have attached the my.cnf file. I am using MyISAM tables for all tests. The test harness is very simple: #!/bin/zsh while true; do echo create database $1 | /var/lib/mysql/mysql-standard-5.0.17-linux-x86_64-glibc23/bin/mysql -u root time zcat TEST_DATABASE.mysql.gz | /var/lib/mysql/mysql-standard-5.0.17-linux-x86_64-glibc23/bin/mysql -u root $1 echo drop database $1 | /var/lib/mysql/mysql-standard-5.0.17-linux-x86_64-glibc23/bin/mysql -u root date done I thought perhaps MySQL 5.0 may be doing a better job determining index cardinalities than 4.0, and as a result there would be no need to do an optimize-db after an insert. I modified the test to insert the database, then optimize each of the tables. I discovered the same test execution time (MySQL 5.0 ~75% slower) and MySQL 4.0 was able to optimize the tables 20% faster. Although I cannot provide our customer data, I can show you an example of what the import file looks like: CREATE TABLE ABCState ( userID int(11) NOT NULL default '0', roleID int(11) NOT NULL default '0', aiFilter int(11) default NULL, PRIMARY KEY (userID,roleID) ) TYPE=MyISAM; -- -- Dumping data for table `ABCState` -- /*!4 ALTER TABLE ABCState DISABLE KEYS */; LOCK TABLES ABCState WRITE; INSERT INTO ABCState VALUES (15,4,8),(20,4,8),(21,4,8),(22,4,8),(19,5,8),(40,4,8),(42,4,8),(38,4,8),(39,4,8),(43,4,8),(33,4,8),(27, 4,8),(28,4,8),(26,4,8),(25,4,8),(34,4,8),(32,4,8),(35,4,1),(31,4,8),(24,4,8),(36,4,8),(37,4,8),(30,4,8),(29,4,8),(47,6,8),(66,6,8),(64,6,8),(46, 6,8),(67,10,8),(60,6,8),(68,10,8),(69,10,8),(44,5,8),(70,10,8),(71,10,8),(72,5,8),(73,10,8),(77,6,8),(76,6,8),(75,4,8),(74,4,8),(78,4,8),(76,4,8 ),(79,4,8),(77,4,8),(18,5,8),(82,4,8),(11,5,8),(81,5,8),(84,4,8),(85,4,8),(106,4,8),(107,4,8),(31,6,8),(108,4,8),(108,6,8),(110,6,8),(109,6,8),( 111,4,8),(6,5,8),(35,6,8),(36,6,8),(112,6,8),(113,6,8),(114,6,8),(115,4,8),(116,4,8),(120,6,8),(16,5,8),(121,6,8),(118,4,8),(119,4,8),(117,4,8), (126,4,8),(127,4,8),(128,4,8),(129,4,8),(130,4,8),(123,4,8),(126,6,8),(133,4,8),(132,4,8),(132,6,8),(134,4,8),(135,10,8),(38,6,8),(136,10,8),(11 0,4,8),(145,4,8),(137,4,8),(144,4,8),(141,4,8),(140,4,8),(146,4,8),(142,4,8),(148,4,8),(147,4,8),(138,4,8),(149,4,8),(141,6,8),(146,6,8),(142,6, 8),(144,6,8),(147,10,8),(75,10,8),(150,4,8),(137,10,8),(156,10,8),(157,4,8),(158,4,8),(159,4,8),(160,4,8),(113,4,8),(165,4,8); UNLOCK TABLES; /*!4 ALTER TABLE ABCState ENABLE KEYS */; the dump was generated from a mysqldump -l --add-locks --extended-insert --quick --all --disable-keys. There are millions of records, the compressed GZ file is 500 megs. There is a good array of varchar, char, blob fields across 100+ tables. Does anybody have any idea why 5.0 is performing so poorly? Since I am not using any of the new 4.1 or 5.0 features, I would
RE: AMD64 Linux MySQL Error 1041 Out of Memory
On Thu, 2005-06-16 at 21:06, Richard Dale wrote: So we have recently started stress testing Mysql on an Opteron dual CPU machine running Ubuntu Hoary. We are using the 64-bit GCC 4.0.24-standard binary from mysql. The stress test that I'm currently running on it involves inserting a large database (from a mysqldump) ... Which linux kernel are you running? 2.6.10, specifically the ubuntu compile: 2.6.10-5-amd64-k8-smp Why not consider MySQL 4.1? It's now a stable production release. Unfortunately when we tried to run our application on MySQL 4.1, we were bit by a number of features of MySQL 4.1. MySQL 4.1 now silently returns Long objects instead of Integers for most functions (COUNT), causing class cast exceptions in Java... Also a number of our queries which used to compare binary and non-binary fields, now nolonger cast automatically... Requiring some additional SQL which our database abstraction layer is not aware of needing. A lengthy porting process ensued... and has been completed, but our application is still quite a ways from being fully certified to run on MySQL 4.1. :( The MySQL 4.1 upgrade was 10x nastier than our 3.23-4.0 upgrade. Also, consider visiting: http://hashmysql.org/index.php?title=Opteron_HOWTO We would welcome a section on there about Ubuntu and x86-64. Yes, I will be updating that wiki once I've got it running stably. I ran into some libc issues with Ubuntu already, as seen here: http://bugzilla.ubuntu.com/show_bug.cgi?id=11730 However I've compiled my own replacement libc with patches applied to fix the issues, which I will make available to all. Except for this memory problem, the machine is ready for our burnin testing. -Joe -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
RE: AMD64 Linux MySQL Error 1041 Out of Memory
On Thu, 2005-06-16 at 21:06, Richard Dale wrote: So we have recently started stress testing Mysql on an Opteron dual CPU machine running Ubuntu Hoary. We are using the 64-bit GCC 4.0.24-standard binary from mysql. The stress test that I'm currently running on it involves inserting a large database (from a mysqldump) ... So I don't think this has anything to do memory consumption anymore. I've been able to get the problem to repeat with mysqld's memory footprint being about 800+megs (545M resident). I also was able to get it to happen with the ubuntu stock mysqld my.cnf aswell... which uses ridiculously conservative memory settings. (The key cache is so small it takes 4x as long to run the test). With an identical configuration as my normal setup, running the 32bit binary doesn't appear to trigger the problem. I'll try 64bit with linuxthreads next. Somebody asked for a show processlist and show status... Here's the data pretty close to when it happened... +--++ | Variable_name| Value | +--++ | Aborted_clients | 4 | | Aborted_connects | 0 | | Bytes_received | 3638957151 | | Bytes_sent | 18427266 | | Com_admin_commands | 5 | | Com_alter_table | 10996 | | Com_analyze | 0 | | Com_backup_table | 0 | | Com_begin| 0 | | Com_change_db| 9 | | Com_change_master| 0 | | Com_check| 193| | Com_commit | 0 | | Com_create_db| 57 | | Com_create_function | 0 | | Com_create_index | 0 | | Com_create_table | 5505 | | Com_delete | 0 | | Com_delete_multi | 0 | | Com_drop_db | 57 | | Com_drop_function| 0 | | Com_drop_index | 0 | | Com_drop_table | 0 | | Com_flush| 3 | | Com_grant| 0 | | Com_ha_close | 0 | | Com_ha_open | 0 | | Com_ha_read | 0 | | Com_insert | 334612 | | Com_insert_select| 0 | | Com_kill | 0 | | Com_load | 0 | | Com_load_master_data | 0 | | Com_load_master_table| 0 | | Com_lock_tables | 5505 | | Com_optimize | 0 | | Com_purge| 0 | | Com_rename_table | 0 | | Com_repair | 0 | | Com_replace | 0 | | Com_replace_select | 0 | | Com_reset| 0 | | Com_restore_table| 0 | | Com_revoke | 0 | | Com_rollback | 0 | | Com_savepoint| 0 | | Com_select | 1 | | Com_set_option | 0 | | Com_show_binlog_events | 0 | | Com_show_binlogs | 0 | | Com_show_create | 0 | | Com_show_databases | 2 | | Com_show_fields | 0 | | Com_show_grants | 0 | | Com_show_keys| 0 | | Com_show_logs| 0 | | Com_show_master_status | 0 | | Com_show_new_master | 0 | | Com_show_open_tables | 0 | | Com_show_processlist | 2 | | Com_show_slave_hosts | 0 | | Com_show_slave_status| 0 | | Com_show_status | 1 | | Com_show_innodb_status | 0 | | Com_show_tables | 9 | | Com_show_variables | 0 | | Com_slave_start | 0 | | Com_slave_stop | 0 | | Com_truncate | 0 | | Com_unlock_tables| 5491 | | Com_update | 0 | | Com_update_multi | 0 | | Connections | 182| | Created_tmp_disk_tables | 0 | | Created_tmp_tables | 0 | | Created_tmp_files| 43 | | Delayed_insert_threads | 0 | | Delayed_writes | 0 | | Delayed_errors | 0 | | Flush_commands | 1 | | Handler_commit | 0 | | Handler_delete | 0 | | Handler_read_first | 1 | | Handler_read_key | 0 | | Handler_read_next| 0 | | Handler_read_prev| 0 | | Handler_read_rnd | 0 | | Handler_read_rnd_next| 9 | | Handler_rollback | 0 | | Handler_update | 0 | | Handler_write| 617450860 | | Key_blocks_used | 478697 | |
AMD64 Linux MySQL Error 1041 Out of Memory
So we have recently started stress testing Mysql on an Opteron dual CPU machine running Ubuntu Hoary. We are using the 64-bit GCC 4.0.24-standard binary from mysql. The stress test that I'm currently running on it involves inserting a large database (from a mysqldump) from three separate windows (so three imports running simultaneously). The database dump is about 3.7 gigs uncompressed, or 580megs compressed. It was dumped using the following dump parameters: --add-locks --extended-insert --quick --lock-tables --all --disable-keys Each window has as script that creates a database, imports the data, dumps the database, and repeats. After about 12 cycles (each take about an hour) mysql starts spewing these errors: ERROR 1041 at line 195: Out of memory; Check if mysqld or some other process uses all available memory. If not you may have to use 'ulimit' to allow mysqld to use more memory or you can add more swap space I, unfortunately, have not been at the server when this actually happens, however when I come in in the morning top is reporting mysqld taking up between 2.7 gigs and 3.2 gigs of memory. I have had a vmstat running all night, and at no point saw the system run out of swap space (it did over the course of the 15 hours or so, slowly hit swap up for about 60megs out of 2 gigs though). Obviously checking ulimit was my first stop, however I believe MySQLd does it's own setuid... And I'm not sure it uses PAM to get it's initial ulimits. Either way, I do this: su mysql -s /bin/sh sh-3.00$ ulimit -a core file size(blocks, -c) 0 data seg size (kbytes, -d) unlimited file size (blocks, -f) unlimited max locked memory (kbytes, -l) unlimited max memory size (kbytes, -m) unlimited open files(-n) 8192 pipe size (512 bytes, -p) 8 stack size(kbytes, -s) 8192 cpu time (seconds, -t) unlimited max user processes(-u) unlimited virtual memory(kbytes, -v) unlimited Which implies it should be able to alloc quite a bit of memory without problem. So my first question, is it appears to be konking out around 4 gigs of memory. Is there some reason why mysqld can't allocate more than 4gigs? I confirmed I *am* running the 64-bit binary: file /usr/sbin/mysqld /usr/sbin/mysqld: ELF 64-bit LSB executable, AMD x86-64, version 1 (SYSV), for GNU/Linux 2.4.0, dynamically linked (uses shared libs), stripped This system has 4 gigs of memory in it. So if it tried to allocate 4gigs, it would have had to hit swap up harder than 60megs. It seems like mysqld is hitting the 32bit allocation limit, but that doesn't particularly make sense to me. Anybody have their mysqld allocating 4gigs? Anything else I can try here? The second thing is, I have no idea why mysql is taking up so much memory. For the first 5 runs or so, mysql only allocates about 800 megs. Sometime during the night, is when it jumps up in memory.. I don't really understand why if it didn't need 3+ gigs of memory after the first 5 complete runs (x3 of course... since there's 3 running in parallel), it would suddenly need more later. Either way, lets do some math. Mysql is 2.7 gigs this morning, which is about half a gig less than yesterday morning. 2.7 gigs Key buffer: 512m Tmp Table: 128m sort buffer size: 512m join buffer size: 512m query cache: 256m KeyBuffer=512m, I could see that possibly not being returned.. So lets assume 512M there. There are no threads connected at the moment because I have shutdown the test, so tmptable should take up 0, but lets say it didn't return 3x128M (384M). Sort buffer size is 512M, well it may have used that for the alter table activate keys... and never returned it, so 512M there. Join buffer size, not a single select query was used ever, 0M. Query cache, 0M. 3x16M max packet. So I see 512M+384+512M+48M=1.4gigs. I have no idea why mysql is using this much memory... especially after it successfully performs 5 cycles with considerably less. Any ideas? here's the my.cnf [mysqld] user= mysql pid-file= /var/run/mysqld/mysqld.pid socket = /var/run/mysqld/mysqld.sock port= 3306 basedir = /usr datadir = /var/lib/mysql tmpdir = /tmp language= /usr/share/mysql/english skip-external-locking key_buffer = 512M tmp_table_size = 128M max_connections = 2000 max_connect_errors = 9 table_cache = 1024 myisam_max_sort_file_size=2048M myisam_sort_buffer_size =512M join_buffer_size=512M sort_buffer =512M max_allowed_packet = 16M thread_stack= 128K query_cache_limit = 1M query_cache_size= 256M query_cache_type= 1 skip-innodb and a show variables from a running server after a night's testing: mysql show variables;
4.0.23 - error 127 from table handler during many parallel inserts
We have been very successfully running MySQL in a production environment one way or another for the past 6 years. We have recently run into what I believe is a thread race condition while writing then reading from a MyISAM table. The server we are experiencing this problem on is a very stable environment, very rarely has anything changed on this machine in the past 4 years (other than mysql upgrades, and security updates). The machine is running Debian Woody (stable), I have included the libraries from mysqlbug and kernel version at the end of this email. The machine is a Dell Poweredge 6450 4 processor XEON 700/2MB, running local hardware raid with an LSI controller. I include the hardware configuration because it's possible the thread interaction problem may lay closer to the hardware level (or compiler) since the box is a 4 CPU machine with fairly massive L2 caches (even by today's standards) on each chip that need to be kept synchronized. There have been firmware upgrades for this machine in the past to fix 'cpu synchronization' issues, however I do not know if we are running these fixes or not. We are running MySQL 4.0.23 Mysql-binary (so it should be statically linked anyway). I've looked in the changelog for 4.0.24 and 4.0.25 and I don't see any updates that might resolve our issue. Unfortunately I have no solid test case for this issue and it only occurs under times of heavy stress. The problem has manifested itself twice out of the past two times a customer has been doing 'massive' batch configuration changes to our system. They do these changes once a month, and the past two months this issue has occurred. The issue manifests itself in a very particular way, and has been practically exactly the same both times. I am hoping somebody can give me a direction to take this, either to open a MySQL support case, mysqldumping the table and reinserting the table, looking at the firmware updates... The problem is we get an error 127 from table handler error when doing a select: General error, message from server: Got error 127 from table handler -- SQLQuery was:select DISTINCT Activity.activityID,Activity.processID,Activity.activityName, [] from Activity where ( Activity.processID='147008' AND Activity.activityName='VIMforQuiz' ) This, ofcourse, crashes out our XML processing and the worker thread will end up stopping. What is happening when this happens is there are atleast 3-4 active threads running on our application server processing tens of thousands of XML files, and making major changes to the database. Each thread is essentially performing the same duty, but just with different data. These threads are primarily database bound. Replication is active on this server, and there are 3 replication servers pulling the updates from this server. None of their tables are corrupted by this (by virtue of check table). Looking in the MySQL binlog for the timeperiod when this occurs (it, ofcourse, does not include selects), shows about 200-300 inserts/updates going on during the 1 second period when this issue happens. There happens to be 3 separate threads inserting records into the Activity table during that second (they must just happened to have converged). I am guessing that the record the above select query is trying to query, has just been inserted (I found the insert for the record during that same 1 second as the crash). Performing a check table reports everything is kosher: mysql check table Activity; +-+---+--+--+ | Table | Op| Msg_type | Msg_text | +-+---+--+--+ | abc.Activity| check | status | OK | +-+---+--+--+ 1 row in set (5 min 36.80 sec) Last month I did a repair table aswell, and that reported everything was fine aswell, but obviously didn't fix anything. Once this crash occurs, that worker thread will quit... but all other threads will continue to pound away on this and other tables, and will work perfectly fine with no errors. The table is a MyISAM Dynamic table, as shown from this show table status: | Activity | MyISAM | Dynamic | 4681274 |105 | 505218432 | 4294967295 | 166466560 | 10512548 |4960114 | 2004-10-14 14:04:25 | 2005-05-24 00:57:34 | 2005-05-24 00:57:10 || | The free space probably was more like zero when the incident occurred, all of the commands I have run here were several hours after the incident. perror 127 says that the record file is crashed. What will cause MySQL to return such an error during a lookup? Clearly the table *itself* is not marked as crashed, so I suspect what must be happening is the SELECT statement is being allowed to read from the table while an insert is still writing... It presumably should be locked out during that operation,
All Queries Hanging/Freeze: Trying to perform Select 1
Hi, We have recently upgraded to MySQL 4.0 (binary) from 3.23.50, and we are seeing *all* MySQL queries freeze up every morning for about 25 seconds. I don't believe this is the standard run-of-the-mill MyISAM locking problem people tend to run into, as queries in separate databases hang, aswell as SELECT 1 queries hanging. Here's the situation, a few jobs run over night that do a humongous amount of inserts into a single table in two separate databases. (We'll call this 'table A', as it is named the same in both databases). This nightly job is run for only 2 databases out of maybe 100. We use batch inserts to insert these rows. When these jobs finish, all is calm (all the users are asleep). All the servers are pretty much entirely idle. There should be no system cronjobs running at this point. Then, sometime between 10 minutes and an hour later, all the queries going to the mysql server will freeze for ~25 seconds. During this time, vmstat 1 reports that one full cpu is cranking 100% user time. (this is a 4 way box, so %25 user time is reported). We have no cronjobs running at this time, and we have disabled nightly optimization since this problem started occuring. Last night, I setup a 'show processlist' python script, and I captured the results every 5 seconds. I have approximately 5 processlist captures from the server in this state. They look like this (this is from the first capture) (31531L, 'ABC', 'xxx:48319', 'ABC', 'Query', '1', None, 'select 1') (31532L, 'ABC', 'xxx:48320', 'ABC', 'Sleep', '21', '', None) (31533L, 'BCD', 'xxx:48322', 'BCD', 'Query', '0', None, 'select 1') (31567L, 'CDE', 'xxx:48489', 'CDE', 'Query', '0', None, 'select 1') (31610L, 'DEF', 'xxx:48622', 'DEF', 'Query', '0', None, 'select 1') (31611L, 'EFG', 'xxx:48626', 'EFG', 'Query', '1', None, 'select 1') (31666L, 'HIJ', 'yyy:54831', 'HIJ', 'Query', '0', None, 'select count(GlobalPreferences.primaryKey) from GlobalPreferences') (32016L, 'JKL', 'zzz:50352', 'JKL', 'Query', '2', 'update', 'INSERT INTO Response (responseSetID,questionID,userID,hasRule,isNA,creationDate,versionNumber,foilID') [...] There are approximately 600 db connections total, at at this point, only a small fraction are hanging. You will note there is exactly one insert being run at this time. I believe this is related to a trigger for this condition. This insert, however, is *NOT* into 'Table A'. The table this insert is going into should have seen a lot of select activity overnight, but no inserts. You can see the select 1's hanging, and a few select count's hanging. Both of those queries are diagnostic queries run by the application to make sure the database is healthy. The hang time in the above list is small, but if we look at my last capture: (31895L, 'ABC', 'yyy:55654', 'ABC', 'Query', '16', None, 'select 1') (31898L, 'BCD', 'yyy:55658', 'BCD', 'Query', '21', None, 'select 1') (31948L, 'CDE', 'yyy:56017', 'CDE', 'Query', '21', None, 'select 1') (32016L, 'JKL', 'zzz:50352', 'JKL', 'Query', '28', 'update', 'INSERT INTO Response (responseSetID,questionID,userID,hasRule,isNA,creationDate,versionNumber,foilID') (31793L, 'DEF', 'yyy:55176', 'DEF', 'Query', '25', None, 'select count(GlobalPreferences.primaryKey) from GlobalPreferences') (31794L, 'DEF', 'yyy:55177', 'DEF', 'Query', '15', None, 'select count(GlobalPreferences.primaryKey) from GlobalPreferences') (31795L, 'DEF', 'yyy:55178', 'DEF', 'Query', '4', None, 'select count(GlobalPreferences.primaryKey) from GlobalPreferences') The select 1's and counts range from hanging 4 seconds to 25 seconds, and the insert hangs for 28 seconds. Based on the increase in the number of hanging requests, I do not believe *any* requests have completed since that insert started (although my show processlist seems to go through!). The Mysql.err log is empty, the Mysql-slow.log shows all the queries above, including the insert. A few things to note. vmstat shows very little I/O during this period. It shows slightly less than normal I/O then a 'standard' idle state, presumably because none of the mysql queries go through during this period. I/O is not frozen however, as I do see disk writes during this period. Here's the server config: my.cnf: [mysqld] skip-locking server-id = 4 skip-innodb set-variable= key_buffer=800M set-variable= tmp_table_size=1024M set-variable= max_allowed_packet=16M set-variable= thread_stack=128K set-variable= max_connections=2000 set-variable= max_connect_errors=9 set-variable= table_cache=1024 set-variable= myisam_max_sort_file_size=4096 set-variable= myisam_sort_buffer_size=512M set-variable= join_buffer_size=512M set-variable= sort_buffer=512M query_cache_size= 512M The system has 4gigs of memory, recently reduced from 8 gigs to try to work around a separate problem, and we're running the 2.4.26 linux kernel on a quad cpu x86 machine. Any thoughts? What kind of internal
Re: All Queries Hanging/Freeze: Trying to perform Select 1
Any thoughts? What kind of internal locks might be generated? Is there another command I can run to get the status of those queries that are hanging? It shows 'None' for the state (NULL), I've never seen a Aha.. So I had a brainstorm while driving to lunch. Is it possible this is related to the query cache? Obviously we did not have a query cache before we upgraded to 4.0. Over night, the heavy jobs running inserts millions of records into Table A, but basically does hundreds of thousands of select's from Table B. Table B is the table we're seeing the insert into that appears to be freezing all the queries on the server (including 'select 1'). Presumably, if this is the very first insert into this table, the query cache must be flushed for that table. I am assuming the query cache is nearly full of queries against that table (as the system has been idle since this job finished). Our query cache is 512M max. Can it really take 25 seconds to clear the query cache of all those queries? Would that hang all the queries in the manner we are seeing (Including 'select 1')? The Insert would be in the 'update' state while clearing the query cache? I can update my script to also pull back 'show status' from the server tonight and see what the query cache is doing at the same time. If it does turn out to be a query cache lock, what course of action can we take? Periodic 'reset query cache', throughout the night? Is a query cache of 512M beyond the recommended size? I wouldn't have expected it to take 25 seconds to clear out a single table's query cache, is this possible? Thanks, -Joe -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
Mysql 4.0.20(binary) Relay Log Purging not working
We just upgraded to MySQL 4.0.20 (binary), and we use replication to a slave running the same. We're having some difficulty deleting the relay log files the slave creates. I'm fairly confused, as the manual clearly says we shouldn't be having this problem: According to the MySQL manual: Relay logs have the same format as binary logs, so you can use mysqlbinlog to read them. A relay log is automatically deleted by the SQL thread as soon as it has executed all its events and no longer needs it). There is no explicit mechanism for deleting relay logs, because the SQL thread takes care of doing so. However, from MySQL 4.0.14, FLUSH LOGS rotates relay logs, which will influence when the SQL thread deletes them. So, I've run Flush Logs on the slave, and it's created the next log file in the sequence. But the old ones still stay around indefinitely. Even after issuing stop slave/start slave. I cannot find a command similar to PURGE MASTER LOGS for the slave to purge logs on the slave. Show slave status shows the slave Relay_log_space increasing steadily even after rotation. The Exec_master_log_pos is in sync with the master, meaning the old relay files are nolonger needed. There is a feature in 4.1.1: --relay-log-purge={0|1} Disables or enables automatic purging of relay logs as soon as they are not needed any more. The default value is 1 (enabled). This is a global variable that can be changed dynamically with SET GLOBAL relay_log_purge. This option is available as of MySQL 4.1.1. Does that mean that log purging doesn't exist until 4.1.1, or that the 'option to turn it off' isn't available until 4.1.1? If log purging isn't available until the next major mysql release, how can I delete these log files now? Even if I delete them off disk, they are still referenced in the 'relay-bin.index' file. Presumably I'm not allowed to update that file on a live slave... Ideas? -Joe -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
Bug #1858: Repeatable on 3.53.57
A week ago I posted that we were having horrible problems with MySQL's replication failing and reconnecting (after years of reliable operation). I still haven't come up with a solution to the problems, it just looks like the master keeps tearing down the replication connection, and the slave keeps reconnecting. Setting up more slaves, they exhibit the same exact problems. As noted in my previous emails, after about 4-6 errors like this: 040707 2:24:44 Error reading packet from server: Lost connection to MySQL server during query (server_errno=2013) 040707 2:24:44 Slave: Failed reading log event, reconnecting to retry, log 'mysql-bin.178' position 13468249 040707 2:24:44 Slave: reconnected to master '[EMAIL PROTECTED]:3306',replication resumed in log 'mysql-bin.178' at position 13468249 040707 2:39:57 Error reading packet from server: Lost connection to MySQL server during query (server_errno=2013) 040707 2:39:57 Slave: Failed reading log event, reconnecting to retry, log 'mysql-bin.178' position 25712380 040707 2:39:57 Slave: reconnected to master '[EMAIL PROTECTED]:3306',replication resumed in log 'mysql-bin.178' at position 25712380 The replication would completely stop with this error: 040702 4:13:51 Slave thread exiting, replication stopped in log 'mysql-bin.167' at position 108374685 Well, it turns out the slave disconnecting and reconnecting does not appear to be related to the slave stopping (odd since they both became problems around the same time). It looks like at 4am, our optimize database script fires off on the SLAVE, and it looks like the slave thread dies a little while after. I have turned off nightly optimization, and for the past 5 days, we have not seen the slave stop. So I would believe that we are running into bug: http://bugs.mysql.com/bug.php?id=1858 on the 3.53.57/3.53.58 line of code. The bug was already fixed on the 4.0.x line of code, but presumably a patch needs to be made to the 3.53.x line of code aswell. As for the rest of my replication problems... who knows :) Thanks, -Joe -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
Replication problem on 3.23.57 3.23.58
We have suddenly run into a replication problem between our master and our slave databases. It had been running without a problem for atleast several months since our last upgrade, and for several years before that. I've tried a number of debugging techniques, and now I'm hoping somebody here can make sense of it. On the slave I get error messages like: 040630 2:43:52 Slave: reconnected to master '[EMAIL PROTECTED]:3306',replication resumed in log 'mysql-bin.163' at position 37919441 It does that several times between 2:20am and 4:30am. and every few nights the slave gives up: 040630 4:15:34 Slave thread exiting, replication stopped in log 'mysql-bin.163' at position 99512496 No logs are recorded on the master. This 2am to 5am time coincides with a lot of mysql update/insert traffic on the master, in parallel with alot of different connections. Master and slave are running linux. Master is running 3.23.57 binary and the slave is running 3.23.58 binary. I checked the changelong on mysql's site, and I don't see why the version difference would be a problem. (but I am open to upgrading the master!) Both master and slave have max packet set at 16M. So my initial thought was hardware (since everything was great before). So I feel like I've almost eliminated it by doing the following: Changing ethernet boards on both master and slave (to totally different brand of hardware even) Changed cables Used different switch ports Turned off auto negotiate on the switch port Then I: swapped the slave hardware for completely different hardware (removed raid drives, stuffed into similar server) Upgraded the kernels on both machines to 2.4.26 I also setup a ping over night to both the master and slave servers looking for packet loss, none. I wrote a TCP based ping between both of those servers and our administrative server. It kept passing data back and forth and recorded the total time. No slow down, or Net errors during the time periods Mysql has communications problems. So then I setup a second slave. This box runs debian (the other slave ran redhat). This machine is running the same exact binary mysql that was running on the other slave. This slave ran fine for 3 days (which led to make some false assumptions that it was working fine). But now it's slave breaks too. Interesting though. Both slaves will break at *approximately* the same time (about a minute apart.. Both machines are ntp synced to the second, so it's not *EXACTLY* the same time), and both will be pretty *close* in the binlog, but not exact. I have looked in the biglog at the various queries that it's crapped out on, and they're nothing interesting. I've looked at about 10 of them, and some are very simple updates, some are long complex inserts. A few nights I have SLAVE STOP;ed the slave, and waited until the morning to run the slave log forward. When I run the slave log in the morning, I get the same exact slave disconnects. This is the first time I've seen the slave disconnect outside the 2:30-4am window. So clearly this seems to be a mysql problem... Something in the binlog that's causing it to disconnect, but why do the two slaves not disconnect in the same *EXACT* place? So, before putting the axe through the mysql server, I busted out my packet sniffer. I sniffed the packets from both sides of the connection until it failed. From my reading, it looks like the master is tearing down the connection. It basically looks like this: Master sends lots of data Master sends a PSH ACK packet Master sends lots of data Master sends a PSH ACK packet Master sends lots of data Master sends a PSH ACK FIN packet Slave sends a ACK FIN packet (with slave ACKs in there too) S, it looks like the master is tearing down the connection. Anybody have any thoughts? I can upgrade the master to 3.23.58, but I don't see anything in the mysql change log that implies that will help. (Bringing down the master server requires much dancing around and appeasing the customers due to the outage). My other thought is going to mysql 4.0.x. But again, I don't generally like doing things just because they might help. We have had a plan to upgrade to 4.0.x for some time (we certified our software on it), but we don't have an urgent need to budget the resources required to do it. Is it possible there is some sort of race in the mysql-biglog writing? One of the reasons why this might have only started cropping up now is the 2:30-4am slot has been getting progressively busier and busier, with a huge number of parallel insert/updates (4cpu box). The rest of the day the traffic isn't even close to that time period, and replication works rockstar. Settings on the master: set-variable= key_buffer=1024M set-variable= tmp_table_size=1024M set-variable= max_allowed_packet=16M set-variable= thread_stack=128K set-variable= max_connections=2000 set-variable= max_connect_errors=9 set-variable
Re: Replication problem on 3.23.57 3.23.58
somebody here can make sense of it. On the slave I get error messages like: 040630 2:43:52 Slave: reconnected to master '[EMAIL PROTECTED]:3306',replication resumed in log 'mysql-bin.163' at position 37919441 It does that several times between 2:20am and 4:30am. and every few Ah.. right.. I'm not-so-smart. I didn't actually post the entire error log message: 040701 14:56:05 Error reading packet from server: Lost connection to MySQL server during query (server_errno=2013) 040701 14:56:05 Slave: Failed reading log event, reconnecting to retry, log 'mysql-bin.165' position 14792980 040701 14:56:05 Slave: reconnected to master '[EMAIL PROTECTED]:3306',replication resumed in log 'mysql-bin.165' at position 14792980 040701 15:02:13 Error reading packet from server: Lost connection to MySQL server during query (server_errno=2013) 040701 15:02:13 Slave: Failed reading log event, reconnecting to retry, log 'mysql-bin.165' position 25852745 040701 15:02:13 Slave: reconnected to master '[EMAIL PROTECTED]:3306',replication resumed in log 'mysql-bin.165' at position 25852745 040701 15:09:14 Error reading packet from server: Lost connection to MySQL server during query (server_errno=2013) 040701 15:09:14 Slave: Failed reading log event, reconnecting to retry, log 'mysql-bin.165' position 37090598 040701 15:09:14 Slave: reconnected to master '[EMAIL PROTECTED]:3306',replication resumed in log 'mysql-bin.165' at position 37090598 040701 15:15:28 Error reading packet from server: Lost connection to MySQL server during query (server_errno=2013) 040701 15:15:28 Slave: Failed reading log event, reconnecting to retry, log 'mysql-bin.165' position 47700336 040701 15:15:28 Slave: reconnected to master '[EMAIL PROTECTED]:3306',replication resumed in log 'mysql-bin.165' at position 47700336 are the messages that come out of the slave. I'm currently running a test where I back up the entire slave, run it forward through the logs, and restore the slave back exactly to it's previous point, and run it through the logs again. See if it dies in the same places or if it's somewhat random. -Joe -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
MySQL 3.23.43 Binary Distribution not stable on Domain Socket 1000 threads
We've just switched from compiling our own MySQL to using your binary distribution. This is the first time we've run MySQL in production from one of your binary builds. We had to switch to a binary distribution because we were running into the 1000 threads issue with the GLIBC library we were linking against. We've now had the server running for about a week and we've just run into a little problem which seems like it's a bug in the mysql and mysqladmin client, or a problem with the mysqld server. We currently have ~1117 clients connected to our production database, all of these connections are TCP/IP based connections, none of them are on the local domainsocket on the db server. Our DB server was taking a pounding, so I logged into the db server, and connected as root and tried to list the connected processes: --- db-01:~ mysql -u root -p Enter password: Welcome to the MySQL monitor. Commands end with ; or \g. Your MySQL connection id is 49923 to server version: 3.23.43-log Type 'help;' or '\h' for help. Type '\c' to clear the buffer. mysql show processlist; ERROR 2013: Lost connection to MySQL server during query --- Ooops! The thread died! Tried again, doesn't work, tried mysqladmin -u root -p processlist, same deal! Crashed thread! Nothing appears in the mysql error log.. I tried running some queries on one of our databases from the db server using the mysql client. If I do a select * on a table with only a few rows (seems like less then 100), the query goes through fine. If I do it on a table with more then like 100 rows, the thread crashes: mysql select * from Course; [.] 74 rows in set (0.00 sec) mysql select * from Section; ERROR 2013: Lost connection to MySQL server during query mysql select count(*) from Section; ERROR 2006: MySQL server has gone away No connection. Trying to reconnect... Connection id:49927 Current database: TT_fcgi +--+ | count(*) | +--+ | 318 | +--+ 1 row in set (0.11 sec) mysql Is there something horrifically wrong with mysql when it goes over 1000 connections or something? I swear I was able to do show processlist from the db server yesterday (but we had less then 1000 connections yesterday). I *am* able to a process list if I connect through TCP/IP: db-01:~ mysql -u joe -h 127.0.0.1 -p So it looks like it just effects the domain socket connections, and TCP/IP is okay. Fortunaetly all of our apps use TCP/IP, so it's not a big deal to us. But I bet somebody out there -does- care! :) The threads crashing seems to compeltely uneffect any of our TCP/IP connections running queries at the same time. Thanks, -Joe - Before posting, please check: http://www.mysql.com/manual.php (the manual) http://lists.mysql.com/ (the list archive) To request this thread, e-mail [EMAIL PROTECTED] To unsubscribe, e-mail [EMAIL PROTECTED] Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php
InnoDB Autoincrement deadlock,
I'm having some problems with tables using autoincrementing primary keys when I use InnoDB.. I've searched through the documentation at Mysql and innobase's website, and havn't been able to find anything saying this is a limitation of innodb, so I will assume this is a bug (or unintentional feature). Basically, if you start two transactions, and insert a record in the first (and let the autoincrementing key be selected automatically, and not committing), it will work. If the second transaction does the same, it blocks. Inserting another record from the first transaction returns a table error 100. This makes both transactions kinda useless for working with this table. Since one's frozen, and the other returns an error. I'm sure what's happening is the second transaction is being blocked until the first commits, so mysql knows what autoincrementing number to issue the second insert. Problem is, this makes no sense. This would mean effectivly only 1 transaction could be in use at a time if you are going to be inserting into a table with a autoincrementing key. And since the existance of a second transaction causes the first to fail too (eg: table error 10), it would be entirely unsafe to allow more then 1 transaction on the table. I really could care less if my autoincrementing keys are handed out sequentially, or if there are missing digits due to rolled back transactions. (If I cared, I would put a timestamp on the record). Is there some way to get this baby going? Maybe a I don't care about sequential auto_incrementing keys variable? Here's some screen dumps: mysql create table vroomm (i int auto_increment, j int, primary key(i)) type=innodb; Query OK, 0 rows affected (0.03 sec) mysql begin; Query OK, 0 rows affected (0.00 sec) mysql insert into vroomm (j) values(2); Query OK, 1 row affected (0.00 sec) -- (Second DB Transaction tries to insert vroomm) mysql insert into vroomm (j) values(4); ERROR 1030: Got error 100 from table handler I ran into this problem when I tried converting one of our existing applications to run under innodb. About 3 minutes after I started it, it deadlocked. This was the cause. Two transactions were trying to create new DB records w/ Autoincrementing keys. The entire application froze seconds later when the all the other threads tried to insert new DB records too. Ooops. Lemme know what you guys think.. If this is a bug, a feature, or how I can get it working, Thanks, -Joe - Before posting, please check: http://www.mysql.com/manual.php (the manual) http://lists.mysql.com/ (the list archive) To request this thread, e-mail [EMAIL PROTECTED] To unsubscribe, e-mail [EMAIL PROTECTED] Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php
Re: Why Heap Tables doesn´t support auto increment colums
Im currently working on system for Managing http Session and would like to use a heap table for storing the session information. Session information is only of temporary intrest. But still I need an unique ID for reference with some other Tabels storing dtaa for the session. All these tables are Heap tables. So there is no Probelem if the ID generated by the auto_increment column is not unique when the Database server is restarted. There's a couple problems with that to begin with. First, you're storing state in a HEAP table. If the mysql server gets reset, all the sessions are lost. It would be tough to think you want to write your system so your database server cannot ever be restarted, especially when writing the data to a MyISAM table would probably be at a trivial loss of speed. Also, if you used an auto_incrementing column, you would be handing out sessionIDs sequentially. It would be trivial for somebody to usurp somebody else's sessionID by simply subtracting or adding 1 to their own. -Joe - Before posting, please check: http://www.mysql.com/manual.php (the manual) http://lists.mysql.com/ (the list archive) To request this thread, e-mail [EMAIL PROTECTED] To unsubscribe, e-mail [EMAIL PROTECTED] Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php