Re: 5.0.18-max-log as a slave of a 4.1.13-standard-log master problem - slave hangs
Kishore Jalleda schrieb: Hi you may be having issues with the byte order on the opetron's and the P4's , this was asked earlier in the list, and here's what Jimmy from Mysql had to say Kishore, Thanks for the suggestion, but all x86 have the same byte order... and as I wrote its not a cluster problem but a replication problem :( btw: I just started the mysql-tests and it hangs, too: db5:/usr/local/mysql/mysql-test# ./mysql-test-run Installing Test Databases Removing Stale Files Installing Master Databases running ../bin/mysqld --no-defaults --bootstrap --skip-grant-tables --basedir=.. --datadir=mysql-test/var/master-da ta --skip-innodb --skip-ndbcluster --skip-bdb Installing Master Databases 1 running ../bin/mysqld --no-defaults --bootstrap --skip-grant-tables --basedir=.. --datadir=mysql-test/var/master-da ta1 --skip-innodb --skip-ndbcluster --skip-bdb Installing Slave Databases running ../bin/mysqld --no-defaults --bootstrap --skip-grant-tables --basedir=.. --datadir=mysql-test/var/slave-dat a --skip-innodb --skip-ndbcluster --skip-bdb Manager disabled, skipping manager start. Starting ndbcluster Starting ndbd Starting ndbd Waiting for started... NDBT_ProgramExit: 0 - OK Connected to Management Server at: localhost:9350 Cluster Configuration - [ndbd(NDB)] 2 node(s) id=1@127.0.0.1 (Version: 5.0.18, Nodegroup: 0, Master) id=2@127.0.0.1 (Version: 5.0.18, Nodegroup: 0) [ndb_mgmd(MGM)] 1 node(s) id=3@127.0.0.1 (Version: 5.0.18) [mysqld(API)] 4 node(s) id=4 (not connected, accepting connect from any host) id=5 (not connected, accepting connect from any host) id=6 (not connected, accepting connect from any host) id=7 (not connected, accepting connect from any host) Loading Standard Test Databases Starting Tests TESTRESULT --- alias [ pass ] alter_table[ pass ] analyse[ pass ] analyze[ pass ] ansi [ pass ] archive[ pass ] archive_gis[ pass ] now nothing happens, cpuload is at 0 - any ideas? Jan -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
Re: 5.0.18-max-log as a slave of a 4.1.13-standard-log master problem - slave hangs
Could you post the error log entries from the slave and the binlog where the slave hangs , also just to make sure d2 and d3 replicate well without any problems from d1 right ? also as per your message d4 and d5 would work well if no replication is enabled at all so essentially its only the replication that is causing the hang right?( could you clarify this for us ) , finally if thats the case and you feel that you have no error on your side ( assuming you have exhausted all the possibilities trying to isloate the problem) but still the slave hangs, then you might want to open up a bug report http://dev.mysql.com/doc/refman/5.0/en/replication-bugs.html hope this helps . Kishore Jalleda On 2/8/06, Jan Kirchhoff [EMAIL PROTECTED] wrote: Kishore Jalleda schrieb: Hi you may be having issues with the byte order on the opetron's and the P4's , this was asked earlier in the list, and here's what Jimmy from Mysql had to say Kishore, Thanks for the suggestion, but all x86 have the same byte order... and as I wrote its not a cluster problem but a replication problem :( btw: I just started the mysql-tests and it hangs, too: db5:/usr/local/mysql/mysql-test# ./mysql-test-run Installing Test Databases Removing Stale Files Installing Master Databases running ../bin/mysqld --no-defaults --bootstrap --skip-grant-tables --basedir=.. --datadir=mysql-test/var/master-da ta --skip-innodb --skip-ndbcluster --skip-bdb Installing Master Databases 1 running ../bin/mysqld --no-defaults --bootstrap --skip-grant-tables --basedir=.. --datadir=mysql-test/var/master-da ta1 --skip-innodb --skip-ndbcluster --skip-bdb Installing Slave Databases running ../bin/mysqld --no-defaults --bootstrap --skip-grant-tables --basedir=.. --datadir=mysql-test/var/slave-dat a --skip-innodb --skip-ndbcluster --skip-bdb Manager disabled, skipping manager start. Starting ndbcluster Starting ndbd Starting ndbd Waiting for started... NDBT_ProgramExit: 0 - OK Connected to Management Server at: localhost:9350 Cluster Configuration - [ndbd(NDB)] 2 node(s) id=1@127.0.0.1 (Version: 5.0.18, Nodegroup: 0, Master) id=2@127.0.0.1 (Version: 5.0.18, Nodegroup: 0) [ndb_mgmd(MGM)] 1 node(s) id=3@127.0.0.1 (Version: 5.0.18) [mysqld(API)] 4 node(s) id=4 (not connected, accepting connect from any host) id=5 (not connected, accepting connect from any host) id=6 (not connected, accepting connect from any host) id=7 (not connected, accepting connect from any host) Loading Standard Test Databases Starting Tests TESTRESULT --- alias [ pass ] alter_table[ pass ] analyse[ pass ] analyze[ pass ] ansi [ pass ] archive[ pass ] archive_gis[ pass ] now nothing happens, cpuload is at 0 - any ideas? Jan
Re: 5.0.18-max-log as a slave of a 4.1.13-standard-log master problem - slave hangs
A neverending story. I thought it worked (without having an idea what has been the problem), but it broke down again after a few hours. My current set up is: -A p4 production server (Server1) running debian linux, 2.4 kernel, mysql 4.1.13-standard-log. This server is replicating to several other production-servers. -Two new Dual-Opteron Servers (Server2+Server3) with 6GB RAM each, 3ware SATA-RAID, custom kernel 2.6.15.1 SMP, mysql 5.0.18-max-log. Server2 is replicating from Server1 with a few Replicate_Ignore_DB/Replicate_Wild_Ignore_Table rules. I have had problems getting this server running at first since it always hung with replicated queries (different ones) and the only thing helped was to kill -9 the mysqld. At some point it suddenly worked and is running for almost a week now - having replicated at least 20-30GB so far. Server 3 was supposed to become a slave of the first one, but it shows the same problems I had with Server2 at first: it starts to replicate and some query hangs after a few minutes. These are no complicated mass-inserts (those 1-5MB mass-inserts work without trouble), but simple queries like insert into table (a,b,c) values (1,2,3) or update table set a=1 where b=2. I tried kernel 2.6.8, 2.6.15, SMP and non-SMP (debian-kernels and self-compiled), the official mysql-max and mysql-standard-binaries and a self-compiled mysql 5.0.18. I disabled Innodb and Cluster, I put all variables back to the standard values and played around with lots of settings. lspci and the output of /proc/cpuinfo are the same on both servers. I have exactly the same BIOS-settings on both servers (I was going nuts comparing these bios-screens with a KVM in a loud server-room). Both servers have exactly the same debian-packages installed. lsmod shows the same on both systems. I have had trouble with mysql-replication in 3.2x and 4.x in the last years, but I always got everything working and it was was working good without bigger trouble once it was up and running. But this time I have no clue what else to try. I currently have no other server that is powerful enough to handle all the updates being replicated in order to test a 5.0.18 on some other CPU. I'll probably try to get my workstation (p4 3ghz, 1GB RAM) running as a slave hoping the IDE-disk is fast enough, but no matter if that works or not - I don't know what to change/try on my new servers?!? any ideas anybody? thanks Jan Jan Kirchhoff schrieb: I thought I found the reason for my problems with the change in join-behaviour in mysql 5, but Iwas wrong :( there is more trouble :( my replications hangs with simple queries like insert into table (a,b,c) values (1,2,3) on a myisam-table. It just hangs forever with no cpu-load on the slave. I have to kill and restart mysql with the following commands: killall -9 mysqld;sleep 2;mysqladmin shutdown;sleep 5;/etc/init.d/mysql start;sleep 2;mysql -e 'slave start' I can find the changed row in the table, so the query was processed correctly. Then it runs again for some time and hangs again with some other simple insert. I disabled innodb, cluster, took out all my variables out of my.cnf except max_allowed_packet = 16M which I need for the replication to work and I have no clue what the reason for my problem is. what else could I try? -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
Re: 5.0.18-max-log as a slave of a 4.1.13-standard-log master problem - slave hangs
Hi you may be having issues with the byte order on the opetron's and the P4's , this was asked earlier in the list, and here's what Jimmy from Mysql had to say All machines used in the cluster must have the same architecture; that is, all machines hosting nodes must be either big-endian or little-endian, and you cannot use a mixture of both. For example, you cannot have a management node running on a PPC which directs a data node that is running on an x86 machine. This restriction does not apply to machines simply running mysql or other clients that may be accessing the cluster's SQL nodes. http://mysql.osuosl.org/doc/refman/5.0/en/mysql-cluster-limitations.html So make sure both the opetron's and P4's are running with the same byte order Kishore Jalleda On 2/7/06, Jan Kirchhoff [EMAIL PROTECTED] wrote: A neverending story. I thought it worked (without having an idea what has been the problem), but it broke down again after a few hours. My current set up is: -A p4 production server (Server1) running debian linux, 2.4 kernel, mysql 4.1.13-standard-log. This server is replicating to several other production-servers. -Two new Dual-Opteron Servers (Server2+Server3) with 6GB RAM each, 3ware SATA-RAID, custom kernel 2.6.15.1 SMP, mysql 5.0.18-max-log. Server2 is replicating from Server1 with a few Replicate_Ignore_DB/Replicate_Wild_Ignore_Table rules. I have had problems getting this server running at first since it always hung with replicated queries (different ones) and the only thing helped was to kill -9 the mysqld. At some point it suddenly worked and is running for almost a week now - having replicated at least 20-30GB so far. Server 3 was supposed to become a slave of the first one, but it shows the same problems I had with Server2 at first: it starts to replicate and some query hangs after a few minutes. These are no complicated mass-inserts (those 1-5MB mass-inserts work without trouble), but simple queries like insert into table (a,b,c) values (1,2,3) or update table set a=1 where b=2. I tried kernel 2.6.8, 2.6.15, SMP and non-SMP (debian-kernels and self-compiled), the official mysql-max and mysql-standard-binaries and a self-compiled mysql 5.0.18. I disabled Innodb and Cluster, I put all variables back to the standard values and played around with lots of settings. lspci and the output of /proc/cpuinfo are the same on both servers. I have exactly the same BIOS-settings on both servers (I was going nuts comparing these bios-screens with a KVM in a loud server-room). Both servers have exactly the same debian-packages installed. lsmod shows the same on both systems. I have had trouble with mysql-replication in 3.2x and 4.x in the last years, but I always got everything working and it was was working good without bigger trouble once it was up and running. But this time I have no clue what else to try. I currently have no other server that is powerful enough to handle all the updates being replicated in order to test a 5.0.18 on some other CPU. I'll probably try to get my workstation (p4 3ghz, 1GB RAM) running as a slave hoping the IDE-disk is fast enough, but no matter if that works or not - I don't know what to change/try on my new servers?!? any ideas anybody? thanks Jan Jan Kirchhoff schrieb: I thought I found the reason for my problems with the change in join-behaviour in mysql 5, but Iwas wrong :( there is more trouble :( my replications hangs with simple queries like insert into table (a,b,c) values (1,2,3) on a myisam-table. It just hangs forever with no cpu-load on the slave. I have to kill and restart mysql with the following commands: killall -9 mysqld;sleep 2;mysqladmin shutdown;sleep 5;/etc/init.d/mysql start;sleep 2;mysql -e 'slave start' I can find the changed row in the table, so the query was processed correctly. Then it runs again for some time and hangs again with some other simple insert. I disabled innodb, cluster, took out all my variables out of my.cnf except max_allowed_packet = 16M which I need for the replication to work and I have no clue what the reason for my problem is. what else could I try? -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
Re: 5.0.18-max-log as a slave of a 4.1.13-standard-log master problem - slave hangs
I thought I found the reason for my problems with the change in join-behaviour in mysql 5, but Iwas wrong :( there is more trouble :( my replications hangs with simple queries like insert into table (a,b,c) values (1,2,3) on a myisam-table. It just hangs forever with no cpu-load on the slave. I have to kill and restart mysql with the following commands: killall -9 mysqld;sleep 2;mysqladmin shutdown;sleep 5;/etc/init.d/mysql start;sleep 2;mysql -e 'slave start' I can find the changed row in the table, so the query was processed correctly. Then it runs again for some time and hangs again with some other simple insert. I disabled innodb, cluster, took out all my variables out of my.cnf except max_allowed_packet = 16M which I need for the replication to work and I have no clue what the reason for my problem is. what else could I try? -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
5.0.18-max-log as a slave of a 4.1.13-standard-log master problem - slave hangs
I've been trying to get my new mysql-5.0.18-servers running as slaves of our production systems to check if all our applications work fine with mysql 5 and to do some tests and tuning on the new servers. The old servers are all P4s, 3GB RAM running debian-linux, 2.4-kernel and official mysql 4.1.13-standard-log binaries: d1 is the master, d2 and d3 are slaves. my new servers are dual-opterons, 6 GB RAM, running debian-linux with a 2.6.15-SMP-kernel, official mysql 5.0.18-max-log-binary. their names are d4 and d5. I am currently trying to get d4 running as a slave of d1. d5 should later become a slave of d4. The old servers only have myisam and memory-tables, innodb is disabled. The new ones had innodb and mysql-cluster enabled (datanodes running on the same servers, management-node running on d3) since I wanted to do some testing with the different engines, but I disabled both temporarily without any change in this weird problem: No matter if I do a copy of the /var/lib/mysql of d1 (and dump the contents of the memory-tables) while a flush tables with read lock is active and copy that to d4 (and doing a change master to... on d4 afterwards) or if I do a mysqldump --master-data=1: The replication runs for maybe a minute or two and then hangs. show slave status says everything is OK but a replicated replace hangs in the processlist and nothing happens. CPU-load goes down to zero. Even after 2 hours nothing changed, a slave stop hangs, too, when I kill the replicated replace-process nothing happens and I can't stop the mysql server and have to kill it with killall -9 mysqld in the shell :( At first I thought this was a problem with a temporary table, but after having reloaded a new dump a few times I had the same problem with really simple inserts/updates like: A new dump, everything works for a few minutes, then this query hangs: | 4 | system user | | nachrichten | Connect | 11164 | update |replace into nachrichten.x_symbole (symbol,syscode,nachrichten_id) values('KUN','de','99949') (taken directly from show processlist) Info about the simple table: CREATE TABLE `x_symbole` ( `symbol` char(20) NOT NULL default '', `syscode` char(6) NOT NULL default '', `nachrichten_id` int(11) NOT NULL default '0', PRIMARY KEY (`symbol`,`syscode`,`nachrichten_id`), KEY `nachrichten_id` (`nachrichten_id`) ) ENGINE=MyISAM DEFAULT CHARSET=latin1 I have to kill the mysqld with killall -9 mysqld, do a mysqladmin shutdown again and then restart mysql and issue the query in the mysql-shell: it works! Then I issue a start slave, everything works again for a minute or two and hangs with some different query. I go nuts with this! I spent so much time with this problem and did not get any further and I have absolutely no idea what the problem is. nothing in the error log. Can anybody suggest something that might help? I have no idea whats wrong! regards Jan d4: mysql show variables; +-++ | Variable_name | Value | +-++ | auto_increment_increment| 1 | | auto_increment_offset | 1 | | automatic_sp_privileges | ON | | back_log| 50 | | basedir | /usr/local/mysql-max-5.0.18-linux-x86_64-glibc23/ | | binlog_cache_size | 32768 | | bulk_insert_buffer_size | 15728640 | | character_set_client| latin1 | | character_set_connection| latin1 | | character_set_database | latin1 | | character_set_results | latin1 | | character_set_server| latin1 | | character_set_system| utf8 | | character_sets_dir | /usr/local/mysql-max-5.0.18-linux-x86_64-glibc23/share/mysql/charsets/ | | collation_connection| latin1_swedish_ci