We have a Dell Poweredge 1750 with dual 3GHz Xeon's and 4GB of memory. It is connected to a EMC CX600 through a Brocade Silkworm 2800 and QLogic HBA running qla2300 v6.07.
Anyways, we are attempting to migrate to the 4 tree from 3.23.56. When we began doing so I started running some sql-bench tests for performance benchmarking. I immediately noticed far worse performance on the mysql 4 tree in comparision to the 3 tree for specific tests primarily the create and insert. We initially started troubleshooting by isolating the hardware. We moved our datadir back to two internal scsi 10k drives in a hardware raid0 config. I then tested 4.0.18 on the internal storage and had no problems. Performance was smoking fast. I then reverted back to the san storage (raid5 btw) to verify again and received horrible performance. Initially I thought it may be due to the additional overhead of writing parity to the raid5. However, I then decided to downgrade to a mysql 4 version and test against the raid5 san storage. I downgraded to 4.0.13 and tested. Performance was again fast. I then climbed the tree revision by revision. I tested 4.0.14, 4.0.15, 4.0.16 and they were all fast performance. I then got to 4.0.17 and performance fell in the toilet again. I have since tested 4.0.17 and 4.0.18 and both show a significant performance degradation over any versions prior to 4.0.17. This ONLY occurs on the external raid5 storage for these versions. All internal storage is lighting fast. I have changed a few kernel parameters for the vm including: # Modified the min readahead vm.min-readahead = 128 # Modified the max readahead vm.max-readahead = 512 # Modified bdflush parameters vm.bdflush = 5 2048 512 512 500 1000 20 10 0 I have also modified a few mysql parameters as seen below: key_buffer = 1600M sort_buffer_size = 8M All parameters were identical through ALL tests and nothing else was running at the time of the testing. The only thing that changed was the mysql version. Can anyone offer some insight as to what may have changed between 4.0.16 and 4.0.17 that would cause this performance issue on the external raid array? I looked at the change log for 4.0.17 but didn't see anything that immediately stuck out as a possibility. Here are the test results for create and here is some vmstat output: MYSQL 4.0.16 [EMAIL PROTECTED] sql-bench]# /usr/bin/perl ./test-create --server=mysql -cmp=mysql --user=root --password='pa$$word' --log > "output/create-mysql-Linux_2.4.21_9.ELsmp_i686-cmp-mysql" 2>&1 [EMAIL PROTECTED] sql-bench]# more output/create-mysql-Linux_2.4.21_9.ELsmp_i686-cmp-mysql Testing server 'MySQL 4.0.16 log' at 2004-07-15 13:47:42 Testing the speed of creating and dropping tables Testing with 10000 tables and 10000 loop count Testing create of tables Time for create_MANY_tables (10000): 46 wallclock secs ( 0.78 usr 0.12 sys + 0.00 cusr 0.00 csys = 0.90 CPU) Accessing tables Time to select_group_when_MANY_tables (10000): 6 wallclock secs ( 0.68 usr 0.24 sys + 0.00 cusr 0.00 csys = 0.92 CPU) Testing drop Time for drop_table_when_MANY_tables (10000): 1 wallclock secs ( 0.17 usr 0.16 sys + 0.00 cusr 0.00 csys = 0.33 CPU) Testing create+drop Time for create+drop (10000): 4 wallclock secs ( 0.69 usr 0.28 sys + 0.00 cusr 0.00 csys = 0.97 CPU) Time for create_key+drop (10000): 5 wallclock secs ( 0.92 usr 0.29 sys + 0.00 cusr 0.00 csys = 1.21 CPU) Total time: 62 wallclock secs ( 3.24 usr 1.09 sys + 0.00 cusr 0.00 csys = 4.33 CPU) [EMAIL PROTECTED] mysql-4.0.16]# vmstat 1 procs memory swap io system cpu r b swpd free buff cache si so bi bo in cs us sy id wa 0 0 0 2708360 35144 879168 0 0 0 0 113 12 0 0 100 0 0 0 0 2696128 35148 889676 0 0 0 0 116 5376 7 7 86 0 1 0 0 2679752 35148 907548 0 0 0 0 113 8845 7 21 72 0 1 0 0 2669044 35148 918256 0 0 0 16384 217 5395 2 23 74 0 0 0 0 2661044 35148 926256 0 0 0 8192 166 3998 2 23 74 0 0 0 0 2654380 35148 932920 0 0 0 0 114 3294 2 24 74 0 0 0 0 2648748 35148 938552 0 0 0 8192 176 2856 2 24 74 0 0 0 0 2643888 35148 943412 0 0 0 8189 207 2490 3 22 75 0 0 0 0 2639304 35148 947996 0 0 0 0 115 2281 1 23 75 0 procs memory swap io system cpu r b swpd free buff cache si so bi bo in cs us sy id wa 0 0 0 2559228 35148 1028072 0 0 0 0 118 949 2 24 74 0 0 0 0 2557368 35148 1029932 0 0 0 0 117 933 2 24 74 0 0 0 0 2681956 35160 888536 0 0 0 0 115 12260 5 5 89 0 0 0 0 2681956 35160 888536 0 0 0 0 113 12 0 0 100 0 0 0 0 2681956 35160 888536 0 0 0 0 124 54 0 0 100 0 MYSQL 4.0.17 [EMAIL PROTECTED] root]# more /usr/local/mysql/sql-bench/output/create-mysql-Linux_2.4.21_9.ELsmp_i686-cmp-mysql Testing server 'MySQL 4.0.17 log' at 2004-07-15 13:50:24 Testing the speed of creating and dropping tables Testing with 10000 tables and 10000 loop count Testing create of tables Time for create_MANY_tables (10000): 400 wallclock secs ( 1.27 usr 0.32 sys + 0.00 cusr 0.00 csys = 1.59 CPU) Accessing tables Time to select_group_when_MANY_tables (10000): 6 wallclock secs ( 0.87 usr 0.38 sys + 0.00 cusr 0.00 csys = 1.25 CPU) Testing drop Time for drop_table_when_MANY_tables (10000): 1 wallclock secs ( 0.23 usr 0.14 sys + 0.00 cusr 0.00 csys = 0.37 CPU) Testing create+drop Time for create+drop (10000): 400 wallclock secs ( 1.16 usr 0.27 sys + 0.00 cusr 0.00 csys = 1.43 CPU) Time for create_key+drop (10000): 400 wallclock secs ( 0.65 usr 0.22 sys + 0.00 cusr 0.00 csys = 0.87 CPU) Total time: 1207 wallclock secs ( 4.18 usr 1.33 sys + 0.00 cusr 0.00 csys = 5.51 CPU) 1 0 0 2666096 35224 897208 0 0 0 412 164 636 0 0 50 50 0 0 0 2666096 35224 897220 0 0 0 400 163 617 0 0 50 50 0 0 0 2666096 35224 897232 0 0 0 404 164 622 0 0 50 50 0 0 0 2666096 35224 897244 0 0 0 400 163 614 0 0 50 50 0 0 0 2666096 35224 897256 0 0 0 412 166 638 0 0 50 50 0 0 0 2666096 35224 897268 0 0 0 400 163 612 0 0 50 50 0 0 0 2669412 35224 897368 0 0 0 0 113 14 0 0 100 0 0 0 0 2669412 35224 897368 0 0 0 0 113 18 0 0 100 0 0 0 0 2669412 35224 897368 0 0 0 0 113 14 0 0 100 0 As you can see resource patterns have changed drastically. Our block activity, context switching and io wait time have all significantly changed between versions. It's almost like 4.0.17 is performing direct io of some sort bypassing filesystem buffers. I have tested throughput to the raid5 (4+1) array using both dd and tiobench running multiple threads. In both cases the throughout to the array has been more than sufficient. I have also strace'd the create script and the function calls are identical however the creates occur at a much faster rate in 4.0.16 compared to 4.0.17. This is also displayed through the sql-bench results. So the obvious question is what changed between 4.0.16 and 4.0.17 that would cause these results? Thx, Ian. The information transmitted in this email is intended only for the person(s) or entity to which it is addressed and may contain confidential and/or privileged material. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipient is prohibited. If you received this email in error, please contact the sender and permanently delete the email from any computer. -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe: http://lists.mysql.com/[EMAIL PROTECTED]