Re: MySQL not using optimum disk throughput.
On Sat, 2005-05-07 at 08:18, Greg Whalin wrote: Hi Peter, As for reporting bugs ... http://bugs.mysql.com/bug.php?id=7437 http://bugs.mysql.com/bug.php?id=10437 We have found Opteron w/ Mysql to be an extremely buggy platform, especially under linux 2.6, but granted, we are running Fedora. Perhaps we will try Suse, but I feel I have heard similar reports (from Friendster) about their use of Suse 2.6 and Opterons being similarly slow. Well, if I'm not mistaken Friendster had been running into some bugs with Linux kernel but it was not directly Opteron related. We are currently running MyIsam tables, but plan on switching to Innodb in the next month or two btw, so our performance problems are w/ MyIsam. Do you still have the problem ? I've seen you're using FC1 which is rather old. I have not heard about much of success of this version with Opteron. also did you run mysql-test on your MySQL server ? Does it pass at all ? If it does not it is just likely your build is broken or incompatible with your system. There are unfortunately two problems which affect both self compiler binaries and out binaries. Self compiled binaries could be affected by GLIBC bugs and compiler bugs which we've seen a lot when platform just appeared.Our static RPM may however have other problem - Opteron distributions are not 100% binary compatible for statically linked binaries and ie binary compiled on SuSE SLES is known to crash on RH AS in some cases. We have great adoption of opteron platform among our customers with great success rate, so I'm quite surprised by extent of problems you're having. -- Peter Zaitsev, Senior Performance Engineer MySQL AB, www.mysql.com -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
RE: MySQL not using optimum disk throughput.
I've added a fair bit of information on the Opteron HOWTO Wiki at: http://hashmysql.org/index.php?title=Opteron_HOWTO for using Fedora Core 3 with X86-64. In my performance testing, I was finding that with so much RAM, everything was coming from RAM anyway. RAID10 seemed to be most stable, and writeback caching increased the speed by about 15% although I don't think I was really able to max out the IO anyway. Best regards, Richard Dale. Norgate Investor Services - Premium quality Stock, Futures and Foreign Exchange Data for markets in Australia, Asia, Canada, Europe, UK USA - www.premiumdata.net -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
Re: MySQL not using optimum disk throughput.
Greg Whalin wrote: I suspect this is an OS issue. Our Opteron's were completing large data update queries aprox 2-3 times slower than our Xeons when running under 2.6. After a switch to 2.4, Opteron's are faster than the Xeons. I mentioned NPTL being shut off (LD_ASSUME_KERNEL=2.4.19 in init script). When we left NPTL running, we saw almost instant deadlocks just watching replication catching up (no other site traffic directed to the machine). This is in 2.4 btw, so this is the backported NPTL kernels from Fedora. I somewhat suspect NPTL being a problem in 2.6 as well due to impressions I get from sifting through mysql's bug tracking system. The IO scheduler was also an obvious culprit. Another point I wanted to note. What version of glibc were you running. We were running Debian with glibc 2.3.2 (libc6-i686-2.3.2) and were running into deadlocks with another piece of code. 2.3.2 has a number of known issues and we had to migrate to an experimental 2.3.4 build. I've been considering moving our databases to 2.3.4 but they weren't having any problems. It might be that opteron is raising these issue more than Xeon. FYI... -- Use Rojo (RSS/Atom aggregator)! - visit http://rojo.com. See irc.freenode.net #rojo if you want to chat. Rojo is Hiring! - http://www.rojonetworks.com/JobsAtRojo.html Kevin A. Burton, Location - San Francisco, CA AIM/YIM - sfburtonator, Web - http://peerfear.org/ GPG fingerprint: 5FB2 F3E2 760E 70A8 6174 D393 E84D 8D04 99F1 4412 -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
Re: MySQL not using optimum disk throughput.
Kevin Burton wrote: Greg Whalin wrote: I suspect this is an OS issue. Our Opteron's were completing large data update queries aprox 2-3 times slower than our Xeons when running under 2.6. After a switch to 2.4, Opteron's are faster than the Xeons. I mentioned NPTL being shut off (LD_ASSUME_KERNEL=2.4.19 in init script). When we left NPTL running, we saw almost instant deadlocks just watching replication catching up (no other site traffic directed to the machine). This is in 2.4 btw, so this is the backported NPTL kernels from Fedora. I somewhat suspect NPTL being a problem in 2.6 as well due to impressions I get from sifting through mysql's bug tracking system. The IO scheduler was also an obvious culprit. Another point I wanted to note. What version of glibc were you running. We were running Debian with glibc 2.3.2 (libc6-i686-2.3.2) and were running into deadlocks with another piece of code. 2.3.2 has a number of known issues and we had to migrate to an experimental 2.3.4 build. I've been considering moving our databases to 2.3.4 but they weren't having any problems. It might be that opteron is raising these issue more than Xeon. FYI... We are currently running 2.3.2 (Fedora Core 1) on our Opterons. When we were still running linux 2.6, we were on 2.3.3 (Fedora Core 2). Greg -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
Re: MySQL not using optimum disk throughput.
Greg Whalin wrote: We are currently running 2.3.2 (Fedora Core 1) on our Opterons. When we were still running linux 2.6, we were on 2.3.3 (Fedora Core 2). Yeah... we were being bitten by 2.3.2's NPTL implementation for MONTHs before I heard a rumor that the Internet Archive moved to 2.3.4. This literally solved all my problems so I'd recommend upgrading to 2.3.4 if you notice this type of stuff again. Kevin -- Use Rojo (RSS/Atom aggregator)! - visit http://rojo.com. See irc.freenode.net #rojo if you want to chat. Rojo is Hiring! - http://www.rojonetworks.com/JobsAtRojo.html Kevin A. Burton, Location - San Francisco, CA AIM/YIM - sfburtonator, Web - http://peerfear.org/ GPG fingerprint: 5FB2 F3E2 760E 70A8 6174 D393 E84D 8D04 99F1 4412 -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
Re: MySQL not using optimum disk throughput.
Kevin Burton wrote: Greg Whalin wrote: We are currently running 2.3.2 (Fedora Core 1) on our Opterons. When we were still running linux 2.6, we were on 2.3.3 (Fedora Core 2). Yeah... we were being bitten by 2.3.2's NPTL implementation for MONTHs before I heard a rumor that the Internet Archive moved to 2.3.4. This literally solved all my problems so I'd recommend upgrading to 2.3.4 if you notice this type of stuff again. Kevin Curious, were you seeing deadlocks in Suns JVM w/ Tomcat? We were forced to run Tomcat w/ NPTL off due to deadlocks under glibc 2.3.2+NPTL. Under FC2, the JVM runs fine w/ NPTL, though glibc is now 2.3.3. We have had no NPTL issues w/ the x86 version of mysql, but the x86-64 definite almost immediate deadlock (w/ 2.3.2). -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
Re: MySQL not using optimum disk throughput.
Greg Whalin wrote: Curious, were you seeing deadlocks in Suns JVM w/ Tomcat? Never with Tomcat but we might have a different number of threads. But it *was* with Java... We were forced to run Tomcat w/ NPTL off due to deadlocks under glibc 2.3.2+NPTL. Yup.. thats the problem we had. But we have too many threads so Linuxthreads fell down. It sounds like upgrading your glibc would fix this issue. Under FC2, the JVM runs fine w/ NPTL, though glibc is now 2.3.3. I think this particular bug was fixed in 2.3.3 but there are other interesting bugs fixed in 2.3.4 so we went that route. That and Debian has no 2.3.3 build. We have had no NPTL issues w/ the x86 version of mysql, but the x86-64 definite almost immediate deadlock (w/ 2.3.2). Yeah.. these should related. I mean its a race condition so the processor or schedulre might affect it. So... this might be another Opteron issue that we've solved :) I'd be interested in finding out if the switch fixes this issue.. Kevin -- Use Rojo (RSS/Atom aggregator)! - visit http://rojo.com. See irc.freenode.net #rojo if you want to chat. Rojo is Hiring! - http://www.rojonetworks.com/JobsAtRojo.html Kevin A. Burton, Location - San Francisco, CA AIM/YIM - sfburtonator, Web - http://peerfear.org/ GPG fingerprint: 5FB2 F3E2 760E 70A8 6174 D393 E84D 8D04 99F1 4412 -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
Re: MySQL not using optimum disk throughput.
On Fri, 2005-05-06 at 22:16, John David Duncan wrote: And no performance diff. Note that you're benchmarks only show a 20M addition overhead. We're about 60x too slow for these drives so I'm not sure what could be going on here :-/ I know of a site that encountered a similar performance issue: The OS was reading in a lot more data from the disk than the database really needed. The culprit turned out to be the stripe size on a 4-disk RAID. By reducing the stripe size from 768K to 32K, they obtained a 200% increase in mysql throughput. Hi, This is actually interesting point, as we typically recommend large stripes with MySQL (RAID 10 best) This may sounds like contradiction but it is not. You need to have large stripe set (256-1024K+) but small RAID controller cache line (16K for Innodb tables) The thing is by default many RAID controllers would put cacheline size = stripe size, some may not even allow to change it. If it is the case MySQL will have to read a lot of unnecessary data which will kill performance. -- Peter Zaitsev, Senior Performance Engineer Come to hear my talk at MySQL UC 2005 http://www.mysqluc.com/ MySQL AB, www.mysql.com -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
Re: MySQL not using optimum disk throughput.
On Fri, 2005-05-06 at 19:01, Greg Whalin wrote: What drives are you using? For SCSI RAID, you definitly want deadline scheduler. That said, even after the switch to deadline, we saw our Opteron's running way slow (compared to older slower Xeons). Whatever the problem is, we fought it for quite a while (though difficult to test too much w/ production dbs) and ended up rolling back to 2.4. One more thing to try, if you have smart RAID would be noop scheduler, to let hardware to do the job. Smart optimizations OS do to reduce head movement may not make sense for RAID. In practice I've however seen close results. Also which storage engine are you using ? One of the things which was changed in 2.6 for some hardware configurations is fsync() performance. It was cases in some cases, so it was instant. This for example explained in many cases why people moving from IDE devices to much faster SCSI devices may observe performance degradation (IDE with 2.4 has typically fake fsync) In general we have very positive feedback from using Opterons with MySQL at this point. Sometimes it takes time to make it work right, especially it was the case when they were new but when it flies. Practically same applies to EM64T - It is very good to have now two inexpensive 64bit platforms available. We're getting some feedback about problems on some Fedora Core versions, well this is bleeding edge distribution so I'm nothing but surprised. SuSE both in SLES and Professional variants seems to work very well with Opterons as well as recent RH EL. Speaking about MySQL problems - if you have any MySQL issues on Opterons, please report them as bugs and we'll troubleshoot it. Kevin Burton wrote: Kevin Burton wrote: Greg Whalin wrote: Deadline was much faster. Using sysbench: test: sysbench --num-threads=16 --test=fileio --file-total-size=20G --file-test-mode=rndrw run So... FYI. I rebooted with elevator=deadline as a kernel param. db2:~# cat /sys/block/sda/queue/scheduler noop anticipatory [deadline] cfq (which I assume means I'm now running deadline. Is there any other way to find out?) And no performance diff. Note that you're benchmarks only show a 20M addition overhead. We're about 60x too slow for these drives so I'm not sure what could be going on here :-/ Kevin -- Peter Zaitsev, Senior Performance Engineer Come to hear my talk at MySQL UC 2005 http://www.mysqluc.com/ MySQL AB, www.mysql.com -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
Re: MySQL not using optimum disk throughput.
Hi Peter, As for reporting bugs ... http://bugs.mysql.com/bug.php?id=7437 http://bugs.mysql.com/bug.php?id=10437 We have found Opteron w/ Mysql to be an extremely buggy platform, especially under linux 2.6, but granted, we are running Fedora. Perhaps we will try Suse, but I feel I have heard similar reports (from Friendster) about their use of Suse 2.6 and Opterons being similarly slow. We are currently running MyIsam tables, but plan on switching to Innodb in the next month or two btw, so our performance problems are w/ MyIsam. Greg Peter Zaitsev wrote: On Fri, 2005-05-06 at 19:01, Greg Whalin wrote: What drives are you using? For SCSI RAID, you definitly want deadline scheduler. That said, even after the switch to deadline, we saw our Opteron's running way slow (compared to older slower Xeons). Whatever the problem is, we fought it for quite a while (though difficult to test too much w/ production dbs) and ended up rolling back to 2.4. One more thing to try, if you have smart RAID would be noop scheduler, to let hardware to do the job. Smart optimizations OS do to reduce head movement may not make sense for RAID. In practice I've however seen close results. Also which storage engine are you using ? One of the things which was changed in 2.6 for some hardware configurations is fsync() performance. It was cases in some cases, so it was instant. This for example explained in many cases why people moving from IDE devices to much faster SCSI devices may observe performance degradation (IDE with 2.4 has typically fake fsync) In general we have very positive feedback from using Opterons with MySQL at this point. Sometimes it takes time to make it work right, especially it was the case when they were new but when it flies. Practically same applies to EM64T - It is very good to have now two inexpensive 64bit platforms available. We're getting some feedback about problems on some Fedora Core versions, well this is bleeding edge distribution so I'm nothing but surprised. SuSE both in SLES and Professional variants seems to work very well with Opterons as well as recent RH EL. Speaking about MySQL problems - if you have any MySQL issues on Opterons, please report them as bugs and we'll troubleshoot it. Kevin Burton wrote: Kevin Burton wrote: Greg Whalin wrote: Deadline was much faster. Using sysbench: test: sysbench --num-threads=16 --test=fileio --file-total-size=20G --file-test-mode=rndrw run So... FYI. I rebooted with elevator=deadline as a kernel param. db2:~# cat /sys/block/sda/queue/scheduler noop anticipatory [deadline] cfq (which I assume means I'm now running deadline. Is there any other way to find out?) And no performance diff. Note that you're benchmarks only show a 20M addition overhead. We're about 60x too slow for these drives so I'm not sure what could be going on here :-/ Kevin -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
Re: MySQL not using optimum disk throughput.
On Fri, 6 May 2005, Kevin Burton wrote: For the record... no a loaded system what type of IO do you guys see? Anywhere near full disk capacity? I'm curious to see what type of IO people are seeing on a production/loaded mysql box. Mostly Linux in this thread so far, so I figured I'd throw some FreeBSD in the mix. Our latest build which so far has worked out great, is MySQL 4.0.24 with linuxthreads on FreeBSD 4.10-R. 1) 15k RPM SCSI in RAID-10 configuration: Threads: 61 Questions: 192440153 Slow queries: 1600 Opens: 361204 Flush tables: 1 Open tables: 128 Queries per second avg: 199.496 tty da0 fd0pass0 cpu tin tout KB/t tps MB/s KB/t tps MB/s KB/t tps MB/s us ni sy in id 01 0.00 0 0.00 0.00 0 0.00 0.00 0 0.00 0 1 3 0 95 0 19 16.00 1 0.02 0.00 0 0.00 0.00 0 0.00 0 0 1 0 98 0 19 0.00 0 0.00 0.00 0 0.00 0.00 0 0.00 0 7 9 1 84 0 19 15.57 14 0.21 0.00 0 0.00 0.00 0 0.00 0 4 4 1 91 0 19 0.00 0 0.00 0.00 0 0.00 0.00 0 0.00 0 0 1 0 98 0 19 16.00 3 0.05 0.00 0 0.00 0.00 0 0.00 0 1 3 1 95 0 19 16.00 7 0.11 0.00 0 0.00 0.00 0 0.00 0 0 3 0 97 0 19 16.00 5 0.08 0.00 0 0.00 0.00 0 0.00 0 1 4 0 95 0 19 16.00 1 0.02 0.00 0 0.00 0.00 0 0.00 0 1 3 0 96 0 19 16.00 7 0.11 0.00 0 0.00 0.00 0 0.00 0 1 1 0 98 It does spike up to 15MB/s and 400tps, but that's pretty rare. For the most part it stays below .20MB/s. 2) 15k RPM SCSI (single disk, no raid) Threads: 427 Questions: 929834784 Slow queries: 99 Opens: 421800 Flush tables: 2 Open tables: 128 Queries per second avg: 467.845 tty da0 fd0pass0 cpu tin tout KB/t tps MB/s KB/t tps MB/s KB/t tps MB/s us ni sy in id 01 0.00 0 0.00 0.00 0 0.00 0.00 0 0.00 2 2 15 2 78 0 38 28.00 4 0.11 0.00 0 0.00 0.00 0 0.00 2 1 15 1 81 0 38 64.00 1 0.06 0.00 0 0.00 0.00 0 0.00 2 1 17 2 78 0 38 5.50 4 0.02 0.00 0 0.00 0.00 0 0.00 0 2 12 3 82 0 38 64.00 1 0.06 0.00 0 0.00 0.00 0 0.00 3 3 16 1 76 0 38 64.00 1 0.06 0.00 0 0.00 0.00 0 0.00 4 2 18 1 75 0 38 64.00 1 0.06 0.00 0 0.00 0.00 0 0.00 1 2 13 4 80 0 38 22.67 3 0.07 0.00 0 0.00 0.00 0 0.00 2 3 13 2 79 0 38 64.00 1 0.06 0.00 0 0.00 0.00 0 0.00 3 0 18 1 77 0 38 0.00 0 0.00 0.00 0 0.00 0.00 0 0.00 0 2 14 1 83 The last setup was barely able to handle the load with pthreads as it does spike to 700q/s daily, but with linuxthreads it's not even that loaded anymore. Both of these systems are dual P4s. Atle - Flying Crocodile Inc, Unix Systems Administrator -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
Re: MySQL not using optimum disk throughput.
Atle Veka wrote: On Fri, 6 May 2005, Kevin Burton wrote: For the record... no a loaded system what type of IO do you guys see? Anywhere near full disk capacity? I'm curious to see what type of IO people are seeing on a production/loaded mysql box. Mostly Linux in this thread so far, so I figured I'd throw some FreeBSD in the mix. Our latest build which so far has worked out great, is MySQL 4.0.24 with linuxthreads on FreeBSD 4.10-R. It looks like you're saying here that a single disk is FASTER than your RAID 10 setup. Correct? Which is interesting. I'm wondering if this is a RAID config issue. It just seems to make a LOT more sense that RAID 1 or 10 would be faster than a single disk. Kevin -- Use Rojo (RSS/Atom aggregator)! - visit http://rojo.com. See irc.freenode.net #rojo if you want to chat. Rojo is Hiring! - http://www.rojonetworks.com/JobsAtRojo.html Kevin A. Burton, Location - San Francisco, CA AIM/YIM - sfburtonator, Web - http://peerfear.org/ GPG fingerprint: 5FB2 F3E2 760E 70A8 6174 D393 E84D 8D04 99F1 4412 -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
Re: MySQL not using optimum disk throughput.
On Sat, 7 May 2005, Kevin Burton wrote: It looks like you're saying here that a single disk is FASTER than your RAID 10 setup. Correct? Which is interesting. I'm wondering if this is a RAID config issue. It just seems to make a LOT more sense that RAID 1 or 10 would be faster than a single disk. I wouldn't say that, those were just examples of setups that we have. They handle very different queries. :) While the RAID example has less queries and at times higher disk IO, it handles complex SELECT queries along with INSERT/UPDATEs; 99% of the queries on the non-RAID setup are simple UPDATEs. Atle - Flying Crocodile Inc, Unix Systems Administrator -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
Re: MySQL not using optimum disk throughput.
In the last episode (May 07), Atle Veka said: On Fri, 6 May 2005, Kevin Burton wrote: For the record... no a loaded system what type of IO do you guys see? Anywhere near full disk capacity? I'm curious to see what type of IO people are seeing on a production/loaded mysql box. Mostly Linux in this thread so far, so I figured I'd throw some FreeBSD in the mix. Our latest build which so far has worked out great, is MySQL 4.0.24 with linuxthreads on FreeBSD 4.10-R. 1) 15k RPM SCSI in RAID-10 configuration: Threads: 61 Questions: 192440153 Slow queries: 1600 Opens: 361204 Flush tables: 1 Open tables: 128 Queries per second avg: 199.496 tty da0 fd0pass0 cpu tin tout KB/t tps MB/s KB/t tps MB/s KB/t tps MB/s us ni sy in id 0 19 16.00 3 0.05 0.00 0 0.00 0.00 0 0.00 0 1 3 1 95 0 19 16.00 7 0.11 0.00 0 0.00 0.00 0 0.00 0 0 3 0 97 0 19 16.00 5 0.08 0.00 0 0.00 0.00 0 0.00 0 1 4 0 95 0 19 16.00 1 0.02 0.00 0 0.00 0.00 0 0.00 0 1 3 0 96 0 19 16.00 7 0.11 0.00 0 0.00 0.00 0 0.00 0 1 1 0 98 2) 15k RPM SCSI (single disk, no raid) Threads: 427 Questions: 929834784 Slow queries: 99 Opens: 421800 Flush tables: 2 Open tables: 128 Queries per second avg: 467.845 tty da0 fd0pass0 cpu tin tout KB/t tps MB/s KB/t tps MB/s KB/t tps MB/s us ni sy in id 0 38 64.00 1 0.06 0.00 0 0.00 0.00 0 0.00 3 3 16 1 76 0 38 64.00 1 0.06 0.00 0 0.00 0.00 0 0.00 4 2 18 1 75 0 38 64.00 1 0.06 0.00 0 0.00 0.00 0 0.00 1 2 13 4 80 0 38 22.67 3 0.07 0.00 0 0.00 0.00 0 0.00 2 3 13 2 79 0 38 64.00 1 0.06 0.00 0 0.00 0.00 0 0.00 3 0 18 1 77 0 38 0.00 0 0.00 0.00 0 0.00 0.00 0 0.00 0 2 14 1 83 My guess is that your tables all fit in RAM, as you have basically zero disk accesses. The orignal poster was seeing heavy disk I/O. -- Dan Nelson [EMAIL PROTECTED] -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
RE: MySQL not using optimum disk throughput.
What kernel are you running. If your running 2.6.x use the deadline scheduler or downgrade to 2.4.23aavm 2.6.[0-9] has major problems with the IO scheduler since the process scheduler is very fast now. DVP Dathan Vance Pattishall http://www.friendster.com -Original Message- From: Kevin Burton [mailto:[EMAIL PROTECTED] Sent: Friday, May 06, 2005 1:58 PM To: mysql@lists.mysql.com Subject: MySQL not using optimum disk throughput. We have a few of DBs which aren't using disk IO to optimum capacity. They're running at a load of 1.5 or so with a high workload of pending queries. When I do iostat I'm not noticing much IO : Device:rrqm/s wrqm/s r/s w/s rsec/s wsec/srkB/s wkB/s avgrq-sz avgqu-sz await svctm %util sda 0.00 13.73 128.43 252.94 1027.45 1695.10 513.73 847.55 7.1490.13 285.00 2.53 96.57 This is only seeing about 500k - 1M per second throughput. When I run bonnie++ on these drives they're showing 20M-40M throughput. Which is really strange. Most of our queries are single INSERTS/DELETES. I could probably rewrite these to become batch operations but I think I'd still end up seeing the above iostat results but with higher throughput. so I'd like to get to the bottom of this before moving forward? I ran OPTIMIZE TABLE on all tables but nothing. The boxes aren't paging. They're running on a RAID5 disk on XFS. Could it be that the disks are having to do a number of HEAD seeks since we have large tables? -- Use Rojo (RSS/Atom aggregator)! - visit http://rojo.com. See irc.freenode.net #rojo if you want to chat. Rojo is Hiring! - http://www.rojonetworks.com/JobsAtRojo.html Kevin A. Burton, Location - San Francisco, CA AIM/YIM - sfburtonator, Web - http://peerfear.org/ GPG fingerprint: 5FB2 F3E2 760E 70A8 6174 D393 E84D 8D04 99F1 4412 -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe: http://lists.mysql.com/[EMAIL PROTECTED] -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
Re: MySQL not using optimum disk throughput.
We have seen the exact same thing here. We used the deadline scheduler and saw an immediate improvement. However, we still saw much worse performance on our Opteron's (compared to our older Xeon boxes). We ended up rolling back to Fedora Core 1 2.4.22-1.2199.nptlsmp kernel and shut down NPTL and now our Opteron's are much much faster than our Xeons. The thing I find strange about this is that our experience (@ Meetup) seems to match that of Friendsters (I know of a few other high traffic sites that have mentioned similar issues), in that Mysql on Opteron and Linux 2.6 is not a good solution. Yet, Mysql recommends exactly this config and in fact, does not seem to even support (via support contract) a 2.4 solution for Opteron + Mysql. Greg Dathan Pattishall wrote: What kernel are you running. If your running 2.6.x use the deadline scheduler or downgrade to 2.4.23aavm 2.6.[0-9] has major problems with the IO scheduler since the process scheduler is very fast now. DVP Dathan Vance Pattishall http://www.friendster.com -Original Message- From: Kevin Burton [mailto:[EMAIL PROTECTED] Sent: Friday, May 06, 2005 1:58 PM To: mysql@lists.mysql.com Subject: MySQL not using optimum disk throughput. We have a few of DBs which aren't using disk IO to optimum capacity. They're running at a load of 1.5 or so with a high workload of pending queries. When I do iostat I'm not noticing much IO : Device:rrqm/s wrqm/s r/s w/s rsec/s wsec/srkB/s wkB/s avgrq-sz avgqu-sz await svctm %util sda 0.00 13.73 128.43 252.94 1027.45 1695.10 513.73 847.55 7.1490.13 285.00 2.53 96.57 This is only seeing about 500k - 1M per second throughput. When I run bonnie++ on these drives they're showing 20M-40M throughput. Which is really strange. Most of our queries are single INSERTS/DELETES. I could probably rewrite these to become batch operations but I think I'd still end up seeing the above iostat results but with higher throughput. so I'd like to get to the bottom of this before moving forward? I ran OPTIMIZE TABLE on all tables but nothing. The boxes aren't paging. They're running on a RAID5 disk on XFS. Could it be that the disks are having to do a number of HEAD seeks since we have large tables? -- Use Rojo (RSS/Atom aggregator)! - visit http://rojo.com. See irc.freenode.net #rojo if you want to chat. Rojo is Hiring! - http://www.rojonetworks.com/JobsAtRojo.html Kevin A. Burton, Location - San Francisco, CA AIM/YIM - sfburtonator, Web - http://peerfear.org/ GPG fingerprint: 5FB2 F3E2 760E 70A8 6174 D393 E84D 8D04 99F1 4412 -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe: http://lists.mysql.com/[EMAIL PROTECTED] -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
Re: MySQL not using optimum disk throughput.
Greg Whalin wrote: We have seen the exact same thing here. We used the deadline scheduler and saw an immediate improvement. However, we still saw much worse performance on our Opteron's (compared to our older Xeon boxes). We ended up rolling back to Fedora Core 1 2.4.22-1.2199.nptlsmp kernel and shut down NPTL and now our Opteron's are much much faster than our Xeons. Sweet... I'm going to take a look at that! Two votes for the deadline scheduler. Though I'm an NPTL fan but I'm not sure our DB boxes need this as they don't use THAT many threads. The thing I find strange about this is that our experience (@ Meetup) seems to match that of Friendsters (I know of a few other high traffic sites that have mentioned similar issues), in that Mysql on Opteron and Linux 2.6 is not a good solution. Yet, Mysql recommends exactly this config and in fact, does not seem to even support (via support contract) a 2.4 solution for Opteron + Mysql. Wow... whats the consensus on Opteron here then? It seems to be a clear winner since you can give the mysql process more memory for caching. Is it an OS issue since few of the distributions seem to support Opteron (well). -- Use Rojo (RSS/Atom aggregator)! - visit http://rojo.com. See irc.freenode.net #rojo if you want to chat. Rojo is Hiring! - http://www.rojonetworks.com/JobsAtRojo.html Kevin A. Burton, Location - San Francisco, CA AIM/YIM - sfburtonator, Web - http://peerfear.org/ GPG fingerprint: 5FB2 F3E2 760E 70A8 6174 D393 E84D 8D04 99F1 4412 -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
Re: MySQL not using optimum disk throughput.
Kevin Burton wrote: Greg Whalin wrote: We have seen the exact same thing here. We used the deadline scheduler and saw an immediate improvement. However, we still saw much worse performance on our Opteron's (compared to our older Xeon boxes). We ended up rolling back to Fedora Core 1 2.4.22-1.2199.nptlsmp kernel and shut down NPTL and now our Opteron's are much much faster than our Xeons. Sweet... I'm going to take a look at that! Two votes for the deadline scheduler. Though I'm an NPTL fan but I'm not sure our DB boxes need this as they don't use THAT many threads. Deadline was much faster. Using sysbench: test: sysbench --num-threads=16 --test=fileio --file-total-size=20G --file-test-mode=rndrw run results: 2.6.10-1.14_FC2smp on dual Opteron 248s w/ 4GB RAM default scheduler (anticaptory): Operations performed: 6004 Read, 3996 Write, 12800 Other = 22800 Total Read 93.812Mb Written 62.438Mb Total transferred 156.25Mb (2.9186Mb/sec) 186.79 Requests/sec executed Test execution summary: total time: 53.5363s total number of events: 1 total time taken by event execution: 376.0398 per-request statistics: min:0.s avg:0.0376s max:18446744073709.4961s approx. 95 percentile: 0.1106s Threads fairness: distribution:70.15/87.92 execution: 88.48/93.88 deadline scheduler: Operations performed: 6006 Read, 3994 Write, 12800 Other = 22800 Total Read 93.844Mb Written 62.406Mb Total transferred 156.25Mb (4.4464Mb/sec) 284.57 Requests/sec executed Test execution summary: total time: 35.1411s total number of events: 1 total time taken by event execution: 289.2953 per-request statistics: min:0.s avg:0.0289s max:0.3520s approx. 95 percentile: 0.0870s Threads fairness: distribution:84.92/92.89 execution: 90.52/96.58 The 2.4 scheduler showed similar results to deadline under 2.6. The thing I find strange about this is that our experience (@ Meetup) seems to match that of Friendsters (I know of a few other high traffic sites that have mentioned similar issues), in that Mysql on Opteron and Linux 2.6 is not a good solution. Yet, Mysql recommends exactly this config and in fact, does not seem to even support (via support contract) a 2.4 solution for Opteron + Mysql. Wow... whats the consensus on Opteron here then? It seems to be a clear winner since you can give the mysql process more memory for caching. Is it an OS issue since few of the distributions seem to support Opteron (well). I suspect this is an OS issue. Our Opteron's were completing large data update queries aprox 2-3 times slower than our Xeons when running under 2.6. After a switch to 2.4, Opteron's are faster than the Xeons. I mentioned NPTL being shut off (LD_ASSUME_KERNEL=2.4.19 in init script). When we left NPTL running, we saw almost instant deadlocks just watching replication catching up (no other site traffic directed to the machine). This is in 2.4 btw, so this is the backported NPTL kernels from Fedora. I somewhat suspect NPTL being a problem in 2.6 as well due to impressions I get from sifting through mysql's bug tracking system. The IO scheduler was also an obvious culprit. Other issues I have noticed w/ Opteron ver of mysql ... - Under 2.6, if we took the db offline and ran myisamchk on a table w/ fulltext indexes, and then started back up again, the table would nearly instantly crash (upon first writes to it). Running repair table would seg fault. Shutting down to run myisamchk would only cause the table to crash again upon 1st write. Only solution ... alter table tablename engine=myisam; Then the table would run fine. We have since dropped all fulltext indexes and moved to Lucene (much more flexible and way faster anyhow). - Under 2.4 (just happened to me tonight and this is a scary one), we routinely archive and cleanup large tables w/ seldom used old data. After doing a DELETE FROM table WHERE ctime '2005-05-01', we would see a select count(*) show around 160k rows remaining (from 1st of the month). I would call repair table on the table, and the remaining rows would be deleted. Repair would make mention of dropping row count from 165k to 0. Yikes! This happened on both Opterons and did not happen on the Xeons (thank god ... was able to save the data). In any rate, I am 100% confidant in saying that Mysql (w/ myisam table engine ... not tried innodb yet) on linux on Opterons is not yet stable or speedy. Though we usually only see problems under large data cleanups (moving, deleting, repairing, etc). Greg
Re: MySQL not using optimum disk throughput.
Greg Whalin wrote: Deadline was much faster. Using sysbench: test: sysbench --num-threads=16 --test=fileio --file-total-size=20G --file-test-mode=rndrw run Wow... what version of sysbench are you running? Its giving me strange errors sysbench v0.3.4: multi-threaded system evaluation benchmark Running the test with following options: Number of threads: 16 Extra file open flags: 0 128 files, 160Mb each 20Gb total file size Block size 16Kb Number of random requests for random IO: 1 Read/Write ratio for combined random IO test: 1.50 Periodic FSYNC enabled, calling fsync() each 100 requests. Calling fsync() at the end of test, Enabled. Using synchronous I/O mode Doing random r/w test Threads started! FATAL: Failed to read file! file: 90 pos: 14761984 errno = 0 (Success) FATAL: Failed to read file! file: 103 pos: 161398784 errno = 0 (Success) FATAL: Failed to read file! file: 75 pos: 79413248 errno = 0 (Success) FATAL: Failed to read file! file: 79 pos: 67207168 errno = 0 (Success) FATAL: Failed to read file! file: 108 pos: 64028672 errno = 0 (Success) FATAL: Failed to read file! file: 53 pos: 96157696 errno = 0 (Success) FATAL: Failed to read file! file: 88 pos: 137068544 errno = 0 (Success) -- Use Rojo (RSS/Atom aggregator)! - visit http://rojo.com. See irc.freenode.net #rojo if you want to chat. Rojo is Hiring! - http://www.rojonetworks.com/JobsAtRojo.html Kevin A. Burton, Location - San Francisco, CA AIM/YIM - sfburtonator, Web - http://peerfear.org/ GPG fingerprint: 5FB2 F3E2 760E 70A8 6174 D393 E84D 8D04 99F1 4412 -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
Re: MySQL not using optimum disk throughput.
Kevin Burton wrote: Greg Whalin wrote: Deadline was much faster. Using sysbench: test: sysbench --num-threads=16 --test=fileio --file-total-size=20G --file-test-mode=rndrw run So... FYI. I rebooted with elevator=deadline as a kernel param. db2:~# cat /sys/block/sda/queue/scheduler noop anticipatory [deadline] cfq (which I assume means I'm now running deadline. Is there any other way to find out?) And no performance diff. Note that you're benchmarks only show a 20M addition overhead. We're about 60x too slow for these drives so I'm not sure what could be going on here :-/ Kevin -- Use Rojo (RSS/Atom aggregator)! - visit http://rojo.com. See irc.freenode.net #rojo if you want to chat. Rojo is Hiring! - http://www.rojonetworks.com/JobsAtRojo.html Kevin A. Burton, Location - San Francisco, CA AIM/YIM - sfburtonator, Web - http://peerfear.org/ GPG fingerprint: 5FB2 F3E2 760E 70A8 6174 D393 E84D 8D04 99F1 4412 -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
Re: MySQL not using optimum disk throughput.
What drives are you using? For SCSI RAID, you definitly want deadline scheduler. That said, even after the switch to deadline, we saw our Opteron's running way slow (compared to older slower Xeons). Whatever the problem is, we fought it for quite a while (though difficult to test too much w/ production dbs) and ended up rolling back to 2.4. Kevin Burton wrote: Kevin Burton wrote: Greg Whalin wrote: Deadline was much faster. Using sysbench: test: sysbench --num-threads=16 --test=fileio --file-total-size=20G --file-test-mode=rndrw run So... FYI. I rebooted with elevator=deadline as a kernel param. db2:~# cat /sys/block/sda/queue/scheduler noop anticipatory [deadline] cfq (which I assume means I'm now running deadline. Is there any other way to find out?) And no performance diff. Note that you're benchmarks only show a 20M addition overhead. We're about 60x too slow for these drives so I'm not sure what could be going on here :-/ Kevin -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
Re: MySQL not using optimum disk throughput.
Greg Whalin wrote: What drives are you using? For SCSI RAID, you definitly want deadline scheduler. That said, even after the switch to deadline, we saw our Opteron's running way slow (compared to older slower Xeons). Whatever the problem is, we fought it for quite a while (though difficult to test too much w/ production dbs) and ended up rolling back to 2.4. Ug.. I don't want to roll back to 2.4... 2.6 has so many nice features we depend on. We're using SCSI RAID5 on XEON of course. I think its time to rule out some things. I'm going to migrate to RAID1... just to verify... then try reviewing our kernel options.. maybe disabling NPTL... maybe try another filesystem... Not fun. For the record... no a loaded system what type of IO do you guys see? Anywhere near full disk capacity? I'm curious to see what type of IO people are seeing on a production/loaded mysql box. Kevin -- Use Rojo (RSS/Atom aggregator)! - visit http://rojo.com. See irc.freenode.net #rojo if you want to chat. Rojo is Hiring! - http://www.rojonetworks.com/JobsAtRojo.html Kevin A. Burton, Location - San Francisco, CA AIM/YIM - sfburtonator, Web - http://peerfear.org/ GPG fingerprint: 5FB2 F3E2 760E 70A8 6174 D393 E84D 8D04 99F1 4412 -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
Re: MySQL not using optimum disk throughput.
In the last episode (May 06), Kevin Burton said: We have a few of DBs which aren't using disk IO to optimum capacity. They're running at a load of 1.5 or so with a high workload of pending queries. When I do iostat I'm not noticing much IO : Device:rrqm/s wrqm/s r/s w/s rsec/s wsec/srkB/swkB/s avgrq-sz avgqu-sz await svctm %util sda0.00 13.73 128.43 252.94 1027.45 1695.10 513.73847.55 7.1490.13 285.00 2.53 96.57 This is only seeing about 500k - 1M per second throughput. When I run bonnie++ on these drives they're showing 20M-40M throughput. They're running on a RAID5 disk on XFS. An OLTP database is not a system that requires throughput. It requires lots of random access. MB/sec doesn't matter a bit. Instead, take a look at the r/s and w/r columns. You're doing ~380 IOs/sec, which sounds like maybe a 3-disk set? Each disk you add to the set should give you another 120 or IOs per second. When you max out the number of drives in your case, you will realize why drive manufacturers sell 15K rpm disks: an array of 15k drives will give you double the transaction rate (250 IO/s instead of 120) of the same number of 7200 rpm drives :) -- Dan Nelson [EMAIL PROTECTED] -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
Re: MySQL not using optimum disk throughput.
And no performance diff. Note that you're benchmarks only show a 20M addition overhead. We're about 60x too slow for these drives so I'm not sure what could be going on here :-/ I know of a site that encountered a similar performance issue: The OS was reading in a lot more data from the disk than the database really needed. The culprit turned out to be the stripe size on a 4-disk RAID. By reducing the stripe size from 768K to 32K, they obtained a 200% increase in mysql throughput. - JD -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]