On Mon, 2009-06-29 at 16:32 +0100, John Haxby wrote: > That's a fairly busy system but the iostat output doesn't look to me > like something that's I/O bound: the average wait times and queue size > just don't look like something that's in trouble or even working all > that hard. > > Am I missing something here?
I think what you're missing is the IO being spread across multiple paths. It's difficult to read in the format with dm-12, dm-13, and dm-14 mixed together, however, if you separate them you'll see a pattern like this: dm-12 0.00 0.00 350.00 880.00 56176.00 7040.00 51.40 7.16 5.82 0.69 85.40 dm-12 0.00 0.00 582.00 108.00 89688.00 864.00 131.23 2.54 3.67 0.98 67.80 dm-12 0.00 0.00 368.32 645.54 56839.60 5164.36 61.16 5.17 5.10 0.85 85.94 dm-12 0.00 0.00 402.00 1520.00 59472.00 12160.00 37.27 9.22 4.80 0.50 95.20 dm-12 0.00 0.00 387.00 100.00 61496.00 800.00 127.92 3.73 7.61 1.54 75.00 dm-12 0.00 0.00 486.00 444.00 75848.00 3552.00 85.38 4.64 4.93 0.82 76.10 dm-12 0.00 0.00 373.00 1488.00 57416.00 11904.00 37.25 11.70 6.32 0.51 95.60 dm-12 0.00 0.00 408.00 185.00 61504.00 1480.00 106.21 2.88 4.86 1.26 74.60 After that, dm-12 will be quiet, but then dm-13 wakes up: dm-13 0.00 0.00 288.00 0.00 41264.00 0.00 143.28 2.04 6.94 2.61 75.20 dm-13 0.00 0.00 468.00 56.00 68424.00 448.00 131.44 2.00 3.85 1.40 73.40 dm-13 0.00 0.00 526.00 292.00 77128.00 2336.00 97.14 5.61 6.88 0.89 72.50 dm-13 0.00 0.00 514.00 216.00 73808.00 1728.00 103.47 5.87 7.88 0.96 70.00 dm-13 0.00 0.00 548.51 83.17 76879.21 665.35 122.76 2.87 4.71 1.09 69.11 dm-13 0.00 0.00 508.00 136.00 72520.00 1088.00 114.30 6.08 9.43 1.07 68.60 dm-13 0.00 0.00 300.00 3.00 44840.00 24.00 148.07 4.09 13.53 2.76 83.50 dm-13 0.00 0.00 392.00 172.00 55608.00 1376.00 101.04 8.69 14.97 1.41 79.40 And finally, dm-14: dm-14 0.00 0.00 427.00 4.00 63816.00 32.00 148.14 1.29 2.99 1.08 46.50 dm-14 0.00 0.00 510.00 0.00 73032.00 0.00 143.20 1.91 3.75 1.45 73.90 dm-14 0.00 0.00 171.00 0.00 24600.00 0.00 143.86 2.15 12.59 5.48 93.70 dm-14 0.00 0.00 459.00 0.00 66368.00 0.00 144.59 2.96 6.46 1.67 76.80 dm-14 0.00 0.00 549.50 0.00 77132.67 0.00 140.37 2.46 4.47 1.27 69.60 dm-14 0.00 0.00 570.00 3.00 82000.00 24.00 143.15 2.26 3.94 1.23 70.70 dm-14 0.00 0.00 404.00 0.00 58680.00 0.00 145.25 1.82 4.51 2.02 81.50 dm-14 0.00 0.00 459.00 0.00 66368.00 0.00 144.59 2.44 5.32 1.64 75.40 Then it wraps around to dm-12 and the pattern continues for about 90 seconds. I'm assuming that's the 90 seconds of the full table scan. He's pretty IO bound during that part. He could set the multipath rr_min_io parameter lower to more evenly balance the IO across the paths, but I think he's pretty much maxing out the IOPS his array has already based on the graph his SAN admin provided. In the end, the RHEL5 box appears to be doing what it can with the IOPS available. Later, Tom _______________________________________________ rhelv5-list mailing list [email protected] https://www.redhat.com/mailman/listinfo/rhelv5-list
