Hi, On Thu, 2011-02-24 at 12:56 +0000, Alan Brown wrote: > Steven Whitehouse wrote: > > > That doesn't sound like it is related to a DLM issue. 150 entries is not > > a lot. > > It isn't, but when the machine's being hammered by requests in other > filesystems, things can get very slow, very quickly. > Depending on the exact mix of I/O, that is expected behaviour. That is why it is so important to look at what can be done at the application layer to mitigate such problems.
> > What do you mean be "access" in this case? Just looking up a > > single file in the directory, or create/delete files or an ls -l > > (implying stats to each file) or what exactly? > > ls -l and creation/deletion. > As soon as you mix creation/deletion on one node with accesses (of whatever kind) from other nodes, you run this risk. Obviously you wouldn't be using a cluster filesystem if you didn't intend to have this kind of access from time to time, but anything that can be done at the application level to help improve locality will pay big dividends compared with any tuning that can be done at the fs/dlm level. At first sight at least, this does not appear to be a dlm related problem, so we need to be careful not to confuse two different issues, even if the symptoms may appear the same. Thanks for the iostat reports, I'll have a more detailed look and get back to you, Steve. > > >> > > Again, figuring out the exact workload should help us get to the bottom > > of what is going on here. How are you measuring the delays reported > > above? Is the syscall service time, for example? > > iostats. > > This is a typcal example. Note the huge difference between await (total > time in ms) and svctm (scsi command response time in ms) > > > avg-cpu: %user %nice %system %iowait %steal %idle > 0.62 0.00 2.12 27.03 0.00 70.22 > > Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz > avgqu-sz await svctm %util > dm-6 0.00 0.00 0.00 0.00 0.00 0.00 0.00 > 0.00 0.00 0.00 0.00 > dm-7 0.00 0.00 0.00 0.00 0.00 0.00 0.00 > 0.00 0.00 0.00 0.00 > dm-8 0.00 0.00 16.00 1.00 8.00 0.50 1.00 > 0.20 11.47 11.47 19.50 > dm-9 0.00 0.00 45.00 22.00 180.00 88.00 8.00 > 13.45 351.07 6.19 41.50 > dm-10 0.00 0.00 61.00 30.00 244.00 120.00 8.00 > 22.19 307.56 10.19 92.70 > dm-11 0.00 0.00 53.00 5.00 212.00 20.00 8.00 > 128.12 4332.69 17.26 100.10 > dm-12 0.00 0.00 54.00 0.00 216.00 0.00 8.00 > 100.28 2951.09 18.54 100.10 > dm-13 0.00 0.00 0.00 0.00 0.00 0.00 0.00 > 0.00 0.00 0.00 0.00 > dm-14 0.00 0.00 0.00 0.00 0.00 0.00 0.00 > 0.00 0.00 0.00 0.00 > > For comparison here's a FC-attached box running Ext4 (on a slower set of > arrays) > > > Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz > avgqu-sz await svctm %util > dm-2 0.00 0.00 0.00 0.00 0.00 0.00 0.00 > 0.00 0.00 0.00 0.00 > dm-3 0.00 0.00 0.00 0.00 0.00 0.00 0.00 > 0.00 0.00 0.00 0.00 > dm-4 0.00 0.00 0.00 0.00 0.00 0.00 0.00 > 0.00 0.00 0.00 0.00 > dm-5 0.00 0.00 0.00 0.00 0.00 0.00 0.00 > 0.00 0.00 0.00 0.00 > dm-6 0.00 0.00 0.00 0.00 0.00 0.00 0.00 > 0.00 0.00 0.00 0.00 > dm-7 0.00 0.00 0.00 0.00 0.00 0.00 0.00 > 0.00 0.00 0.00 0.00 > dm-8 0.00 0.00 247.00 0.00 988.00 0.00 8.00 > 19.17 92.43 3.28 81.10 > dm-9 0.00 0.00 0.00 0.00 0.00 0.00 0.00 > 0.00 0.00 0.00 0.00 > dm-10 0.00 0.00 0.00 0.00 0.00 0.00 0.00 > 0.00 0.00 0.00 0.00 > dm-11 0.00 0.00 51.00 14.00 21548.00 56.00 664.74 > 1.93 29.91 4.14 26.90 > dm-12 0.00 0.00 54.00 424.00 25800.00 1696.00 115.05 > 6.07 12.69 0.60 28.50 > dm-13 0.00 0.00 55.00 0.00 23928.00 0.00 870.11 > 2.50 45.49 5.38 29.60 > dm-14 0.00 0.00 0.00 0.00 0.00 0.00 0.00 > 0.00 0.00 0.00 0.00 > dm-15 0.00 0.00 247.00 0.00 988.00 0.00 8.00 > 19.17 92.44 3.29 81.20 > dm-16 0.00 0.00 0.00 0.00 0.00 0.00 0.00 > 0.00 0.00 0.00 0.00 > > > > > -- > Linux-cluster mailing list > Linux-cluster@redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster -- Linux-cluster mailing list Linux-cluster@redhat.com https://www.redhat.com/mailman/listinfo/linux-cluster