Hello Jason, please note that it is also possible to enable quotas using lctl and that would not be visible using tunefs.lustre. I think the only real option to check if quotas are enabled is to check if quota file exist. For an online filesystem 'debugfs -c /dev/device' is probably the safest way (there is also a 'secret' way how to bind mount the underlying ldiskfs to another directory, but I only use that for test filesystems and never in production, as have not verified the kernel code path yet).
Either way, you should check for lquota files, such as r...@rhel5-nfs@phys-oss0:~# mount -t ldiskfs /dev/mapper/ost_demofs_2 /mnt r...@rhel5-nfs@phys-oss0:~# ll /mnt [...] -rw-r--r-- 1 root root 7168 Oct 23 09:48 lquota_v2.group -rw-r--r-- 1 root root 71680 Oct 23 09:48 lquota_v2.user (Of course, you should check that for those OST which have reported the slow quota messages). I just poked around a bit in the code and above the fsfilt_check_slow() check, there is also a loop that calls filter_range_is_mapped(). Now this function calls fs_bmap() and when that eventually goes to down to ext3, it might get a bit slow if, if another thread should modify that file (check out linux/fs/inode.c): /* * bmap() is special. It gets used by applications such as lilo and by * the swapper to find the on-disk block of a specific piece of data. * * Naturally, this is dangerous if the block concerned is still in the * journal. If somebody makes a swapfile on an ext3 data-journaling * filesystem and enables swap, then they may get a nasty shock when the * data getting swapped to that swapfile suddenly gets overwritten by * the original zero's written out previously to the journal and * awaiting writeback in the kernel's buffer cache. * * So, if we see any bmap calls here on a modified, data-journaled file, * take extra steps to flush any blocks which might be in the cache. */ I don't know though, if it can happen that several threads write to the same file. But if it happens, it gets slow. I wonder if a possible swap file is worth the efforts here... In fact, the reason to call filter_range_is_mapped() certainly does not require a journal flush in that loop. I will check myself next week, if journal flushes are ever made due to that and open a Lustre bugzilla then. Avoiding all of that should not be difficult Cheers, Bernd On Saturday, October 23, 2010, Jason Hill wrote: > Kevin/Dave/(and Dave from DDN): > > Thanks for your replies. From tunefs.lustre --dryrun it is very apparent > that we are not running quotas. > > Thanks for your assistance. > > > That message, from lustre/obdfilter/filter_io_26.c, is the result of the > > thread taking 35 second > > from when it entered filter_commitrw_write() until after it called > > lquota_chkquota() to check the quota. > > > > However, it is certainly plausible that the thread was delayed because > > of something other than quotas, > > such as an allocation (eg, it could have been stuck in filter_iobuf_get). > > > > Kevin > > _______________________________________________ > Lustre-discuss mailing list > Lustre-discuss@lists.lustre.org > http://lists.lustre.org/mailman/listinfo/lustre-discuss -- Bernd Schubert DataDirect Networks _______________________________________________ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss