Re: [Lustre-discuss] 1.8 quotas

Bernd Schubert Sat, 23 Oct 2010 01:40:06 -0700

Hello Jason,

please note that it is also possible to enable quotas using lctl and that 
would not be visible using tunefs.lustre. I think the only real option to 
check if quotas are enabled is to check if quota file exist. For an online 
filesystem 'debugfs -c /dev/device' is probably the safest way (there is also 
a 'secret' way how to bind mount the underlying ldiskfs to another directory, 
but I only use that for test filesystems and never in production, as have not 
verified the kernel code path yet).

Either way, you should check for lquota files, such as

r...@rhel5-nfs@phys-oss0:~# mount -t ldiskfs /dev/mapper/ost_demofs_2 /mnt

r...@rhel5-nfs@phys-oss0:~# ll /mnt
[...]
-rw-r--r-- 1 root root  7168 Oct 23 09:48 lquota_v2.group
-rw-r--r-- 1 root root 71680 Oct 23 09:48 lquota_v2.user

(Of course, you should check that for those OST which have reported the slow 
quota messages).

I just poked around a bit in the code and above the fsfilt_check_slow() check, 
there is also a loop that calls filter_range_is_mapped(). Now this function 
calls fs_bmap() and when that eventually goes to down to ext3, it might get a 
bit slow if, if another thread should modify that file (check out 
linux/fs/inode.c):

/* 
 * bmap() is special.  It gets used by applications such as lilo and by
 * the swapper to find the on-disk block of a specific piece of data.
 *
 * Naturally, this is dangerous if the block concerned is still in the
 * journal.  If somebody makes a swapfile on an ext3 data-journaling
 * filesystem and enables swap, then they may get a nasty shock when the
 * data getting swapped to that swapfile suddenly gets overwritten by
 * the original zero's written out previously to the journal and
 * awaiting writeback in the kernel's buffer cache. 
 *
 * So, if we see any bmap calls here on a modified, data-journaled file,
 * take extra steps to flush any blocks which might be in the cache. 
 */

I don't know though, if it can happen that several threads write to the same 
file. But if it happens, it gets slow. I wonder if a possible swap file is 
worth the  efforts here... In fact, the reason to call 
filter_range_is_mapped() certainly does not require a journal flush in that 
loop. I will check myself next week, if journal flushes are ever made due to 
that and open a Lustre bugzilla then. Avoiding all of that should not be 
difficult

Cheers,
Bernd

On Saturday, October 23, 2010, Jason Hill wrote:
> Kevin/Dave/(and Dave from DDN):
> 
> Thanks for your replies. From tunefs.lustre --dryrun it is very apparent
> that we are not running quotas.
> 
> Thanks for your assistance.
> 
> > That message, from lustre/obdfilter/filter_io_26.c, is the result of the
> > thread taking 35 second
> > from when it entered filter_commitrw_write() until after it called
> > lquota_chkquota() to check the quota.
> > 
> > However, it is certainly plausible that the thread was delayed because
> > of something other than quotas,
> > such as an allocation (eg, it could have been stuck in filter_iobuf_get).
> > 
> > Kevin
> 
> _______________________________________________
> Lustre-discuss mailing list
> Lustre-discuss@lists.lustre.org
> http://lists.lustre.org/mailman/listinfo/lustre-discuss

-- 
Bernd Schubert
DataDirect Networks
_______________________________________________
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss

Re: [Lustre-discuss] 1.8 quotas

Reply via email to