On Wed, 2020-11-11 at 15:39 +0000, Luis Henriques wrote:
> When doing a rename across quota realms, there's a corner case that isn't
> handled correctly.  Here's a testcase:
> 
>   mkdir files limit
>   truncate files/file -s 10G
>   setfattr limit -n ceph.quota.max_bytes -v 1000000
>   mv files limit/
> 
> The above will succeed because ftruncate(2) won't result in an immediate
> notification of the MDSs with the new file size, and thus the quota realms
> stats won't be updated.
> 
> This patch forces a sync with the MDS every time there's an ATTR_SIZE that
> sets a new i_size, even if we have Fx caps.
> 
> Cc: sta...@vger.kernel.org
> Fixes: dffdcd71458e ("ceph: allow rename operation under different quota 
> realms")
> URL: https://tracker.ceph.com/issues/36593
> Signed-off-by: Luis Henriques <lhenriq...@suse.de>
> ---
>  fs/ceph/inode.c | 11 ++---------
>  1 file changed, 2 insertions(+), 9 deletions(-)
> 
> diff --git a/fs/ceph/inode.c b/fs/ceph/inode.c
> index 526faf4778ce..30e3f240ac96 100644
> --- a/fs/ceph/inode.c
> +++ b/fs/ceph/inode.c
> @@ -2136,15 +2136,8 @@ int __ceph_setattr(struct inode *inode, struct iattr 
> *attr)
>       if (ia_valid & ATTR_SIZE) {
>               dout("setattr %p size %lld -> %lld\n", inode,
>                    inode->i_size, attr->ia_size);
> -             if ((issued & CEPH_CAP_FILE_EXCL) &&
> -                 attr->ia_size > inode->i_size) {
> -                     i_size_write(inode, attr->ia_size);
> -                     inode->i_blocks = calc_inode_blocks(attr->ia_size);
> -                     ci->i_reported_size = attr->ia_size;
> -                     dirtied |= CEPH_CAP_FILE_EXCL;
> -                     ia_valid |= ATTR_MTIME;
> -             } else if ((issued & CEPH_CAP_FILE_SHARED) == 0 ||
> -                        attr->ia_size != inode->i_size) {
> +             if ((issued & (CEPH_CAP_FILE_EXCL|CEPH_CAP_FILE_SHARED)) ||
> +                 (attr->ia_size != inode->i_size)) {
>                       req->r_args.setattr.size = cpu_to_le64(attr->ia_size);
>                       req->r_args.setattr.old_size =
>                               cpu_to_le64(inode->i_size);

Hmm...this makes truncates more expensive when we have caps. I'd rather
not do that if we can help it.

What about instead having the client mimic a fsync when there is a
rename across quota realms? If we can't tell that reliably then we could
also just do an effective fsync ahead of any cross-directory rename?
-- 
Jeff Layton <jlay...@kernel.org>

Reply via email to