On Wed, 2020-11-11 at 15:39 +0000, Luis Henriques wrote: > When doing a rename across quota realms, there's a corner case that isn't > handled correctly. Here's a testcase: > > mkdir files limit > truncate files/file -s 10G > setfattr limit -n ceph.quota.max_bytes -v 1000000 > mv files limit/ > > The above will succeed because ftruncate(2) won't result in an immediate > notification of the MDSs with the new file size, and thus the quota realms > stats won't be updated. > > This patch forces a sync with the MDS every time there's an ATTR_SIZE that > sets a new i_size, even if we have Fx caps. > > Cc: sta...@vger.kernel.org > Fixes: dffdcd71458e ("ceph: allow rename operation under different quota > realms") > URL: https://tracker.ceph.com/issues/36593 > Signed-off-by: Luis Henriques <lhenriq...@suse.de> > --- > fs/ceph/inode.c | 11 ++--------- > 1 file changed, 2 insertions(+), 9 deletions(-) > > diff --git a/fs/ceph/inode.c b/fs/ceph/inode.c > index 526faf4778ce..30e3f240ac96 100644 > --- a/fs/ceph/inode.c > +++ b/fs/ceph/inode.c > @@ -2136,15 +2136,8 @@ int __ceph_setattr(struct inode *inode, struct iattr > *attr) > if (ia_valid & ATTR_SIZE) { > dout("setattr %p size %lld -> %lld\n", inode, > inode->i_size, attr->ia_size); > - if ((issued & CEPH_CAP_FILE_EXCL) && > - attr->ia_size > inode->i_size) { > - i_size_write(inode, attr->ia_size); > - inode->i_blocks = calc_inode_blocks(attr->ia_size); > - ci->i_reported_size = attr->ia_size; > - dirtied |= CEPH_CAP_FILE_EXCL; > - ia_valid |= ATTR_MTIME; > - } else if ((issued & CEPH_CAP_FILE_SHARED) == 0 || > - attr->ia_size != inode->i_size) { > + if ((issued & (CEPH_CAP_FILE_EXCL|CEPH_CAP_FILE_SHARED)) || > + (attr->ia_size != inode->i_size)) { > req->r_args.setattr.size = cpu_to_le64(attr->ia_size); > req->r_args.setattr.old_size = > cpu_to_le64(inode->i_size);
Hmm...this makes truncates more expensive when we have caps. I'd rather not do that if we can help it. What about instead having the client mimic a fsync when there is a rename across quota realms? If we can't tell that reliably then we could also just do an effective fsync ahead of any cross-directory rename? -- Jeff Layton <jlay...@kernel.org>