On Mon, Oct 01, 2018 at 01:51:09PM -0600, Andreas Dilger wrote:
> On Oct 1, 2018, at 9:49 AM, Eric Sandeen wrote:
> > Yes, I would expect there to be problems with his modified kernel
> > for a filesystem that supports clone_file_range, because
> > vfs_copy_file_range() will clone if possible,
On Fri, Jan 19, 2018 at 09:36:34AM -0500, Jeff Layton wrote:
> Shrug...we have that problem with the spinlock in place too. The bottom
> line is that reads of this value are not serialized with the increment
> at all.
OK, so this wouldn't even be a new bug.
> I'm not 100% thrilled with this patch
On Tue, Jan 09, 2018 at 09:10:42AM -0500, Jeff Layton wrote:
> From: Jeff Layton
>
> The rationale for taking the i_lock when incrementing this value is
> lost in antiquity. The readers of the field don't take it (at least
> not universally), so my assumption is that it was only done here to
> se
On Tue, Jan 09, 2018 at 09:10:41AM -0500, Jeff Layton wrote:
> --- /dev/null
> +++ b/include/linux/iversion.h
> @@ -0,0 +1,236 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +#ifndef _LINUX_IVERSION_H
> +#define _LINUX_IVERSION_H
> +
> +#include
> +
> +/*
> + * The change attribute (i_version) is
On Mon, Dec 18, 2017 at 12:22:20PM -0500, Jeff Layton wrote:
> On Mon, 2017-12-18 at 17:34 +0100, Jan Kara wrote:
> > On Mon 18-12-17 10:11:56, Jeff Layton wrote:
> > > static inline bool
> > > inode_maybe_inc_iversion(struct inode *inode, bool force)
> > > {
> > > - atomic64_t *ivp = (atomic64_
On Fri, May 12, 2017 at 08:22:23AM +1000, NeilBrown wrote:
> On Thu, May 11 2017, J. Bruce Fields wrote:
> > +static inline u64 nfsd4_change_attribute(struct inode *inode)
> > +{
> > + u64 chattr;
> > +
> > + chattr = inode->i_ctime.tv_sec << 30
On Fri, May 12, 2017 at 07:01:25AM -0400, Jeff Layton wrote:
> This looks reasonable to me (modulo Jan's comment about casting tv_sec
> to u64).
>
> To be clear, I think this is mostly orthogonal to the changes that I was
> originally proposing, right? I think we can still benefit from only
> bump
On Fri, May 12, 2017 at 10:27:54AM +0200, Jan Kara wrote:
> On Thu 11-05-17 14:59:43, J. Bruce Fields wrote:
> > On Wed, Apr 05, 2017 at 02:14:09PM -0400, J. Bruce Fields wrote:
> > > On Wed, Apr 05, 2017 at 10:05:51AM +0200, Jan Kara wrote:
> > > > 1) Keep i_version
On Wed, Apr 05, 2017 at 02:14:09PM -0400, J. Bruce Fields wrote:
> On Wed, Apr 05, 2017 at 10:05:51AM +0200, Jan Kara wrote:
> > 1) Keep i_version as is, make clients also check for i_ctime.
>
> That would be a protocol revision, which we'd definitely rather avoid.
>
&g
On Wed, Apr 05, 2017 at 10:05:51AM +0200, Jan Kara wrote:
> 1) Keep i_version as is, make clients also check for i_ctime.
That would be a protocol revision, which we'd definitely rather avoid.
But can't we accomplish the same by using something like
ctime * (some constant) + i_version
?
On Wed, Apr 05, 2017 at 11:43:32AM +1000, NeilBrown wrote:
> On Tue, Apr 04 2017, J. Bruce Fields wrote:
>
> > On Thu, Mar 30, 2017 at 02:35:32PM -0400, Jeff Layton wrote:
> >> On Thu, 2017-03-30 at 12:12 -0400, J. Bruce Fields wrote:
> >> > On Thu, Mar 30, 2017
On Thu, Mar 30, 2017 at 10:41:37AM +1100, Dave Chinner wrote:
> On Wed, Mar 29, 2017 at 01:54:31PM -0400, Jeff Layton wrote:
> > On Wed, 2017-03-29 at 13:15 +0200, Jan Kara wrote:
> > > On Tue 21-03-17 14:46:53, Jeff Layton wrote:
> > > > On Tue, 2017-03-21 at 14:3
On Thu, Mar 30, 2017 at 02:35:32PM -0400, Jeff Layton wrote:
> On Thu, 2017-03-30 at 12:12 -0400, J. Bruce Fields wrote:
> > On Thu, Mar 30, 2017 at 07:11:48AM -0400, Jeff Layton wrote:
> > > On Thu, 2017-03-30 at 08:47 +0200, Jan Kara wrote:
> > > > Because if abo
On Tue, Apr 04, 2017 at 10:34:14PM +1000, Dave Chinner wrote:
> On Mon, Apr 03, 2017 at 04:00:55PM +0200, Jan Kara wrote:
> > What filesystems can or cannot easily do obviously differs. Ext4 has a
> > recovery flag set in superblock on RW mount/remount and cleared on
> > umount/RO remount.
>
> Eve
On Thu, Mar 30, 2017 at 07:11:48AM -0400, Jeff Layton wrote:
> On Thu, 2017-03-30 at 08:47 +0200, Jan Kara wrote:
> > Hum, so are we fine if i_version just changes (increases) for all inodes
> > after a server crash? If I understand its use right, it would mean
> > invalidation of all client's cach
On Tue, Mar 21, 2017 at 02:46:53PM -0400, Jeff Layton wrote:
> On Tue, 2017-03-21 at 14:30 -0400, J. Bruce Fields wrote:
> > On Tue, Mar 21, 2017 at 01:23:24PM -0400, Jeff Layton wrote:
> > > On Tue, 2017-03-21 at 12:30 -0400, J. Bruce Fields wrote:
> > > > - It
On Tue, Mar 21, 2017 at 01:23:24PM -0400, Jeff Layton wrote:
> On Tue, 2017-03-21 at 12:30 -0400, J. Bruce Fields wrote:
> > - It's durable; the above comparison still works if there were reboots
> > between the two i_version checks.
> > - I don't know how re
On Tue, Mar 21, 2017 at 01:37:04PM -0400, J. Bruce Fields wrote:
> On Tue, Mar 21, 2017 at 01:23:24PM -0400, Jeff Layton wrote:
> > On Tue, 2017-03-21 at 12:30 -0400, J. Bruce Fields wrote:
> > > - NFS doesn't actually require that it increases, but I think it
> > &g
On Tue, Mar 21, 2017 at 01:23:24PM -0400, Jeff Layton wrote:
> On Tue, 2017-03-21 at 12:30 -0400, J. Bruce Fields wrote:
> > - NFS doesn't actually require that it increases, but I think it
> > should. I assume 64 bits means we don't need a discussion of
>
On Tue, Mar 21, 2017 at 06:45:00AM -0700, Christoph Hellwig wrote:
> On Mon, Mar 20, 2017 at 05:43:27PM -0400, J. Bruce Fields wrote:
> > To me, the interesting question is whether this allows us to turn on
> > i_version updates by default on xfs and ext4.
>
> XFS v5 file
On Thu, Dec 22, 2016 at 09:42:04AM -0500, Jeff Layton wrote:
> On Thu, 2016-12-22 at 00:45 -0800, Christoph Hellwig wrote:
> > On Wed, Dec 21, 2016 at 12:03:17PM -0500, Jeff Layton wrote:
> > >
> > > Only btrfs, ext4, and xfs implement it for data changes. Because of
> > > this, these filesystems
On Fri, Mar 03, 2017 at 07:53:57PM -0500, Jeff Layton wrote:
> On Fri, 2017-03-03 at 18:00 -0500, J. Bruce Fields wrote:
> > On Wed, Dec 21, 2016 at 12:03:17PM -0500, Jeff Layton wrote:
> > > tl;dr: I think we can greatly reduce the cost of the inode->i_version
> > &
On Wed, Dec 21, 2016 at 12:03:17PM -0500, Jeff Layton wrote:
> tl;dr: I think we can greatly reduce the cost of the inode->i_version
> counter, by exploiting the fact that we don't need to increment it
> if no one is looking at it. We can also clean up the code to prepare
> to eventually expose thi
The patch ordering is a little annoying as I'd like to be able to be
able to verify the implementation at the same time these new interfaces
are added, but, I don't know, I don't have a better idea.
Anyway, various nits:
On Wed, Dec 21, 2016 at 12:03:28PM -0500, Jeff Layton wrote:
> We already ha
On Thu, Oct 06, 2016 at 05:39:24PM +1100, NeilBrown wrote:
> On Thu, Aug 04 2016, NeilBrown wrote:
>
> >
> >
> > When nfsd calls fh_to_dentry, it expect ESTALE or ENOMEM as errors.
> > In particular it can be tempting to return ENOENT, but this is not
> > handled well by nfsd.
> >
> > Rather than
On Wed, Aug 10, 2016 at 11:56:15AM -0700, Linus Torvalds wrote:
> On Wed, Aug 10, 2016 at 11:46 AM, Josef Bacik wrote:
> >
> > So my naive fix would be something like this
>
> Bruce? Josef's patch looks ObviouslyCorrect(tm) to me now that I look
> at it - all the other callers of fh_compose() als
On Thu, Aug 04, 2016 at 05:47:19AM -0700, Christoph Hellwig wrote:
> On Thu, Aug 04, 2016 at 10:19:06AM +1000, NeilBrown wrote:
> >
> >
> > When nfsd calls fh_to_dentry, it expect ESTALE or ENOMEM as errors.
> > In particular it can be tempting to return ENOENT, but this is not
> > handled well b
On Fri, Jul 22, 2016 at 12:40:26PM +1000, NeilBrown wrote:
> On Fri, Jul 22 2016, J. Bruce Fields wrote:
>
> > On Fri, Jul 22, 2016 at 11:08:17AM +1000, NeilBrown wrote:
> >> On Fri, Jun 10 2016, fdman...@kernel.org wrote:
> >>
> >> > From: Filipe Manana
On Fri, Jul 22, 2016 at 11:08:17AM +1000, NeilBrown wrote:
> On Fri, Jun 10 2016, fdman...@kernel.org wrote:
>
> > From: Filipe Manana
> >
> > When we attempt to read an inode from disk, we end up always returning an
> > -ESTALE error to the caller regardless of the actual failure reason, which
>
On Fri, Apr 01, 2016 at 11:33:00AM +1100, Dave Chinner wrote:
> On Thu, Mar 31, 2016 at 06:34:17PM -0400, J. Bruce Fields wrote:
> > I haven't looked at the code, but I assume a JUKEBOX-returning write to
> > an absent file brings into cache the bits necessary to perform the
On Fri, Apr 01, 2016 at 09:20:23AM +1100, Dave Chinner wrote:
> On Thu, Mar 31, 2016 at 01:47:50PM -0600, Andreas Dilger wrote:
> > On Mar 31, 2016, at 12:08 PM, J. Bruce Fields wrote:
> > >
> > > On Thu, Mar 31, 2016 at 10:18:50PM +1100, Dave Chinner wrote:
> >
On Thu, Mar 31, 2016 at 10:18:50PM +1100, Dave Chinner wrote:
> On Thu, Mar 31, 2016 at 12:54:40AM -0700, Christoph Hellwig wrote:
> > On Thu, Mar 31, 2016 at 12:18:13PM +1100, Dave Chinner wrote:
> > > On Wed, Mar 30, 2016 at 11:27:55AM -0700, Darrick J. Wong wrote:
> > > > Or is it ok that falloc
On Thu, Nov 26, 2015 at 07:50:54PM +0100, Christoph Hellwig wrote:
> This patch set moves the existing btrfs clone ioctls that other file
> system have started to implement to common code, and allows the NFS
> server to export this functionality to remote systems.
>
> This work is based originally
On Thu, Nov 26, 2015 at 07:50:56PM +0100, Christoph Hellwig wrote:
> Pass a loff_t end for the last byte instead of the 32-bit count
> parameter to allow full file clones even on 32-bit architectures.
> While we're at it also drop the pointless inode argument and simplify
> the read/write selection
On Wed, Oct 28, 2015 at 07:25:10AM +0900, Neil Brown wrote:
>
> If you create a subvolume in btrfs and access it (by name) without
> mounting it, then the subvolume looks like a separate mount to some
> extent, returning a different st_dev to stat(), but it doesn't look like
> a separate mount in
On Mon, Oct 26, 2015 at 12:19:33PM +, Pádraig Brady wrote:
> On 26/10/15 03:39, Christoph Hellwig wrote:
> > On Sat, Oct 24, 2015 at 01:02:21PM +0100, P??draig Brady wrote:
> >> I'm a bit worried about the sparse expansion and default reflinking
> >> which might preclude cp(1) from using this c
On Sun, Oct 18, 2015 at 11:30:13AM -0700, Christoph Hellwig wrote:
> Just commenting on the man page here as the comment is about sematics.
> All the infrastructure in the patch looks reasonable to me, but this
> is something we need to get right.
>
> > +.B COPY_FR_REFLINK
> > +Create a lightweigh
On Thu, Jun 25, 2015 at 06:12:57PM -0400, Theodore Ts'o wrote:
> On Thu, Jun 25, 2015 at 02:46:44PM -0400, J. Bruce Fields wrote:
> > Looks OK to me. As I say I'd expect i_version_seen == true to end up
> > being the common case in a lot of v4 workloads, so I'm more
On Tue, Jun 23, 2015 at 12:32:41PM -0400, Theodore Ts'o wrote:
> On Thu, Jun 18, 2015 at 04:38:56PM +0200, David Sterba wrote:
> > Moving the discussion to fsdevel.
> >
> > Summary: disabling MS_I_VERSION brings some speedups to btrfs, but the
> > generic 'noiversion' option cannot be used to achi
On Tue, Apr 14, 2015 at 11:22:41AM -0700, Zach Brown wrote:
> On Tue, Apr 14, 2015 at 02:19:11PM -0400, J. Bruce Fields wrote:
> > On Tue, Apr 14, 2015 at 01:16:13PM -0400, Anna Schumaker wrote:
> > > On 04/14/2015 12:53 PM, Christoph Hellwig wrote:
> > > > On Sat, Ap
On Tue, Apr 14, 2015 at 01:16:13PM -0400, Anna Schumaker wrote:
> On 04/14/2015 12:53 PM, Christoph Hellwig wrote:
> > On Sat, Apr 11, 2015 at 09:04:02AM -0400, Jeff Layton wrote:
> >> Yuck! How the heck do you clean up the mess if that happens? I
> >> guess you're just stuck redoing the copy with
On Mon, Nov 24, 2014 at 06:57:27AM -0500, Theodore Ts'o wrote:
> If we want to be paranoid, we handle i_version updates non-lazily; I
> can see arguments in favor of that.
>
> Ext4 only enables MS_I_VERSION if the user asks for it explicitly, so
> it wouldn't cause me any problems. However, xfs a
On Thu, Feb 20, 2014 at 05:44:14PM -0800, Christoph Hellwig wrote:
> >
> > - return d_obtain_alias(inode);
> > + return d_obtain_alias_root(inode);
>
> Can we call this d_obtain_root or similar, please?
Yes, I like d_obtain_root better, done.
I'll send out the updated series sometime.
--b
From: "J. Bruce Fields"
Minor documentation updates:
- refer to d_obtain_alias rather than d_alloc_anon
- explain when to use d_splice_alias and when
d_materialise_unique.
- cut some details of d_splice_alias/d_materialise_unique
impl
From: "J. Bruce Fields"
There are a few d_obtain_alias callers that are using it to get the
root of a filesystem which may already have an alias somewhere else.
This is not the same as the filehandle-lookup case, and none of them
actually need DCACHE_DISCONNECTED set.
In the btrfs
From: "J. Bruce Fields"
Signed-off-by: J. Bruce Fields
---
fs/dcache.c | 13 +
1 file changed, 5 insertions(+), 8 deletions(-)
diff --git a/fs/dcache.c b/fs/dcache.c
index 3a1057a..efe3d3b 100644
--- a/fs/dcache.c
+++ b/fs/dcache.c
@@ -719,8 +719,6 @@ EXPORT_SYMBOL(d
From: "J. Bruce Fields"
The DCACHE_DISCONNECTED flag was intended *only* to mark dentries which
were looked up by filehandle and are currently in the process of being
hooked up to the rest of dcache.
Over time people have also confused it with IS_ROOT, using it to mark
directorie
From: "J. Bruce Fields"
Just a trivial move to locate it near (similar) d_materialise_unique
code and save some forward references in a following patch.
Signed-off-by: J. Bruce Fields
---
fs/dcache.c | 104 ++--
1 file c
From: "J. Bruce Fields"
If we get to this point and discover the dentry is not a root dentry, or
not DCACHE_DISCONNECTED--great, we always prefer that anyway.
Signed-off-by: J. Bruce Fields
---
fs/dcache.c | 9 +++--
1 file changed, 3 insertions(+), 6 deletions(-)
diff
From: "J. Bruce Fields"
Any IS_ROOT() alias should be safe to use; there's nothing special about
DCACHE_DISCONNECTED dentries.
Signed-off-by: J. Bruce Fields
---
fs/dcache.c | 8
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/fs/dcache.c b/fs/dcache.
From: "J. Bruce Fields"
d_splice_alias will d_move an IS_ROOT() directory dentry into place if
one exists. This should be safe as long as the dentry remains IS_ROOT,
but I can't see what guarantees that: once we drop the i_lock all we
hold here is the i_mutex on an unrelated p
From: "J. Bruce Fields"
Currently if d_splice_alias finds a directory with an alias that is not
IS_ROOT or not DCACHE_DISCONNECTED, it creates a duplicate directory.
Duplicate directory dentries are unacceptable; it is better just to
error out.
(In the case of a local filesyste
On Fri, Feb 14, 2014 at 09:45:24PM -0500, J. Bruce Fields wrote:
> On Fri, Feb 14, 2014 at 05:40:55PM -0800, Eric W. Biederman wrote:
> > "J. Bruce Fields" writes:
> >
> > > On Fri, Feb 14, 2014 at 01:43:48PM -0500, Josef Bacik wrote:
> > >> A user w
On Fri, Feb 14, 2014 at 05:40:55PM -0800, Eric W. Biederman wrote:
> "J. Bruce Fields" writes:
>
> > On Fri, Feb 14, 2014 at 01:43:48PM -0500, Josef Bacik wrote:
> >> A user was running into errors from an NFS export of a subvolume that had a
> >> defa
On Fri, Feb 14, 2014 at 01:43:48PM -0500, Josef Bacik wrote:
> A user was running into errors from an NFS export of a subvolume that had a
> default subvol set. When we mount a default subvol we will use
> d_obtain_alias()
> to find an existing dentry for the subvolume in the case that the root s
On Wed, May 15, 2013 at 08:21:54PM +, Myklebust, Trond wrote:
> On Wed, 2013-05-15 at 16:19 -0400, J. Bruce Fields wrote:
> > On Tue, May 14, 2013 at 02:15:26PM -0700, Zach Brown wrote:
> > > This crude patch illustrates the simplest plumbing involved in
> > > sup
On Tue, May 14, 2013 at 02:15:26PM -0700, Zach Brown wrote:
> This crude patch illustrates the simplest plumbing involved in
> supporting sys_call_range with the NFS COPY operation that's pending in
> the 4.2 draft spec.
>
> The patch is based on a previous prototype that used the COPY op to
> imp
On Wed, Apr 04, 2012 at 02:16:22PM -0400, Josef Bacik wrote:
> On Wed, Apr 04, 2012 at 09:12:57PM +0300, Kasatkin, Dmitry wrote:
> > On Wed, Apr 4, 2012 at 8:47 PM, Mimi Zohar wrote:
> > > On Wed, 2012-04-04 at 13:43 -0400, Josef Bacik wrote:
> > >> On Wed, Apr 04, 2012 at 08:24:19PM +0300, Kasatk
On Wed, Dec 08, 2010 at 09:41:33PM -0700, Andreas Dilger wrote:
> On 2010-12-08, at 16:07, Neil Brown wrote:
> > On Mon, 6 Dec 2010 11:48:45 -0500 "J. Bruce Fields"
> > wrote:
> >
> >> On Fri, Dec 03, 2010 at 04:01:44PM -0700, Andreas Dilger wrote:
>
On Wed, Dec 08, 2010 at 10:16:29AM -0700, Andreas Dilger wrote:
> On 2010-12-07, at 10:02, Trond Myklebust wrote:
>
> > On Tue, 2010-12-07 at 17:51 +0100, Christoph Hellwig wrote:
> >> It's just as stable as a real dev_t in the times of hotplug and udev.
> >> As long as you don't touch anything in
On Tue, Dec 07, 2010 at 05:52:13PM +0100, hch wrote:
> On Fri, Dec 03, 2010 at 05:45:26PM -0500, J. Bruce Fields wrote:
> > We're using statfs64.fs_fsid for this; I believe that's both stable
> > across reboots and distinguishes between subvolumes, so that's OK.
>
On Fri, Dec 03, 2010 at 04:01:44PM -0700, Andreas Dilger wrote:
> On 2010-12-03, at 15:45, J. Bruce Fields wrote:
> > We're using statfs64.fs_fsid for this; I believe that's both stable
> > across reboots and distinguishes between subvolumes, so that's OK.
>
On Fri, Dec 03, 2010 at 05:29:24PM -0500, Chris Mason wrote:
> Excerpts from Dave Chinner's message of 2010-12-03 17:27:56 -0500:
> > On Fri, Dec 03, 2010 at 04:45:27PM -0500, Josef Bacik wrote:
> > > On Wed, Dec 01, 2010 at 09:21:36AM -0500, Josef Bacik wrote:
> > > > Hello,
> > > >
> > > > Vario
On Fri, Dec 03, 2010 at 04:45:27PM -0500, Josef Bacik wrote:
> So now that I've actually looked at everything, it looks like the semantics
> are
> all right for subvolumes
>
> 1) readdir - we return the root id in d_ino, which is unique across the fs
Though Michael Vrable pointed out an apparent
On Thu, Dec 02, 2010 at 05:14:53PM +, David Pottage wrote:
> A couple of years ago I was suffering from the problem of different
> files having the same inode number on Netapp servers. On a Netapp
> device if you snapshot a volume then the files in the snapshot have
> the same inode number as t
On Wed, Dec 01, 2010 at 05:52:07PM -0800, Michael Vrable wrote:
> On Wed, Dec 01, 2010 at 03:09:52PM -0500, Josef Bacik wrote:
> >On Wed, Dec 01, 2010 at 03:00:08PM -0500, J. Bruce Fields wrote:
> >>I think you're already fine:
> >>
> >># mkdir TMP
>
On Wed, Dec 01, 2010 at 03:09:52PM -0500, Josef Bacik wrote:
> On Wed, Dec 01, 2010 at 03:00:08PM -0500, J. Bruce Fields wrote:
> > On Wed, Dec 01, 2010 at 02:54:33PM -0500, Josef Bacik wrote:
> > > Oh well crud, I was hoping that I could leave the inode numbers as 256 for
>
On Wed, Dec 01, 2010 at 02:54:33PM -0500, Josef Bacik wrote:
> Oh well crud, I was hoping that I could leave the inode numbers as 256 for
> everything, but I forgot about readdir. So the inode item in the parent would
> have to have a unique inode number that would get spit out in readdir, but
>
On Wed, Dec 01, 2010 at 09:21:36AM -0500, Josef Bacik wrote:
> Hello,
>
> Various people have complained about how BTRFS deals with subvolumes recently,
> specifically the fact that they all have the same inode number, and there's no
> discrete seperation from one subvolume to another. Christoph
On Thu, Jun 24, 2010 at 07:31:57AM +1000, Neil Brown wrote:
> On Wed, 23 Jun 2010 14:28:38 -0400
> "J. Bruce Fields" wrote:
>
> > On Thu, Jun 17, 2010 at 02:54:01PM +1000, Neil Brown wrote:
> > >
> > > If you export two subvolumes of a btrfs filesys
On Thu, Jun 17, 2010 at 02:54:01PM +1000, Neil Brown wrote:
>
> If you export two subvolumes of a btrfs filesystem, they will both be
> given the same uuid so lookups will be confused.
> blkid cannot differentiate the two, so we must use the fsid from
> statfs64 to identify the filesystem.
>
> We
On Mon, Jan 05, 2009 at 08:18:58AM -0500, Chris Mason wrote:
> On Mon, 2009-01-05 at 21:07 +1100, Chris Samuel wrote:
> > On Sat, 3 Jan 2009 8:01:04 am Andi Kleen wrote:
> >
> > > When it's in mainline I suspect people will start using it for that.
> >
> > Some people don't even wait for that. ;-
72 matches
Mail list logo