Re: freeBSD nullfs together with nfs and "silly rename"

2010-06-12 Thread Mikolaj Golub

On Sun, 6 Jun 2010 16:44:43 +0200 Leon Meßner wrote:

 LM> Hi,
 LM> I hope this is not the wrong list to ask. Didn't get any answers on
 LM> -questions.

 LM> When you try to do the following inside a nullfs mounted directory,
 LM> where the nullfs origin is itself mounted via nfs you get an error:

 LM> # foo 
 LM> # tail -f foo& 
 LM> # rm -f foo 
 LM> tail: foo: Stale NFS file handle
 LM> # fg

 LM> This is really a problem when running services inside jails and using
 LM> NFS as storage. As of [2] it looks like this problem is known for a
 LM> while. On a normal NFS mount this does not happen as "silly renaming"
 LM> [1] works there (producing nasty little .nfs files).

nfs_sillyrename() is called when vnode's usecount is more then 1. It is
expected that unlink() syscall increases vnode's usecount in namei() and if
the file has been already opened usecount will be more then 1.

But with nullfs layer present the reference counts are held by the upper node,
not the lower (nfs) one, so when unlink() is called it increases usecount of
the upper vnode, not nfs vnode and nfs_sillyrename() is never called.

The strightforward solution looks like to implement null_remove() that will
increase lower vnode's refcount before calling null_bypass() and then
decrement it after the call. See the attached patch (it works for me on both
8-STABLE and CURRENT).

-- 
Mikolaj Golub

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: freeBSD nullfs together with nfs and "silly rename"

2010-06-12 Thread Mikolaj Golub

On Sat, 12 Jun 2010 11:56:10 +0300 Mikolaj Golub wrote to Leon Meßner:

 MG> See the attached patch (it works for me on both 8-STABLE and CURRENT).

Sorry, actually here is the patch.

-- 
Mikolaj Golub

Index: sys/fs/nullfs/null_vnops.c
===
--- sys/fs/nullfs/null_vnops.c	(revision 208960)
+++ sys/fs/nullfs/null_vnops.c	(working copy)
@@ -499,6 +499,23 @@
 }
 
 /*
+ * Increasing refcount of lower vnode is needed at least for the case
+ * when lower FS is NFS to do sillyrename if the file is in use.
+ */
+static int
+null_remove(struct vop_remove_args *ap)
+{
+	int retval;
+	struct vnode *lvp;
+
+	lvp = NULLVPTOLOWERVP(ap->a_vp);
+	VREF(lvp);
+	retval = null_bypass(&ap->a_gen);
+	vrele(lvp);
+	return (retval);
+}
+
+/*
  * We handle this to eliminate null FS to lower FS
  * file moving. Don't know why we don't allow this,
  * possibly we should.
@@ -809,6 +826,7 @@
 	.vop_open =		null_open,
 	.vop_print =		null_print,
 	.vop_reclaim =		null_reclaim,
+	.vop_remove =		null_remove,
 	.vop_rename =		null_rename,
 	.vop_setattr =		null_setattr,
 	.vop_strategy =		VOP_EOPNOTSUPP,
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: freeBSD nullfs together with nfs and "silly rename"

2010-06-12 Thread Kostik Belousov
On Sat, Jun 12, 2010 at 11:56:10AM +0300, Mikolaj Golub wrote:
> 
> On Sun, 6 Jun 2010 16:44:43 +0200 Leon Me??ner wrote:
> 
>  LM> Hi,
>  LM> I hope this is not the wrong list to ask. Didn't get any answers on
>  LM> -questions.
> 
>  LM> When you try to do the following inside a nullfs mounted directory,
>  LM> where the nullfs origin is itself mounted via nfs you get an error:
> 
>  LM> # foo 
>  LM> # tail -f foo& 
>  LM> # rm -f foo 
>  LM> tail: foo: Stale NFS file handle
>  LM> # fg
> 
>  LM> This is really a problem when running services inside jails and using
>  LM> NFS as storage. As of [2] it looks like this problem is known for a
>  LM> while. On a normal NFS mount this does not happen as "silly renaming"
>  LM> [1] works there (producing nasty little .nfs files).
> 
> nfs_sillyrename() is called when vnode's usecount is more then 1. It is
> expected that unlink() syscall increases vnode's usecount in namei() and if
> the file has been already opened usecount will be more then 1.
> 
> But with nullfs layer present the reference counts are held by the upper node,
> not the lower (nfs) one, so when unlink() is called it increases usecount of
> the upper vnode, not nfs vnode and nfs_sillyrename() is never called.
> 
> The strightforward solution looks like to implement null_remove() that will
> increase lower vnode's refcount before calling null_bypass() and then
> decrement it after the call. See the attached patch (it works for me on both
> 8-STABLE and CURRENT).

The upper vnode holds a reference to the lower vnode, as you noted.
Now, with your patch, I believe that _all_ calls to the nfs_remove()
are happen with refcount > 1.


pgpFhd1vszN6Q.pgp
Description: PGP signature


Re: Re: freeBSD nullfs together with nfs and "silly rename"

2010-06-12 Thread Rick Macklem



On Sat, 12 Jun 2010, Kostik Belousov wrote:


On Sat, Jun 12, 2010 at 11:56:10AM +0300, Mikolaj Golub wrote:


On Sun, 6 Jun 2010 16:44:43 +0200 Leon Me??ner wrote:

 LM> Hi,
 LM> I hope this is not the wrong list to ask. Didn't get any answers on
 LM> -questions.

 LM> When you try to do the following inside a nullfs mounted directory,
 LM> where the nullfs origin is itself mounted via nfs you get an error:

 LM> # foo
 LM> # tail -f foo&
 LM> # rm -f foo
 LM> tail: foo: Stale NFS file handle
 LM> # fg

 LM> This is really a problem when running services inside jails and using
 LM> NFS as storage. As of [2] it looks like this problem is known for a
 LM> while. On a normal NFS mount this does not happen as "silly renaming"
 LM> [1] works there (producing nasty little .nfs files).

nfs_sillyrename() is called when vnode's usecount is more then 1. It is
expected that unlink() syscall increases vnode's usecount in namei() and if
the file has been already opened usecount will be more then 1.

But with nullfs layer present the reference counts are held by the upper node,
not the lower (nfs) one, so when unlink() is called it increases usecount of
the upper vnode, not nfs vnode and nfs_sillyrename() is never called.

The strightforward solution looks like to implement null_remove() that will
increase lower vnode's refcount before calling null_bypass() and then
decrement it after the call. See the attached patch (it works for me on both
8-STABLE and CURRENT).


The upper vnode holds a reference to the lower vnode, as you noted.
Now, with your patch, I believe that _all_ calls to the nfs_remove()
are happen with refcount > 1.


I'm not familiar with the nullfs so this might be way off, but would
this patch be ok by any chance?

Index: sys/fs/nullfs/null_vnops.c
===
--- sys/fs/nullfs/null_vnops.c  (revision 208960)
+++ sys/fs/nullfs/null_vnops.c  (working copy)
@@ -499,6 +499,23 @@
 }

 /*
+ * Increasing refcount of lower vnode is needed at least for the case
+ * when lower FS is NFS to do sillyrename if the file is in use.
+ */
+static int
+null_remove(struct vop_remove_args *ap)
+{
+   int retval;
+   struct vnode *lvp;
+
+   if (ap->a_vp->v_usecount > 1) {
+   lvp = NULLVPTOLOWERVP(ap->a_vp);
+   VREF(lvp);
+   } else
+   lvp = NULL;
+   retval = null_bypass(&ap->a_gen);
+   if (lvp != NULL)
+   vrele(lvp);
+   return (retval);
+}
+
+/*
  * We handle this to eliminate null FS to lower FS
  * file moving. Don't know why we don't allow this,
  * possibly we should.
@@ -809,6 +826,7 @@
.vop_open = null_open,
.vop_print =null_print,
.vop_reclaim =  null_reclaim,
+   .vop_remove =   null_remove,
.vop_rename =   null_rename,
.vop_setattr =  null_setattr,
.vop_strategy = VOP_EOPNOTSUPP,
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Re: freeBSD nullfs together with nfs and "silly rename"

2010-06-12 Thread Kostik Belousov
On Sat, Jun 12, 2010 at 11:15:49AM -0400, Rick Macklem wrote:
> 
> 
> On Sat, 12 Jun 2010, Kostik Belousov wrote:
> 
> >On Sat, Jun 12, 2010 at 11:56:10AM +0300, Mikolaj Golub wrote:
> >>
> >>On Sun, 6 Jun 2010 16:44:43 +0200 Leon Me??ner wrote:
> >>
> >> LM> Hi,
> >> LM> I hope this is not the wrong list to ask. Didn't get any answers on
> >> LM> -questions.
> >>
> >> LM> When you try to do the following inside a nullfs mounted directory,
> >> LM> where the nullfs origin is itself mounted via nfs you get an error:
> >>
> >> LM> # foo
> >> LM> # tail -f foo&
> >> LM> # rm -f foo
> >> LM> tail: foo: Stale NFS file handle
> >> LM> # fg
> >>
> >> LM> This is really a problem when running services inside jails and using
> >> LM> NFS as storage. As of [2] it looks like this problem is known for a
> >> LM> while. On a normal NFS mount this does not happen as "silly renaming"
> >> LM> [1] works there (producing nasty little .nfs files).
> >>
> >>nfs_sillyrename() is called when vnode's usecount is more then 1. It is
> >>expected that unlink() syscall increases vnode's usecount in namei() and 
> >>if
> >>the file has been already opened usecount will be more then 1.
> >>
> >>But with nullfs layer present the reference counts are held by the upper 
> >>node,
> >>not the lower (nfs) one, so when unlink() is called it increases usecount 
> >>of
> >>the upper vnode, not nfs vnode and nfs_sillyrename() is never called.
> >>
> >>The strightforward solution looks like to implement null_remove() that 
> >>will
> >>increase lower vnode's refcount before calling null_bypass() and then
> >>decrement it after the call. See the attached patch (it works for me on 
> >>both
> >>8-STABLE and CURRENT).
> >
> >The upper vnode holds a reference to the lower vnode, as you noted.
> >Now, with your patch, I believe that _all_ calls to the nfs_remove()
> >are happen with refcount > 1.
> >
> I'm not familiar with the nullfs so this might be way off, but would
> this patch be ok by any chance?
> 
> Index: sys/fs/nullfs/null_vnops.c
> ===
> --- sys/fs/nullfs/null_vnops.c(revision 208960)
> +++ sys/fs/nullfs/null_vnops.c(working copy)
> @@ -499,6 +499,23 @@
>  }
> 
>  /*
> + * Increasing refcount of lower vnode is needed at least for the case
> + * when lower FS is NFS to do sillyrename if the file is in use.
> + */
> +static int
> +null_remove(struct vop_remove_args *ap)
> +{
> + int retval;
> + struct vnode *lvp;
> +
> + if (ap->a_vp->v_usecount > 1) {
> + lvp = NULLVPTOLOWERVP(ap->a_vp);
> + VREF(lvp);
> + } else
> + lvp = NULL;
> + retval = null_bypass(&ap->a_gen);
> + if (lvp != NULL)
> + vrele(lvp);
> + return (retval);
> +}
> +
> +/*

Yes, I hoped that Mikolaj ends up with something similar :). Please note
that this is racy, since we cannot know why usecount is greater then 1.
This might cause the silly rename to kick in some time where it should
not, but the race is rare.


pgpM94x3NcK0o.pgp
Description: PGP signature


Re: Re: Re: freeBSD nullfs together with nfs and "silly rename"

2010-06-12 Thread Rick Macklem



On Sat, 12 Jun 2010, Kostik Belousov wrote:



Yes, I hoped that Mikolaj ends up with something similar :). Please note
that this is racy, since we cannot know why usecount is greater then 1.
This might cause the silly rename to kick in some time where it should
not, but the race is rare.


I'd say that having silly rename happen once in a while for unlink when
it doesn't have to happen is better than having the file deleted on the
server while it is still open on the client.

rick

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Re: Re: freeBSD nullfs together with nfs and "silly rename"

2010-06-12 Thread Kostik Belousov
On Sat, Jun 12, 2010 at 07:06:11PM -0400, Rick Macklem wrote:
> 
> 
> On Sat, 12 Jun 2010, Kostik Belousov wrote:
> 
> >
> >Yes, I hoped that Mikolaj ends up with something similar :). Please note
> >that this is racy, since we cannot know why usecount is greater then 1.
> >This might cause the silly rename to kick in some time where it should
> >not, but the race is rare.
> >
> I'd say that having silly rename happen once in a while for unlink when
> it doesn't have to happen is better than having the file deleted on the
> server while it is still open on the client.

My note was not an objection, only a note. Also, when committing, please
add a comment explaining what is going on.


pgplm57gi6oDf.pgp
Description: PGP signature


Re: Re: Re: Re: freeBSD nullfs together with nfs and "silly rename"

2010-06-13 Thread Rick Macklem



On Sun, 13 Jun 2010, Kostik Belousov wrote:



My note was not an objection, only a note. Also, when committing, please
add a comment explaining what is going on.


Righto, and my response was just my opinion. I'm assuming Mikolaj is
looking at committing this?

rick

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"