Re: fcntl(F_GETPATH) support
> On Sep 17, 2019, at 9:08 PM, Takashi YAMAMOTO wrote: > > but it would be a super expensive operation for the majority of filesystems, > right? But it will most likely be extremely rare, so *shrug*? -- thorpej
Re: fcntl(F_GETPATH) support
2019年9月16日(月) 3:43 Jason Thorpe : > > > > On Sep 15, 2019, at 11:03 AM, Christos Zoulas > wrote: > > > > I think it is quite reliable because all the file descriptors would be > recently > > opened and therefore be in the cache. One would need to DoS the cache > > cause eviction. If that turns out to be false, we can make the namecache > > reliable, or withdraw it. > > If we had a way to ask the file system for the / a name+parent in the > event we don't find an entry in the name cache, then we don't have to make > the name cache "reliable". > but it would be a super expensive operation for the majority of filesystems, right? > > FWIW, I don't think making the name cache "reliable" is necessarily a good > idea -- it could be abused to consume kernel memory by creating lots of > long-named files / directories. > > In the event that the name really can't be determined (perhaps it's an > open-unlinked file?), then I think it's perfectly reasonable to return > ENOENT for F_GETPATH; callers need to be capable of handling that. > > -- thorpej > >
Re: fcntl(F_GETPATH) support
2019年9月15日(日) 9:06 Christos Zoulas : > In article <2f29ca9a-0ae1-48d3-b3f4-1556912d4...@me.com>, > Jason Thorpe wrote: > > > > > >> On Sep 14, 2019, at 2:52 PM, Kamil Rytarowski wrote: > >> > >> On 14.09.2019 23:34, Christos Zoulas wrote: > >>> Comments? > >>> > >> > >> Question. How does it handle symbolic/hardlinks? > > > >Symbolic links, of course, are simply continuations of the lookup, so > >there's "nothing to see here" with a symbolic link. > > If you can get a file desriptor to a symlink, it will work; I don't think > that we have a way to do this now. > > >I looked at cache_revlookup() and it looks to me like it simply picks > >the first hit it finds in the hash table for the given vnode. > > That is correct. > > >At least one platform that support this API uses "the last name the file > >was looked up with", and will fall back on "whatever the underlying file > >system considers to be the canonical name for the file", which is file > >system-defined, if for some reason there is no entry in the name cache > >(which could, for example, happen if an application is doing an > >"open-by-file-id" type operation and the file has not yet been opened by > >path name). > > It will also fail if the entry has been evicted from the cache, but that > rarely happens with recently opened files. > if we want to provide this API, it should be done in a reliable way. "rarely fails" is not reliable enough, IMO. > > christos > >
Re: fcntl(F_GETPATH) support
In article , Jason Thorpe wrote: > > >> On Sep 17, 2019, at 1:39 PM, Christos Zoulas wrote: >> >> I am not sure though if we should change the current behavior just to make >> F_GETPATH better? Opinions? > >It seems completely logical that we SHOULD fix this. I concur. Specially because some filesystems already do it. These are the only ones that need to be changed: christos Index: fs/msdosfs/msdosfs_vnops.c === RCS file: /cvsroot/src/sys/fs/msdosfs/msdosfs_vnops.c,v retrieving revision 1.98 diff -u -u -r1.98 msdosfs_vnops.c --- fs/msdosfs/msdosfs_vnops.c 26 Apr 2017 03:02:48 - 1.98 +++ fs/msdosfs/msdosfs_vnops.c 17 Sep 2019 22:40:01 - @@ -153,6 +153,8 @@ goto bad; VN_KNOTE(ap->a_dvp, NOTE_WRITE); *ap->a_vpp = DETOV(dep); + cache_enter(ap->a_dvp, *ap->a_vpp, cnp->cn_nameptr, cnp->cn_namelen, + cnp->cn_flags); return (0); bad: Index: fs/tmpfs/tmpfs_subr.c === RCS file: /cvsroot/src/sys/fs/tmpfs/tmpfs_subr.c,v retrieving revision 1.104 diff -u -u -r1.104 tmpfs_subr.c --- fs/tmpfs/tmpfs_subr.c 1 Jan 2019 10:06:54 - 1.104 +++ fs/tmpfs/tmpfs_subr.c 17 Sep 2019 22:40:01 - @@ -434,6 +434,7 @@ VOP_UNLOCK(*vpp); + cache_enter(dvp, *vpp, cnp->cn_nameptr, cnp->cn_namelen, cnp->cn_flags); return 0; } Index: fs/udf/udf_subr.c === RCS file: /cvsroot/src/sys/fs/udf/udf_subr.c,v retrieving revision 1.146 diff -u -u -r1.146 udf_subr.c --- fs/udf/udf_subr.c 3 Jun 2019 06:04:20 - 1.146 +++ fs/udf/udf_subr.c 17 Sep 2019 22:40:01 - @@ -5963,6 +5963,7 @@ /* adjust file count */ udf_adjust_filecount(udf_node, 1); + cache_enter(dvp, *vpp, cnp->cn_nameptr, cnp->cn_namelen, cnp->cn_flags); return 0; } Index: ufs/chfs/chfs_vnode.c === RCS file: /cvsroot/src/sys/ufs/chfs/chfs_vnode.c,v retrieving revision 1.15 diff -u -u -r1.15 chfs_vnode.c --- ufs/chfs/chfs_vnode.c 1 Apr 2017 19:35:57 - 1.15 +++ ufs/chfs/chfs_vnode.c 17 Sep 2019 22:40:01 - @@ -310,6 +310,8 @@ VOP_UNLOCK(vp); *vpp = vp; + cache_enter(pdir, *vpp, cnp->cn_nameptr, cnp->cn_namelen, + cnp->cn_flags); return (0); } Index: ufs/ext2fs/ext2fs_vnops.c === RCS file: /cvsroot/src/sys/ufs/ext2fs/ext2fs_vnops.c,v retrieving revision 1.129 diff -u -u -r1.129 ext2fs_vnops.c --- ufs/ext2fs/ext2fs_vnops.c 1 Jan 2019 10:06:55 - 1.129 +++ ufs/ext2fs/ext2fs_vnops.c 17 Sep 2019 22:40:01 - @@ -1045,6 +1045,7 @@ } *vpp = tvp; + cache_enter(dvp, *vpp, cnp->cn_nameptr, cnp->cn_namelen, cnp->cn_flags); return 0; bad: Index: ufs/lfs/lfs_vnops.c === RCS file: /cvsroot/src/sys/ufs/lfs/lfs_vnops.c,v retrieving revision 1.324 diff -u -u -r1.324 lfs_vnops.c --- ufs/lfs/lfs_vnops.c 20 Jun 2019 00:49:11 - 1.324 +++ ufs/lfs/lfs_vnops.c 17 Sep 2019 22:40:01 - @@ -405,6 +405,7 @@ if (error) goto bad; *vpp = tvp; + cache_enter(dvp, *vpp, cnp->cn_nameptr, cnp->cn_namelen, cnp->cn_flags); KASSERT(VOP_ISLOCKED(*vpp) == LK_EXCLUSIVE); return (0); Index: ufs/ufs/ufs_vnops.c === RCS file: /cvsroot/src/sys/ufs/ufs/ufs_vnops.c,v retrieving revision 1.247 diff -u -u -r1.247 ufs_vnops.c --- ufs/ufs/ufs_vnops.c 1 Jul 2019 00:57:06 - 1.247 +++ ufs/ufs/ufs_vnops.c 17 Sep 2019 22:40:01 - @@ -1909,6 +1909,7 @@ if (error) goto bad; *vpp = tvp; + cache_enter(dvp, *vpp, cnp->cn_nameptr, cnp->cn_namelen, cnp->cn_flags); return (0); bad:
Re: fcntl(F_GETPATH) support
> On Sep 17, 2019, at 1:39 PM, Christos Zoulas wrote: > > I am not sure though if we should change the current behavior just to make > F_GETPATH better? Opinions? It seems completely logical that we SHOULD fix this. -- thorpej
Re: fcntl(F_GETPATH) support
In article <4e7e49e0-9c71-1f21-22fc-8ed54590a...@gmx.com>, Kamil Rytarowski wrote: >-=-=-=-=-=- >-=-=-=-=-=- > >On 15.09.2019 20:03, Christos Zoulas wrote: >> I think it is quite reliable because all the file descriptors would be >recently >> opened and therefore be in the cache. One would need to DoS the cache >> cause eviction. If that turns out to be false, we can make the namecache >> reliable, or withdraw it. > >As discussed with Mateusz Guzik offlist, the dentry approach looks >reliable and as a way to go. > >It changes fd -> vnode -> inode to fd -> dentry -> vnode. > >We could switch from catching program name on exec to proper pathname >resolution with KERN_PROC_PATHNAME. > >Certain programs require always correct path to be resolved from >hardlinks, not the last one from the cache. This used to affect LLVM. > >There is also a hole in the current namecache implementation as it >misses entry for newly created files (as informed by Mateusz). Example: > >#include >#include >#include >#include >#include > >int >main(void) >{ >char buf[1024]; >int fd; > >fd = open("/tmp/test", O_RDWR|O_CREAT|O_EXCL,0600); >if (fd == -1) >err(1, "open"); >if (fcntl(fd, F_GETPATH, buf) < 0) >err(1, "fcntl"); >printf("[%s]\n", buf); >} > >For the time being, the current code is still an improvement over the >past state. Well, I did some experiments by adding a cache_enter call in ufs inode creation. This fixes the above case, and makes the cache more logically consistent i.e. we usually create files we are going to access, so we might as well cache them (consider a build scenario and object files, or even binaries that will get installed). This is a one line change per filesystem: Index: ufs_vnops.c === RCS file: /cvsroot/src/sys/ufs/ufs/ufs_vnops.c,v retrieving revision 1.247 diff -u -p -u -r1.247 ufs_vnops.c --- ufs_vnops.c 1 Jul 2019 00:57:06 - 1.247 +++ ufs_vnops.c 17 Sep 2019 20:32:48 - @@ -1909,6 +1909,7 @@ ufs_makeinode(struct vattr *vap, struct if (error) goto bad; *vpp = tvp; + cache_enter(dvp, *vpp, cnp->cn_nameptr, cnp->cn_namelen, cnp->cn_flags); return (0); bad: The above change does not seem to affect build times... no cache build.sh started:Mon Sep 16 13:11:04 EDT 2019 build.sh ended: Mon Sep 16 15:27:41 EDT 2019 2:16:37 cache build.sh started:Mon Sep 16 16:02:18 EDT 2019 build.sh ended: Mon Sep 16 18:18:23 EDT 2019 2:16:15 I don't know what made the builds slower the second day... We should investigate why builds on a freshly booted machines are faster! cache build.sh started:Tue Sep 17 11:05:35 EDT 2019 build.sh ended: Tue Sep 17 13:37:33 EDT 2019 2:31:58 no cache build.sh started:Tue Sep 17 14:01:14 EDT 2019 build.sh ended: Tue Sep 17 16:29:17 EDT 2019 2:28:03 I am not sure though if we should change the current behavior just to make F_GETPATH better? Opinions? christos
Re: fcntl(F_GETPATH) support
On 9/15/19, Jason Thorpe wrote: > > >> On Sep 15, 2019, at 11:03 AM, Christos Zoulas >> wrote: >> >> I think it is quite reliable because all the file descriptors would be >> recently >> opened and therefore be in the cache. One would need to DoS the cache >> cause eviction. If that turns out to be false, we can make the namecache >> reliable, or withdraw it. > > If we had a way to ask the file system for the / a name+parent in the event > we don't find an entry in the name cache, then we don't have to make the > name cache "reliable". See my other e-mail about turning realpath into a syscall. Then you have everything to work with and the code should be trivial unless you want to handle hardlinks. In the most simplistic implementation you do the lookup and then do the reverse lookup of what you found. Things still can get evicted in the time window in-between but that's probably tolerable. > > FWIW, I don't think making the name cache "reliable" is necessarily a good > idea -- it could be abused to consume kernel memory by creating lots of > long-named files / directories. > I don't think it's a material change in terms of memory use. You can already force this scenario by creating files and stating them. Getting rid of namecache entries is not going to save you much compared to all the vnodes which need to be dropped/recycled (which will also remove the nc entries). > In the event that the name really can't be determined (perhaps it's an > open-unlinked file?), then I think it's perfectly reasonable to return > ENOENT for F_GETPATH; callers need to be capable of handling that. > Corner case of the sort is probably not accounted for either and not expected to be ran into -- if you remove the files from under the compiler you are tampering with the build and it is free to fail. On the other hand name eviction is not a problem you expect to run into. I can't easily test right now for sure, but historically BSDs have not been adding newly created files to the namecache and perhaps NetBSD is not doing it either. In which case F_GETPATH on newly created files is guaranteed to fail unless someone looked them up separately. Whether this should be modified is imho a separate discussion, but is relevant to points made earlier about memory use and DoSability. Imho it should be fine to change the kernel to always add these. -- Mateusz Guzik
Re: fcntl(F_GETPATH) support
On 15.09.2019 20:03, Christos Zoulas wrote: > I think it is quite reliable because all the file descriptors would be > recently > opened and therefore be in the cache. One would need to DoS the cache > cause eviction. If that turns out to be false, we can make the namecache > reliable, or withdraw it. As discussed with Mateusz Guzik offlist, the dentry approach looks reliable and as a way to go. It changes fd -> vnode -> inode to fd -> dentry -> vnode. We could switch from catching program name on exec to proper pathname resolution with KERN_PROC_PATHNAME. Certain programs require always correct path to be resolved from hardlinks, not the last one from the cache. This used to affect LLVM. There is also a hole in the current namecache implementation as it misses entry for newly created files (as informed by Mateusz). Example: #include #include #include #include #include int main(void) { char buf[1024]; int fd; fd = open("/tmp/test", O_RDWR|O_CREAT|O_EXCL,0600); if (fd == -1) err(1, "open"); if (fcntl(fd, F_GETPATH, buf) < 0) err(1, "fcntl"); printf("[%s]\n", buf); } For the time being, the current code is still an improvement over the past state. signature.asc Description: OpenPGP digital signature
Re: fcntl(F_GETPATH) support
> On Sep 15, 2019, at 1:17 PM, Mouse wrote: > > [top-posting damage repaired manually] >>> If you can get a file desriptor to a symlink, it will work; I don't >>> think that we have a way to do this now. >> FYI - XNU has O_SYMLINK for this. > > What operations does it support on fds open onto symlinks? fstat(), fchflags(), etc. I don't recall if read() is allowed, but I'm pretty sure write() is not. -- thorpej
Re: fcntl(F_GETPATH) support
>>> If you can get a file desriptor to a symlink, [...] > There should be no way to get a fd for a symlink; if you can, > something bad has happened. At present, perhaps. > Because of the way short symlinks are handled in ffs, I wouldn't be > surprised if managing to get one let you corrupt the fs. Depends on what operations are defined on such a descriptor, and how they are implemented. /~\ The ASCII Mouse \ / Ribbon Campaign X Against HTMLmo...@rodents-montreal.org / \ Email! 7D C8 61 52 5D E7 2D 39 4E F1 31 3E E8 B3 27 4B
Re: fcntl(F_GETPATH) support
[top-posting damage repaired manually] >> If you can get a file desriptor to a symlink, it will work; I don't >> think that we have a way to do this now. > FYI - XNU has O_SYMLINK for this. What operations does it support on fds open onto symlinks? /~\ The ASCII Mouse \ / Ribbon Campaign X Against HTMLmo...@rodents-montreal.org / \ Email! 7D C8 61 52 5D E7 2D 39 4E F1 31 3E E8 B3 27 4B
Re: fcntl(F_GETPATH) support
On Sun, Sep 15, 2019 at 09:53:41AM -0700, Jason Thorpe wrote: > FYI - XNU has O_SYMLINK for this. Not saying we should necessarily add it > to NetBSD, just noting the precedent. > > -- thorpej > Sent from my iPhone. > > > On Sep 14, 2019, at 5:06 PM, Christos Zoulas wrote: > > > > If you can get a file desriptor to a symlink, it will work; I don't think > > that we have a way to do this now. There should be no way to get a fd for a symlink; if you can, something bad has happened. Because of the way short symlinks are handled in ffs, I wouldn't be surprised if managing to get one let you corrupt the fs. -- A. No Q. Should I include quotations after my reply?
Re: fcntl(F_GETPATH) support
> On Sep 15, 2019, at 11:03 AM, Christos Zoulas wrote: > > I think it is quite reliable because all the file descriptors would be > recently > opened and therefore be in the cache. One would need to DoS the cache > cause eviction. If that turns out to be false, we can make the namecache > reliable, or withdraw it. If we had a way to ask the file system for the / a name+parent in the event we don't find an entry in the name cache, then we don't have to make the name cache "reliable". FWIW, I don't think making the name cache "reliable" is necessarily a good idea -- it could be abused to consume kernel memory by creating lots of long-named files / directories. In the event that the name really can't be determined (perhaps it's an open-unlinked file?), then I think it's perfectly reasonable to return ENOENT for F_GETPATH; callers need to be capable of handling that. -- thorpej
Re: fcntl(F_GETPATH) support
On 9/15/19, Mateusz Guzik wrote: > On 9/14/19, Christos Zoulas wrote: >> >> Comments? >> >> +error = vnode_to_path(kpath, MAXPATHLEN, fp->f_vnode, l, l->l_proc); > > What motivates this change? > > I think it is a little problematic in that namecache resolution is not > guaranteed to work. If the consumer does not check for failure or fallback > to something else the user is going to experience "once in the blue moon" > failures. > > For example clang likes to get full paths to names. On Linux it can > trivially do it thanks to dentry cache. On other systems there is a > compilation check for F_GETPATH. If none of this is present it goes for > realpath, which is slover but basically works. If F_GETPATH is present, > it is always used and there is no fallback if it fails. Then you risk > setting yourself up for a weird compilation failure, which may not even > repeat itself after you retry. > > All in all I don't think this should be added unless namecache becomes > reliable. The above reasoning is why I did not add it to FreeBSD. > Now that I wrote this I suspect a kernel-based realpath could do the trick here. Will have to take a closer look. -- Mateusz Guzik
Re: fcntl(F_GETPATH) support
On 9/14/19, Christos Zoulas wrote: > > Comments? > > + error = vnode_to_path(kpath, MAXPATHLEN, fp->f_vnode, l, l->l_proc); What motivates this change? I think it is a little problematic in that namecache resolution is not guaranteed to work. If the consumer does not check for failure or fallback to something else the user is going to experience "once in the blue moon" failures. For example clang likes to get full paths to names. On Linux it can trivially do it thanks to dentry cache. On other systems there is a compilation check for F_GETPATH. If none of this is present it goes for realpath, which is slover but basically works. If F_GETPATH is present, it is always used and there is no fallback if it fails. Then you risk setting yourself up for a weird compilation failure, which may not even repeat itself after you retry. All in all I don't think this should be added unless namecache becomes reliable. The above reasoning is why I did not add it to FreeBSD. -- Mateusz Guzik
Re: fcntl(F_GETPATH) support
> On Sep 15, 2019, at 12:43 PM, Mateusz Guzik wrote: > > On 9/14/19, Christos Zoulas wrote: >> >> Comments? >> >> +error = vnode_to_path(kpath, MAXPATHLEN, fp->f_vnode, l, l->l_proc); > > What motivates this change? > > I think it is a little problematic in that namecache resolution is not > guaranteed to work. If the consumer does not check for failure or fallback > to something else the user is going to experience "once in the blue moon" > failures. > > For example clang likes to get full paths to names. On Linux it can > trivially do it thanks to dentry cache. On other systems there is a > compilation check for F_GETPATH. If none of this is present it goes for > realpath, which is slover but basically works. If F_GETPATH is present, > it is always used and there is no fallback if it fails. Then you risk > setting yourself up for a weird compilation failure, which may not even > repeat itself after you retry. > > All in all I don't think this should be added unless namecache becomes > reliable. The above reasoning is why I did not add it to FreeBSD. > I think it is quite reliable because all the file descriptors would be recently opened and therefore be in the cache. One would need to DoS the cache cause eviction. If that turns out to be false, we can make the namecache reliable, or withdraw it. christos
Re: fcntl(F_GETPATH) support
FYI - XNU has O_SYMLINK for this. Not saying we should necessarily add it to NetBSD, just noting the precedent. -- thorpej Sent from my iPhone. > On Sep 14, 2019, at 5:06 PM, Christos Zoulas wrote: > > If you can get a file desriptor to a symlink, it will work; I don't think > that we have a way to do this now.
Re: fcntl(F_GETPATH) support
In article <2f29ca9a-0ae1-48d3-b3f4-1556912d4...@me.com>, Jason Thorpe wrote: > > >> On Sep 14, 2019, at 2:52 PM, Kamil Rytarowski wrote: >> >> On 14.09.2019 23:34, Christos Zoulas wrote: >>> Comments? >>> >> >> Question. How does it handle symbolic/hardlinks? > >Symbolic links, of course, are simply continuations of the lookup, so >there's "nothing to see here" with a symbolic link. If you can get a file desriptor to a symlink, it will work; I don't think that we have a way to do this now. >I looked at cache_revlookup() and it looks to me like it simply picks >the first hit it finds in the hash table for the given vnode. That is correct. >At least one platform that support this API uses "the last name the file >was looked up with", and will fall back on "whatever the underlying file >system considers to be the canonical name for the file", which is file >system-defined, if for some reason there is no entry in the name cache >(which could, for example, happen if an application is doing an >"open-by-file-id" type operation and the file has not yet been opened by >path name). It will also fail if the entry has been evicted from the cache, but that rarely happens with recently opened files. christos
Re: fcntl(F_GETPATH) support
> On Sep 14, 2019, at 2:52 PM, Kamil Rytarowski wrote: > > On 14.09.2019 23:34, Christos Zoulas wrote: >> Comments? >> > > Question. How does it handle symbolic/hardlinks? Symbolic links, of course, are simply continuations of the lookup, so there's "nothing to see here" with a symbolic link. I looked at cache_revlookup() and it looks to me like it simply picks the first hit it finds in the hash table for the given vnode. At least one platform that support this API uses "the last name the file was looked up with", and will fall back on "whatever the underlying file system considers to be the canonical name for the file", which is file system-defined, if for some reason there is no entry in the name cache (which could, for example, happen if an application is doing an "open-by-file-id" type operation and the file has not yet been opened by path name). -- thorpej
Re: fcntl(F_GETPATH) support
On 14.09.2019 23:34, Christos Zoulas wrote: > Comments? > Question. How does it handle symbolic/hardlinks? > christos > > > Index: kern/sys_descrip.c > === > RCS file: /cvsroot/src/sys/kern/sys_descrip.c,v > retrieving revision 1.34 > diff -u -p -u -r1.34 sys_descrip.c > --- kern/sys_descrip.c26 Aug 2019 10:19:08 - 1.34 > +++ kern/sys_descrip.c14 Sep 2019 21:33:29 - > @@ -315,6 +312,28 @@ do_fcntl_lock(int fd, int cmd, struct fl > return error; > } > > +static int > +do_fcntl_getpath(struct lwp *l, file_t *fp, char *upath) > +{ > + char *kpath; > + int error; > + > + if (fp->f_type != DTYPE_VNODE) > + return EOPNOTSUPP; > + > + kpath = PNBUF_GET(); > + > + error = vnode_to_path(kpath, MAXPATHLEN, fp->f_vnode, l, l->l_proc); > + if (error) > + goto out; > + > + error = copyoutstr(kpath, upath, MAXPATHLEN, NULL); > +out: > + PNBUF_PUT(kpath); > + > + return error; > +} > + > /* > * The file control system call. > */ > @@ -463,6 +482,10 @@ sys_fcntl(struct lwp *l, const struct sy > error = (*fp->f_ops->fo_ioctl)(fp, FIOSETOWN, ); > break; > > + case F_GETPATH: > + error = do_fcntl_getpath(l, fp, SCARG(uap, arg)); > + break; > + > default: > error = EINVAL; > } > Index: sys/fcntl.h > === > RCS file: /cvsroot/src/sys/sys/fcntl.h,v > retrieving revision 1.50 > diff -u -p -u -r1.50 fcntl.h > --- sys/fcntl.h 20 Feb 2018 18:20:05 - 1.50 > +++ sys/fcntl.h 14 Sep 2019 21:33:29 - > @@ -193,6 +195,7 @@ > #define F_DUPFD_CLOEXEC 12 /* close on exec duplicated fd > */ > #define F_GETNOSIGPIPE 13 /* get SIGPIPE disposition */ > #define F_SETNOSIGPIPE 14 /* set SIGPIPE disposition */ > +#define F_GETPATH 15 /* get pathname assosiated with > file */ > #endif > > /* file descriptor flags (F_GETFD, F_SETFD) */ > signature.asc Description: OpenPGP digital signature