Re: Adding linux_link(2) system call, second round
David Holland wrote: > > It is much more code, since it happens on the client, which sends > > filesystem operations to lower layers and regain control later using > > callbacks. Have a look to the sources (xlator/cluster/dht/dht-rename.c) > > and you will see why it is complex. > > Where does that path live? glusterfs source? Yes, get it from here: http://download.gluster.com/pub/gluster/glusterfs/3.2/3.2.2/ glusterfs-3.2.2.tar.gz -- Emmanuel Dreyfus http://hcpnet.free.fr/pubz m...@netbsd.org
Re: Adding linux_link(2) system call, second round
On Tue, Aug 02, 2011 at 04:34:12PM +, Emmanuel Dreyfus wrote: > > As opposed to link/unlink? I still don't see why this would be more > > than a half-dozen lines of code, if that. By your previous > > descriptions it already needs to stat the object to check if it's a > > directory. > > It is much more code, since it happens on the client, which sends > filesystem operations to lower layers and regain control later using > callbacks. Have a look to the sources (xlator/cluster/dht/dht-rename.c) > and you will see why it is complex. Where does that path live? glusterfs source? -- David A. Holland dholl...@netbsd.org
Re: Adding linux_link(2) system call, second round
On Tue, Aug 02, 2011 at 04:30:15PM +, David Holland wrote: > As opposed to link/unlink? I still don't see why this would be more > than a half-dozen lines of code, if that. By your previous > descriptions it already needs to stat the object to check if it's a > directory. It is much more code, since it happens on the client, which sends filesystem operations to lower layers and regain control later using callbacks. Have a look to the sources (xlator/cluster/dht/dht-rename.c) and you will see why it is complex. -- Emmanuel Dreyfus m...@netbsd.org
Re: Adding linux_link(2) system call, second round
On Tue, Aug 02, 2011 at 08:52:56AM +, Emmanuel Dreyfus wrote: > > Sure. But what does it actually do, such that if you have a symlink it > > doesn't work to copy the symlink instead of hardlink it? > > That would probably work for symlinks, since they cannot be updated. > But this would requires heavy changes in the way the code is written. > Basically a rename on a symlink would become a readlink/symlink/unlink. As opposed to link/unlink? I still don't see why this would be more than a half-dozen lines of code, if that. By your previous descriptions it already needs to stat the object to check if it's a directory. -- David A. Holland dholl...@netbsd.org
Re: Adding linux_link(2) system call, second round
On Tue, Aug 02, 2011 at 05:45:56PM +0200, Rhialto wrote: > Ok, then we also want openat(2), fchmodat(2) (which seems to be misnamed > and looks more like a chmodat(2)), unlinkat(2), fchownat(2) (same remark > as fchmodat), etc. And you forgot fexecve(). I agree we want all of them, but I do not think we want everything at once. We have linkat(2) which fixes the problem of hard linking symlinks. This is a small and harmless change. And we have these "*at" functions that allow specifying pathnames relative to a directory specified by a file descriptor. That means modifying the namei interface, not a challenge, but something a bit more intrusive. Therefore I would like to go incremental, by first supproting this: linkat (AT_FDCW, name1, AT_FDCW, name2, 0) linkat (AT_FDCW, name1, AT_FDCW, name2, AT_SYMLINK_FOLLOW) and return ENOSYS if fd1 and fd2 have values other than AT_FDCW. Then do the full Extended API set 2. -- Emmanuel Dreyfus m...@netbsd.org
Re: Adding linux_link(2) system call, second round
On Tue 02 Aug 2011 at 09:05:27 +, Emmanuel Dreyfus wrote: > On Tue, Aug 02, 2011 at 10:02:39AM +0100, Roland C. Dowdeswell wrote: > > It looks like linkat(2) is POSIX.1-2008 and is implemented by Linux > > as well as FreeBSD. It might be the more portable direction to go. > > Right, then everything is simple, this is just the matter of > implementing a standard system call. Ok, then we also want openat(2), fchmodat(2) (which seems to be misnamed and looks more like a chmodat(2)), unlinkat(2), fchownat(2) (same remark as fchmodat), etc. openat(2) is similar to, but not the same as, the existing function fhopen(2). These functions can be found here too: http://pubs.opengroup.org/onlinepubs/9699919799/functions/.html FreeBSD seems to have: faccessat fchmodat fchownat fstatat futimesat linkat mkdirat mkfifoat mknodat openat readlinkat renameat symlinkat unlinkat -Olaf. -- ___ Olaf 'Rhialto' Seibert -- There's no point being grown-up if you \X/ rhialto/at/xs4all.nl-- can't be childish sometimes. -The 4th Doctor
Re: Adding linux_link(2) system call, second round
On Tue, Aug 02, 2011 at 10:02:39AM +0100, Roland C. Dowdeswell wrote: > It looks like linkat(2) is POSIX.1-2008 and is implemented by Linux > as well as FreeBSD. It might be the more portable direction to go. Right, then everything is simple, this is just the matter of implementing a standard system call. Here is the specification. I will change llink to linkat and commit shortly. http://pubs.opengroup.org/onlinepubs/9699919799/functions/link.html -- Emmanuel Dreyfus m...@netbsd.org
Re: Adding linux_link(2) system call, second round
On Tue, Aug 02, 2011 at 08:52:56AM +, Emmanuel Dreyfus wrote: > > On Mon, Aug 01, 2011 at 07:20:30PM +, David Holland wrote: > > Sure. But what does it actually do, such that if you have a symlink it > > doesn't work to copy the symlink instead of hardlink it? > > That would probably work for symlinks, since they cannot be updated. > But this would requires heavy changes in the way the code is written. > Basically a rename on a symlink would become a readlink/symlink/unlink. > This is not really a portability patch, it is a code rewrite which would > consume more time than I can afford here. I suspect glusterfs developpers > will have to do it if they want to support something else than Linux, but > I have no idea when, therefore it is not wise to hold our breath on it. > > llink(2) is a simple change, FreeBSD already went there with linkat(2), > and it makes everything simple. It looks like linkat(2) is POSIX.1-2008 and is implemented by Linux as well as FreeBSD. It might be the more portable direction to go. -- Roland Dowdeswell http://Imrryr.ORG/~elric/
Re: Adding linux_link(2) system call, second round
On Mon, Aug 01, 2011 at 07:20:30PM +, David Holland wrote: > Sure. But what does it actually do, such that if you have a symlink it > doesn't work to copy the symlink instead of hardlink it? That would probably work for symlinks, since they cannot be updated. But this would requires heavy changes in the way the code is written. Basically a rename on a symlink would become a readlink/symlink/unlink. This is not really a portability patch, it is a code rewrite which would consume more time than I can afford here. I suspect glusterfs developpers will have to do it if they want to support something else than Linux, but I have no idea when, therefore it is not wise to hold our breath on it. llink(2) is a simple change, FreeBSD already went there with linkat(2), and it makes everything simple. -- Emmanuel Dreyfus m...@netbsd.org
Re: [PATCH] llink(2) (was: Re: Adding linux_link(2) system call, second round)
On Mon, Aug 01, 2011 at 09:00:36PM +, David Holland wrote: > > Not withstanding dh's comment, why not pass in all the namei flags. > > > > > + error = namei_simple_user(path, flags, &vp); > > Because I gimmicked up the flags for namei_simple specifically to > disallow that sort of thing :-) Er, or rather, you can't use | on them. Just passing them would work, but again, please don't... -- David A. Holland dholl...@netbsd.org
Re: [PATCH] llink(2) (was: Re: Adding linux_link(2) system call, second round)
On Mon, Aug 01, 2011 at 09:31:11PM +0100, David Laight wrote: > > + if (flags & FOLLOW) > > + namei_simple_flags = NSM_FOLLOW_TRYEMULROOT; > > + else > > + namei_simple_flags = NSM_NOFOLLOW_TRYEMULROOT; > > + > > + error = namei_simple_user(path, namei_simple_flags, &vp); > > Not withstanding dh's comment, why not pass in all the namei flags. > > > + error = namei_simple_user(path, flags, &vp); Because I gimmicked up the flags for namei_simple specifically to disallow that sort of thing :-) -- David A. Holland dholl...@netbsd.org
Re: [PATCH] llink(2) (was: Re: Adding linux_link(2) system call, second round)
On Mon, Aug 01, 2011 at 09:46:33AM +, Emmanuel Dreyfus wrote: > + if (flags & FOLLOW) > + namei_simple_flags = NSM_FOLLOW_TRYEMULROOT; > + else > + namei_simple_flags = NSM_NOFOLLOW_TRYEMULROOT; > + > + error = namei_simple_user(path, namei_simple_flags, &vp); Not withstanding dh's comment, why not pass in all the namei flags. > + error = namei_simple_user(path, flags, &vp); David -- David Laight: da...@l8s.co.uk
Re: [PATCH] llink(2) (was: Re: Adding linux_link(2) system call, second round)
Christos Zoulas wrote: > Except for the ktruser() call, looks good to me (my personal opinion). Um, yes, that one was another pending patch I had for later. For now ktrace does not show symlink(2) targets, which is annoying: sometime you cannot tell what is going on. -- Emmanuel Dreyfus http://hcpnet.free.fr/pubz m...@netbsd.org
Re: Adding linux_link(2) system call, second round
On Mon, Aug 01, 2011 at 04:05:32AM +0200, Emmanuel Dreyfus wrote: > > You still haven't explained what glusterfs is doing that's so evil or > > why it can't be fixed by having it copy the symlink when that's the > > case in question. > > glusterfs uses the native filesystem as its storage backend. When you > rename a filesytem object in the distributed and replicated setup, they > have to make sure it remains accessible by another client during the > operation. > > Directories are all present on all servers and therefore are just > treated by a rename(2). Other objects are stored on some server and are > reteived using a DHT. When they are renamed, they are treated by a > distributed link(2)/rename(2)/unlink(2) algorithm. This breaks on NetBSD > when the object is a symlink to a directory or a symlink to a > nonexistent target, since you cannot link(2) to such an object. Sure. But what does it actually do, such that if you have a symlink it doesn't work to copy the symlink instead of hardlink it? > The fix is not traightforward, and require a change in the way glusterfs > stores symlinks in its distributed and replicated setup. Why? -- David A. Holland dholl...@netbsd.org
Re: [PATCH] llink(2) (was: Re: Adding linux_link(2) system call, second round)
On Mon, Aug 01, 2011 at 09:46:33AM +, Emmanuel Dreyfus wrote: > +return do_sys_link(l, path, link, FOLLOW, retval); > > +return do_sys_link(l, path, link, NOFOLLOW, retval); Can you use a boolean argument for that instead of namei flags? > +.Fn llink > +is like > +.Fn link > +except in the case where the named file is a symbolic link, in which case > +.Fn llink > +links on the symbolic link itself, while > +.Fn link > +links on the symbolic link target. How about: .Fn llink is kthe same as .Fn link except in the case where .Fa name1 is a symbolic link. In this case .Fn llink creates an additional (hard) link to the symbolic link, whereas .Fn link creates an additional (hard) link to the symbolic link's target. [I'm still not convinced it's worth supporting this feature ~forever, but the patch may as well be good. :-) ] -- David A. Holland dholl...@netbsd.org
Re: [PATCH] llink(2) (was: Re: Adding linux_link(2) system call, second round)
In article <20110801094633.ga17...@homeworld.netbsd.org>, Emmanuel Dreyfus wrote: >-=-=-=-=-=- > >On Sun, Jul 31, 2011 at 06:36:53PM +, Christos Zoulas wrote: >> I don't have an issue with it as long as: >> - fsck does not get confused >> - filesystems don't need to be modified to support it >> - there is consensus that this is not harmful >> - I am also ambivalent about exposing this in the native abi >>because it will only cause confusion. > >Attached is the patch that adds llink(2) and its documentation. The test I >ran are below (the llink program just calls llink(2)). > >fsck has no probmem with it, ffs was not modified. For it >being harmful, I cannot immagine what could be done with it, but >we could restrict it to root just in case. > >On confusion, well, I think the llink name speaks by itself. Except for the ktruser() call, looks good to me (my personal opinion). christos
Re: Adding linux_link(2) system call, second round
On Mon 01 Aug 2011 at 12:09:34 +0200, Joerg Sonnenberger wrote: > You are adding a lot of complexity to workaround portability issues of a > single application. Let's start the other way -- has FreeBSD added > llink(2)? What about OSX? Solaris? FreeBSD 8 has ln -P. From the manual, it seems that the system call used for it must be linkat(2), which has an added flags argument. (A check with ktrace confirms that) >From FreeBSD 8.2's link(2): SYNOPSIS #include int link(const char *name1, const char *name2); int linkat(int fd1, const char *name1, int fd2, const char *name2, int flag); ... The linkat() system call is equivalent to link except in the case where either name1 or name2 or both are relative paths. In this case a rela- tive path name1 is interpreted relative to the directory associated with the file descriptor fd1 instead of the current working directory and sim- ilarly for name2 and the file descriptor fd2. Values for flag are constructed by a bitwise-inclusive OR of flags from the following list, defined in : AT_SYMLINK_FOLLOW If name1 names a symbolic link, a new link for the target of the symbolic link is created. http://www.freebsd.org/cgi/man.cgi?query=link&apropos=0&sektion=2&manpath=FreeBSD+8.2-RELEASE&format=html > Joerg -Olaf. -- ___ Olaf 'Rhialto' Seibert -- There's no point being grown-up if you \X/ rhialto/at/xs4all.nl-- can't be childish sometimes. -The 4th Doctor
Re: Adding linux_link(2) system call, second round
Rhialto wrote: > LN(1) FreeBSD General Commands Manual LN(1) (...) > -PWhen creating a hard link to a symbolic link, create a hard link to >the symbolic link itself. This option cancels the -L option. I can add this this to NetBSD as well if it is considered desirable. -- Emmanuel Dreyfus http://hcpnet.free.fr/pubz m...@netbsd.org
Re: Adding linux_link(2) system call, second round
On Mon 01 Aug 2011 at 10:50:50 +0200, Matthias Drochner wrote: > While the "DESCRIPTION" chapter doesn't tell it explicitely, > we have the following in "ERRORS": > > [ELOOP] > A loop exists in symbolic links encountered during resolution of the path1 > or path2 argument. > > This implies that the intention is that symlinks are followed. Non-final symlinks are always followed, even for lchmod(2), readlink(2) and similar functions, right? For instance, readlink(2)'s man page also mentions ELOOP. Another argument that allowing hard links to symlinks is "only natural": The rename(2) operation does it in any case (on ffs). And one *can* rename symlinks. >From /usr/src/sys/ufs/ufs/ufs_vnops.c: /* * Rename vnode operation * rename("foo", "bar"); * is essentially * unlink("bar"); * link("foo", "bar"); * unlink("foo"); * but ``atomically''. Can't do full commit without saving state in the * inode on disk which isn't feasible at this time. Best we can do is * always guarantee the target exists. The code below that doesn't appear to special-case symlinks, only directories. FreeBSD also allows hard links to symlinks, with the ln -P option. (This must have been introduced in version 7 or 8; 6.1 doesn't have it but 8.1 does) LN(1) FreeBSD General Commands Manual LN(1) NAME ln, link -- link files SYNOPSIS ln [-L | -P | -s [-F]] [-f | -iw] [-hnv] source_file [target_file] ln [-L | -P | -s [-F]] [-f | -iw] [-hnv] source_file ... target_dir link source_file target_file DESCRIPTION ... -PWhen creating a hard link to a symbolic link, create a hard link to the symbolic link itself. This option cancels the -L option. test$ ln -s foo bar test$ l total 8 drwxr-xr-x 2 olafs vb 3 Aug 1 12:25 ./ drwxr-xr-x 23 olafs vb 29 Aug 1 12:25 ../ lrwxr-xr-x 1 olafs vb 3 Aug 1 12:25 bar@ -> foo test$ ln -P bar baz test$ l total 8 drwxr-xr-x 2 olafs vb 4 Aug 1 12:25 ./ drwxr-xr-x 23 olafs vb 29 Aug 1 12:25 ../ lrwxr-xr-x 2 olafs vb 3 Aug 1 12:25 bar@ -> foo lrwxr-xr-x 2 olafs vb 3 Aug 1 12:25 baz@ -> foo I tested both on ffs and zfs. The results are the same. -Olaf. -- ___ Olaf 'Rhialto' Seibert -- There's no point being grown-up if you \X/ rhialto/at/xs4all.nl-- can't be childish sometimes. -The 4th Doctor
Re: Adding linux_link(2) system call, second round
Joerg Sonnenberger wrote: > > You did not explain what problems it would introduce, did you? > You are adding a lot of complexity to workaround portability issues of a > single application. It is not that complex. See the patch I posted this morning, the thing is really simple, and it works quite well. > Let's start the other way -- has FreeBSD added llink(2)? > What about OSX? Solaris? OSX did not in X.5, I do not know for the others. But we do not know if developers of theses systems are aware of this portability problem. We may be the first that got there, it does not means we have to be wrong. -- Emmanuel Dreyfus http://hcpnet.free.fr/pubz m...@netbsd.org
Re: Adding linux_link(2) system call, second round
On Mon, Aug 01, 2011 at 04:05:33AM +0200, Emmanuel Dreyfus wrote: > Joerg Sonnenberger wrote: > > > Given the very small number of programs that manage to mess up the > > symlink usage, I'm kind of opposed to providing another system call just > > as work around for them. > > You did not explain what problems it would introduce, did you? You are adding a lot of complexity to workaround portability issues of a single application. Let's start the other way -- has FreeBSD added llink(2)? What about OSX? Solaris? Joerg
[PATCH] llink(2) (was: Re: Adding linux_link(2) system call, second round)
On Sun, Jul 31, 2011 at 06:36:53PM +, Christos Zoulas wrote: > I don't have an issue with it as long as: > - fsck does not get confused > - filesystems don't need to be modified to support it > - there is consensus that this is not harmful > - I am also ambivalent about exposing this in the native abi > because it will only cause confusion. Attached is the patch that adds llink(2) and its documentation. The test I ran are below (the llink program just calls llink(2)). fsck has no probmem with it, ffs was not modified. For it being harmful, I cannot immagine what could be done with it, but we could restrict it to root just in case. On confusion, well, I think the llink name speaks by itself. # ls -li total 1 3648 drwxr-xr-x 2 root wheel 512 Aug 1 11:31 dir 3 -rw-r--r-- 2 root wheel0 Aug 1 11:30 file 3 -rw-r--r-- 2 root wheel0 Aug 1 11:30 hfile 5 lrwxr-xr-x 1 root wheel3 Aug 1 11:31 sdir -> dir 4 lrwxr-xr-x 1 root wheel4 Aug 1 11:31 sfile -> file 6 lrwxr-xr-x 1 root wheel 11 Aug 1 11:32 void -> nonexistent # /home2/manu/llink sfile hsfile # /home2/manu/llink sdir hsdir # /home2/manu/llink void hvoid # ls -li total 1 3648 drwxr-xr-x 2 root wheel 512 Aug 1 11:31 dir 3 -rw-r--r-- 2 root wheel0 Aug 1 11:30 file 3 -rw-r--r-- 2 root wheel0 Aug 1 11:30 hfile 5 lrwxr-xr-x 2 root wheel3 Aug 1 11:31 hsdir -> dir 4 lrwxr-xr-x 2 root wheel4 Aug 1 11:31 hsfile -> file 6 lrwxr-xr-x 2 root wheel 11 Aug 1 11:32 hvoid -> nonexistent 5 lrwxr-xr-x 2 root wheel3 Aug 1 11:31 sdir -> dir 4 lrwxr-xr-x 2 root wheel4 Aug 1 11:31 sfile -> file 6 lrwxr-xr-x 2 root wheel 11 Aug 1 11:32 void -> nonexistent -- Emmanuel Dreyfus m...@netbsd.org Index: include/unistd.h === RCS file: /cvsroot/src/include/unistd.h,v retrieving revision 1.126 diff -U4 -r1.126 unistd.h --- include/unistd.h26 Jun 2011 16:42:40 - 1.126 +++ include/unistd.h1 Aug 2011 09:34:20 - @@ -124,8 +124,9 @@ pid_t getppid(void); uid_t getuid(void); int isatty(int); int link(const char *, const char *); +int llink(const char *, const char *); longpathconf(const char *, int); int pause(void); int pipe(int *); #if __SSP_FORTIFY_LEVEL == 0 Index: sys/kern/vfs_syscalls.c === RCS file: /cvsroot/src/sys/kern/vfs_syscalls.c,v retrieving revision 1.430 diff -U4 -r1.430 vfs_syscalls.c --- sys/kern/vfs_syscalls.c 17 Jun 2011 14:23:51 - 1.430 +++ sys/kern/vfs_syscalls.c 1 Aug 2011 09:34:20 - @@ -1769,27 +1769,32 @@ } /* * Make a hard file link. + * The flag argument can be + * - FOLLOW for sys_link, to link to symlink target + * - NOFOLLOW for sys_llink, to link to symlink itself */ /* ARGSUSED */ -int -sys_link(struct lwp *l, const struct sys_link_args *uap, register_t *retval) +static int +do_sys_link(struct lwp *l, const char *path, const char *link, + int flags, register_t *retval) { - /* { - syscallarg(const char *) path; - syscallarg(const char *) link; - } */ struct vnode *vp; struct pathbuf *linkpb; struct nameidata nd; + namei_simple_flags_t namei_simple_flags; int error; - error = namei_simple_user(SCARG(uap, path), - NSM_FOLLOW_TRYEMULROOT, &vp); + if (flags & FOLLOW) + namei_simple_flags = NSM_FOLLOW_TRYEMULROOT; + else + namei_simple_flags = NSM_NOFOLLOW_TRYEMULROOT; + + error = namei_simple_user(path, namei_simple_flags, &vp); if (error != 0) return (error); - error = pathbuf_copyin(SCARG(uap, link), &linkpb); + error = pathbuf_copyin(link, &linkpb); if (error) { goto out1; } NDINIT(&nd, CREATE, LOCKPARENT | TRYEMULROOT, linkpb); @@ -1826,8 +1831,35 @@ goto out2; } int +sys_link(struct lwp *l, const struct sys_link_args *uap, register_t *retval) +{ + /* { + syscallarg(const char *) path; + syscallarg(const char *) link; + } */ + const char *path = SCARG(uap, path); + const char *link = SCARG(uap, link); + + return do_sys_link(l, path, link, FOLLOW, retval); +} + +int +sys_llink(struct lwp *l, const struct sys_llink_args *uap, register_t *retval) +{ + /* { + syscallarg(const char *) path; + syscallarg(const char *) link; + } */ + const char *path = SCARG(uap, path); + const char *link = SCARG(uap, link); + + return do_sys_link(l, path, link, NOFOLLOW, retval); +} + + +int do_sys_symlink(const char *patharg, const char *link, enum uio_seg seg) { struct proc *p = curproc; struct vattr vattr; @@ -1850,
Re: Adding linux_link(2) system call, second round
m...@netbsd.org said: > Both behaviors are standard compliant, since SUSv2 says nothing about > resolving symlinks or not. While the "DESCRIPTION" chapter doesn't tell it explicitely, we have the following in "ERRORS": [ELOOP] A loop exists in symbolic links encountered during resolution of the path1 or path2 argument. This implies that the intention is that symlinks are followed. el...@imrryr.org said: > Or perhaps llink(2) for symmetry with lchmod(2) and lstat(2). This looks reasonable. best regards Matthias Forschungszentrum Juelich GmbH 52425 Juelich Sitz der Gesellschaft: Juelich Eingetragen im Handelsregister des Amtsgerichts Dueren Nr. HR B 3498 Vorsitzender des Aufsichtsrats: MinDirig Dr. Karl Eugen Huthmacher Geschaeftsfuehrung: Prof. Dr. Achim Bachem (Vorsitzender), Dr. Ulrich Krafft (stellv. Vorsitzender), Prof. Dr.-Ing. Harald Bolt, Prof. Dr. Sebastian M. Schmidt
Re: Adding linux_link(2) system call, second round
In article <20110731224944.ga23...@britannica.bec.de>, Joerg Sonnenberger wrote: >On Sun, Jul 31, 2011 at 07:49:20PM +0200, Emmanuel Dreyfus wrote: >> Both behaviors are standard compliant, since SUSv2 says nothing about >> resolving symlinks or not. I found at least one program (glusterfs), >> which assumes the Linux behavior, and is a real pain to fix on NetBSD >> because of that. > >The standard is explicitly open on this to allow filesystems that >implement symlinks without using inodes. Essentially, it is valid to >store a symlink inside the directory entry itself. That's one of the >reasons why no change semantic is provided either. And approximately this (storing the symlink data inside the source inode without using an extra inode of the link target fit) was attempted in BSD4.4 and if failed miserably. We had to undo it, and use separate inodes again. christos
Re: Adding linux_link(2) system call, second round
Joerg Sonnenberger wrote: > Given the very small number of programs that manage to mess up the > symlink usage, I'm kind of opposed to providing another system call just > as work around for them. You did not explain what problems it would introduce, did you? -- Emmanuel Dreyfus http://hcpnet.free.fr/pubz m...@netbsd.org
Re: Adding linux_link(2) system call, second round
David Holland wrote: > You still haven't explained what glusterfs is doing that's so evil or > why it can't be fixed by having it copy the symlink when that's the > case in question. glusterfs uses the native filesystem as its storage backend. When you rename a filesytem object in the distributed and replicated setup, they have to make sure it remains accessible by another client during the operation. Directories are all present on all servers and therefore are just treated by a rename(2). Other objects are stored on some server and are reteived using a DHT. When they are renamed, they are treated by a distributed link(2)/rename(2)/unlink(2) algorithm. This breaks on NetBSD when the object is a symlink to a directory or a symlink to a nonexistent target, since you cannot link(2) to such an object. The fix is not traightforward, and require a change in the way glusterfs stores symlinks in its distributed and replicated setup. I suspect it may involve treating such objects like directories, and have them duplicated on all servers. An alternative would be to sacrifice the garantee that symlinks are available during a rename, at least for NetBSD. -- Emmanuel Dreyfus http://hcpnet.free.fr/pubz m...@netbsd.org
Re: Adding linux_link(2) system call, second round
On Sun, Jul 31, 2011 at 07:49:20PM +0200, Emmanuel Dreyfus wrote: > Quick summary for the impatient: NetBSD link(2) first resolves symlinks > before doing the actual link to the target. As a result, NetBSD link(2) > fails on symlinks to directories or to non existent targets. > > On the other side, Linux link(2) is dumb and just create a second > symlink with the same inode. Therefore it does not care about the > symlink target, and will succeed even if it is a directory or if it is > nonexistent. > > Both behaviors are standard compliant, since SUSv2 says nothing about > resolving symlinks or not. I found at least one program (glusterfs), > which assumes the Linux behavior, and is a real pain to fix on NetBSD > because of that. You still haven't explained what glusterfs is doing that's so evil or why it can't be fixed by having it copy the symlink when that's the case in question. I remain not thrilled about adding this, mostly on the grounds that adding variant functionality with no clear purpose or value tends to create maintenance hassles in the long run. -- David A. Holland dholl...@netbsd.org
Re: Adding linux_link(2) system call, second round
On Sun, Jul 31, 2011 at 07:49:20PM +0200, Emmanuel Dreyfus wrote: > Both behaviors are standard compliant, since SUSv2 says nothing about > resolving symlinks or not. I found at least one program (glusterfs), > which assumes the Linux behavior, and is a real pain to fix on NetBSD > because of that. The standard is explicitly open on this to allow filesystems that implement symlinks without using inodes. Essentially, it is valid to store a symlink inside the directory entry itself. That's one of the reasons why no change semantic is provided either. Given the very small number of programs that manage to mess up the symlink usage, I'm kind of opposed to providing another system call just as work around for them. Besides, NetBSD isn't the only implementation following this strategy... Joerg
Re: Adding linux_link(2) system call, second round
On Sun, Jul 31, 2011 at 06:36:53PM +, Christos Zoulas wrote: > > Also perhaps just call it link2(from, to, flags) in the long tradition > of adding a number to existing syscalls when extending them ;-) Or perhaps llink(2) for symmetry with lchmod(2) and lstat(2). -- Roland Dowdeswell http://Imrryr.ORG/~elric/
Re: Adding linux_link(2) system call, second round
On Jul 31, 9:18pm, el...@imrryr.org ("Roland C. Dowdeswell") wrote: -- Subject: Re: Adding linux_link(2) system call, second round | On Sun, Jul 31, 2011 at 06:36:53PM +, Christos Zoulas wrote: | > | | > Also perhaps just call it link2(from, to, flags) in the long tradition | > of adding a number to existing syscalls when extending them ;-) | | Or perhaps llink(2) for symmetry with lchmod(2) and lstat(2). I like that even more! christos
Re: Adding linux_link(2) system call, second round
In article <1k5abxi.a8h289rm4jc3m%m...@netbsd.org>, Emmanuel Dreyfus wrote: >Quick summary for the impatient: NetBSD link(2) first resolves symlinks >before doing the actual link to the target. As a result, NetBSD link(2) >fails on symlinks to directories or to non existent targets. > >On the other side, Linux link(2) is dumb and just create a second >symlink with the same inode. Therefore it does not care about the >symlink target, and will succeed even if it is a directory or if it is >nonexistent. > >Both behaviors are standard compliant, since SUSv2 says nothing about >resolving symlinks or not. I found at least one program (glusterfs), >which assumes the Linux behavior, and is a real pain to fix on NetBSD >because of that. > >I proposed to implement a linux_link(2), or lazy_link(2), whatever >sounds nicer. It seems it does not reach consensus, but I am not sure I >understood why: what are the problems that would be introduced by adding >such a system call? At least I can tell what benefit it would have: it >would ease porting from Linux. I don't have an issue with it as long as: - fsck does not get confused - filesystems don't need to be modified to support it - there is consensus that this is not harmful - I am also ambivalent about exposing this in the native abi because it will only cause confusion. Also perhaps just call it link2(from, to, flags) in the long tradition of adding a number to existing syscalls when extending them ;-) christos
Adding linux_link(2) system call, second round
Quick summary for the impatient: NetBSD link(2) first resolves symlinks before doing the actual link to the target. As a result, NetBSD link(2) fails on symlinks to directories or to non existent targets. On the other side, Linux link(2) is dumb and just create a second symlink with the same inode. Therefore it does not care about the symlink target, and will succeed even if it is a directory or if it is nonexistent. Both behaviors are standard compliant, since SUSv2 says nothing about resolving symlinks or not. I found at least one program (glusterfs), which assumes the Linux behavior, and is a real pain to fix on NetBSD because of that. I proposed to implement a linux_link(2), or lazy_link(2), whatever sounds nicer. It seems it does not reach consensus, but I am not sure I understood why: what are the problems that would be introduced by adding such a system call? At least I can tell what benefit it would have: it would ease porting from Linux. -- Emmanuel Dreyfus http://hcpnet.free.fr/pubz m...@netbsd.org