Re: Adding linux_link(2) system call, second round
In article <20110731224944.ga23...@britannica.bec.de>, Joerg Sonnenberger wrote: >On Sun, Jul 31, 2011 at 07:49:20PM +0200, Emmanuel Dreyfus wrote: >> Both behaviors are standard compliant, since SUSv2 says nothing about >> resolving symlinks or not. I found at least one program (glusterfs), >> which assumes the Linux behavior, and is a real pain to fix on NetBSD >> because of that. > >The standard is explicitly open on this to allow filesystems that >implement symlinks without using inodes. Essentially, it is valid to >store a symlink inside the directory entry itself. That's one of the >reasons why no change semantic is provided either. And approximately this (storing the symlink data inside the source inode without using an extra inode of the link target fit) was attempted in BSD4.4 and if failed miserably. We had to undo it, and use separate inodes again. christos
Re: Adding linux_link(2) system call, second round
Joerg Sonnenberger wrote: > Given the very small number of programs that manage to mess up the > symlink usage, I'm kind of opposed to providing another system call just > as work around for them. You did not explain what problems it would introduce, did you? -- Emmanuel Dreyfus http://hcpnet.free.fr/pubz m...@netbsd.org
Re: Adding linux_link(2) system call, second round
David Holland wrote: > You still haven't explained what glusterfs is doing that's so evil or > why it can't be fixed by having it copy the symlink when that's the > case in question. glusterfs uses the native filesystem as its storage backend. When you rename a filesytem object in the distributed and replicated setup, they have to make sure it remains accessible by another client during the operation. Directories are all present on all servers and therefore are just treated by a rename(2). Other objects are stored on some server and are reteived using a DHT. When they are renamed, they are treated by a distributed link(2)/rename(2)/unlink(2) algorithm. This breaks on NetBSD when the object is a symlink to a directory or a symlink to a nonexistent target, since you cannot link(2) to such an object. The fix is not traightforward, and require a change in the way glusterfs stores symlinks in its distributed and replicated setup. I suspect it may involve treating such objects like directories, and have them duplicated on all servers. An alternative would be to sacrifice the garantee that symlinks are available during a rename, at least for NetBSD. -- Emmanuel Dreyfus http://hcpnet.free.fr/pubz m...@netbsd.org
Re: Adding linux_link(2) system call, second round
On Sun, Jul 31, 2011 at 07:49:20PM +0200, Emmanuel Dreyfus wrote: > Quick summary for the impatient: NetBSD link(2) first resolves symlinks > before doing the actual link to the target. As a result, NetBSD link(2) > fails on symlinks to directories or to non existent targets. > > On the other side, Linux link(2) is dumb and just create a second > symlink with the same inode. Therefore it does not care about the > symlink target, and will succeed even if it is a directory or if it is > nonexistent. > > Both behaviors are standard compliant, since SUSv2 says nothing about > resolving symlinks or not. I found at least one program (glusterfs), > which assumes the Linux behavior, and is a real pain to fix on NetBSD > because of that. You still haven't explained what glusterfs is doing that's so evil or why it can't be fixed by having it copy the symlink when that's the case in question. I remain not thrilled about adding this, mostly on the grounds that adding variant functionality with no clear purpose or value tends to create maintenance hassles in the long run. -- David A. Holland dholl...@netbsd.org
Re: Adding linux_link(2) system call, second round
On Sun, Jul 31, 2011 at 07:49:20PM +0200, Emmanuel Dreyfus wrote: > Both behaviors are standard compliant, since SUSv2 says nothing about > resolving symlinks or not. I found at least one program (glusterfs), > which assumes the Linux behavior, and is a real pain to fix on NetBSD > because of that. The standard is explicitly open on this to allow filesystems that implement symlinks without using inodes. Essentially, it is valid to store a symlink inside the directory entry itself. That's one of the reasons why no change semantic is provided either. Given the very small number of programs that manage to mess up the symlink usage, I'm kind of opposed to providing another system call just as work around for them. Besides, NetBSD isn't the only implementation following this strategy... Joerg
Re: Adding linux_link(2) system call, second round
On Sun, Jul 31, 2011 at 06:36:53PM +, Christos Zoulas wrote: > > Also perhaps just call it link2(from, to, flags) in the long tradition > of adding a number to existing syscalls when extending them ;-) Or perhaps llink(2) for symmetry with lchmod(2) and lstat(2). -- Roland Dowdeswell http://Imrryr.ORG/~elric/
Re: Adding linux_link(2) system call, second round
On Jul 31, 9:18pm, el...@imrryr.org ("Roland C. Dowdeswell") wrote: -- Subject: Re: Adding linux_link(2) system call, second round | On Sun, Jul 31, 2011 at 06:36:53PM +, Christos Zoulas wrote: | > | | > Also perhaps just call it link2(from, to, flags) in the long tradition | > of adding a number to existing syscalls when extending them ;-) | | Or perhaps llink(2) for symmetry with lchmod(2) and lstat(2). I like that even more! christos
Re: Adding linux_link(2) system call, second round
In article <1k5abxi.a8h289rm4jc3m%m...@netbsd.org>, Emmanuel Dreyfus wrote: >Quick summary for the impatient: NetBSD link(2) first resolves symlinks >before doing the actual link to the target. As a result, NetBSD link(2) >fails on symlinks to directories or to non existent targets. > >On the other side, Linux link(2) is dumb and just create a second >symlink with the same inode. Therefore it does not care about the >symlink target, and will succeed even if it is a directory or if it is >nonexistent. > >Both behaviors are standard compliant, since SUSv2 says nothing about >resolving symlinks or not. I found at least one program (glusterfs), >which assumes the Linux behavior, and is a real pain to fix on NetBSD >because of that. > >I proposed to implement a linux_link(2), or lazy_link(2), whatever >sounds nicer. It seems it does not reach consensus, but I am not sure I >understood why: what are the problems that would be introduced by adding >such a system call? At least I can tell what benefit it would have: it >would ease porting from Linux. I don't have an issue with it as long as: - fsck does not get confused - filesystems don't need to be modified to support it - there is consensus that this is not harmful - I am also ambivalent about exposing this in the native abi because it will only cause confusion. Also perhaps just call it link2(from, to, flags) in the long tradition of adding a number to existing syscalls when extending them ;-) christos
kcpuset(9) interface
Hello, Here is a reworked dynamic CPU set implementation for kernel (shared cpuset.c in src/common will be moved to libc) - a kcpuset(9) interface: http://www.netbsd.org/~rmind/kcpuset_ng.diff It supports early use while the system is cold through a fix up mechanism, see kcpuset_sysinit(). That would enable us to use kcpuset(9) in MD code, such as pmap(9). The intention of interface is to: 1) replace hard-coded parts (e.g. limited to uint32_t or MAXCPUS constant) with a more dynamic mechanism 2) replace and unify duplicated CPU bitset code (e.g. in MIPS, PowerPC, sparc64, which have own copies). Comments? -- Mindaugas
exec and VM_MAP_TOPDOWN - chicken & egg?
I have a small (mostly conceptional) issue with sys/kern/exec_elf.c. In my view the exec operation is kind of contstructor op for a vmspace, but on the other hand exec needs to know where to put the interpreter, which slightly differs if we are about to arrange for topdown VM layout. My concrete issue popped up when I try to exec in a proc that has no p_vmspace at all yet - so it crashes when checking for VM_MAP_TOPDOWN in the vmspace flags. This is easily worked around by this patch: Index: exec_elf.c === RCS file: /cvsroot/src/sys/kern/exec_elf.c,v retrieving revision 1.30 diff -c -u -p -r1.30 exec_elf.c --- exec_elf.c 19 Jul 2011 19:45:36 - 1.30 +++ exec_elf.c 31 Jul 2011 18:01:22 - @@ -84,6 +84,7 @@ __KERNEL_RCSID(1, "$NetBSD: exec_elf.c,v #include #include +#include extern struct emul emul_netbsd; @@ -406,9 +407,19 @@ elf_load_file(struct lwp *l, struct exec u_long phsize; Elf_Addr addr = *last; struct proc *p; + bool use_topdown; p = l->l_proc; + if (p->p_vmspace) + use_topdown = p->p_vmspace->vm_map.flags & VM_MAP_TOPDOWN; + else +#ifdef __USING_TOPDOWN_VM + use_topdown = true; +#else + use_topdown = false; +#endif + /* * 1. open file * 2. read filehdr @@ -552,7 +563,7 @@ elf_load_file(struct lwp *l, struct exec flags = VMCMD_BASE; if (addr == ELF_LINK_ADDR) addr = ph0->p_vaddr; - if (p->p_vmspace->vm_map.flags & VM_MAP_TOPDOWN) + if (use_topdown) addr = ELF_TRUNC(addr, ph0->p_align); else addr = ELF_ROUND(addr, ph0->p_align); Obviously this is a hack. Thinking about what happens in the normal case: we are about to create the new vmspace, but the check tests the flags for the old vmspace. The new vmspace will not inherit the flags, but will have the same default as the use_topdown variable I added in the patch. I would have expected that emulations would care, but I can't find traces of it. And the only exec format that cares is elf. Wouldn't it be conceptually cleaner if the "we would like to arrange for topdown VM, if possible" flag would be part of struct exec_pack and explicitly set upfront (maybbe by just copying it from the current procs vmspace flags? Object loaders and emulations could override it, and the vmpspace flag could later be set accordingly. Am I missing something? Martin
Adding linux_link(2) system call, second round
Quick summary for the impatient: NetBSD link(2) first resolves symlinks before doing the actual link to the target. As a result, NetBSD link(2) fails on symlinks to directories or to non existent targets. On the other side, Linux link(2) is dumb and just create a second symlink with the same inode. Therefore it does not care about the symlink target, and will succeed even if it is a directory or if it is nonexistent. Both behaviors are standard compliant, since SUSv2 says nothing about resolving symlinks or not. I found at least one program (glusterfs), which assumes the Linux behavior, and is a real pain to fix on NetBSD because of that. I proposed to implement a linux_link(2), or lazy_link(2), whatever sounds nicer. It seems it does not reach consensus, but I am not sure I understood why: what are the problems that would be introduced by adding such a system call? At least I can tell what benefit it would have: it would ease porting from Linux. -- Emmanuel Dreyfus http://hcpnet.free.fr/pubz m...@netbsd.org