re: fexecve
not really commenting on the proposal itself, but .. > Let us not forget that you need a binary inside the chroot that can > call fexecve() on a file descriptor or the ability to build such a > binary. this is only one buffer overflow away... ie, strength in layers would imply you should not rely this. .mrg.
Re: fexecve
>> (I'd actually _like_ to see something capabilityish, in which case >> "can use fexecve" would be a capability that could be removed, from >> init if need be, on systems that care about this sort of thing.) > Couldn't we have an enable/disable sysctl variable for this? Certainly. I would count that as "something capabilityish" - after all, assuming it's per-process, in what ways, aside from the APIs used to control it, does that differ from a capability? Or, to return for a moment to my roots, $ SET PROC/PRIV=FEXECVE /~\ The ASCII Mouse \ / Ribbon Campaign X Against HTMLmo...@rodents-montreal.org / \ Email! 7D C8 61 52 5D E7 2D 39 4E F1 31 3E E8 B3 27 4B
Re: fexecve
> Date: Sun, 8 Sep 2019 14:03:03 -0400 > From: Thor Lancelot Simon > > On Sun, Sep 08, 2019 at 01:23:46PM -0400, Christos Zoulas wrote: > > > > Here's a simple fexecve(2) implementation. Comments? > > I think this is dangerous in systems which use chroot into filesystems > mounted noexec (or nosuid) and file-descriptor passing into the constrained > environment to feed data. Now new executables (and even setuid ones) can > be fed in, too. > > What can we do about that? It sounds like you're positing: - there is a chrooted process A - there is a colluding process B outside the chroot - they share a socket - B can open setuid executables and send their fds over the socket - A can now execute setuid executables outside the chroot How is this substantively different from the following? - there is a chrooted process A - there is a colluding process B outside the chroot - they share a socket - A can ask B to execute files by pathname and B will happily oblige - A can now execute setuid executables outside the chroot That is, under what meaningful circumstances can you rule out the first scenario but not the second one?
Re: fexecve
On Sun, 8 Sep 2019, Mouse wrote: (2) Losing the command name isn't good; lots of people turn process accounting on for logging (in fact, I'd assume 99.9% of people who turn process accounting on use it purely for logging) and it substantially decreases the utility if it's easily circumvented. Isn't the command name easy to lose and/or forge already, with links if nothng else? In any case, it seems to me this is one reason to make fexecve() optional. (I'd actually _like_ to see something capabilityish, in which case "can use fexecve" would be a capability that could be removed, from init if need be, on systems that care about this sort of thing.) Couldn't we have an enable/disable sysctl variable for this? (3) Setugid processes should be prohibited, or at least setugid dynamically-linked processes, because otherwise there's a window where a live update of a library might be used to run the old binary with a new set of libraries. How does fexecve() make anything possible here that wasn't possible before? It seems to me that updating .so libraries has always carried this risk, so I must be missing something. /~\ The ASCII Mouse \ / Ribbon Campaign X Against HTML mo...@rodents-montreal.org / \ Email! 7D C8 61 52 5D E7 2D 39 4E F1 31 3E E8 B3 27 4B !DSPAM:5d756850213181273910470! ++--+---+ | Paul Goyette | PGP Key fingerprint: | E-mail addresses: | | (Retired) | FA29 0E3B 35AF E8AE 6651 | p...@whooppee.com | | Software Developer | 0786 F758 55DE 53BA 7731 | pgoye...@netbsd.org | ++--+---+
Re: fexecve
> (2) Losing the command name isn't good; lots of people turn process > accounting on for logging (in fact, I'd assume 99.9% of people who > turn process accounting on use it purely for logging) and it > substantially decreases the utility if it's easily circumvented. Isn't the command name easy to lose and/or forge already, with links if nothng else? In any case, it seems to me this is one reason to make fexecve() optional. (I'd actually _like_ to see something capabilityish, in which case "can use fexecve" would be a capability that could be removed, from init if need be, on systems that care about this sort of thing.) > (3) Setugid processes should be prohibited, or at least setugid > dynamically-linked processes, because otherwise there's a window > where a live update of a library might be used to run the old binary > with a new set of libraries. How does fexecve() make anything possible here that wasn't possible before? It seems to me that updating .so libraries has always carried this risk, so I must be missing something. /~\ The ASCII Mouse \ / Ribbon Campaign X Against HTMLmo...@rodents-montreal.org / \ Email! 7D C8 61 52 5D E7 2D 39 4E F1 31 3E E8 B3 27 4B
Re: fexecve
Christos Zoulas wrote: > - We can completely dissallow fexecve in chrooted environments. Full disk encryption (loaded with cgdroot.kmod) requires a complete system to be chrooted. -- Alex
Re: fexecve
> Here's a simple fexecve(2) implementation. Comments? Strikes me as a very good thing to have optionally available. I'd need to think a good deal more to decide whether I think it's a reasonable thing to have enabled by default. /~\ The ASCII Mouse \ / Ribbon Campaign X Against HTMLmo...@rodents-montreal.org / \ Email! 7D C8 61 52 5D E7 2D 39 4E F1 31 3E E8 B3 27 4B
Re: fexecve
On Sun, Sep 08, 2019 at 07:27:10PM +, David Holland wrote: > On Sun, Sep 08, 2019 at 01:23:46PM -0400, Christos Zoulas wrote: > > Here's a simple fexecve(2) implementation. Comments? > > Two additional things come to mind besides the chroot issue: ...for large values of two -- David A. Holland dholl...@netbsd.org
Re: fexecve
On Sun, Sep 08, 2019 at 01:23:46PM -0400, Christos Zoulas wrote: > Here's a simple fexecve(2) implementation. Comments? Two additional things come to mind besides the chroot issue: (1) For consistency of permission handling, it ought to require that the file be opened with O_EXEC, and check the execute bit at open time, not at exec time. However, I imagine that doing so is against whoever defined fexecve(). It also would create a gap in which a process could store up references to files that were previously executable and execute them later, perhaps after they've been modified. So that isn't unproblematic. But without that there are going to be a pile of odd corner cases, which I doubt we'll be able to identify in full beforehand, and some of which may turn out to be problematic as well. One that comes to mind is: if you open a program file for write while it's initially being written out, you can save that handle and use it later (after chmod) to both alter the program and execute it. On its own this isn't much use, but I'm not convinced that it can't be in combination with other circumstances. (2) Losing the command name isn't good; lots of people turn process accounting on for logging (in fact, I'd assume 99.9% of people who turn process accounting on use it purely for logging) and it substantially decreases the utility if it's easily circumvented. Also, breaking $ORIGIN for fexecve will probably lead to howling eventually. Not trivial to fix though. (3) Setugid processes should be prohibited, or at least setugid dynamically-linked processes, because otherwise there's a window where a live update of a library might be used to run the old binary with a new set of libraries. At least with ELF this can easily lead to UB that might be exploitable. -- David A. Holland dholl...@netbsd.org
Re: fexecve
In article <20190908183938.ga25...@panix.com>, Thor Lancelot Simon wrote: >On Sun, Sep 08, 2019 at 06:27:11PM -, Christos Zoulas wrote: >> In article <20190908180303.ga6...@panix.com>, >> Thor Lancelot Simon wrote: >> >On Sun, Sep 08, 2019 at 01:23:46PM -0400, Christos Zoulas wrote: >> >> >> >> Here's a simple fexecve(2) implementation. Comments? >> > >> >I think this is dangerous in systems which use chroot into filesystems >> >mounted noexec (or nosuid) and file-descriptor passing into the constrained >> >environment to feed data. Now new executables (and even setuid ones) can >> >be fed in, too. >> > >> >What can we do about that? >> >> - We can completely dissallow fexecve in chrooted environments. >> >> or >> >> - We can check the permissions of the mountpoint of the current working >> directory in addition to checking the mountpoint of the executable's >> vnode. > >I'd like to figure out a way to make this _optional_ in chrooted environments >because I think in a system designed to use it, it actually could provide a >security enhancement. At the same time, I'm worried about the effect on >systems designed as sketched out above but without this feature in mind. > >But I'm having trouble thinking through how it'd work. A flag of course and >a test, but on what -- the receive side of the socket when the chroot's >performed, perhaps? > >Or, maybe: > >1) Find a way to take the properties of the listen socket from which the > received-on socket was cloned into account; so if I chroot-then-listen > and I don't have a writable, executable filesystem in which to create > my listening socket, I can't receive an executable fd that way > >2) At chroot time, block executable fd passing on any socket that hasn't > been deliberately marked as "can receive executables" with fcntl > >Maybe those two in combination (neither looks easy, from my memory of the >relevant code, particularly #1) would work? Let us not forget that you need a binary inside the chroot that can call fexecve() on a file descriptor or the ability to build such a binary. christos
Re: fexecve
On Sun, Sep 08, 2019 at 06:27:11PM -, Christos Zoulas wrote: > In article <20190908180303.ga6...@panix.com>, > Thor Lancelot Simon wrote: > >On Sun, Sep 08, 2019 at 01:23:46PM -0400, Christos Zoulas wrote: > >> > >> Here's a simple fexecve(2) implementation. Comments? > > > >I think this is dangerous in systems which use chroot into filesystems > >mounted noexec (or nosuid) and file-descriptor passing into the constrained > >environment to feed data. Now new executables (and even setuid ones) can > >be fed in, too. > > > >What can we do about that? > > - We can completely dissallow fexecve in chrooted environments. > > or > > - We can check the permissions of the mountpoint of the current working > directory in addition to checking the mountpoint of the executable's > vnode. I'd like to figure out a way to make this _optional_ in chrooted environments because I think in a system designed to use it, it actually could provide a security enhancement. At the same time, I'm worried about the effect on systems designed as sketched out above but without this feature in mind. But I'm having trouble thinking through how it'd work. A flag of course and a test, but on what -- the receive side of the socket when the chroot's performed, perhaps? Or, maybe: 1) Find a way to take the properties of the listen socket from which the received-on socket was cloned into account; so if I chroot-then-listen and I don't have a writable, executable filesystem in which to create my listening socket, I can't receive an executable fd that way 2) At chroot time, block executable fd passing on any socket that hasn't been deliberately marked as "can receive executables" with fcntl Maybe those two in combination (neither looks easy, from my memory of the relevant code, particularly #1) would work? Thor
Re: fexecve
In article <20190908180303.ga6...@panix.com>, Thor Lancelot Simon wrote: >On Sun, Sep 08, 2019 at 01:23:46PM -0400, Christos Zoulas wrote: >> >> Here's a simple fexecve(2) implementation. Comments? > >I think this is dangerous in systems which use chroot into filesystems >mounted noexec (or nosuid) and file-descriptor passing into the constrained >environment to feed data. Now new executables (and even setuid ones) can >be fed in, too. > >What can we do about that? - We can completely dissallow fexecve in chrooted environments. or - We can check the permissions of the mountpoint of the current working directory in addition to checking the mountpoint of the executable's vnode. christos
Re: fexecve
On Sun, Sep 08, 2019 at 01:23:46PM -0400, Christos Zoulas wrote: > > Here's a simple fexecve(2) implementation. Comments? I think this is dangerous in systems which use chroot into filesystems mounted noexec (or nosuid) and file-descriptor passing into the constrained environment to feed data. Now new executables (and even setuid ones) can be fed in, too. What can we do about that? Thor
fexecve
Here's a simple fexecve(2) implementation. Comments? christos Index: include/unistd.h === RCS file: /cvsroot/src/include/unistd.h,v retrieving revision 1.151 diff -u -p -u -r1.151 unistd.h --- include/unistd.h18 Nov 2018 19:22:23 - 1.151 +++ include/unistd.h8 Sep 2019 17:20:08 - @@ -107,6 +107,7 @@ int execle(const char *, const char *, int execlp(const char *, const char *, ...); int execv(const char *, char * const *); int execve(const char *, char * const *, char * const *); +int fexecve(int, char * const *, char * const *); int execvp(const char *, char * const *); pid_t fork(void); longfpathconf(int, int); Index: sys/kern/exec_elf.c === RCS file: /cvsroot/src/sys/kern/exec_elf.c,v retrieving revision 1.98 diff -u -p -u -r1.98 exec_elf.c --- sys/kern/exec_elf.c 7 Jun 2019 23:35:52 - 1.98 +++ sys/kern/exec_elf.c 8 Sep 2019 17:20:08 - @@ -157,6 +157,7 @@ elf_populate_auxv(struct lwp *l, struct size_t len, vlen; AuxInfo ai[ELF_AUX_ENTRIES], *a, *execname; struct elf_args *ap; + char *path = l->l_proc->p_path; int error; a = ai; @@ -224,9 +225,11 @@ elf_populate_auxv(struct lwp *l, struct a->a_v = l->l_proc->p_stackbase; a++; - execname = a; - a->a_type = AT_SUN_EXECNAME; - a++; + if (*path == '/') { + execname = a; + a->a_type = AT_SUN_EXECNAME; + a++; + } exec_free_emul_arg(pack); } else { @@ -242,7 +245,6 @@ elf_populate_auxv(struct lwp *l, struct KASSERT(vlen <= sizeof(ai)); if (execname) { - char *path = l->l_proc->p_path; execname->a_v = (uintptr_t)(*stackp + vlen); len = strlen(path) + 1; if ((error = copyout(path, (*stackp + vlen), len)) != 0) Index: sys/kern/kern_exec.c === RCS file: /cvsroot/src/sys/kern/kern_exec.c,v retrieving revision 1.479 diff -u -p -u -r1.479 kern_exec.c --- sys/kern/kern_exec.c7 Sep 2019 15:34:44 - 1.479 +++ sys/kern/kern_exec.c8 Sep 2019 17:20:08 - @@ -309,7 +309,8 @@ exec_path_free(struct execve_data *data) { pathbuf_stringcopy_put(data->ed_pathbuf, data->ed_pathstring); pathbuf_destroy(data->ed_pathbuf); - PNBUF_PUT(data->ed_resolvedpathbuf); + if (data->ed_resolvedpathbuf) + PNBUF_PUT(data->ed_resolvedpathbuf); } /* @@ -343,22 +344,35 @@ check_exec(struct lwp *l, struct exec_pa { int error, i; struct vnode*vp; - struct nameidata nd; size_t resid; - // grab the absolute pathbuf here before namei() trashes it. - pathbuf_copystring(pb, epp->ep_resolvedname, PATH_MAX); - NDINIT(&nd, LOOKUP, FOLLOW | LOCKLEAF | TRYEMULROOT, pb); + if (epp->ep_resolvedname) { + struct nameidata nd; - /* first get the vnode */ - if ((error = namei(&nd)) != 0) - return error; - epp->ep_vp = vp = nd.ni_vp; + // grab the absolute pathbuf here before namei() trashes it. + pathbuf_copystring(pb, epp->ep_resolvedname, PATH_MAX); + NDINIT(&nd, LOOKUP, FOLLOW | LOCKLEAF | TRYEMULROOT, pb); + + /* first get the vnode */ + if ((error = namei(&nd)) != 0) + return error; + epp->ep_vp = vp = nd.ni_vp; #ifdef DIAGNOSTIC - /* paranoia (take this out once namei stuff stabilizes) */ - memset(nd.ni_pnbuf, '~', PATH_MAX); + /* paranoia (take this out once namei stuff stabilizes) */ + memset(nd.ni_pnbuf, '~', PATH_MAX); #endif + } else { + struct file *fp; + + if ((error = fd_getvnode(epp->ep_xfd, &fp)) != 0) + return error; + epp->ep_vp = vp = fp->f_vnode; + vn_lock(vp, LK_EXCLUSIVE | LK_RETRY); + vref(vp); + fd_putfile(epp->ep_xfd); + } + /* check access and type */ if (vp->v_type != VREG) { @@ -387,19 +401,21 @@ check_exec(struct lwp *l, struct exec_pa /* unlock vp, since we need it unlocked from here on out. */ VOP_UNLOCK(vp); + if (epp->ep_resolvedname) { #if NVERIEXEC > 0 - error = veriexec_verify(l, vp, epp->ep_resolvedname, - epp->ep_flags & EXEC_INDIR ? VERIEXEC_INDIRECT : VERIEXEC_DIRECT, - NULL); - if (error) - goto bad2; + error = veriexec_verify(l, vp, epp->ep_resolvedname, + epp->ep_flags & EXEC_INDIR ? VERIEXEC_INDIREC
Re: pool guard
> On Sep 8, 2019, at 4:48 PM, Maxime Villard wrote: > > "Fixing" this entails first having a UVM that can scale up to and work > reasonably well with such gigantic amounts of RAM, which is far from being the > case currently. We will need to have SMP, NUMA, large pages, and in short, we > will likely have to rewrite UVM entirely. Hence "at some point". I am under no illusions that it won't be a large amount of work. At the very least, document the limitation so it's easy to find later. -- thorpej
Re: pool guard
Le 08/09/2019 à 08:54, Jason Thorpe a écrit : On Sep 8, 2019, at 9:29 AM, Maxime Villard wrote: Don't confuse VA and PA. NetBSD-amd64 supports 16TB of PA, so even if you have 48TB nvdimms it gets truncated to 16TB. Then, we have 32TB of VA, twice more than the maximum PA. So again, we are fine. Yes, but obviously we should fix that at some point. "Fixing" this entails first having a UVM that can scale up to and work reasonably well with such gigantic amounts of RAM, which is far from being the case currently. We will need to have SMP, NUMA, large pages, and in short, we will likely have to rewrite UVM entirely. By the time we have done all of that, the 64bit CPUs will already ship with 5-level page trees which multiply by 512 the size of the VA space, so again, we will be largely fine. Even otherwise, revisiting pool guards to increase the mapped/unmapped ratio is a very insignificant and easy task. I see no problem related to VA/PA amounts.