Hello Guix, Anadon <joshua.r.marshall.1...@gmail.com> writes:
> Talking with iskarian on IRC, we've confirmed that guix successfully > installs, almost successfully sets up (the init.d isn't set up to > actually daemonize), but for `guix build`, `guix install` and `guix > pull` all fail with "guix <SUB_CMD>: error: cannot kill processes for > uid `998': failed with exit code 1" when using WSL1/2. I've investigated this a bit and it seems to be an issue with the return code from `kill` when no other processes owned by that user exist to kill. If I set a process to continually spawn with uid 998, I no longer encounter the above error. Also, if there is a zombie process under that uid, I no longer encounter the above error until I manually kill it. This is on WSL1; I do not know if this technique also applies to WSL2. For reference, the relevant portion of `nix/libutil/util.cc`: --8<---------------cut here---------------start------------->8--- Pid pid = startProcess([&]() { if (setuid(uid) == -1) throw SysError("setting uid"); while (true) { #ifdef __APPLE__ /* OSX's kill syscall takes a third parameter that, among other things, determines if kill(-1, signo) affects the calling process. In the OSX libc, it's set to true, which means "follow POSIX", which we don't want here */ if (syscall(SYS_kill, -1, SIGKILL, false) == 0) break; #elif __GNU__ /* Killing all a user's processes using PID=-1 does currently not work on the Hurd. */ if (kill(getpid(), SIGKILL) == 0) break; #else if (kill(-1, SIGKILL) == 0) break; #endif if (errno == ESRCH) break; /* no more processes */ if (errno != EINTR) throw SysError(format("cannot kill processes for uid `%1%'") % uid); } _exit(0); }); int status = pid.wait(true); #if __GNU__ /* When the child killed itself, status = SIGKILL. */ if (status == SIGKILL) return; #endif if (status != 0) throw Error(format("cannot kill processes for uid `%1%': %2%") % uid % statusToString(status)); --8<---------------cut here---------------end--------------->8--- Perhaps the way WSL handles the return code for ``kill` is not as expected? On a cursory inspection, though, the relevant parts of WSL2's kernel/signal.c seem the same as the vanilla Linux kernel... Or perhaps for some reason `kill(-1, SIGKILL)` under WSL is attempting to kill the calling process (why?) and failing, therefore returning an error. Note that the error code 1 reported by Guix does not seem to be the actual errno reported by `kill`. I've seen Guix working on WSL2 in the wild before [0] so this is a really odd error. I'm stumped. [0] https://github.com/giuliano108/guix-packages/blob/master/notes/Guix-on-WSL2.md -- Sarah