On Thu, Mar 11, 2021 at 10:19:52AM +0000, Tom Ellis wrote: > SPJ Wrote: > > I've just installed WSL2 and built GHC. I get this (single) > > validation failure in libraries/unix/tests/getGroupEntryForName. It > > seems to be just an error message wibble, but I can't push a change > > to master because that'll affect everyone else. > > Interesting, I've only ever built GHC on WSL and WSL2. I've seen this > error message on WSL2 during every test run, I think. I didn't > realise that it never occurred on other platforms, let alone that it > was WSL2 specific!
I am curious what specific version/branch of GHC (and associated submodule commit of "unix") is being tested. I've recently cleaned a bunch of the upstream "unix" handling of the group/passwd database handling, but I don't believe that GHC has yet switched to the newer code. A subtle facet of the delta points in the right direction: -getGroupEntryForName: getGroupEntryForName: does not exist (no such group) +getGroupEntryForName: getGroupEntryForName: does not exist (No such process) not only is it complaining about "process" rather than "group", but crucially the case of the word "No" is different. The variance is due to the fact that there are two possible error paths with group lookup in the group lookup code: doubleAllocWhileERANGE loc enttype initlen unpack action = alloca $ go initlen where go len res = do r <- allocaBytes len $ \buf -> do rc <- action buf (fromIntegral len) res if rc /= 0 --hard-error-> then return (Left rc) else do p <- peek res --not-found--> when (p == nullPtr) $ notFoundErr fmap Right (unpack p) case r of Right x -> return x Left rc | Errno rc == eRANGE -> -- ERANGE means this is not an error -- we just have to try again with a larger buffer go (2 * len) res Left rc -> --1--> ioError (errnoToIOError loc (Errno rc) Nothing Nothing) notFoundErr = --2--> ioError $ flip ioeSetErrorString ("no such " ++ enttype) $ mkIOError doesNotExistErrorType loc Nothing Nothing The expected error path is "not-found" -> (2), where the group lookup works, but no result is found (rc == 0). This reports the lower-case "no such group". The unexpected error path is a non-zero return from "getgrnam_r" (action) -> (1), which uses `errno` to build the error string, which ends up being "No such process". On Linux systems that's: ESRCH 3 /* No such process */ So the call to "getgrnam_r" failed by returning ESRCH, rather than 0. The Linux manpage does not suggest to me that one might expect a non-zero return from getgrnam_r(3) just from a missing entry in the group file: RETURN VALUE The getgrnam() and getgrgid() functions return a pointer to a group structure, or NULL if the matching entry is not found or an error occurs. If an error occurs, errno is set appropriately. If one wants to check errno after the call, it should be set to zero before the call. The return value may point to a static area, and may be overwritten by subsequent calls to getgrent(3), getgrgid(), or getgrnam(). (Do not pass the returned pointer to free(3).) On success, getgrnam_r() and getgrgid_r() return zero, and ---> set *result to grp. If no matching group record was found, ---> these functions return 0 and store NULL in *result. In case ---> of error, an error number is returned, and NULL is stored in ---> *result. ERRORS 0 or ENOENT or ESRCH or EBADF or EPERM or ... The given name or gid was not found. EINTR A signal was caught; see signal(7). EIO I/O error. EMFILE The per-process limit on the number of open file descriptors has been reached. ENFILE The system-wide limit on the total number of open files has been reached. ENOMEM Insufficient memory to allocate group structure. ERANGE Insufficient buffer space supplied. The "0 or ENOENT or ESRCH ..." text then plausibly applies to getgrnam(3), and its legacy behaviour. So the question is why the lookup is failing. To that end compiling a tracing with "strace" the below C program should tell the story: #include <sys/types.h> #include <grp.h> #include <errno.h> #include <stdio.h> int main(int argc, char **argv) { struct group g, *p; char buf[1024]; int rc; errno = 0; rc = getgrnam_r("nosuchgrouphere", &g, buf, sizeof(buf), &p); printf("%p: %m(%d)\n", p, errno); return (rc == 0 && p == NULL); } On a Fedora 31 system I get: $ make g cc g.c -o g $ ./g (nil): Success(0) If something else happens on WSL2, running $ strace -o g.trace ./g may reveal something not going right during the lookup if the problem is with some system call. On the other hand, if the problem is entirely in "user-land", then it may take more work to see what's going on. Is group database on these systems backed just by local files or by AD LDAP? A look at at the "group" entry in /etc/nsswitch.conf may shed some light on how groups are found. -- Viktor. _______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs