On Tue, Sep 27, 2022 at 12:14:28PM +0100, Richard W.M. Jones wrote: > On Mon, Sep 26, 2022 at 05:05:43PM -0500, Eric Blake wrote: > > nbd_connect_command (h, (char **) { NULL }) triggers SIGABRT when > > preparing to exec a NULL command name (during > > enter_STATE_CONNECT_COMMAND_START in v1.0). > > > > nbd_connect_command (h, NULL) in newer releases triggers SIGSEGV by > > trying to dereference NULL (with LIBNBD_DEBUG=1, during > > nbd_internal_printable_string_list in v1.4; otherwise, during > > nbd_internal_set_argv in v1.6). In older releases, it behaves the > > same as (char **) { NULL } and triggers SIGABRT. > > > > Both crashes are corner cases that have existed for a long time, and > > unlikely to hit in real code. Still, we shouldn't crash in a library. > > Is this an actual rule?
More of a rule of thumb. Letting a library kill an entire program with SIGSEGV because of a bug in the outer program is not nice. If we document it well, we can say it's the program's own fault; but right now, we don't document that a StringList parameter must be non-NULL, either in our prose, or with __attribute__((nonnull)) in our public .h files. However, note that the use of the compiler attribute has its own drawbacks - gcc treats it as license to optimize the implementation to skip the NULL check, but does not always warn about callers that violate the invariant and actually pass NULL in - if you use the attribute, you actually lose out on the ability to write a library that tries to document that passing NULL is dumb but which still sanely handles NULL - because the sane handling is optimized away. Libvirt tried doing this a while back, and ended up with their current practice of enabling attribute((nonnull)) for Coverity scanning, but disabling it for gcc compilation. > > I checked some libc functions and it seems like > > printf(NULL) => errno EINVAL > puts(NULL) => segfault > > Both functions _do_ have nonnull attributes on the parameters, so GCC > will warn (if it can statically analyze the situation). Yes, these are cases where the documentation is explicit that passing in NULL leads to undefined behavior (part of the C standard). Whether gcc actually allows the glibc implementation to actually check for NULL and turn it into errno EINVAL, or optimized it out and results in a SIGSEGV, is then a question for how the library itself was compiled while those .h attributes are present. > > > Forbid a NULL StringList in general (similar to how we already forbid > > a NULL String); which also means a StringList parameter implies > > may_set_error=true. Refactor nbd_internal_set_argv() to split out the > > vector population into a new helper function that can only fail with > > ENOMEM (this function will also be useful in later patches that want > > to copy a StringList, but can tolerate 0 elements), as well as to set > > errors itself (all 2 callers updated) since detecting NULL for argv[0] > > is unique to argv. > > > > Tests are added in the next patch, to make it easier to review by > > temporarily swapping patch order. > > > > Changes from: > > > > $ nbdsh -c 'h.connect_command([])' > > nbdsh: generator/states-connect.c:247: enter_STATE_CONNECT_COMMAND_START: > > Assertion `h->argv.ptr[0]' failed. > > Aborted (core dumped) > > I do agree that this is wrong. If I'm following the Python bindings > correctly what is happening is that we're calling > > nbd_connect_command (h, { NULL }); > > which is causing the first case above. Correct. The python bindings always pass a non-NULL array; but when the array is empty, there is no argv[0]. This fix is worth having no matter what. The other fix is whether we should special case passing in NULL instead of an empty array (the python bindings can't trigger it). We already do this sort of NULL checking for String arguments (again, where python bindings can't trigger that, either), so we do have precedence; but we could also take alternatives of better documenting that C code should not pass in NULL (it's no longer our fault if the program didn't obey the documentation invariants and gets a crash because they passed in NULL) and/or try to decorate our public .h with attribute((nonnull)) to allow compilers that are capable of warning users about bad inputs about their mistake (I'm not yet sure if gcc has quite reached that point; it wasn't there a few years ago when libvirt decided NOT to use attribute((nonnull)) in the public headers). > > > to: > > > > nbdsh: command line script failed: nbd_connect_command: missing command > > name in argv list: Invalid argument > > This is definitely an improvement for Python code (which really should > never crash). > > So I'm not sure about the total fix here. Definitely we should be > returning an error if a zero length list is passed to nbd_connect_* > functions. > > But one thing I especially don't like about libvirt is that it turns > various virFoo (NULL) calls into errors instead of segfaults, which in > turn makes it much easier to ignore serious errors in the calling > code. > > While I won't necessarily push too hard against this patch I don't > feel it's the right direction unless someone can persuade me > otherwise. I can split this into two parts; the obvious avoidance of the missing argv[0] (we want that no matter what), and the less-obvious behavior of what to do when NULL is passed to StringList (where we have several options of what we want it to do, and where a segv + static compiler flags that will warn users about suspect coding may indeed be nicer than blindly failing with EFAULT, for forcing the user to not write bad code - but I'm not yet sure if we have reliable compiler setups that can be used to get that environment). > > +++ b/generator/C.ml > > @@ -570,7 +570,7 @@ let > > need_out_label := true > > | Flags (n, flags) -> > > print_flags_check n flags None > > - | String n -> > > + | String n | StringList n -> > > let value = match errcode with > > | Some value -> value > > | None -> assert false in This is the hunk that enforces StringList must not be NULL (the one where attribute((nonnull)) may be worth exploring)... > > +++ b/lib/utils.c > > > > +/* Store argv into h, or diagnose an error on failure. */ > > +int > > +nbd_internal_set_argv (struct nbd_handle *h, char **argv) > > +{ > > + assert (argv); > > + > > + if (argv[0] == NULL) { > > + set_error (EINVAL, "missing command name in argv list"); > > + return -1; > > + } > > + if (nbd_internal_copy_string_list (&h->argv, argv) == -1) { While this hunk (and its helper function and updates to all callsites) is where diagnosing empty arrays is the noncontroversial fix. -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3266 Virtualization: qemu.org | libvirt.org _______________________________________________ Libguestfs mailing list Libguestfs@redhat.com https://listman.redhat.com/mailman/listinfo/libguestfs