On Thu, Jul 13, 2023 at 05:46:56PM +0100, Richard W.M. Jones wrote: > On Thu, Jul 13, 2023 at 04:18:03PM +0000, Tage Johansson wrote: > > > > On 7/13/2023 5:37 PM, Richard W.M. Jones wrote: > > >On Thu, Jul 13, 2023 at 03:05:30PM +0000, Tage Johansson wrote: > > >>On 7/13/2023 4:36 PM, Eric Blake wrote: > > >>>On Thu, Jul 13, 2023 at 01:37:49PM +0000, Tage Johansson wrote: > > >>>>On 7/13/2023 3:26 PM, Richard W.M. Jones wrote: > > >>>>>On Thu, Jul 13, 2023 at 08:01:09AM -0500, Eric Blake wrote: > > >>>>>>On Thu, Jul 13, 2023 at 07:13:37AM -0500, Eric Blake wrote: > > >>>>>>>>>I have replaced a call to `nbd_opt_info()` with a call to > > >>>>>>>>>`nbd_aio_opt_info()` and passed in a completion callback which just > > >>>>>>>>>calls `exit(EXIT_FAILURE)`. So if the completion callback is called > > >>>>>>>>>the test should fail, which it doesn't, at least not on my machine. > > >>>>>>>>Isn't that OK? Only .free is required to be called. > > >>>>>>>For the context callback (for opt_set_meta), .callback may be called > > >>>>>>>zero, one, or multiple times, but .free should be called exactly > > >>>>>>>once. > > >>>>>>>But for the completion callback, I agree that the docs state that > > >>>>>>>both > > >>>>>>>.callback and .free should each be called exactly once ("On the other > > >>>>>>>hand, if a completion callback is supplied (only possible with > > >>>>>>>asynchronous commands), it will always be reached exactly once, and > > >>>>>>>the completion callback must not ignore the value pointed to by > > >>>>>>>C<error>."); we are indeed missing the call to .callback. I'll work > > >>>>>>>on a patch. > > >>>>>>Eww, the bug is bigger than just nbd_aio_opt* not always calling > > >>>>>>completion.callback exactly once. With just this diff (to be folded > > >>>>>>into the larger patch I'm working on), I'm getting an assertion > > >>>>>>failure that we fail to call completion.callback for > > >>>>>>nbd_aio_pread_structured when calling nbd_close() prior to the command > > >>>>>>running to completion, so I'll have to fix that too. > > >>>>>Just to be clear about this, are we really sure the completion > > >>>>>callback should always be called once? I'm not clear why that should > > >>>>>be the case. (The .free callback however should be.) > > >>>>> > > >>>>>For example, if I call a function with bogus invalid parameters so > > >>>>>that it fails very early on (or when the handle is in an incorrect > > >>>>>state), should I expect the completion callback to be called? I would > > >>>>>expect not. > > >>>>> > > >>>>>Rich. > > >>>>The user needs a way to know if an error occurred. So the completion > > >>>>callback must be called if the asynchronous function did not fail > > >>>>(returned > > >>>>0). If the completion callback should be called, with the error > > >>>>parameter > > >>>>set, even when the asynchronous function immediately failed with a > > >>>>non-zero > > >>>>return value is another question. I see two possibilities: Either the > > >>>>completion callback should always be called. Or it should be called iff > > >>>>the > > >>>>asynchronous function returned 0 (did not fail). > > >>>Good point. > > >>> > > >>>Our documentation currently states (docs/libnbd.pod) that the > > >>>completion callback is ALWAYS called, but this is inconsistent with > > >>>current code - you are correct that at present, the completion > > >>>callback is NOT called if the aio command returns non-zero (easy test: > > >>>call nbd_aio_block_status before nbd_connect*). I'm game to updating > > >>>that part of the documentation to match existing practice (changing > > >>>our guarantee to the completion callback will be called once iff the > > >>>aio command reported success), since our testsuite wasn't covering it, > > >>>and it is probably an easier fix than munging the generator to call > > >>>completion.callback even for aio failures. Meanwhile, the .free > > >>>callback MUST be called unconditionally, and I think our testsuite is > > >>>already covering that. > > >>> > > >>>Life-cycle wise, I see the following sequence as being something we > > >>>could usefully rely on (although we don't yet have enough testsuite > > >>>coverage to prove that we already have it). Note that it is not only > > >>>important how many times things are called, but I would like it if we > > >>>can guarantee the specific ordering between them (neither .free should > > >>>be called until all .callback opportunities are complete finished, to > > >>>avoid any use-after-free issues regardless of which of the two .free > > >>>slots the programmer actually uses): > > >>> > > >>>- if aio command fails: > > >>>mid-command .callback: 0 calls > > >>>completion .callback: 0 calls > > >>>mid-command .free: 1 call > > >>>completion .free: 1 call > > >>> > > >>>- if aio command succeeds: > > >>>mid-command .callback: 0, 1, or multiple times > > >>>completion .callback: 1 call > > >>>mid-command .free: 1 call > > >>>completion .free: 1 call > > >>> > > >>>What I'm not sure of yet (without more code inspection) is whether we > > >>>can sometimes call completion.callback after mid-command.free. > > >>> > > >>>>By the way, if the error parameter is set in the completion callback, > > >>>>how > > >>>>can the user retrieve the error text? Is it possible to call > > >>>>`get_nbd_error(3)` from inside the completion callback? > > >>>You mean nbd_get_error(). We currently document that it is not safe > > >>>to call any nbd_* from within a callback. Maybe we could relax that > > >>>to carve out exceptions for nbd_get_error/errno as special cases, > > >>>since they only inspect thread-local state and can be used even when > > >>>there is no current nbd_handle. The problem is that I'm not sure if > > >>>there is a reliable string at that point in time: part of the reason > > >>>the completion callback has an errnum parameter is that the point of > > >>>reporting the completion failure may be in a different thread than > > >>>when the failure was first detected, and we only preserved a numeric > > >>>value in cmd->error rather than also preserving a string. > > >>> > > >>>Put another way, there is no way to guarantee that nbd_get_errno() > > >>>will return the same value as the errnum parameter to the callback; > > >>>and if that is true, then even if calling nbd_get_error() doesn't > > >>>crash or deadlock from within a callback, it is likewise probable that > > >>>any such string returned is not consistent with the errnum parameter > > >>>passed to the callback. > > >> > > >>So is there any safe way to get some description of the error from a > > >>completion callback apart from a non-zero number? It isn't too > > >>helpful to report to the user that the read operation faild with -1. > > >As I recall, from the callback, no. > > > > > >The error is not lost however, it will be returned by the API call > > >itself. eg. If you're in nbd_aio_opt_list -> callback (error) then > > >nbd_aio_opt_list will return -1 and at that point you can use > > >nbd_get_error to report the error. > > > > > > I don't understand. If I call `nbd_aio_opt_list()` with a completion > > callback. `nbd_aio_opt_list()` will return immediately, maybe with a > > successful result. But the command is not complete until > > `nbd_aio_is_connecting()` returns `false`, so the completion > > callback may be invoked with an error after `nbd_aio_opt_list()` has > > returned. > > It's returned by some later call, such as nbd_aio_notify_read. > > Basically libnbd doesn't create threads, so the only place that > callbacks can be called is when you call into libnbd. That function > that you call is what will return an error. > > > Also, does the value of `err` (passed into a caller) has any meaning > > like a known errno value or something else that can be converted to > > a description? Or is it just an arbitrary non zero integer? > > It's a C errno number. Setting it will change what nbd_get_errno > returns. > > In the OCaml bindings we actually provide a function to translate > OCaml <-> C errnos for this purpose: > > val errno_of_unix_error : Unix.error -> int > (** Return the raw C errno corresponding to a {!Unix.error}. This > can be used in callbacks to update the [int ref] parameter. *)
ocaml/tests/test_505_aio_pread_structured_callback.ml has an example of how this is used. The errno is translated twice, once from OCaml -> C inside the callback. And a second time from C -> OCaml when the error is returned by the NBD.* call. Rich. > Rich. > > -- > Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones > Read my programming and virtualization blog: http://rwmj.wordpress.com > nbdkit - Flexible, fast NBD server with plugins > https://gitlab.com/nbdkit/nbdkit > _______________________________________________ > Libguestfs mailing list > Libguestfs@redhat.com > https://listman.redhat.com/mailman/listinfo/libguestfs -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com Fedora Windows cross-compiler. Compile Windows programs, test, and build Windows installers. Over 100 libraries supported. http://fedoraproject.org/wiki/MinGW _______________________________________________ Libguestfs mailing list Libguestfs@redhat.com https://listman.redhat.com/mailman/listinfo/libguestfs