Fabiano Rosas <faro...@suse.de> writes: > Markus Armbruster <arm...@redhat.com> writes: > >> We use errno after calling Libibverbs functions that are not >> documented to set errno (manual page does not mention errno), or where >> the documentation is unclear ("returns [...] the value of errno on >> failure"). While this could be read as "sets errno and returns it", >> a glance at the source code[*] kills that hope: >> >> static inline int ibv_post_send(struct ibv_qp *qp, struct ibv_send_wr >> *wr, >> struct ibv_send_wr **bad_wr) >> { >> return qp->context->ops.post_send(qp, wr, bad_wr); >> } >> >> The callback can be >> >> static int mana_post_send(struct ibv_qp *ibqp, struct ibv_send_wr *wr, >> struct ibv_send_wr **bad) >> { >> /* This version of driver supports RAW QP only. >> * Posting WR is done directly in the application. >> */ >> return EOPNOTSUPP; >> } >> >> Neither of them touches errno. >> >> One of these errno uses is easy to fix, so do that now. Several more >> will go away later in the series; add temporary FIXME commments. >> Three will remain; add TODO comments. TODO, not FIXME, because the >> bug might be in Libibverbs documentation. >> >> [*] https://github.com/linux-rdma/rdma-core.git >> commit 55fa316b4b18f258d8ac1ceb4aa5a7a35b094dcf >> >> Signed-off-by: Markus Armbruster <arm...@redhat.com> >> --- >> migration/rdma.c | 45 +++++++++++++++++++++++++++++++++++++++------ >> 1 file changed, 39 insertions(+), 6 deletions(-) >> >> diff --git a/migration/rdma.c b/migration/rdma.c >> index 28097ce604..bba8c99fa9 100644 >> --- a/migration/rdma.c >> +++ b/migration/rdma.c >> @@ -853,6 +853,12 @@ static int qemu_rdma_broken_ipv6_kernel(struct >> ibv_context *verbs, Error **errp) >> >> for (x = 0; x < num_devices; x++) { >> verbs = ibv_open_device(dev_list[x]); >> + /* >> + * ibv_open_device() is not documented to set errno. If >> + * it does, it's somebody else's doc bug. If it doesn't, >> + * the use of errno below is wrong. >> + * TODO Find out whether ibv_open_device() sets errno. >> + */ >> if (!verbs) { >> if (errno == EPERM) { >> continue; > > This function can call into glibc, so it's not unreasonable to expect > errno to be set at some point. We're not relying on errno to be set, > just taking an action if it happens to be.
errno may well be set on some failures. But it needs to be set on *all* failures to be reliable. If it's not, then its value on such failures comes from some unrelated, prior errno-setting failure. > I don't think someone would just decide to handle EPERM at this point > for no reason. Specially since the manual makes no mention to > errno. This was probably introduced after someone got bit by it. > > ... indeed the commit 5b61d57521 ("rdma: Fix qemu crash when IPv6 > address is used for migration") introduced this to fix a crash. I don't doubt the error recovery added there works in the case described by the commit message. I suspect it can break on unrelated failures.