On 25/09/2023 14:36, Markus Armbruster wrote: > "Zhijian Li (Fujitsu)" <lizhij...@fujitsu.com> writes: > >> On 18/09/2023 22:41, Markus Armbruster wrote: >>> The QEMUFileHooks methods don't come with a written contract. Digging >>> through the code calling them, we find: >>> >>> * save_page(): >> >> I'm fine with this >> >>> >>> Negative values RAM_SAVE_CONTROL_DELAYED and >>> RAM_SAVE_CONTROL_NOT_SUPP are special. Any other negative value is >>> an unspecified error. >>> >>> qemu_rdma_save_page() returns -EIO or rdma->error_state on error. I >>> believe the latter is always negative. Nothing stops either of them >>> to clash with the special values, though. Feels unlikely, but fix >>> it anyway to return only the special values and -1. >>> >>> * before_ram_iterate(), before_ram_iterate(): >> >> error code returned by before_ram_iterate() and before_ram_iterate() will be >> assigned to QEMUFile for upper layer. >> But it seems that no callers take care about the error ? Shouldn't let the >> callers >> to check the error instead ? > > The error values returned by qemu_rdma_registration_start() and > qemu_rdma_registration_stop() carry no additional information a caller > could check.
I think qemu_file_get_error(f) can be used for callers to check the error code. > > Both return either -EIO or rdma->error_state on error. The latter is > *not* a negative errno code. Evidence: qio_channel_rdma_shutdown() > assigns -1, qemu_rdma_block_for_wrid() assigns the error value of > qemu_rdma_poll(), which can be the error value of ibv_poll_cq(), which > is an unspecified negative value, ... > you are right. > I decided not to investigate what qemu-file.c does with the error values > after one quick glance at the code. It's confusing, and quite possibly > confused. But I'm already at 50+ patches, and am neither inclined nor > able to take on more migration cleanup at this time. Yeah, it's already big enough patch set. very thanks Reviewed-by: Li Zhijian <lizhij...@fujitsu.com> > >>> Negative value means error. qemu_rdma_registration_start() and >>> qemu_rdma_registration_stop() comply as far as I can tell. Make >>> them comply *obviously*, by returning -1 on error. >>> >>> * hook_ram_load: >>> >>> Negative value means error. rdma_load_hook() already returns -1 on >>> error. Leave it alone. >> >> see inline reply below, >> >>> >>> Signed-off-by: Markus Armbruster <arm...@redhat.com> >>> --- >>> migration/rdma.c | 79 +++++++++++++++++++++++------------------------- >>> 1 file changed, 37 insertions(+), 42 deletions(-) >>> >>> diff --git a/migration/rdma.c b/migration/rdma.c >>> index cc59155a50..46b5859268 100644 >>> --- a/migration/rdma.c >>> +++ b/migration/rdma.c >>> @@ -3219,12 +3219,11 @@ static size_t qemu_rdma_save_page(QEMUFile *f, >>> rdma = qatomic_rcu_read(&rioc->rdmaout); >>> >>> if (!rdma) { >>> - return -EIO; >>> + return -1; >>> } >>> >>> - ret = check_error_state(rdma); >>> - if (ret) { >>> - return ret; >>> + if (check_error_state(rdma)) { >>> + return -1; >>> } >>> >>> qemu_fflush(f); >>> @@ -3290,9 +3289,10 @@ static size_t qemu_rdma_save_page(QEMUFile *f, >>> } >>> >>> return RAM_SAVE_CONTROL_DELAYED; >>> + >>> err: >>> rdma->error_state = ret; >>> - return ret; >>> + return -1; >>> } >>> >>> static void rdma_accept_incoming_migration(void *opaque); >>> @@ -3538,12 +3538,11 @@ static int qemu_rdma_registration_handle(QEMUFile >>> *f) >>> rdma = qatomic_rcu_read(&rioc->rdmain); >>> >>> if (!rdma) { >>> - return -EIO; >>> + return -1; >> >> that's because EIO is not accurate here ? > > It's because the function does not return a negative errno code, and > returning -EIO is misleading readers into assuming it does > >>> } >>> >>> - ret = check_error_state(rdma); >>> - if (ret) { >>> - return ret; >> >> Ditto > > Likewise. > > [...] >