* lizhij...@fujitsu.com (lizhij...@fujitsu.com) wrote: > > > On 14/05/2021 01.15, Dr. David Alan Gilbert wrote: > > * Li Zhijian (lizhij...@cn.fujitsu.com) wrote: > >> A segmentation fault was triggered when i try to abort a postcopy + rdma > >> migration. > >> > >> since rdma_ack_cm_event releases a uninitialized cm_event in thise case. > >> > >> like below: > >> 2496 ret = rdma_get_cm_event(rdma->channel, &cm_event); > >> 2497 if (ret) { > >> 2498 perror("rdma_get_cm_event after rdma_connect"); > >> 2499 ERROR(errp, "connecting to destination!"); > >> 2500 rdma_ack_cm_event(cm_event); <<<< cause segmentation fault > >> 2501 goto err_rdma_source_connect; > >> 2502 } > >> > >> Signed-off-by: Li Zhijian <lizhij...@cn.fujitsu.com> > > OK, that's an easy fix then; but I wonder if we should perhaps remove > > that rdma_ack_cm_event, if it's the get_cm_event that's failed? > > I also wondered, i checked the man page get_cm_event(3) which has not > documented > > and checked some rdma examples, some of them try to ack it[1], but some > not[2].
I think they're actually consistent: > [1]: > https://github.com/linux-rdma/rdma-core/blob/e381334c2915a5290565694947790d4aebaf2222/librdmacm/examples/mckey.c#L451 ret = rdma_get_cm_event(test.channel, &event); if (!ret) { ret = cma_handler(event->id, event); rdma_ack_cm_event(event); } Note it's '!ret' - so it's only doing the ack if the get_cm_event succeeded. > [2]: > https://github.com/linux-rdma/rdma-core/blob/e381334c2915a5290565694947790d4aebaf2222/librdmacm/examples/mckey.c#L342 ret = rdma_get_cm_event(test.channel, &event); if (ret) { perror("rdma_get_cm_event"); break; } that exits the loop (and skips the ack) in the (ret) - i.e. only on error - no ! Dave > Thanks > > > > > Still, > > > > > > Reviewed-by: Dr. David Alan Gilbert <dgilb...@redhat.com> > > > >> --- > >> migration/rdma.c | 2 +- > >> 1 file changed, 1 insertion(+), 1 deletion(-) > >> > >> diff --git a/migration/rdma.c b/migration/rdma.c > >> index 00eac34232..2dadb62aed 100644 > >> --- a/migration/rdma.c > >> +++ b/migration/rdma.c > >> @@ -2466,7 +2466,7 @@ static int qemu_rdma_connect(RDMAContext *rdma, > >> Error **errp) > >> .private_data = &cap, > >> .private_data_len = > >> sizeof(cap), > >> }; > >> - struct rdma_cm_event *cm_event; > >> + struct rdma_cm_event *cm_event = NULL; > >> int ret; > >> > >> /* > >> -- > >> 2.30.2 > >> > >> > >> -- Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK