Re: [PATCH] migration/rdma: Fix cm_event used before being initialized
On 17/05/2021 18.00, Dr. David Alan Gilbert wrote: > * lizhij...@fujitsu.com (lizhij...@fujitsu.com) wrote: >> >> On 14/05/2021 01.15, Dr. David Alan Gilbert wrote: >>> * Li Zhijian (lizhij...@cn.fujitsu.com) wrote: A segmentation fault was triggered when i try to abort a postcopy + rdma migration. since rdma_ack_cm_event releases a uninitialized cm_event in thise case. like below: 2496 ret = rdma_get_cm_event(rdma->channel, _event); 2497 if (ret) { 2498 perror("rdma_get_cm_event after rdma_connect"); 2499 ERROR(errp, "connecting to destination!"); 2500 rdma_ack_cm_event(cm_event); cause segmentation fault 2501 goto err_rdma_source_connect; 2502 } Signed-off-by: Li Zhijian >>> OK, that's an easy fix then; but I wonder if we should perhaps remove >>> that rdma_ack_cm_event, if it's the get_cm_event that's failed? >> I also wondered, i checked the man page get_cm_event(3) which has not >> documented >> >> and checked some rdma examples, some of them try to ack it[1], but some >> not[2]. > I think they're actually consistent: You are right. I also checked rdma_get_cm_even() code, indeed, event will be changed only if rdma_get_cm_even() returns 0. So i agree to remove rdma_ack_cm_event(event) in error path. i will update the patch soon. Thanks Zhijian
Re: [PATCH] migration/rdma: Fix cm_event used before being initialized
* lizhij...@fujitsu.com (lizhij...@fujitsu.com) wrote: > > > On 14/05/2021 01.15, Dr. David Alan Gilbert wrote: > > * Li Zhijian (lizhij...@cn.fujitsu.com) wrote: > >> A segmentation fault was triggered when i try to abort a postcopy + rdma > >> migration. > >> > >> since rdma_ack_cm_event releases a uninitialized cm_event in thise case. > >> > >> like below: > >> 2496 ret = rdma_get_cm_event(rdma->channel, _event); > >> 2497 if (ret) { > >> 2498 perror("rdma_get_cm_event after rdma_connect"); > >> 2499 ERROR(errp, "connecting to destination!"); > >> 2500 rdma_ack_cm_event(cm_event); cause segmentation fault > >> 2501 goto err_rdma_source_connect; > >> 2502 } > >> > >> Signed-off-by: Li Zhijian > > OK, that's an easy fix then; but I wonder if we should perhaps remove > > that rdma_ack_cm_event, if it's the get_cm_event that's failed? > > I also wondered, i checked the man page get_cm_event(3) which has not > documented > > and checked some rdma examples, some of them try to ack it[1], but some > not[2]. I think they're actually consistent: > [1]: > https://github.com/linux-rdma/rdma-core/blob/e381334c2915a5290565694947790d4aebaf/librdmacm/examples/mckey.c#L451 ret = rdma_get_cm_event(test.channel, ); if (!ret) { ret = cma_handler(event->id, event); rdma_ack_cm_event(event); } Note it's '!ret' - so it's only doing the ack if the get_cm_event succeeded. > [2]: > https://github.com/linux-rdma/rdma-core/blob/e381334c2915a5290565694947790d4aebaf/librdmacm/examples/mckey.c#L342 ret = rdma_get_cm_event(test.channel, ); if (ret) { perror("rdma_get_cm_event"); break; } that exits the loop (and skips the ack) in the (ret) - i.e. only on error - no ! Dave > Thanks > > > > > Still, > > > > > > Reviewed-by: Dr. David Alan Gilbert > > > >> --- > >> migration/rdma.c | 2 +- > >> 1 file changed, 1 insertion(+), 1 deletion(-) > >> > >> diff --git a/migration/rdma.c b/migration/rdma.c > >> index 00eac34232..2dadb62aed 100644 > >> --- a/migration/rdma.c > >> +++ b/migration/rdma.c > >> @@ -2466,7 +2466,7 @@ static int qemu_rdma_connect(RDMAContext *rdma, > >> Error **errp) > >> .private_data = , > >> .private_data_len = > >> sizeof(cap), > >> }; > >> -struct rdma_cm_event *cm_event; > >> +struct rdma_cm_event *cm_event = NULL; > >> int ret; > >> > >> /* > >> -- > >> 2.30.2 > >> > >> > >> -- Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK
Re: [PATCH] migration/rdma: Fix cm_event used before being initialized
On 14/05/2021 01.15, Dr. David Alan Gilbert wrote: > * Li Zhijian (lizhij...@cn.fujitsu.com) wrote: >> A segmentation fault was triggered when i try to abort a postcopy + rdma >> migration. >> >> since rdma_ack_cm_event releases a uninitialized cm_event in thise case. >> >> like below: >> 2496 ret = rdma_get_cm_event(rdma->channel, _event); >> 2497 if (ret) { >> 2498 perror("rdma_get_cm_event after rdma_connect"); >> 2499 ERROR(errp, "connecting to destination!"); >> 2500 rdma_ack_cm_event(cm_event); cause segmentation fault >> 2501 goto err_rdma_source_connect; >> 2502 } >> >> Signed-off-by: Li Zhijian > OK, that's an easy fix then; but I wonder if we should perhaps remove > that rdma_ack_cm_event, if it's the get_cm_event that's failed? I also wondered, i checked the man page get_cm_event(3) which has not documented and checked some rdma examples, some of them try to ack it[1], but some not[2]. [1]: https://github.com/linux-rdma/rdma-core/blob/e381334c2915a5290565694947790d4aebaf/librdmacm/examples/mckey.c#L451 [2]: https://github.com/linux-rdma/rdma-core/blob/e381334c2915a5290565694947790d4aebaf/librdmacm/examples/mckey.c#L342 Thanks > > Still, > > > Reviewed-by: Dr. David Alan Gilbert > >> --- >> migration/rdma.c | 2 +- >> 1 file changed, 1 insertion(+), 1 deletion(-) >> >> diff --git a/migration/rdma.c b/migration/rdma.c >> index 00eac34232..2dadb62aed 100644 >> --- a/migration/rdma.c >> +++ b/migration/rdma.c >> @@ -2466,7 +2466,7 @@ static int qemu_rdma_connect(RDMAContext *rdma, Error >> **errp) >> .private_data = , >> .private_data_len = sizeof(cap), >> }; >> -struct rdma_cm_event *cm_event; >> +struct rdma_cm_event *cm_event = NULL; >> int ret; >> >> /* >> -- >> 2.30.2 >> >> >>
Re: [PATCH] migration/rdma: Fix cm_event used before being initialized
* Li Zhijian (lizhij...@cn.fujitsu.com) wrote: > A segmentation fault was triggered when i try to abort a postcopy + rdma > migration. > > since rdma_ack_cm_event releases a uninitialized cm_event in thise case. > > like below: > 2496 ret = rdma_get_cm_event(rdma->channel, _event); > 2497 if (ret) { > 2498 perror("rdma_get_cm_event after rdma_connect"); > 2499 ERROR(errp, "connecting to destination!"); > 2500 rdma_ack_cm_event(cm_event); cause segmentation fault > 2501 goto err_rdma_source_connect; > 2502 } > > Signed-off-by: Li Zhijian OK, that's an easy fix then; but I wonder if we should perhaps remove that rdma_ack_cm_event, if it's the get_cm_event that's failed? Still, Reviewed-by: Dr. David Alan Gilbert > --- > migration/rdma.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/migration/rdma.c b/migration/rdma.c > index 00eac34232..2dadb62aed 100644 > --- a/migration/rdma.c > +++ b/migration/rdma.c > @@ -2466,7 +2466,7 @@ static int qemu_rdma_connect(RDMAContext *rdma, Error > **errp) >.private_data = , >.private_data_len = sizeof(cap), > }; > -struct rdma_cm_event *cm_event; > +struct rdma_cm_event *cm_event = NULL; > int ret; > > /* > -- > 2.30.2 > > > -- Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK
Re: [PATCH] migration/rdma: Fix cm_event used before being initialized
On 5/13/21 6:13 PM, Philippe Mathieu-Daudé wrote: > On 5/13/21 1:37 PM, Li Zhijian wrote: >> A segmentation fault was triggered when i try to abort a postcopy + rdma >> migration. >> >> since rdma_ack_cm_event releases a uninitialized cm_event in thise case. >> >> like below: >> 2496 ret = rdma_get_cm_event(rdma->channel, _event); >> 2497 if (ret) { >> 2498 perror("rdma_get_cm_event after rdma_connect"); >> 2499 ERROR(errp, "connecting to destination!"); >> 2500 rdma_ack_cm_event(cm_event); cause segmentation fault >> 2501 goto err_rdma_source_connect; >> 2502 } >> >> Signed-off-by: Li Zhijian >> --- >> migration/rdma.c | 2 +- >> 1 file changed, 1 insertion(+), 1 deletion(-) >> >> diff --git a/migration/rdma.c b/migration/rdma.c >> index 00eac34232..2dadb62aed 100644 >> --- a/migration/rdma.c >> +++ b/migration/rdma.c >> @@ -2466,7 +2466,7 @@ static int qemu_rdma_connect(RDMAContext *rdma, Error >> **errp) >>.private_data = , >>.private_data_len = sizeof(cap), >> }; >> -struct rdma_cm_event *cm_event; >> +struct rdma_cm_event *cm_event = NULL; >> int ret; >> >> /* >> > > Reviewed-by: Philippe Mathieu-Daudé > Cc: qemu-sta...@nongnu.org
Re: [PATCH] migration/rdma: Fix cm_event used before being initialized
On 5/13/21 1:37 PM, Li Zhijian wrote: > A segmentation fault was triggered when i try to abort a postcopy + rdma > migration. > > since rdma_ack_cm_event releases a uninitialized cm_event in thise case. > > like below: > 2496 ret = rdma_get_cm_event(rdma->channel, _event); > 2497 if (ret) { > 2498 perror("rdma_get_cm_event after rdma_connect"); > 2499 ERROR(errp, "connecting to destination!"); > 2500 rdma_ack_cm_event(cm_event); cause segmentation fault > 2501 goto err_rdma_source_connect; > 2502 } > > Signed-off-by: Li Zhijian > --- > migration/rdma.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/migration/rdma.c b/migration/rdma.c > index 00eac34232..2dadb62aed 100644 > --- a/migration/rdma.c > +++ b/migration/rdma.c > @@ -2466,7 +2466,7 @@ static int qemu_rdma_connect(RDMAContext *rdma, Error > **errp) >.private_data = , >.private_data_len = sizeof(cap), > }; > -struct rdma_cm_event *cm_event; > +struct rdma_cm_event *cm_event = NULL; > int ret; > > /* > Reviewed-by: Philippe Mathieu-Daudé