Re: [ovs-dev] [PATCH] ofproto: Fix crash on PACKET_OUT due to recursive locking after upcall.
On 25.11.2019 22:45, Ben Pfaff wrote: > On Fri, Nov 01, 2019 at 10:24:39PM +0100, Ilya Maximets wrote: >> Handling of OpenFlow PACKET_OUT implies pushing the packet through >> the datapath and packet processing inside the datapath could trigger >> an upcall. In case of system datapath, 'dpif_execute()' sends packet >> to the kernel module and returns. If any, upcall will be triggered >> inside the kernel and handled by a separate thread in userspace. >> But in case of userspace datapath full processing of the packet happens >> inside the 'dpif_execute()' in the same thread that handled PACKET_OUT. >> This causes an issue if upcall will lead to modification of flow rules. > > Thank you very much for the fix. It is simpler and cleaner than I > expected. It looks correct to me and it passes all of the tests for me > locally. > > Acked-by: Ben Pfaff Thanks! Applied to master and backported down to 2.7. There was a big refactoring between branches 2.6 and 2.7, so this patch could not be applied to 2.6. However, IIUC, handle_packet_out() on this branch doesn't take 'ofproto_mutex' at all. Looks a bit dangerous. Best regards, Ilya Maximets. ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH] ofproto: Fix crash on PACKET_OUT due to recursive locking after upcall.
On Fri, Nov 01, 2019 at 10:24:39PM +0100, Ilya Maximets wrote: > Handling of OpenFlow PACKET_OUT implies pushing the packet through > the datapath and packet processing inside the datapath could trigger > an upcall. In case of system datapath, 'dpif_execute()' sends packet > to the kernel module and returns. If any, upcall will be triggered > inside the kernel and handled by a separate thread in userspace. > But in case of userspace datapath full processing of the packet happens > inside the 'dpif_execute()' in the same thread that handled PACKET_OUT. > This causes an issue if upcall will lead to modification of flow rules. Thank you very much for the fix. It is simpler and cleaner than I expected. It looks correct to me and it passes all of the tests for me locally. Acked-by: Ben Pfaff ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH] ofproto: Fix crash on PACKET_OUT due to recursive locking after upcall.
Hi Ben, I was not able to reproduce this issue in ovs sandbox, but able to test this patch on local testbed which had this issue and the patch works. Thanks & Regards, Anil Kumar. -Original Message- From: Ilya Maximets Sent: Friday, 22 November, 2019 12:02 AM To: Ilya Maximets ; ovs-dev@openvswitch.org; Ben Pfaff Cc: Anil Kumar Koli Subject: Re: [PATCH] ofproto: Fix crash on PACKET_OUT due to recursive locking after upcall. On 01.11.2019 22:24, Ilya Maximets wrote: > Handling of OpenFlow PACKET_OUT implies pushing the packet through the > datapath and packet processing inside the datapath could trigger an > upcall. In case of system datapath, 'dpif_execute()' sends packet to > the kernel module and returns. If any, upcall will be triggered > inside the kernel and handled by a separate thread in userspace. > But in case of userspace datapath full processing of the packet > happens inside the 'dpif_execute()' in the same thread that handled > PACKET_OUT. > This causes an issue if upcall will lead to modification of flow rules. > For example, it could happen while processing of 'learn' actions. > Since whole handling of PACKET_OUT is protected by 'ofproto_mutex', > OVS will assert on attempt to take it recursively while processing > 'learn' actions: > >0 __GI_raise (sig=sig@entry=6) >1 __GI_abort () >2 ovs_abort_valist () >3 ovs_abort () >4 ovs_mutex_lock_at (where=where@entry=0xad4199 "ofproto/ofproto.c:5391") > >5 ofproto_flow_mod_learn () at ofproto/ofproto.c:5391 > >6 xlate_learn_action () at ofproto/ofproto-dpif-xlate.c:5378 > <'learn' action found> >7 do_xlate_actions () at ofproto/ofproto-dpif-xlate.c:6893 >8 xlate_recursively ()at ofproto/ofproto-dpif-xlate.c:4233 >9 xlate_table_action () at ofproto/ofproto-dpif-xlate.c:4361 > 10 in xlate_ofpact_resubmit () at ofproto/ofproto-dpif-xlate.c:4672 > 11 do_xlate_actions () at ofproto/ofproto-dpif-xlate.c:6773 > 12 xlate_actions ()at ofproto/ofproto-dpif-xlate.c:7570 > > 13 upcall_xlate () at ofproto/ofproto-dpif-upcall.c:1197 > 14 process_upcall () at ofproto/ofproto-dpif-upcall.c:1413 > 15 upcall_cb ()at ofproto/ofproto-dpif-upcall.c:1315 > 16 dp_netdev_upcall (DPIF_UC_MISS) at lib/dpif-netdev.c:6236 > > 17 handle_packet_upcall () at lib/dpif-netdev.c:6591 > 18 fast_path_processing () at lib/dpif-netdev.c:6709 > 19 dp_netdev_input__ ()at lib/dpif-netdev.c:6797 > 20 dp_netdev_recirculate ()at lib/dpif-netdev.c:6842 > 21 dp_execute_cb ()at lib/dpif-netdev.c:7158 > 22 odp_execute_actions () at lib/odp-execute.c:794 > 23 dp_netdev_execute_actions ()at lib/dpif-netdev.c:7332 > 24 dpif_netdev_execute () at lib/dpif-netdev.c:3725 > 25 dpif_netdev_operate () at lib/dpif-netdev.c:3756 > > 26 dpif_operate () at lib/dpif.c:1367 > 27 dpif_execute () at lib/dpif.c:1321 > 28 packet_execute () at ofproto/ofproto-dpif.c:4760 > 29 ofproto_packet_out_finish ()at ofproto/ofproto.c:3594 > > 30 handle_packet_out ()at ofproto/ofproto.c:3635 > 31 handle_single_part_openflow (OFPTYPE_PACKET_OUT) at > ofproto/ofproto.c:8400 > 32 handle_openflow () at > ofproto/ofproto.c:8587 > 33 ofconn_run () > 34 connmgr_run () > 35 ofproto_run () > 36 bridge_run__ () > 37 bridge_run () > 38 main () > > Fix that by splitting the 'ofproto_packet_out_finish()' in two parts. > First one that translates side-effects and requires holding 'ofproto_mutex' > and the second that only pushes packet to datapath. The second part > moved out of 'ofproto_mutex' because 'ofproto_mutex' is not required > and actually should not be taken in order to avoid recursive locking. > > Reported-by: Anil Kumar Koli > Reported-at: > https://mail.openvswitch.org/pipermail/ovs-discuss/2019-April/048494.h > tml > Signed-off-by: Ilya Maximets > --- > > Previous discussion about implementation: > https://mail.openvswitch.org/pipermail/ovs-dev/2019-August/362082.html > > I'm not able to reproduce the original issue, so I can not test if > this patch fixes it. > > ofproto/ofproto-dpif.c | 40 -- > ofproto/ofproto-provider.h | 12 +--- > ofproto/ofproto.c | 29 +++ > 3 files changed, 64 insertions(+), 17 deletions(-) Hi Ben, Could you, please, take a look at this patch? Best regards, Ilya Maximets. ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH] ofproto: Fix crash on PACKET_OUT due to recursive locking after upcall.
On 01.11.2019 22:24, Ilya Maximets wrote: > Handling of OpenFlow PACKET_OUT implies pushing the packet through > the datapath and packet processing inside the datapath could trigger > an upcall. In case of system datapath, 'dpif_execute()' sends packet > to the kernel module and returns. If any, upcall will be triggered > inside the kernel and handled by a separate thread in userspace. > But in case of userspace datapath full processing of the packet happens > inside the 'dpif_execute()' in the same thread that handled PACKET_OUT. > This causes an issue if upcall will lead to modification of flow rules. > For example, it could happen while processing of 'learn' actions. > Since whole handling of PACKET_OUT is protected by 'ofproto_mutex', > OVS will assert on attempt to take it recursively while processing > 'learn' actions: > >0 __GI_raise (sig=sig@entry=6) >1 __GI_abort () >2 ovs_abort_valist () >3 ovs_abort () >4 ovs_mutex_lock_at (where=where@entry=0xad4199 "ofproto/ofproto.c:5391") > >5 ofproto_flow_mod_learn () at ofproto/ofproto.c:5391 > >6 xlate_learn_action () at ofproto/ofproto-dpif-xlate.c:5378 > <'learn' action found> >7 do_xlate_actions () at ofproto/ofproto-dpif-xlate.c:6893 >8 xlate_recursively ()at ofproto/ofproto-dpif-xlate.c:4233 >9 xlate_table_action () at ofproto/ofproto-dpif-xlate.c:4361 > 10 in xlate_ofpact_resubmit () at ofproto/ofproto-dpif-xlate.c:4672 > 11 do_xlate_actions () at ofproto/ofproto-dpif-xlate.c:6773 > 12 xlate_actions ()at ofproto/ofproto-dpif-xlate.c:7570 > > 13 upcall_xlate () at ofproto/ofproto-dpif-upcall.c:1197 > 14 process_upcall () at ofproto/ofproto-dpif-upcall.c:1413 > 15 upcall_cb ()at ofproto/ofproto-dpif-upcall.c:1315 > 16 dp_netdev_upcall (DPIF_UC_MISS) at lib/dpif-netdev.c:6236 > > 17 handle_packet_upcall () at lib/dpif-netdev.c:6591 > 18 fast_path_processing () at lib/dpif-netdev.c:6709 > 19 dp_netdev_input__ ()at lib/dpif-netdev.c:6797 > 20 dp_netdev_recirculate ()at lib/dpif-netdev.c:6842 > 21 dp_execute_cb ()at lib/dpif-netdev.c:7158 > 22 odp_execute_actions () at lib/odp-execute.c:794 > 23 dp_netdev_execute_actions ()at lib/dpif-netdev.c:7332 > 24 dpif_netdev_execute () at lib/dpif-netdev.c:3725 > 25 dpif_netdev_operate () at lib/dpif-netdev.c:3756 > > 26 dpif_operate () at lib/dpif.c:1367 > 27 dpif_execute () at lib/dpif.c:1321 > 28 packet_execute () at ofproto/ofproto-dpif.c:4760 > 29 ofproto_packet_out_finish ()at ofproto/ofproto.c:3594 > > 30 handle_packet_out ()at ofproto/ofproto.c:3635 > 31 handle_single_part_openflow (OFPTYPE_PACKET_OUT) at > ofproto/ofproto.c:8400 > 32 handle_openflow () at > ofproto/ofproto.c:8587 > 33 ofconn_run () > 34 connmgr_run () > 35 ofproto_run () > 36 bridge_run__ () > 37 bridge_run () > 38 main () > > Fix that by splitting the 'ofproto_packet_out_finish()' in two parts. > First one that translates side-effects and requires holding 'ofproto_mutex' > and the second that only pushes packet to datapath. The second part moved > out of 'ofproto_mutex' because 'ofproto_mutex' is not required and actually > should not be taken in order to avoid recursive locking. > > Reported-by: Anil Kumar Koli > Reported-at: > https://mail.openvswitch.org/pipermail/ovs-discuss/2019-April/048494.html > Signed-off-by: Ilya Maximets > --- > > Previous discussion about implementation: > https://mail.openvswitch.org/pipermail/ovs-dev/2019-August/362082.html > > I'm not able to reproduce the original issue, so I can not test if > this patch fixes it. > > ofproto/ofproto-dpif.c | 40 -- > ofproto/ofproto-provider.h | 12 +--- > ofproto/ofproto.c | 29 +++ > 3 files changed, 64 insertions(+), 17 deletions(-) Hi Ben, Could you, please, take a look at this patch? Best regards, Ilya Maximets. ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH] ofproto: Fix crash on PACKET_OUT due to recursive locking after upcall.
Handling of OpenFlow PACKET_OUT implies pushing the packet through the datapath and packet processing inside the datapath could trigger an upcall. In case of system datapath, 'dpif_execute()' sends packet to the kernel module and returns. If any, upcall will be triggered inside the kernel and handled by a separate thread in userspace. But in case of userspace datapath full processing of the packet happens inside the 'dpif_execute()' in the same thread that handled PACKET_OUT. This causes an issue if upcall will lead to modification of flow rules. For example, it could happen while processing of 'learn' actions. Since whole handling of PACKET_OUT is protected by 'ofproto_mutex', OVS will assert on attempt to take it recursively while processing 'learn' actions: 0 __GI_raise (sig=sig@entry=6) 1 __GI_abort () 2 ovs_abort_valist () 3 ovs_abort () 4 ovs_mutex_lock_at (where=where@entry=0xad4199 "ofproto/ofproto.c:5391") 5 ofproto_flow_mod_learn () at ofproto/ofproto.c:5391 6 xlate_learn_action () at ofproto/ofproto-dpif-xlate.c:5378 <'learn' action found> 7 do_xlate_actions () at ofproto/ofproto-dpif-xlate.c:6893 8 xlate_recursively ()at ofproto/ofproto-dpif-xlate.c:4233 9 xlate_table_action () at ofproto/ofproto-dpif-xlate.c:4361 10 in xlate_ofpact_resubmit () at ofproto/ofproto-dpif-xlate.c:4672 11 do_xlate_actions () at ofproto/ofproto-dpif-xlate.c:6773 12 xlate_actions ()at ofproto/ofproto-dpif-xlate.c:7570 13 upcall_xlate () at ofproto/ofproto-dpif-upcall.c:1197 14 process_upcall () at ofproto/ofproto-dpif-upcall.c:1413 15 upcall_cb ()at ofproto/ofproto-dpif-upcall.c:1315 16 dp_netdev_upcall (DPIF_UC_MISS) at lib/dpif-netdev.c:6236 17 handle_packet_upcall () at lib/dpif-netdev.c:6591 18 fast_path_processing () at lib/dpif-netdev.c:6709 19 dp_netdev_input__ ()at lib/dpif-netdev.c:6797 20 dp_netdev_recirculate ()at lib/dpif-netdev.c:6842 21 dp_execute_cb ()at lib/dpif-netdev.c:7158 22 odp_execute_actions () at lib/odp-execute.c:794 23 dp_netdev_execute_actions ()at lib/dpif-netdev.c:7332 24 dpif_netdev_execute () at lib/dpif-netdev.c:3725 25 dpif_netdev_operate () at lib/dpif-netdev.c:3756 26 dpif_operate () at lib/dpif.c:1367 27 dpif_execute () at lib/dpif.c:1321 28 packet_execute () at ofproto/ofproto-dpif.c:4760 29 ofproto_packet_out_finish ()at ofproto/ofproto.c:3594 30 handle_packet_out ()at ofproto/ofproto.c:3635 31 handle_single_part_openflow (OFPTYPE_PACKET_OUT) at ofproto/ofproto.c:8400 32 handle_openflow () at ofproto/ofproto.c:8587 33 ofconn_run () 34 connmgr_run () 35 ofproto_run () 36 bridge_run__ () 37 bridge_run () 38 main () Fix that by splitting the 'ofproto_packet_out_finish()' in two parts. First one that translates side-effects and requires holding 'ofproto_mutex' and the second that only pushes packet to datapath. The second part moved out of 'ofproto_mutex' because 'ofproto_mutex' is not required and actually should not be taken in order to avoid recursive locking. Reported-by: Anil Kumar Koli Reported-at: https://mail.openvswitch.org/pipermail/ovs-discuss/2019-April/048494.html Signed-off-by: Ilya Maximets --- Previous discussion about implementation: https://mail.openvswitch.org/pipermail/ovs-dev/2019-August/362082.html I'm not able to reproduce the original issue, so I can not test if this patch fixes it. ofproto/ofproto-dpif.c | 40 -- ofproto/ofproto-provider.h | 12 +--- ofproto/ofproto.c | 29 +++ 3 files changed, 64 insertions(+), 17 deletions(-) diff --git a/ofproto/ofproto-dpif.c b/ofproto/ofproto-dpif.c index c35ec3e61..0466952e2 100644 --- a/ofproto/ofproto-dpif.c +++ b/ofproto/ofproto-dpif.c @@ -4827,12 +4827,13 @@ ofproto_dpif_xcache_execute(struct ofproto_dpif *ofproto, } static void -packet_execute(struct ofproto *ofproto_, struct ofproto_packet_out *opo) +packet_execute_prepare(struct ofproto *ofproto_, + struct ofproto_packet_out *opo) OVS_REQUIRES(ofproto_mutex) { struct ofproto_dpif *ofproto = ofproto_dpif_cast(ofproto_); struct dpif_flow_stats stats; -struct dpif_execute execute; +struct dpif_execute *execute; struct ofproto_dpif_packet_out *aux = opo->aux; ovs_assert(aux); @@ -4841,22 +4842,40 @@ packet_execute(struct ofproto *ofproto_, struct ofproto_packet_out *opo) dpif_flow_stats_extract(opo->flow, opo->packet, time_msec(), &stats); ofproto_dpif_xcache_execute(ofproto, &aux->xcache, &sta