王亮 Liang Wang <wanglian...@didiglobal.com> writes:

> Hi, Conole. Thank you for pointing out the mistakes in my last patch. Yes, I 
> did missed some something important.
> The correct and complete way to fix this problem is like the following patch. 
> In fact, we have found and fixed this
> crash problem one year ago; and this patch has worked very well in our 
> production environment for 11 months.
>
> From 7e13a39cdbdfb484d4cb6b074f08471dc16d22b7 Mon Sep 17 00:00:00 2001
> From: Wang Liang <wanglian...@didiglobal.com>
> Date: Thu, 15 Apr 2021 13:52:01 +0800
> Subject: [PATCH] Fix userspace datapath crash caused by IP fragments
>
> The ovs userspace datapath will crash on openflow packet-out message
> with IP packet largger than MTU. This patch will fix the problem.
> When pre-processing the IP fragments, clone the packet if it has
> 'dnsteal' flag set.
>
> Signed-off-by: Wang Liang <wanglian...@didiglobal.com>
> ---
>  lib/ipf.c | 6 ++----
>  1 file changed, 2 insertions(+), 4 deletions(-)
>
> diff --git a/lib/ipf.c b/lib/ipf.c
> index c20bcc0b3..176140afb 100644
> --- a/lib/ipf.c
> +++ b/lib/ipf.c
> @@ -811,7 +811,7 @@ ipf_process_frag(struct ipf *ipf, struct ipf_list 
> *ipf_list,
>               * recommend not setting the mempool number of buffers too low
>               * and also clamp the number of fragments. */
>              struct ipf_frag *frag = &ipf_list->frag_list[last_inuse_idx + 1];
> -            frag->pkt = pkt;
> +            frag->pkt = dnsteal ? dp_packet_clone(pkt) : pkt;
>              frag->start_data_byte = start_data_byte;
>              frag->end_data_byte = end_data_byte;
>              frag->dnsteal = dnsteal;
> @@ -1338,9 +1338,7 @@ ipf_destroy(struct ipf *ipf)
>          while (ipf_list->last_sent_idx < ipf_list->last_inuse_idx) {
>              struct dp_packet *pkt
>                  = ipf_list->frag_list[ipf_list->last_sent_idx + 1].pkt;
> -            if (!ipf_list->frag_list[ipf_list->last_sent_idx + 1].dnsteal) {
> -                dp_packet_delete(pkt);
> -            }
> +            dp_packet_delete(pkt);

This looks better.

There are no more uses of (struct ipf_frag *)->dnsteal - so I think we should
remove it from the struct as well.

I know you wrote that you've been running this in production for a year,
and I think the change looks correct, but is it possible to have a test
case (maybe even a pcap that we can extract) to go along with it?

>              atomic_count_dec(&ipf->nfrag);
>              ipf_list->last_sent_idx++;
>          }

_______________________________________________
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to