On Sat, 2017-05-27 at 01:43 +0000, wangyunjian wrote:
> 
> > -----Original Message-----
> > From: ovs-dev-boun...@openvswitch.org [mailto:ovs-dev-
> > boun...@openvswitch.org] On Behalf Of Greg Rose
> > Sent: Friday, May 26, 2017 11:51 PM
> > To: ovs-dev@openvswitch.org
> > Subject: Re: [ovs-dev] [PATCH v2] netflow: Fix memory leak in
> > netflow_unref.
> > 
> > On Thu, 2017-05-25 at 21:05 -0700, Greg Rose wrote:
> > > On Thu, 2017-05-25 at 09:53 -0700, Greg Rose wrote:
> > > > On Mon, 2017-05-22 at 07:40 -0700, Greg Rose wrote:
> > > > > On Mon, 2017-05-22 at 12:55 +0800, Yunjian Wang wrote:
> > > > > > The memory leak was triggered each time on calling
> > > > > > netflow_unref() with containing netflow_flows. And flows need to be
> > removed and destroyed.
> > > > > >
> > > > > > Signed-off-by: Yunjian Wang <wangyunj...@huawei.com>
> > > > > > ---
> > > > > >  ofproto/netflow.c | 7 +++++++
> > > > > >  1 file changed, 7 insertions(+)
> > > > > >
> > > > > > diff --git a/ofproto/netflow.c b/ofproto/netflow.c index
> > > > > > 55f7814..29c5f3e 100644
> > > > > > --- a/ofproto/netflow.c
> > > > > > +++ b/ofproto/netflow.c
> > > > > > @@ -409,10 +409,17 @@ netflow_ref(const struct netflow *nf_)
> > > > > > void  netflow_unref(struct netflow *nf)  {
> > > > > > +    struct netflow_flow *nf_flow, *next;
> > > > > > +
> > > > > >      if (nf && ovs_refcount_unref_relaxed(&nf->ref_cnt) == 1) {
> > > > > >          atomic_count_dec(&netflow_count);
> > > > > >          collectors_destroy(nf->collectors);
> > > > > >          ofpbuf_uninit(&nf->packet);
> > > > > > +        HMAP_FOR_EACH_SAFE (nf_flow, next, hmap_node, &nf-
> > >flows) {
> > > > > > +            hmap_remove(&nf->flows, &nf_flow->hmap_node);
> > > > > > +            free(nf_flow);
> > > > > > +        }
> > > > > > +        hmap_destroy(&nf->flows);
> > > > > >          free(nf);
> > > > > >      }
> > > > > >  }
> > > > >
> > > > > This looks right to me.  The only other place I see the flows
> > > > > freed is when they're detected as idle.  If the flow is never
> > > > > detected as idle then I don't see anywhere else that they are
> > > > > freed up after the xzalloc in netflow_flow_update().
> > > > >
> > > > > Reviewed-by: Greg Rose <gvrose8...@gmail.com>
> > > > >
> > > >
> > > > I'm trying to test this but the condition never seems to be met and
> > > > thus the 4 lines of additional code free flows never executes.  Is
> > > > there some particular flow or type of network traffic that will execute 
> > > > this
> > code?
> > > >
> > > > I'd like to add a Tested-by for this but unless I can get the code
> > > > to execute I can't.
> > > >
> > > > - Greg
> > >
> > > OK, I've finally got my netflow configuration working right with
> > > ntopng collecting the flows.  I can test the patch tomorrow, it's
> > > getting late for me now.
> > >
> > > Thanks,
> > >
> > > - Greg
> > 
> > Actually, ntopng doesn't work as the flow collector itself (I spoke too
> > soon) but I am successfully creating netflows which I can see with tcpdump.
> > Then we are hitting the condition net netflow_unref() when I execute this
> > command:
> > 
> > ovs-vsctl clear Bridge int-br1 netflow
> > 
> > However, the '*' marked lines of code are never executed when I place a gdb
> > breakpoint on it:
> > 
> >         HMAP_FOR_EACH_SAFE (nf_flow, next, hmap_node, &nf->flows) {
> > *            hmap_remove(&nf->flows, &nf_flow->hmap_node);
> > *            free(nf_flow);
> >         }
> > 
> > It seems the code is never reached.  Unless we can see some example of the
> > code being reached and executed I worry about this patch.
> > 
> > Thanks,
> > 
> > - Greg
> 
> It is probable that the need to test many times.
> 
> Thanks, 
> 
> Yunjian
> 
> my ovs version: openvswitch-2.7.0
> 
> Test Script:
> ovs-vsctl add-br ovs
> ovs-vsctl add-port ovs eth1
> 
> for (( i=0; i<5000000; i=i+1 )); do
>       ovs-vsctl -- --id=@netflow create netflow targets=\"3.3.2.26:9996\" 
> active_timeout=30 -- set bridge ovs netflow=@netflow
>       sleep 3
>       ovs-vsctl clear bridge ovs netflow
>       sleep 1
> done
> 
> Test results:
> (gdb) b ofproto/netflow.c:419
> Breakpoint 1 at 0x4aeb74: file ofproto/netflow.c, line 419.
> (gdb) c
> Continuing.
> [Switching to Thread 0x7f09ed273700 (LWP 19046)]
> 
> Breakpoint 1, netflow_unref (nf=0x408dbd0) at ofproto/netflow.c:419
> 419               hmap_remove(&nf->flows, &nf_flow->hmap_node);
> (gdb) bt
> #0  netflow_unref (nf=0x408dbd0) at ofproto/netflow.c:419
> #1  0x00000000004a4347 in xlate_cache_clear_netflow (netflow=0x408dbd0, 
> flow=0x7f09dc011a90) at ofproto/ofproto-dpif-xlate-cache.c:200
> #2  0x00000000004a43e2 in xlate_cache_clear_entry (entry=0x7f09dc010d08) at 
> ofproto/ofproto-dpif-xlate-cache.c:221
> #3  0x00000000004a44dc in xlate_cache_clear (xcache=0x7f09dc01ca90) at 
> ofproto/ofproto-dpif-xlate-cache.c:262
> #4  0x00000000004a451e in xlate_cache_uninit (xcache=0x7f09dc01ca90) at 
> ofproto/ofproto-dpif-xlate-cache.c:271
> #5  0x00000000004a4544 in xlate_cache_delete (xcache=0x7f09dc01ca90) at 
> ofproto/ofproto-dpif-xlate-cache.c:278
> #6  0x00000000004911d9 in ukey_delete__ (ukey=0x40987d0) at 
> ofproto/ofproto-dpif-upcall.c:1801
> #7  0x000000000056c9f5 in ovsrcu_call_postponed () at lib/ovs-rcu.c:323
> #8  0x000000000056ca9b in ovsrcu_postpone_thread (arg=0x0) at 
> lib/ovs-rcu.c:338
> #9  0x000000000056efa0 in ovsthread_wrapper (aux_=0x3fa3c30) at 
> lib/ovs-thread.c:348
> #10 0x00007f09f1e44dc5 in start_thread () from /usr/lib64/libpthread.so.0
> #11 0x00007f09f0c0a71d in clone () from /usr/lib64/libc.so.6
> (gdb) l
> 414       if (nf && ovs_refcount_unref_relaxed(&nf->ref_cnt) == 1) {
> 415           atomic_count_dec(&netflow_count);
> 416           collectors_destroy(nf->collectors);
> 417           ofpbuf_uninit(&nf->packet);
> 418                     HMAP_FOR_EACH_SAFE (nf_flow, next, hmap_node, 
> &nf->flows) {
> 419               hmap_remove(&nf->flows, &nf_flow->hmap_node);
> 420               free(nf_flow);
> 421           }
> 422           free(nf);
> 423       }
> (gdb) info b
> Num     Type           Disp Enb Address            What
> 1       breakpoint     keep y   0x00000000004aeb74 in netflow_unref at 
> ofproto/netflow.c:419
>       breakpoint already hit 1 time
> (gdb) p nf->flows 
> $1 = {buckets = 0x408dc50, one = 0x3fd1fd0, mask = 0, n = 1}
> (gdb)

I haven't been able to hit this breakpoint in my own setup after running
this test for several days.  It may be due to different types of traffic
or some other subtle difference.  I'm running iperf sessions with
multiple ports/ip address combinations to create a lot of netflows and
netflow updates but perhaps that is not enough to trigger the right
condition.

In any case, the code looks correct as per my review and Yunjian has a
reproducer so I'll leave it to the maintainers as to what action they
will take.  I know that there were some further comments from Joe and
Ben.

Thanks,

- Greg

> 
> > 
> > _______________________________________________
> > dev mailing list
> > d...@openvswitch.org
> > https://mail.openvswitch.org/mailman/listinfo/ovs-dev



_______________________________________________
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to