----- Original Message -----
> From: "Ryan Moats" <rmo...@us.ibm.com>
> To: "Lance Richardson" <lrich...@redhat.com>
> Cc: dev@openvswitch.org
> Sent: Monday, July 18, 2016 11:36:27 PM
> Subject: Re: [ovs-dev] [PATCH] netdev-dummy: fix crash with more than one     
> passive connection
> 
> "dev" <dev-boun...@openvswitch.org> wrote on 07/18/2016 11:29:28 AM:
> 
> >
> > > From: "Lance Richardson" <lrich...@redhat.com>
> > > To: dev@openvswitch.org
> > > Sent: Wednesday, July 6, 2016 7:39:52 PM
> > > Subject: [ovs-dev] [PATCH] netdev-dummy: fix crash with more than
> > one   passive connection
> > >
> > > Investigation found that Some of the occasional failures in the
> > > "ovn -- vtep: 3 HVs, 1 VIFs/HV, 1 GW, 1 LS" test case are caused
> > > by ovs-vswitchd crashing with SIGSEGV. It turns out that the
> > > crash occurrs when the number of netdev-dummy passive connections
> > > transitions from 1 to 2.  When xrealloc() copies the array of
> > > dummy_packet_stream structures from the original buffer to a
> > > newly allocated one, the struct ovs_list txq member of the structure
> > > becomes corrupt (e.g. if ovs_list_is_empty() would have returned
> > > false before the copy, it will return true after the copy, which
> > > will lead to a crash when the bogus packet buffer on the list is
> > > dereferenced).
> > >
> > > Fix by taking a hint from David Wheeler and adding a level of
> > > indirection.
> > >
> > > Signed-off-by: Lance Richardson <lrich...@redhat.com>
> 
> [snip]
> 
> >
> > Here is a small script that reliably reproduces the crash in
> ovs-vswitchd.
> > I don't have an explanation for why we have two connections to the same
> > port in the "ovn -- vtep: 3 HVs, 1 VIFs/HV, 1 GW, 1 LS" test case (this
> > happens infrequently), perhaps it's something like the active connect
> from
> > a peer ovs-vswitchd being interrupted and re-tried.
> >
> > #!/bin/bash
> >
> > set -ex
> >
> > PWD=$(pwd)
> > export PATH=${PWD}/vswitchd:${PWD}/utilities:${PWD}/ovsdb:$PATH
> >
> > ovs_setenv() {
> >     export OVS_RUNDIR=${PWD}/$1
> >     export OVS_LOGDIR=${PWD}/$1
> >     export OVS_DBDIR=${PWD}/$1
> >     export OVS_SYSCONFDIR=${PWD}/$1
> >     export OVS_PKGDATADIR=${PWD}/$1
> > }
> >
> > for x in repro_main repro_hv1 repro_hv2; do
> >     mkdir -p $x
> >     rm -f $x/*
> >     ovs_setenv $x
> >     : > $OVS_RUNDIR/.conf.db.~lock~
> >     ovsdb-tool create $OVS_RUNDIR/conf.db vswitchd/vswitch.ovsschema
> >     ovsdb-server --remote=punix:$OVS_RUNDIR/db.sock  -vconsole:off
> > --detach --no-chdir --pidfile --log-file
> >     ovs-vsctl --no-wait -- init
> >     ovs-vswitchd --enable-dummy=system -vvconn -vofproto_dpif -
> > vunixctl -vconsole:off --detach --no-chdir --pidfile --log-file
> > done
> >
> > ovs_setenv repro_main
> > ovs-vsctl add-br foo  \
> >           -- add-port foo p1 \
> >           -- set Interface p1
> options:pstream="punix:$PWD/repro_main/p1.sock"
> >
> > ovs_setenv repro_hv1
> > ovs-vsctl add-br br1 \
> >           -- add-port br1 p1 \
> >           -- set Interface p1
> options:stream="unix:$PWD/repro_main/p1.sock"
> >
> > ovs_setenv repro_hv2
> > ovs-vsctl add-br br1 \
> >           -- add-port br1 p1 \
> >           -- set Interface p1
> options:stream="unix:$PWD/repro_main/p1.sock"
> 
> I like this, but I think I'm going to be consistent and ask if you
> can spin a new version with the above script as a unit test so that
> we can see the crash before applying the rest of the patch and then
> verify that it doesn't happen with the patch.
> 
> Ryan
> 

Hi Ryan,

Thanks for the review.  I'm a little unclear on whether it's a good idea
to add a test case that doesn't verify any functionality and is no longer
useful as soon as it is merged.  I was rather hoping that the script provided
above would serve the "see the crash before applying" function.

I'm fine with writing and posting such a test case if that's the policy,
but I do wonder how many new test cases we'll have if one is written for
every crash that has been fixed, and how effective this will be at detecting
future crash causes given the enormous number of ways code can be changed
to produce a crash.

On the other hand it probably would be a good idea to add test cases to
verify netdev-dummy functionality (especially for things like multiple
passive connection support that aren't already used in other test cases).
This is probably a larger effort though.

Thanks,

   Lance
_______________________________________________
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

Reply via email to