Ben Pfaff <b...@ovn.org> wrote on 07/22/2016 02:21:01 PM: > From: Ben Pfaff <b...@ovn.org> > To: Ryan Moats/Omaha/IBM@IBMUS > Cc: Guru Shetty <g...@ovn.org>, ovs dev <dev@openvswitch.org> > Date: 07/22/2016 02:21 PM > Subject: Re: [ovs-dev] ovn test failures > > On Fri, Jul 22, 2016 at 01:52:18PM -0500, Ryan Moats wrote: > > > > > > Guru Shetty <g...@ovn.org> wrote on 07/22/2016 12:31:43 PM: > > > > > From: Guru Shetty <g...@ovn.org> > > > To: Ryan Moats/Omaha/IBM@IBMUS > > > Cc: Lance Richardson <lrich...@redhat.com>, ovs dev <dev@openvswitch.org> > > > Date: 07/22/2016 12:31 PM > > > Subject: Re: [ovs-dev] ovn test failures > > > > > > On 21 July 2016 at 06:05, Ryan Moats <rmo...@us.ibm.com> wrote: > > > "dev" <dev-boun...@openvswitch.org> wrote on 07/21/2016 06:32:02 AM: > > > > > > > From: Lance Richardson <lrich...@redhat.com> > > > > To: ovs dev <dev@openvswitch.org> > > > > Date: 07/21/2016 06:32 AM > > > > Subject: [ovs-dev] ovn test failures > > > > Sent by: "dev" <dev-boun...@openvswitch.org> > > > > > > > > It seems the failure rate for OVN end-to-end tests went up > > significantly > > > > when commit 70c7cfef188b5ae9940abd5b7d9fe46b1fa88c8e was merged earlier > > > > this week. > > > > > > > > After this commit, 100 iterations of "make check TESTSUITEFLAGs='-j8 -k > > > ovn'" > > > > gave (number of failures in left-most column): > > > > 2 2179: ovn -- vtep: 3 HVs, 1 VIFs/HV, 1 GW, 1 LS FAILED > > > > (ovn.at:1312) > > > > 10 2183: ovn -- 2 HVs, 2 LS, 1 lport/LS, 2 peer LRs FAILED > > > > (ovn.at:2416) > > > > 52 2184: ovn -- 1 HV, 1 LS, 2 lport/LS, 1 LR FAILED > > > > (ovn.at:2529) > > > > 45 2185: ovn -- 1 HV, 2 LSs, 1 lport/LS, 1 LR FAILED > > > > (ovn.at:2668) > > > > 23 2186: ovn -- 2 HVs, 3 LS, 1 lport/LS, 2 peer LRs, static > > > > routes FAILED (ovn.at:2819) > > > > 53 2188: ovn -- 2 HVs, 3 LRs connected via LS, static routes > > > > FAILED (ovn.at:3053) > > > > 32 2189: ovn -- 2 HVs, 2 LRs connected via LS, gateway router > > > > FAILED (ovn.at:3237) > > > > 50 2190: ovn -- icmp_reply: 1 HVs, 2 LSs, 1 lport/LS, 1 LR > > > > FAILED (ovn.at:3389) > > > > > > > > Immediately prior to this (at commit > > > > 48ff3e25abe31b761d2d3f3a2fd6ccaa783c79dc), > > > > the number of failures per 100 iterations was much lower: > > > > 1 2178: ovn -- 2 HVs, 4 lports/HV, localnet ports FAILED > > > > (ovn.at:1020) > > > > 1 2179: ovn -- vtep: 3 HVs, 1 VIFs/HV, 1 GW, 1 LS FAILED > > > > (ovn.at:1307) > > > > 1 2179: ovn -- vtep: 3 HVs, 1 VIFs/HV, 1 GW, 1 LS FAILED > > > > (ovn.at:1312) > > > > 9 2184: ovn -- 1 HV, 1 LS, 2 lport/LS, 1 LR FAILED > > > > (ovn.at:2529) > > > > 7 2186: ovn -- 2 HVs, 3 LS, 1 lport/LS, 2 peer LRs, static > > > > routes FAILED (ovn.at:2819) > > > > 1 2187: ovn -- send gratuitous arp on localnet FAILED > > > > (ovn.at:2874) > > > > 16 2188: ovn -- 2 HVs, 3 LRs connected via LS, static routes > > > > FAILED (ovn.at:3053) > > > > > > > > Any ideas? > > > > > > > > Thanks, > > > > > > > > Lance > > > > > As author of that patch, I will admit that those numbers are a > > > bit disturbing, because they aren't consistent with what I was > > > seeing while developing and testing the patch series. > > > > > > What they make me suspect is that that patches doesn't catch all > > > state transitions (similar to what you uncovered with commit > > > f94705d729459d808fd139c8f95d5f1f8d8becc6) correctly. > > > > > > Two things come to mind: > > > 1) Make sure that all of the places where the code needs to request > > > a full process of tables are correctly handled. > > > 2) If a later step in the process finds that an earlier step in > > > the process needs to process the database rows fully during the > > > next cycle, use poll_immediate_wake so that processing happens > > > sooner than later. > > > Ryan, > > > Were you planning to look at the failures? Should we revert the patch? > > > > > > > > > > Guru- > > > > Yes, I have been looking at the failures since Wed and I have a patch set > > that > > addresses all of these failures. However, I'm travelling today, so I won't > > be able > > to mail it until either late tonight or tomorrow morning (US Central Time). > > We'll look forward to it. I think that these are probably affecting > everyone who regularly runs the tests. It'll be nice to get them fixed > soon.
I had time to make it happen from the Austin airport - http://openvswitch.org/pipermail/dev/2016-July/075942.html _______________________________________________ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev