On 12/17/13 20:42, RD Thrush wrote: > On 12/17/13 20:01, Kenneth R Westerback wrote: >> On Tue, Dec 17, 2013 at 07:32:02PM -0500, RD Thrush wrote: >>> On 11/22/13 12:03, Stuart Henderson wrote: >>>> On 2013/11/22 08:47, RD Thrush wrote: >>>>> On 11/11/13 11:22, Stuart Henderson wrote: >>>>>> On 2013/11/11 09:53, RD Thrush wrote: >>>>>>>> Synopsis: Firewall panic with Nov 10 snapshot >>>>>>>> Category: kernel >>>>>>>> Environment: >>>>>>> System : OpenBSD 5.4 >>>>>>> Details : OpenBSD 5.4-current (GENERIC) #142: Sun Nov 10 >>>>>>> 22:52:49 MST 2013 >>>>>>> >>>>>>> dera...@i386.openbsd.org:/usr/src/sys/arch/i386/compile/GENERIC >>>>>>> Architecture: OpenBSD.i386 >>>>>>> Machine : i386 >>>>>>>> Description: >>>>>>> Soekris 5501 firewall panics an hour after booting new >>>>>>> snapshot. Appended is >>>>>>> some ddb info as well as normal sendbug details. >>>>>>>> How-To-Repeat: >>>>>>> Don't know. >>>>>>>> Fix: >>>>>>> Revert to Nov 7 kernel >>>>>> >>>>>> I've reverted the bpf commit for now, it looks like the change is >>>>>> invalidating >>>>>> assumptions of the conditional around bpf_read()'s tsleep in bpf.c:439 .. >>>>> >>>>> It appears this problem still exists. I've had panic's on 3 machines >>>>> since >>>>> upgrading to the Nov 20 snap (2 amd64, 1 i386). I've attached the report >>>>> from >>>>> the x2 machine and am appending ddb's trace,ps,show registers and callout >>>>> from >>>>> all three. I have the full serial captures if more info is required. >>>> >>>> I've just updated things on my router at home and hit this (with >>>> ladvd), including with the bpf.c commits reverted. >>>> >>>> I'm using "ladvd -Lz" and hit the panic pretty much as soon as it starts. >>> >>> The panic remains with today's snapshot. This problem originated with >>> v1.84 of sys/net/bpf.c. Despite several reverts since, the problem >>> remains. It is easy to reproduce, ie.: >> >> Not sure how we can point the finger at remnants of 1.84 since >> >> [ snip ] >> i.e. only comment changes from 1.83 remain. What rev of bpf.c does work >> for you? >> >> .... Ken > > I believe I had reverted to the Nov. 7 snapshot on the soekris 5501 > successfully. Since then I've tried various combinations of amd64/i386 & > sp/mp, > with newer snaps and reported some of them in this thread. I stopped using > darkstat on the firewall after that initial report. > > I'm afraid I should have said the problem started w/ the Nov 11 snapshot > rather > than incorrectly point the finger at the bpf.c/bpfdesc.h changes. > > Unfortunately, I no longer have any 5.4 snapshots older than Nov 11... > > I do have a crash dump (and can easily reproduce another) but am not able to > analyze them. > > What else can I do to help?
FWIW, I built a GENERIC kernel from cvs as of Nov 11 00:00 GMT and that kernel did *not* panic. I noticed that although bpf.c was reverted, bpfdesc.h was not. Reverting bpfdesc.h to before Nov 11 results in a kernel that passes the darkstat exercise. Here's what I used: Index: bpfdesc.h =================================================================== RCS file: /a8v/pub2/cvsroot/OpenBSD/src/sys/net/bpfdesc.h,v retrieving revision 1.21 diff -w -b -u -r1.21 bpfdesc.h --- bpfdesc.h 12 Nov 2013 01:12:09 -0000 1.21 +++ bpfdesc.h 18 Dec 2013 05:24:05 -0000 @@ -67,8 +67,8 @@ int bd_bufsize; /* absolute length of buffers */ struct bpf_if * bd_bif; /* interface descriptor */ - int bd_rtout; /* Read timeout in 'ticks' */ - int bd_rdStart; /* when the read started */ + u_long bd_rtout; /* Read timeout in 'ticks' */ + u_long bd_rdStart; /* when the read started */ struct bpf_insn *bd_rfilter; /* read filter code */ struct bpf_insn *bd_wfilter; /* write filter code */ u_long bd_rcount; /* number of packets received */