netstart (current apr 1, amd64, re0, bsd.mp - reproducible)

Adam Wolk Sat, 04 Apr 2015 09:03:43 -0700

On Sat, Apr 4, 2015, at 03:36 PM, Adam Wolk wrote:
> On Sat, Apr 4, 2015, at 03:19 PM, Adam Wolk wrote:
> > > > You mentioned earlier some watchdog timeout.  Do you know if you always
> > > > see one when the pool corruption triggers?  You can type "dmesg" at the
> > > > ddb prompt to check if there's any weird message before the panic.
> > > > 
> > > 
> > > No. I only see the watchdog timeout if I don't try to start netstart by
> > > hand.
> > > If it's only started in the boot process due to the existence of
> > > /etc/hostname.re
> > > then after a while I see a blue watchdog timeout for re0 but it does not
> > > result
> > > in the kernel panicking.
> > > 
> > > If I start it manually then I immediately get a kernel panic. By
> > > manually I mean:
> > >  - init tries to netstart, passes without crash but doesn't get an
> > >  address
> > >  - I run sh /etc/netstart immediately after logging in
> > > 
> > > I just did that again to check dmesg from ddb. There's no watchdog entry
> > > there
> > > in this case. Just last boot message (WARNING: / was not properly
> > > unmounted as I
> > > had a hard crash) followed by the ddb 'panic' entry.
> > > 
> > > I can reproduce this issue every time so can help with debugging it with
> > > some
> > > newbie guidance ;)
> > > 
> > 
> > One more thing. It's worth to note that this only happens when re0 is
> > set to dhcp.
> > If I manually assign an address the kernel doesn't panic and once I even
> > had proper
> > connectivity with manual ip setup.
> > 
> > Regards,
> > Adam
> > 
> 
> Ok I think I found a really interesting thing with this. I removed
> /etc/hostname.re0
> booted up and did the test again.
> 
> # echo dhcp > /etc/hotname.re0
> # sh /etc/netstart
> WARNING: /etc/hostname.re0 is insecure, fixing permisions
> DHCPDISCOVER on req0 - interval 3
> DHCPDISCOVER on req0 - interval 8
> DHCPDISCOVER on req0 - interval 15
> DHCPDISCOVER on req0 - interval 10
> DHCPDISCOVER on req0 - interval 15
> DHCPDISCOVER on req0 - interval 10
> No acceptable DHCPOFFERS received.
> No working leases in persistent database - sleeping.
> # sh /etc/netstart
> panic: pool_do_get: mc12k free list modified: page
> 0xffffff00a569d000;......
> SNIP
> 
> So as you can see. The issue is determined by the second run of netstart
> and only
> when it's set to dhcp. Could it be the dhclient holding on to something?
> ddb ps
> shows that two dhclient's are running.
> 
> PS.
> attached the acpidump.
> 
> Regards,
> Adam
> Email had 1 attachment:
> + acpidump.tar.gz
>   1k (application/x-tar-gz)


Final data point. I now believe that the 'freezes' I'm experiencing are
actually
caused by the watchdog tmeout for re0 (even with manual inet config).
Tried several times, each time the "re0: watchdog timedout" is present
in /var/log/messages as the last entry before reboot.

Though I am not dropped into  ddb session, the system becomes just
unresponsive
forcing me into a hard reboot.

I also can't reliably ping my gate with the re0 driver. I managed to
have it
working only once. The router sees the box but the box is unable to ping
the router ('host down').

Re: Lenovo G50-70 : kernel panic for sh /etc/netstart (current apr 1, amd64, re0, bsd.mp - reproducible)

Reply via email to