On Mon, May 15, 2017 at 11:28 AM, Mike Belopuhov <m...@belopuhov.com> wrote:

> On Mon, May 15, 2017 at 11:18 -0400, Dan Cross wrote:
> > On Mon, May 15, 2017 at 11:01 AM, Mike Belopuhov <m...@belopuhov.com>
> wrote:
> > >
> > > Thanks for reporting this, however there's not enough info to follow
> > > up on this right now.  What is clear is that your provider is using
> > > an ancient version of Xen that doesn't even support the callback
> > > vector interrupt delivery (the emulated xspd0 device is delivering
> > > all interrupts).  We have developed code for Xen 4.5+ platforms and
> > > there was only some testing done by users on 3.x.  So, in a way, you
> > > can consider Xen 3.x to not be officially supported at this point.
> >
> > That's unfortunate. Sadly, this is common across two different providers
> > (Panix and rootbsd.net). The latter, I'm sure, would at least be
> interested
> > in coordinating with you guys to get a fix. I'll open a trouble ticket
> with
> > them.
> >
> > Having said that, I've got a few questions:
> > >
> > >  - Do you see other write failures as well?
> >
> > Yes. E.g, syslogd had a similar write failure before panic.
>
> Can you reproduce any of these write failures at will?
>

I'm not sure what you mean. If I induce the load conditions, then the VM
will panic fairly reliably.

What happens when you just send a signal to dump the core?
> You can test this by running "sleep 100", and then call
> "pkill -ABRT -lf sleep".


I'm not sure what this shows, but sure I can do that:

: jaan; /bin/sleep 100&
[1] 20701
: jaan; pkill -ABRT -lf sleep
20701 sleep
: jaan;
[1]  + abort (core dumped)  /bin/sleep 100
: jaan; ls -l sleep.core
-rw-------  1 cross  staff  4208416 May 15 15:42 sleep.core
: jaan;

The panic-inducing condition seems to be that, for whatever reason, the
kernel gets into a funny state where processes like init(8) die due to
having part of their VM image corrupted; the kernel then panics because
`init` dies.

>  - Do you have swap enabled? (pstat -s)
> >
> >
> > Yes; a gig:
> >
> > : jaan; pstat -s
> > Device      1K-blocks     Used    Avail Capacity  Priority
> > /dev/sd0b     1048249        0  1048249     0%    0
> > : jaan;
> >
>
> Do you see swap being used under your load?


I'm not sure. I can try and crash a machine again and see poke at a kernel
var from ddb to see; anything in particular you want me to look at?

>  - Do you see crashes when bsd.mp is used instead of a single processor
> >
> >    kernel (that's right, even on the single processor VM)?
> > >
> >
> > Yes; the panic happens whether using single- or multi-processor kernels.
>
> Good, nothing has slipped through those cracks again.
>

I can see the value in narrowing down the search space. :-)

        - Dan C.

Reply via email to