Tracking a Fatal Double Fault

1999-02-08 Thread tcobb
Can someone please give me a short guide
on how to track down a fatal double fault?
System is 3.0-19990205-STABLE, and I've written
down the fault info.

Thanks,

-Troy Cobb
 Circle Net, Inc.
 http://www.circle.net

To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-current" in the body of the message


Re: Tracking a Fatal Double Fault

1999-02-08 Thread Mike Smith
> Can someone please give me a short guide
> on how to track down a fatal double fault?
> System is 3.0-19990205-STABLE, and I've written
> down the fault info.

Ack.  It's actually pretty difficult.  You can start by trying to 
locate the PC for the fault in the kernel image, but the typical cause 
of a double fault is running out of kernel stack. 

Are you running any custom kernel code?

-- 
\\  Sometimes you're ahead,   \\  Mike Smith
\\  sometimes you're behind.  \\  m...@smith.net.au
\\  The race is long, and in the  \\  msm...@freebsd.org
\\  end it's only with yourself.  \\  msm...@cdrom.com



To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-current" in the body of the message


RE: Tracking a Fatal Double Fault

1999-02-08 Thread tcobb
The machine is running a custom kernel, but nothing
very unusual.  My instinct is that it may be related to 
something with the 3c905B 3COM cards that I reported
earlier, I'm trying with Intel EtherExpresses right now
and getting no fault problems.

The double-fault does not occur consistently, unfortunately,
and typically only occurs during my rc.local stuff (loading
a bunch (100+) of chrooted daemons) on boot-up.

Would the eip/esp/ebp values be worth sending?


-Troy Cobb
 Circle Net, Inc.
 http://www.circle.net

>   -Original Message-
>   From: Mike Smith [mailto:m...@smith.net.au]
>   Sent: Monday, February 08, 1999 6:55 PM
>   To: tc...@staff.circle.net
>   Cc: curr...@freebsd.org
>   Subject: Re: Tracking a Fatal Double Fault 
>   
>   
>   > Can someone please give me a short guide
>   > on how to track down a fatal double fault?
>   > System is 3.0-19990205-STABLE, and I've written
>   > down the fault info.
>   
>   Ack.  It's actually pretty difficult.  You can start by trying to 
>   locate the PC for the fault in the kernel image, but the 
>   typical cause 
>   of a double fault is running out of kernel stack. 
>   
>   Are you running any custom kernel code?
>   
>   -- 
>   \\  Sometimes you're ahead,   \\  Mike Smith
>   \\  sometimes you're behind.  \\  m...@smith.net.au
>   \\  The race is long, and in the  \\  msm...@freebsd.org
>   \\  end it's only with yourself.  \\  msm...@cdrom.com
>   
>   

To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-current" in the body of the message


Re: Tracking a Fatal Double Fault

1999-02-08 Thread Mike Smith
> The machine is running a custom kernel, but nothing
> very unusual.  My instinct is that it may be related to 
> something with the 3c905B 3COM cards that I reported
> earlier, I'm trying with Intel EtherExpresses right now
> and getting no fault problems.
> 
> The double-fault does not occur consistently, unfortunately,
> and typically only occurs during my rc.local stuff (loading
> a bunch (100+) of chrooted daemons) on boot-up.
> 
> Would the eip/esp/ebp values be worth sending?

They're meaningless without your kernel, but even then all you're going 
to be able to tell is where in the fault handler things died; you won't 
know the address of the original fault.

There's nothing immediately obvious in the xl driver that would suggest 
that it uses excessive kernel stack either.  8(  Maybe someone has some 
clues on measuring stack usage (or simply on how to increase the kernel 
stack allocation...).


-- 
\\  Sometimes you're ahead,   \\  Mike Smith
\\  sometimes you're behind.  \\  m...@smith.net.au
\\  The race is long, and in the  \\  msm...@freebsd.org
\\  end it's only with yourself.  \\  msm...@cdrom.com



To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-current" in the body of the message


Re: Tracking a Fatal Double Fault

1999-02-08 Thread Chuck Robey
On Mon, 8 Feb 1999, Mike Smith wrote:

> > The machine is running a custom kernel, but nothing
> > very unusual.  My instinct is that it may be related to 
> > something with the 3c905B 3COM cards that I reported
> > earlier, I'm trying with Intel EtherExpresses right now
> > and getting no fault problems.
> > 
> > The double-fault does not occur consistently, unfortunately,
> > and typically only occurs during my rc.local stuff (loading
> > a bunch (100+) of chrooted daemons) on boot-up.
> > 
> > Would the eip/esp/ebp values be worth sending?
> 
> They're meaningless without your kernel, but even then all you're going 
> to be able to tell is where in the fault handler things died; you won't 
> know the address of the original fault.
> 
> There's nothing immediately obvious in the xl driver that would suggest 
> that it uses excessive kernel stack either.  8(  Maybe someone has some 
> clues on measuring stack usage (or simply on how to increase the kernel 
> stack allocation...).

While you guys are on this subject, I'd like to sneak in a question on a
subject that's close by.  A while back, I had a kernel problem, and went
about finding out how to use kgdb and kernel dumps.  Unfortunately, by
the time I was completely ready to do it, the problem (you guys remember
the GPL_MATH_EMULATE thing?) went away.  Just for grins, I'd like to
force a kernel dump, so I can go the rest of the way in making sure my
test setup works, and I can get more used to it.  What's the safest way
to force a kernel dump (hopefully without screwing filesystems)?

+---
Chuck Robey | Interests include any kind of voice or data 
chu...@glue.umd.edu | communications topic, C programming, and Unix.
213 Lakeside Drive Apt T-1  |
Greenbelt, MD 20770 | I run picnic (FreeBSD-current)
(301) 220-2114  | and jaunt (Solaris7).
+---





To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-current" in the body of the message


RE: Tracking a Fatal Double Fault

1999-02-08 Thread tcobb
So a double-fault is always a kernel stack problem?

I find it suspicious that this same machine
also had trouble with the 3c905B flaking out --
dropping packets during an ifconfig alias, and
sometimes never reactivating the interface
according to what tcpdump shows.

The 3c905B problem repeates itself on EVERY machine
that I've them installed into (7 or so), the double-faults
are infrequent on some of the busier machines, and almost
always during the initial boot process.


-Troy Cobb
 Circle Net, Inc.
 http://www.circle.net

>   -Original Message-
>   From: Mike Smith [mailto:m...@smith.net.au]
>   There's nothing immediately obvious in the xl driver that 
>   would suggest 
>   that it uses excessive kernel stack either.  8(  Maybe 
>   someone has some 
>   clues on measuring stack usage (or simply on how to 
>   increase the kernel 
>   stack allocation...).
>   
>   
>   -- 
>   \\  Sometimes you're ahead,   \\  Mike Smith
>   \\  sometimes you're behind.  \\  m...@smith.net.au
>   \\  The race is long, and in the  \\  msm...@freebsd.org
>   \\  end it's only with yourself.  \\  msm...@cdrom.com
>   
>   

To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-current" in the body of the message