More info (was Re: 4.11 panic every 23 hours 55 minutes or so)

2009-05-22 Thread Doug Lee
On Sun, May 17, 2009 at 07:06:57AM -0400, Doug Lee wrote:
> One of the weirder things I've seen in a while here...
> 
> OS: FreeBSD 4.11 (yeah I know, old, but generally stable)
> CPU: Intel(R) Pentium(R) 4 CPU 2.00GHz
> real memory  = 536608768 (524032K bytes)
> Hds: IDE
> 
> Problem:  Ever since a suspitious power outage (I say suspitious
> because we think a surge was also involved), this box has been
> exhibiting kernel panics about every 23 hours 55 minutes, give or
> take about 4 minutes either way.  Obviously hardware is suspect,
> and hopefully in line for upgrade; but as FreeBSD has always proven
> so stable for me, I'm curious what on earth could cause this sort
> of regular panic?
> 
> It's not time of day; if I reboot at 2:00 AM, 3:55 PM, or any other
> time, it's 23:55 or so later I get a panic, whenever that may be.
> I think this rules out cron jobs, external attacks, and load-based
> issues.

Update: I killed mysqld, four nfsiods, Apache2, mpd, and maybe a
couple more no-longer-needed processes two mornings ago.  I also
disabled them at that time in rc.conf.  the next morning, the system
restarted with a panic as usual, BUT...

This morning, on the first boot that never ran all those processes, I
have not seen a restart yet, and we're at 1 day 1 hour as I speak.

I looked in /var/at earlier in the week and never found any scheduled
jobs.  It shouldn't be Cron, since it's sensitive to boot time, not
clock time.

Is there some way one of those processes, like mysqld, could be
scheduling an event to occur 24 hours after launch, without using
`at', and without having to be running 24 hours later?  Example:
Could mysqld schedule something without `at' that will run 24 hours
after mysqld starts even if mysqld is no longer running?

Also, is it even possible that any process could cause a  kernel-mode
page fault without there being damaged hardware?  Example:  Could some
mysql file be so corrupt that it would panic a perfectly fine machine?
I should hope not, but I wonder.


-- 
Doug Lee d...@dlee.orghttp://www.dlee.org
SSB BART Group   doug@ssbbartgroup.com   http://www.ssbbartgroup.com
"When there is no enemy within, the enemies outside cannot hurt you."
--African Proverb
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to "freebsd-questions-unsubscr...@freebsd.org"


Re: 4.11 panic every 23 hours 55 minutes or so

2009-05-18 Thread D C
On Sun, May 17, 2009 at 10:38 AM, Mark  wrote:

> -Original Message-
> From: owner-freebsd-questi...@freebsd.org
> [mailto:owner-freebsd-questi...@freebsd.org] On Behalf Of Mark Tinguely
> Sent: zondag 17 mei 2009 17:30
> To: d...@dlee.org
> Cc: freebsd-questions@freebsd.org
> Subject: Re: 4.11 panic every 23 hours 55 minutes or so
>
> > Why does the panic happen 24 hours after the last reboot; who
> > knows and one may never know. Are you finally hitting, the bad
> > RAM? Are you finally getting the computer over-warm to act
> > up?
>
> Or running a heavy process around the time of the panic, which
> may (now) draw too much power for his seemingly damaged
> (PSU|DDR[12]|MOTHERBARD|WHATEVER)?
>
> - Mark
>


MySQL displayed the most processor time in your top output-- have you
checked it for daily maint jobs, like index rebuilds?

>
> ___
> freebsd-questions@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-questions
> To unsubscribe, send any mail to "
> freebsd-questions-unsubscr...@freebsd.org"
>
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to "freebsd-questions-unsubscr...@freebsd.org"


Re: 4.11 panic every 23 hours 55 minutes or so

2009-05-17 Thread Odhiambo ワシントン
On Sun, May 17, 2009 at 2:06 PM, Doug Lee  wrote:

> One of the weirder things I've seen in a while here...
>
> OS: FreeBSD 4.11 (yeah I know, old, but generally stable)
> CPU: Intel(R) Pentium(R) 4 CPU 2.00GHz
> real memory  = 536608768 (524032K bytes)
> Hds: IDE
>
> Problem:  Ever since a suspitious power outage (I say suspitious
> because we think a surge was also involved), this box has been
> exhibiting kernel panics about every 23 hours 55 minutes, give or
> take about 4 minutes either way.


Is there something in the cron that runs at around that time?

-- 
Best regards,
Odhiambo WASHINGTON,
Nairobi,KE
+254733744121/+254722743223
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
"Clothes make the man.  Naked people have little or no influence on
society."
  -- Mark Twain
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to "freebsd-questions-unsubscr...@freebsd.org"


RE: 4.11 panic every 23 hours 55 minutes or so

2009-05-17 Thread Mark
-Original Message-
From: owner-freebsd-questi...@freebsd.org
[mailto:owner-freebsd-questi...@freebsd.org] On Behalf Of Mark Tinguely
Sent: zondag 17 mei 2009 17:30
To: d...@dlee.org
Cc: freebsd-questions@freebsd.org
Subject: Re: 4.11 panic every 23 hours 55 minutes or so

> Why does the panic happen 24 hours after the last reboot; who
> knows and one may never know. Are you finally hitting, the bad
> RAM? Are you finally getting the computer over-warm to act
> up?

Or running a heavy process around the time of the panic, which
may (now) draw too much power for his seemingly damaged
(PSU|DDR[12]|MOTHERBARD|WHATEVER)?

- Mark

___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to "freebsd-questions-unsubscr...@freebsd.org"


Re: 4.11 panic every 23 hours 55 minutes or so

2009-05-17 Thread Mark Tinguely

Trap 12 are usually hardware related many times RAM.

Since it happened after bad power, I would besides testing your RAM but
make sure your power supply and fans are operating normally.

Why does the panic happen 24 hours after the last reboot; who knows and
one may never know. Are you finally hitting, the bad RAM? Are you finally
getting the computer over-warm to act up?

--Mark.
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to "freebsd-questions-unsubscr...@freebsd.org"


Re: 4.11 panic every 23 hours 55 minutes or so

2009-05-17 Thread Doug Lee
On Sun, May 17, 2009 at 09:42:05AM -0400, Mehmet Erol Sanliturk wrote:
>Another problem may be as follows :
>I am living an area nearby to industrial factories .
>When they are started or stopped . they are causing important
>fluctuation in my home current in such a way that even uninterruptible
>power supplies are becoming not able to balance their effects .
>Such an effect may be present in your area . In that hour regularly
>such a system may start and cause a current fluctuation that it may
>boot your computer(s) .

That might explain the initial surge (UPSes are indeed in effect in
this office), but it won't explain the panics themselves, since they
clearly occur relative to boot time, not to real time.

-- 
Doug Lee d...@dlee.orghttp://www.dlee.org
SSB BART Group   doug@ssbbartgroup.com   http://www.ssbbartgroup.com
"Nearly all men can stand adversity, but if you want to test a man's
character, give him power." -Abraham Lincoln
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to "freebsd-questions-unsubscr...@freebsd.org"


Re: 4.11 panic every 23 hours 55 minutes or so

2009-05-17 Thread Mehmet Erol Sanliturk
On Sun, May 17, 2009 at 9:13 AM, Doug Lee  wrote:

> On Sun, May 17, 2009 at 07:39:46AM -0400, Glen Barber wrote:
> > On Sun, May 17, 2009 at 7:06 AM, Doug Lee  wrote:
> > > One of the weirder things I've seen in a while here...
> > >
> > > OS: FreeBSD 4.11 (yeah I know, old, but generally stable)
> > > CPU: Intel(R) Pentium(R) 4 CPU 2.00GHz
> > > real memory ?= 536608768 (524032K bytes)
> > > Hds: IDE
> > >
>
> > Do you by chance have the kernel built with debugging enabled?
>
> Afraid not, nor much space in / for that.  I partitioned this system
> before /modules arrived, and I barely have enough space in / now
> (about 3 meg free).  That shouldn't affect this issue though; I do
> have separate /usr, /var, and /tmp.  I do mount /tmp and /var/run via
> MFS.
>
> > > Problem: ?Ever since a suspitious power outage (I say suspitious
> > > because we think a surge was also involved), this box has been
> > > exhibiting kernel panics about every 23 hours 55 minutes, give or
> > > take about 4 minutes either way. ?Obviously hardware is suspect,
> > > and hopefully in line for upgrade; but as FreeBSD has always proven
> > > so stable for me, I'm curious what on earth could cause this sort
> > > of regular panic?
> > >
> > > It's not time of day; if I reboot at 2:00 AM, 3:55 PM, or any other
> > > time, it's 23:55 or so later I get a panic, whenever that may be.
> > > I think this rules out cron jobs, external attacks, and load-based
> > > issues.
> > >
>
> > Perhaps a bad CMOS battery causing the system time to become
> > corrupted?  (I know it's a long shot...)
>
> Interesting idea, though I'd be surprised since I think the system
> time is set via ntpd, is it not?  `date' seems to recover nicely every
> time anyway.  A power surge could indeed play with CMOS though... but
> how would I test for this while the system is running?
>
> --
> Doug Lee d...@dlee.orghttp://www.dlee.org
> SSB BART Group   doug@ssbbartgroup.com
> http://www.ssbbartgroup.com
> "Pray devoutly, but hammer stoutly."
> --Sir William G. Benham
>



Another problem may be as follows :

I am living an area nearby to industrial factories .

When they are started or stopped . they are causing important fluctuation in
my home current in such a way that even uninterruptible power supplies are
becoming not able to balance their effects .

Such an effect may be present in your area . In that hour regularly such a
system may start and cause a current fluctuation that it may boot your
computer(s) .


Thank you very much .

Mehmet Erol Sanliturk
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to "freebsd-questions-unsubscr...@freebsd.org"


Re: 4.11 panic every 23 hours 55 minutes or so

2009-05-17 Thread Doug Lee
On Sun, May 17, 2009 at 07:39:46AM -0400, Glen Barber wrote:
> On Sun, May 17, 2009 at 7:06 AM, Doug Lee  wrote:
> > One of the weirder things I've seen in a while here...
> >
> > OS: FreeBSD 4.11 (yeah I know, old, but generally stable)
> > CPU: Intel(R) Pentium(R) 4 CPU 2.00GHz
> > real memory ?= 536608768 (524032K bytes)
> > Hds: IDE
> >

> Do you by chance have the kernel built with debugging enabled?

Afraid not, nor much space in / for that.  I partitioned this system
before /modules arrived, and I barely have enough space in / now
(about 3 meg free).  That shouldn't affect this issue though; I do
have separate /usr, /var, and /tmp.  I do mount /tmp and /var/run via
MFS.

> > Problem: ?Ever since a suspitious power outage (I say suspitious
> > because we think a surge was also involved), this box has been
> > exhibiting kernel panics about every 23 hours 55 minutes, give or
> > take about 4 minutes either way. ?Obviously hardware is suspect,
> > and hopefully in line for upgrade; but as FreeBSD has always proven
> > so stable for me, I'm curious what on earth could cause this sort
> > of regular panic?
> >
> > It's not time of day; if I reboot at 2:00 AM, 3:55 PM, or any other
> > time, it's 23:55 or so later I get a panic, whenever that may be.
> > I think this rules out cron jobs, external attacks, and load-based
> > issues.
> >

> Perhaps a bad CMOS battery causing the system time to become
> corrupted?  (I know it's a long shot...)

Interesting idea, though I'd be surprised since I think the system
time is set via ntpd, is it not?  `date' seems to recover nicely every
time anyway.  A power surge could indeed play with CMOS though... but
how would I test for this while the system is running?

-- 
Doug Lee d...@dlee.orghttp://www.dlee.org
SSB BART Group   doug@ssbbartgroup.com   http://www.ssbbartgroup.com
"Pray devoutly, but hammer stoutly."
--Sir William G. Benham
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to "freebsd-questions-unsubscr...@freebsd.org"


Re: 4.11 panic every 23 hours 55 minutes or so

2009-05-17 Thread Glen Barber
On Sun, May 17, 2009 at 7:06 AM, Doug Lee  wrote:
> One of the weirder things I've seen in a while here...
>
> OS: FreeBSD 4.11 (yeah I know, old, but generally stable)
> CPU: Intel(R) Pentium(R) 4 CPU 2.00GHz
> real memory  = 536608768 (524032K bytes)
> Hds: IDE
>

Do you by chance have the kernel built with debugging enabled?

> Problem:  Ever since a suspitious power outage (I say suspitious
> because we think a surge was also involved), this box has been
> exhibiting kernel panics about every 23 hours 55 minutes, give or
> take about 4 minutes either way.  Obviously hardware is suspect,
> and hopefully in line for upgrade; but as FreeBSD has always proven
> so stable for me, I'm curious what on earth could cause this sort
> of regular panic?
>
> It's not time of day; if I reboot at 2:00 AM, 3:55 PM, or any other
> time, it's 23:55 or so later I get a panic, whenever that may be.
> I think this rules out cron jobs, external attacks, and load-based
> issues.
>

Perhaps a bad CMOS battery causing the system time to become
corrupted?  (I know it's a long shot...)


-- 
Glen Barber
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to "freebsd-questions-unsubscr...@freebsd.org"


4.11 panic every 23 hours 55 minutes or so

2009-05-17 Thread Doug Lee
One of the weirder things I've seen in a while here...

OS: FreeBSD 4.11 (yeah I know, old, but generally stable)
CPU: Intel(R) Pentium(R) 4 CPU 2.00GHz
real memory  = 536608768 (524032K bytes)
Hds: IDE

Problem:  Ever since a suspitious power outage (I say suspitious
because we think a surge was also involved), this box has been
exhibiting kernel panics about every 23 hours 55 minutes, give or
take about 4 minutes either way.  Obviously hardware is suspect,
and hopefully in line for upgrade; but as FreeBSD has always proven
so stable for me, I'm curious what on earth could cause this sort
of regular panic?

It's not time of day; if I reboot at 2:00 AM, 3:55 PM, or any other
time, it's 23:55 or so later I get a panic, whenever that may be.
I think this rules out cron jobs, external attacks, and load-based
issues.

I'll provide the latest two panic logs below, followed by a few
rounds of output from `top -Sd1000', which is `top' with system
processes shown and dumb-terminal output, logged via ssh onto another
system so I could see the last report before the panic.  I have a
longer `top' log but I'll send that only on request to avoid an
even huger post.  Curiously, this is one of the few times the uptime
went past 24 hours, and also shows the current process as something
other than Idle.  I wonder if running `top' actually changed the
results.

Any info would be most welcome.  Please Cc me on responses.


- Panic from yesterday (double panic; this is typical) -

Fatal trap 12: page fault while in kernel mode
fault virtual address   = 0xe4f5eb02
fault code  = supervisor read, page not present
instruction pointer = 0x8:0xc02af097
stack pointer   = 0x10:0xc04be8d4
frame pointer   = 0x10:0xc04be8d8
code segment= base 0x0, limit 0xf, type 0x1b
= DPL 0, pres 1, def32 1, gran 1
processor eflags= interrupt enabled, resume, IOPL = 0
current process = Idle
interrupt mask  = 
trap number = 12
panic: page fault

syncing disks... 

Fatal trap 12: page fault while in kernel mode
fault virtual address   = 0x30
fault code  = supervisor read, page not present
instruction pointer = 0x8:0xc0377070
stack pointer   = 0x10:0xc04be6f4
frame pointer   = 0x10:0xc04be6fc
code segment= base 0x0, limit 0xf, type 0x1b
= DPL 0, pres 1, def32 1, gran 1
processor eflags= interrupt enabled, resume, IOPL = 0
current process = Idle
interrupt mask  = bio 
trap number = 12
panic: page fault
Uptime: 23h59m5s
Automatic reboot in 15 seconds - press a key on the console to abort


- Panic from today (just one this time, NOT typical) -

Fatal trap 12: page fault while in kernel mode
fault virtual address   = 0xe694f564
fault code  = supervisor read, page not present
instruction pointer = 0x8:0xc02a7b73
stack pointer   = 0x10:0xdb394da4
frame pointer   = 0x10:0xdb394e28
code segment= base 0x0, limit 0xf, type 0x1b
= DPL 0, pres 1, def32 1, gran 1
processor eflags= interrupt enabled, resume, IOPL = 0
current process = 505 (nmbd)
interrupt mask  = none
trap number = 12
panic: page fault

syncing disks... 1 
done
Uptime: 1d0h1m36s
Automatic reboot in 15 seconds - press a key on the console to abort
Rebooting...


- top -Sd1 from today (stopping at the above panic) -

last pid:  9015;  load averages:  0.02,  0.02,  0.00  up 1+00:01:5606:37:45
76 processes:  1 running, 74 sleeping, 1 zombie
CPU states:  0.0% user,  0.0% nice,  0.4% system,  0.0% interrupt, 99.6% idle
Mem: 95M Active, 264M Inact, 63M Wired, 21M Cache, 60M Buf, 55M Free
Swap: 250M Total, 250M Free

  PID USERNAME   PRI NICE  SIZERES STATETIME   WCPUCPU COMMAND
  491 mysql2   0 44788K 19244K poll 0:57  0.00%  0.00% mysqld
9 root18   0 0K 0K syncer   0:16  0.00%  0.00% syncer
  330 ssbdev   2   0  9744K  8188K poll 0:12  0.00%  0.00% python
  248 root 2   0  1336K   868K select   0:04  0.00%  0.00% ntpd
  224 root 2   0   464K   252K select   0:03  0.00%  0.00% natd
  416 root 2   0  3188K  1948K select   0:03  0.00%  0.00% sendmail
 6408 root10   0  2868K  2292K nanslp   0:03  0.00%  0.00% perl5.8.8
  308 root 2   0  9884K  5732K select   0:02  0.00%  0.00% httpd
 6515 root10   0 16548K 15872K nanslp   0:02  0.00%  0.00% perl5.8.8
  505 root 2   0  5364K  1796K select   0:02  0.00%  0.00% nmbd
 6738 root10   0 16548K 15872K nanslp   0:01  0.00%  0.00% perl5.8.8
  650 dlee 2   0  5336K  1836K select   0:01  0.00%  0.00% sshd
 7778 root10   0 16532K 15868K nanslp   0:01  0.00%  0.00% perl5.8.8
 8307 root10   0 16532K 15868K nanslp   0:01  0.00%  0.00% perl5.8.8
  245 bind 2   0  2484K  1856K select   0:01  0.00%  0.00% named
  241 root 2   0  1000K   668K selec