Re: vm.loadavg high (by one) on idle Sun systems

Nick Gustas Tue, 12 Aug 2008 04:37:49 -0700

Daniel Ouellet wrote:

Hi,
Any idea on how it might be possible to boot the system step by stepto get an idea of where this bug might be isolated?
I strip the boot process as much as possible and this is a very oldissue, but may be there is a way to find more in it. Looking at itmore, I think, it's possibly in the scheduler of the kernel. I can seethis problem only on Sun systems, either with the X1 or the V100 so far.
Rebooting the system will give you either a load of 1.08 to 1.12 or0.08 to 0.12.
I strip the system as much as I can from daemon start now to show itwell.
# cat /etc/rc.conf.local
sshd_flags=NO
sendmail_flags=NO
syslogd_flags=NO
inetd=NO                # almost always needed
and you can see there isn't anything running on the system to justifythis load.
# ps -auxwk
USER       PID %CPU %MEM   VSZ   RSS TT  STAT  STARTED       TIME COMMAND
root         3 99.0  0.0     0     0 ??  DK     6:15PM    7:08.13 (idle0)
root 8 0.0 0.0 0 0 ?? DK 6:15PM 0:00.00(pagedaemon)root 9 0.0 0.0 0 0 ?? DK 6:15PM 0:00.26(reaper)root 12 0.0 0.0 0 0 ?? DK 6:15PM 0:00.00(aiodoned)root 11 0.0 0.0 0 0 ?? DK 6:15PM 0:00.00(update)root 10 0.0 0.0 0 0 ?? DK 6:15PM 0:00.00(cleaner)root 13 0.0 0.0 0 0 ?? DK 6:15PM 0:00.00(crypto)root 0 0.0 0.0 0 0 ?? DKs 6:15PM 0:00.00(swapper)
root         4  0.0  0.0     0     0 ??  DK     6:15PM    0:00.00 (syswq)
root 2 0.0 0.0 0 0 ?? DK 6:15PM 0:00.00(kmthread)root 1 0.0 0.1 616 408 ?? Is 6:15PM 0:00.01/sbin/initroot 7 0.0 0.0 0 0 ?? DK 6:15PM 0:00.00(pfpurge)root 6 0.0 0.0 0 0 ?? DK 6:15PM 0:00.00(usbtask)
root         5  0.0  0.0     0     0 ??  DK     6:15PM    0:00.01 (usb0)
root      6772  0.0  0.2   664  1040 ??  Ss     6:15PM    0:00.02 cron
root 11233 0.0 0.1 552 528 00 Ss 6:15PM 0:00.08 -ksh(ksh)root 32400 0.0 0.1 416 352 00 R+ 6:22PM 0:00.00 ps-auxwk
however, you get this:

# uptime
 6:22PM  up 8 mins, 1 user, load averages: 1.08, 0.89, 0.48
# sysctl vm.loadavg
vm.loadavg=1.08 0.89 0.48
# sysctl kern.nprocs
kern.nprocs=17
# sysctl kern.version
kern.version=OpenBSD 4.4 (GENERIC) #1714: Wed Aug  6 13:31:49 MDT 2008
    [EMAIL PROTECTED]:/usr/src/sys/arch/sparc64/compile/GENERIC

# sysctl hw.model
hw.model=SUNW,UltraSPARC-IIe (rev 3.3) @ 548 MHz
I tried a few different things with boot -c to see, but so far, Ican't isolate where this might be.
The only thing I get is that it is ONLY and ALWAYS from the start ofthe system.
So, either it will be off by one on boot, or good.
Needs to be rebooted may be 5 times to get the real reading, (not offby 1) but then you can get that.
Any suggestion on how I could get more details to dig this more?
I was thinking of may be putting some kind of delay in the schedulerin case it might be possible to isolate it more that way, but I am notsure how I could do it.
Or may be log from the scheduler to get what process add/remove to theload average here, but again no success doing that yet.
This is not really hardware broken as I can do that on way more then20 different systems here.
May be this might affect something else in the scheduler as looking atthe code looks like some process are schedule based on their load andhow long they have run. So, if the data is wrong, it may well lead toother issues cause by this.
Any possible suggestions to try to dig this up more and get may bemore valuable informations?
One thing for sure, it's always either right, or off by one when present.

Thanks

Daniel

I had the same issue with an X1 at work, disabling USB with boot -c orconfig would eliminate the problem.

It's been over 6 months since I worked on this, and I won't be able toverify until Thursday, but I recall leaving usb enabled, but keeping aUSB device such as a mouse or RS232 adapter plugged in would also bringthe load back to 0. Seems like the load could go back up 1 once Iunplugged the device, but it's been a while.

I assumed it was buggy USB hardware causing something usb related in thekernel to block on IO and raise the load by 1.


Hopefully this gives you something to go on.

Re: vm.loadavg high (by one) on idle Sun systems

Reply via email to