hi,
we've finally got an interesting panic that shed some light on the
'kernel diagnostic assertion "__mp_lock_held(&sched_lock) == 0" failed'
panics that plagued some heavily loaded systems.
apparently, there's a window in mi_switch (right after cpu_switchto) that
has sched_lock held but kernel_lock released, thus allowing other cpus
to take advantage of the kernel_lock:
cpu_switchto(p, nextproc);
} else {
p->p_stat = SONPROC;
}
clear_resched(curcpu());
SCHED_ASSERT_LOCKED();
/*
* To preserve lock ordering, we need to release the sched lock
* and grab it after we grab the big lock.
* In the future, when the sched lock isn't recursive, we'll
* just release it here.
*/
#ifdef MULTIPROCESSOR
__mp_unlock(&sched_lock);
#endif
unfortunately they can't do this because we check for a sched_lock
being held:
void
_kernel_lock(void)
{
SCHED_ASSERT_UNLOCKED();
__mp_lock(&kernel_lock);
}
it should be safe to take kernel_lock there because mi_switch:
1) doesn't do anything nasty there;
2) figures out whether process should take a kernel_lock later on:
#ifdef MULTIPROCESSOR
/*
* Reacquire the kernel_lock now. We do this after we've
* released the scheduler lock to avoid deadlock, and before
* we reacquire the interlock and the scheduler lock.
*/
if (p->p_flag & P_BIGLOCK)
__mp_acquire_count(&kernel_lock, hold_count);
__mp_acquire_count(&sched_lock, sched_count + 1);
#endif
opinions? should we just remove an assertion?
or am i trippin' on some high quality drugs? (:
ddb log session:
ddb{0}> tr
Debugger(d08cc9e6,e08b7e34,d08aaf94,e08b7e34,d0a44e28) at Debugger+0x4
panic(d08aaf94,d08346ee,d08a80e0,d08a83ed,16a) at panic+0x5d
__assert(d08346ee,d08a83ed,16a,d08a80e0,0) at __assert+0x2e
_kernel_lock(1,0,d09d15cc,7a120,0) at _kernel_lock+0x48
trap() at trap+0x5fb
--- trap (number -547639296) ---
Bad frame pointer: 0xd0ad48a0
0:
ddb{0}> show panic
kernel diagnostic assertion "__mp_lock_held(&sched_lock) == 0" failed:
file "..
/../../../kern/kern_lock.c", line 362
ddb{0}> show ps
No such command
ddb{0}> ps
PID PPID PGRP UID S FLAGS WAIT COMMAND
2111 1 2111 0 3 0x2040180 select sendmail
12050 1 12050 0 3 0x2004080 ttyin getty
3065 1 3065 0 3 0x2004080 ttyin getty
2937 1 2937 0 3 0x2004080 ttyin getty
15153 1 15153 0 3 0x2004080 ttyin getty
24343 1 24343 0 3 0x2004080 ttyin getty
631 1 631 0 3 0x2004080 ttyin getty
18404 1 18404 0 3 0x2000080 select cron
7380 1 7380 0 3 0x2000180 select inetd
23123 1 23123 0 3 0x2000080 select sshd
8900 27266 27156 83 3 0x2000180 poll ntpd
27266 27156 27156 83 3 0x2000180 poll ntpd
27156 1 27156 0 3 0x2000080 poll ntpd
22867 11112 11112 74 3 0x2000180 bpf pflogd
11112 1 11112 0 3 0x2000080 netio pflogd
15707 20626 20626 73 3 0x2000180 poll syslogd
20626 1 20626 0 3 0x2000088 netio syslogd
29035 1 29035 77 3 0x2000180 poll dhclient
19499 1 27610 0 3 0x2000080 poll dhclient
7362 1 7362 77 3 0x2000180 poll dhclient
22676 1 27610 0 3 0x2000080 poll dhclient
19 0 0 0 3 0x2100200 aiodoned aiodoned
18 0 0 0 3 0x2100200 syncer update
17 0 0 0 3 0x2100200 cleaner cleaner
16 0 0 0 3 0x100200 reaper reaper
15 0 0 0 3 0x2100200 pgdaemon pagedaemon
14 0 0 0 3 0x2100200 bored crypto
13 0 0 0 3 0x2100200 pftm pfpurge
12 0 0 0 3 0x2100200 usbtsk usbtask
11 0 0 0 3 0x2100200 usbatsk usbatsk
10 0 0 0 3 0x2100200 acpi0 acpi0
9 0 0 0 7 0x40100200 idle5
8 0 0 0 7 0x40100200 idle4
7 0 0 0 7 0x40100200 idle3
6 0 0 0 7 0x40100200 idle2
5 0 0 0 7 0x40100200 idle1
4 0 0 0 3 0x2100200 bored syswq
* 3 0 0 0 7 0x40100200 idle0
2 0 0 0 3 0x2100200 kmalloc kmthread
1 0 1 0 3 0x2004080 wait init
0 -1 0 0 3 0x2080200 scheduler swapper
ddb{0}> ddb cpu 1
No such command
ddb{0}> mach ddbcpu 1
Stopped at Debugger+0x4: popl %ebp
RUN AT LEAST 'trace' AND 'ps' AND INCLUDE OUTPUT WHEN REPORTING THIS
PANIC!
IF RUNNING SMP, USE 'mach ddbcpu <#>' AND 'trace' ON OTHER PROCESSORS,
TOO.
DO NOT EVEN BOTHER REPORTING THIS WITHOUT INCLUDING THAT INFORMATION!
ddb{1}> tr
Debugger(d4320000,d082b116,1,202,1) at Debugger+0x4
i386_ipi_handler(c0,58,10,10,10) at i386_ipi_handler+0x5f
Xintripi() at Xintripi+0x47
--- interrupt ---
__mp_lock(d0a1d10c,7fffffff,da4154f0,d0a184a0,e08bbf5c) at __mp_lock+0x52
__mp_acquire_count(d0a1d10c,1,10,d4320000,d4320034,e08bbf8c,0,2,1,0,2f6d)
at __
mp_acquire_count+0x20
mi_switch(d0a1d10c,d4320000,da4154f0,e08bbf90,d03d3a53) at mi_switch+0x168
sched_idle(d4320000) at sched_idle+0x1cc
Bad frame pointer: 0xd0b8ae48
ddb{1}>