Re: pcm(4) related panic
Mathew Kanner <[EMAIL PROTECTED]> wrote: > On Nov 25, Don Lewis wrote: > > On 25 Nov, Don Lewis wrote: > > > On 25 Nov, Artur Poplawski wrote: > > >> Artur Poplawski <[EMAIL PROTECTED]> wrote: > > >> > > >>> Hello, > > >>> > > >>> > > >>> > > >>> On a 5.1-RELEASE and 5.2-BETA machines I have been able to cause a panic > > >>> like this: > > > > > >>> Sleeping on "swread" with the following non-sleepable locks held: > > >>> exclusive sleep mutex pcm0:play:0 (pcm channel) r = 0 (0xc1c3d740) locked @ \ > > >>> > > >>> /usr/src/sys/dev/sound/pcm/dsp.c:146 > > > > > > This enables the panic. > > > > > >>> panic: sleeping thread (pid 583) owns a non-sleepable lock > > > > > > Then the panic happens when another thread tries to grab the mutex. > > > > > > > > > The problem is that the pcm code attempts to hold a mutex across a call > > > to uiomove(), which can sleep if the userland buffer that it is trying > > > to access is paged out. Either the buffer has to be pre-wired before > > > calling getchns(), or the mutex has to be dropped around the call to > > > uiomove(). The amount of memory to be wired should be limited to > > > 'sz' as calculated by chn_read() and chn_write(), which complicates the > > > logic. Dropping the mutex probably has other issues. > > > > Following up to myself ... > > > > It might be safe to drop the mutex for the uiomove() call if the code > > set flags to enforce a limit of one reader and one writer at a time to > > keep the code from being re-entered. The buffer pointer manipulations > > in sndbuf_dispose() and sndbuf_acquire() would probably still have to be > > protected by the mutex. If this can be made to work, it would probably > > be preferable to wiring the buffer. It would have a lot less CPU > > overhead, and would work better with large buffers, which could still be > > allowed to page normally. > > Don, > I never would have suspected that uio might sleep and panic, > thanks for the clue. > > Artur, > Could you try the attached patch. I've tried the patch -- and it works great! :-) I was unable to trigger the panic with the patch applied, although I tried really hard -- so I guess the problem is solved. Mat and Don, I'm really very thankful for your help. Best regards, Artur ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: pcm(4) related panic
On Nov 25, Don Lewis wrote: > On 25 Nov, Don Lewis wrote: > > On 25 Nov, Artur Poplawski wrote: > >> Artur Poplawski <[EMAIL PROTECTED]> wrote: > >> > >>> Hello, > >>> > >>> On a 5.1-RELEASE and 5.2-BETA machines I have been able to cause a panic > >>> like this: > > > >>> Sleeping on "swread" with the following non-sleepable locks held: > >>> exclusive sleep mutex pcm0:play:0 (pcm channel) r = 0 (0xc1c3d740) locked @ \ > >>> /usr/src/sys/dev/sound/pcm/dsp.c:146 > > > > This enables the panic. > > > >>> panic: sleeping thread (pid 583) owns a non-sleepable lock > > > > Then the panic happens when another thread tries to grab the mutex. > > > > > > The problem is that the pcm code attempts to hold a mutex across a call > > to uiomove(), which can sleep if the userland buffer that it is trying > > to access is paged out. Either the buffer has to be pre-wired before > > calling getchns(), or the mutex has to be dropped around the call to > > uiomove(). The amount of memory to be wired should be limited to > > 'sz' as calculated by chn_read() and chn_write(), which complicates the > > logic. Dropping the mutex probably has other issues. > > Following up to myself ... > > It might be safe to drop the mutex for the uiomove() call if the code > set flags to enforce a limit of one reader and one writer at a time to > keep the code from being re-entered. The buffer pointer manipulations > in sndbuf_dispose() and sndbuf_acquire() would probably still have to be > protected by the mutex. If this can be made to work, it would probably > be preferable to wiring the buffer. It would have a lot less CPU > overhead, and would work better with large buffers, which could still be > allowed to page normally. Don, I never would have suspected that uio might sleep and panic, thanks for the clue. Artur, Could you try the attached patch. Thanks, --Mat -- Any idiot can face a crisis; it is this day-to-day living that wears you out. - Chekhov --- channel.c Sun Nov 9 04:17:22 2003 +++ /sys/dev/sound/pcm/channel.cWed Nov 26 02:21:14 2003 @@ -250,6 +250,8 @@ { int ret, timeout, newsize, count, sz; struct snd_dbuf *bs = c->bufsoft; + void *off; + int t, x,togo,p; CHN_LOCKASSERT(c); /* @@ -291,7 +293,22 @@ sz = MIN(sz, buf->uio_resid); KASSERT(sz > 0, ("confusion in chn_write")); /* printf("sz: %d\n", sz); */ +#if 0 ret = sndbuf_uiomove(bs, buf, sz); +#else + togo = sz; + while (ret == 0 && togo> 0) { + p = sndbuf_getfreeptr(bs); + t = MIN(togo, sndbuf_getsize(bs) - p); + off = sndbuf_getbufofs(bs, p); + CHN_UNLOCK(c); + ret = uiomove(off, t, buf); + CHN_LOCK(c); + togo -= t; + x = sndbuf_acquire(bs, NULL, t); + } + ret = 0; +#endif if (ret == 0 && !(c->flags & CHN_F_TRIGGERED)) chn_start(c, 0); } ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: pcm(4) related panic
On 25 Nov, Don Lewis wrote: > On 25 Nov, Artur Poplawski wrote: >> Artur Poplawski <[EMAIL PROTECTED]> wrote: >> >>> Hello, >>> >>> On a 5.1-RELEASE and 5.2-BETA machines I have been able to cause a panic >>> like this: > >>> Sleeping on "swread" with the following non-sleepable locks held: >>> exclusive sleep mutex pcm0:play:0 (pcm channel) r = 0 (0xc1c3d740) locked @ \ >>> /usr/src/sys/dev/sound/pcm/dsp.c:146 > > This enables the panic. > >>> panic: sleeping thread (pid 583) owns a non-sleepable lock > > Then the panic happens when another thread tries to grab the mutex. > > > The problem is that the pcm code attempts to hold a mutex across a call > to uiomove(), which can sleep if the userland buffer that it is trying > to access is paged out. Either the buffer has to be pre-wired before > calling getchns(), or the mutex has to be dropped around the call to > uiomove(). The amount of memory to be wired should be limited to > 'sz' as calculated by chn_read() and chn_write(), which complicates the > logic. Dropping the mutex probably has other issues. Following up to myself ... It might be safe to drop the mutex for the uiomove() call if the code set flags to enforce a limit of one reader and one writer at a time to keep the code from being re-entered. The buffer pointer manipulations in sndbuf_dispose() and sndbuf_acquire() would probably still have to be protected by the mutex. If this can be made to work, it would probably be preferable to wiring the buffer. It would have a lot less CPU overhead, and would work better with large buffers, which could still be allowed to page normally. ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: pcm(4) related panic
On 25 Nov, Artur Poplawski wrote: > Artur Poplawski <[EMAIL PROTECTED]> wrote: > >> Hello, >> >> On a 5.1-RELEASE and 5.2-BETA machines I have been able to cause a panic >> like this: >> Sleeping on "swread" with the following non-sleepable locks held: >> exclusive sleep mutex pcm0:play:0 (pcm channel) r = 0 (0xc1c3d740) locked @ \ >> /usr/src/sys/dev/sound/pcm/dsp.c:146 This enables the panic. >> panic: sleeping thread (pid 583) owns a non-sleepable lock Then the panic happens when another thread tries to grab the mutex. The problem is that the pcm code attempts to hold a mutex across a call to uiomove(), which can sleep if the userland buffer that it is trying to access is paged out. Either the buffer has to be pre-wired before calling getchns(), or the mutex has to be dropped around the call to uiomove(). The amount of memory to be wired should be limited to 'sz' as calculated by chn_read() and chn_write(), which complicates the logic. Dropping the mutex probably has other issues. ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: pcm(4) related panic
Artur Poplawski <[EMAIL PROTECTED]> wrote: > Hello, > > On a 5.1-RELEASE and 5.2-BETA machines I have been able to cause a panic > like this: > > (watch out for folded lines; the stack backtrace below is rewritten by > hand from ddb) > > lock order reversal > 1st 0xc22a45ac vm object (vm object) @ /usr/src/sys/vm/swap_pager.c:1323 > 2nd 0xc06c0420 swap_pager swhash (swap_pager swhash) @ \ > /usr/src/sys/vm/swap_pager.c:1838 > 3rd 0xc0c358c4 vm object (vm object) @ /usr/src/sys/vm/uma_core.c:876 > Stack backtrace: > backtrace > witness_lock > _mtx_lock_flags > obj_allock > slab_zalloc > uma_zone_slab > uma_zalloc_internal > uma_zalloc_arg > swp_pager_meta_build > swap_pager_putpages > default_pager_putpages > vm_pageout_flush > vm_pageout_clean > vm_pageout_scan > vm_pageout > fork_exit > fork_trampoline > > Sleeping on "swread" with the following non-sleepable locks held: > exclusive sleep mutex pcm0:play:0 (pcm channel) r = 0 (0xc1c3d740) locked @ \ > /usr/src/sys/dev/sound/pcm/dsp.c:146 > panic: sleeping thread (pid 583) owns a non-sleepable lock > syncing disks, buffers remaining... 1410 1410 panic: mi_switch: \ > switch in a critical section > Uptime: 1m45s > panic: msleep > Uptime: 1m45s > panic: msleep > Uptime: 1m45s > panic: msleep > Uptime: 1m45s > panic: msleep > [... repeated few more times] > Fatal double fault: > eip = 0xc05e3916 > esp = 0xc8db8ff4 > ebp = 0xc8db9004 > panic: double fault > Uptime: 1m45s > panic: msleep > Uptime: 1m45s > panic: msleep > Uptime: 1m45s > panic: msleep > Uptime: 1m45s > [...] > And the machine suddenly reboots, so there is no coredump. > > eip address points close to: > c05e3910 T sc_vtb_putc > > To reproduce this panic just start some audio player app (like xmms), > and launch countless memory-eating applications (like mozilla ;>). > The machine starts swapping, and it panics. > > % uname -a > FreeBSD kaszanka.domek 5.2-BETA FreeBSD 5.2-BETA #0: Sun Nov 23 01:23:10\ > CET 2003 [EMAIL PROTECTED]:/usr/obj/usr/src/sys/KASZANKA i386 > > dmesg fragments: > CPU: AMD Athlon(tm) XP 2000+ (1666.73-MHz 686-class CPU) > pcm0: port 0xec00-0xec3f irq 10 at device 8.0 on pci0 > pcm0: > rl0: port 0xe800-0xe8ff mem \ > 0xdf00-0xdfff ir > q 10 at device 10.0 on pci0 > miibus0: on rl0 > rlphy0: on miibus0 > rlphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto > rl1: port 0xe400-0xe4ff mem \ > 0xde00-0xdeff ir > q 10 at device 11.0 on pci0 > rlphy1: on miibus1 > rlphy1: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto In the meantime I've managed to get a coredump, by directly calling doadump() from ddb. Results: [EMAIL PROTECTED]:/usr/obj/usr/src/sys/KASZANKA# gdb -k kernel.debug /var/crash/vmcore.0 GNU gdb 5.2.1 (FreeBSD) Copyright 2002 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "i386-undermydesk-freebsd"... panic: sleeping thread (pid 568) owns a non-sleepable lock panic messages: --- panic: sleeping thread (pid 568) owns a non-sleepable lock syncing disks, buffers remaining... panic: msleep Dumping 128 MB 16 32 48 64 80 96 112 --- Reading symbols from /usr/obj/usr/src/sys/KASZANKA/modules/usr/src/sys/modules/linprocfs/linprocfs.ko.debug...done. Loaded symbols for /usr/obj/usr/src/sys/KASZANKA/modules/usr/src/sys/modules/linprocfs/linprocfs.ko.debug Reading symbols from /usr/obj/usr/src/sys/KASZANKA/modules/usr/src/sys/modules/linux/linux.ko.debug...done. Loaded symbols for /usr/obj/usr/src/sys/KASZANKA/modules/usr/src/sys/modules/linux/linux.ko.debug Reading symbols from /boot/kernel/netgraph.ko...done. Loaded symbols for /boot/kernel/netgraph.ko Reading symbols from /boot/kernel/ng_ether.ko...done. Loaded symbols for /boot/kernel/ng_ether.ko Reading symbols from /boot/kernel/ng_pppoe.ko...done. Loaded symbols for /boot/kernel/ng_pppoe.ko Reading symbols from /boot/kernel/ng_socket.ko...done. Loaded symbols for /boot/kernel/ng_socket.ko Reading symbols from /boot/kernel/mga.ko...done. Loaded symbols for /boot/kernel/mga.ko #0 doadump () at /usr/src/sys/kern/kern_shutdown.c:240 240 dumping++; (kgdb) where #0 doadump () at /usr/src/sys/kern/kern_shutdown.c:240 #1 0xc04292cd in db_fncall (dummy1=0, dummy2=0, dummy3=0, dummy4=0xc8dba7bc "à×hÀ") at /usr/src/sys/ddb/db_command.c:548 #2 0xc042906a in db_command (last_cmdp=0xc068ce80, cmd_table=0x0, aux_cmd_tablep=0xc06480c0, aux_cmd_tablep_end=0xc
pcm(4) related panic
Hello, On a 5.1-RELEASE and 5.2-BETA machines I have been able to cause a panic like this: (watch out for folded lines; the stack backtrace below is rewritten by hand from ddb) lock order reversal 1st 0xc22a45ac vm object (vm object) @ /usr/src/sys/vm/swap_pager.c:1323 2nd 0xc06c0420 swap_pager swhash (swap_pager swhash) @ \ /usr/src/sys/vm/swap_pager.c:1838 3rd 0xc0c358c4 vm object (vm object) @ /usr/src/sys/vm/uma_core.c:876 Stack backtrace: backtrace witness_lock _mtx_lock_flags obj_allock slab_zalloc uma_zone_slab uma_zalloc_internal uma_zalloc_arg swp_pager_meta_build swap_pager_putpages default_pager_putpages vm_pageout_flush vm_pageout_clean vm_pageout_scan vm_pageout fork_exit fork_trampoline Sleeping on "swread" with the following non-sleepable locks held: exclusive sleep mutex pcm0:play:0 (pcm channel) r = 0 (0xc1c3d740) locked @ \ /usr/src/sys/dev/sound/pcm/dsp.c:146 panic: sleeping thread (pid 583) owns a non-sleepable lock syncing disks, buffers remaining... 1410 1410 panic: mi_switch: \ switch in a critical section Uptime: 1m45s panic: msleep Uptime: 1m45s panic: msleep Uptime: 1m45s panic: msleep Uptime: 1m45s panic: msleep [... repeated few more times] Fatal double fault: eip = 0xc05e3916 esp = 0xc8db8ff4 ebp = 0xc8db9004 panic: double fault Uptime: 1m45s panic: msleep Uptime: 1m45s panic: msleep Uptime: 1m45s panic: msleep Uptime: 1m45s [...] And the machine suddenly reboots, so there is no coredump. eip address points close to: c05e3910 T sc_vtb_putc To reproduce this panic just start some audio player app (like xmms), and launch countless memory-eating applications (like mozilla ;>). The machine starts swapping, and it panics. % uname -a FreeBSD kaszanka.domek 5.2-BETA FreeBSD 5.2-BETA #0: Sun Nov 23 01:23:10\ CET 2003 [EMAIL PROTECTED]:/usr/obj/usr/src/sys/KASZANKA i386 dmesg fragments: CPU: AMD Athlon(tm) XP 2000+ (1666.73-MHz 686-class CPU) pcm0: port 0xec00-0xec3f irq 10 at device 8.0 on pci0 pcm0: rl0: port 0xe800-0xe8ff mem \ 0xdf00-0xdfff ir q 10 at device 10.0 on pci0 miibus0: on rl0 rlphy0: on miibus0 rlphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto rl1: port 0xe400-0xe4ff mem \ 0xde00-0xdeff ir q 10 at device 11.0 on pci0 rlphy1: on miibus1 rlphy1: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto Regards, Artur ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "[EMAIL PROTECTED]"