Re: pcm(4) related panic

2003-11-26 Thread Artur Poplawski
Mathew Kanner <[EMAIL PROTECTED]> wrote:

> On Nov 25, Don Lewis wrote:
> > On 25 Nov, Don Lewis wrote:
> > > On 25 Nov, Artur Poplawski wrote:
> > >> Artur Poplawski <[EMAIL PROTECTED]> wrote:
> > >> 
> > >>> Hello, 
> > >>>  
> > >>>
> > >>>  
> > >>> On a 5.1-RELEASE and 5.2-BETA machines I have been able to cause a panic 
> > >>> like this:
> > > 
> > >>> Sleeping on "swread" with the following non-sleepable locks held:
> > >>> exclusive sleep mutex pcm0:play:0 (pcm channel) r = 0 (0xc1c3d740) locked @ \  
> > >>>  
> > >>> /usr/src/sys/dev/sound/pcm/dsp.c:146
> > > 
> > > This enables the panic.
> > > 
> > >>> panic: sleeping thread (pid 583) owns a non-sleepable lock
> > > 
> > > Then the panic happens when another thread tries to grab the mutex.
> > > 
> > > 
> > > The problem is that the pcm code attempts to hold a mutex across a call
> > > to uiomove(), which can sleep if the userland buffer that it is trying
> > > to access is paged out.  Either the buffer has to be pre-wired before
> > > calling getchns(), or the mutex has to be dropped around the call to
> > > uiomove().  The amount of memory to be wired should be limited to
> > > 'sz' as calculated by chn_read() and chn_write(), which complicates the
> > > logic.  Dropping the mutex probably has other issues.
> > 
> > Following up to myself ...
> > 
> > It might be safe to drop the mutex for the uiomove() call if the code
> > set flags to enforce a limit of one reader and one writer at a time to
> > keep the code from being re-entered.  The buffer pointer manipulations
> > in sndbuf_dispose() and sndbuf_acquire() would probably still have to be
> > protected by the mutex.  If this can be made to work, it would probably
> > be preferable to wiring the buffer.  It would have a lot less CPU
> > overhead, and would work better with large buffers, which could still be
> > allowed to page normally.
> 
>   Don,
>   I never would have suspected that uio might sleep and panic,
> thanks for the clue.
> 
>   Artur,
>   Could you try the attached patch.

I've tried the patch -- and it works great! :-) I was unable to trigger
the panic with the patch applied, although I tried really hard -- so I 
guess the problem is solved. 

Mat and Don, I'm really very thankful for your help.

Best regards, Artur


___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: pcm(4) related panic

2003-11-26 Thread Mathew Kanner
On Nov 25, Don Lewis wrote:
> On 25 Nov, Don Lewis wrote:
> > On 25 Nov, Artur Poplawski wrote:
> >> Artur Poplawski <[EMAIL PROTECTED]> wrote:
> >> 
> >>> Hello,  
> >>> 
> >>> On a 5.1-RELEASE and 5.2-BETA machines I have been able to cause a panic 
> >>> like this:
> > 
> >>> Sleeping on "swread" with the following non-sleepable locks held:
> >>> exclusive sleep mutex pcm0:play:0 (pcm channel) r = 0 (0xc1c3d740) locked @ \   
> >>> /usr/src/sys/dev/sound/pcm/dsp.c:146
> > 
> > This enables the panic.
> > 
> >>> panic: sleeping thread (pid 583) owns a non-sleepable lock
> > 
> > Then the panic happens when another thread tries to grab the mutex.
> > 
> > 
> > The problem is that the pcm code attempts to hold a mutex across a call
> > to uiomove(), which can sleep if the userland buffer that it is trying
> > to access is paged out.  Either the buffer has to be pre-wired before
> > calling getchns(), or the mutex has to be dropped around the call to
> > uiomove().  The amount of memory to be wired should be limited to
> > 'sz' as calculated by chn_read() and chn_write(), which complicates the
> > logic.  Dropping the mutex probably has other issues.
> 
> Following up to myself ...
> 
> It might be safe to drop the mutex for the uiomove() call if the code
> set flags to enforce a limit of one reader and one writer at a time to
> keep the code from being re-entered.  The buffer pointer manipulations
> in sndbuf_dispose() and sndbuf_acquire() would probably still have to be
> protected by the mutex.  If this can be made to work, it would probably
> be preferable to wiring the buffer.  It would have a lot less CPU
> overhead, and would work better with large buffers, which could still be
> allowed to page normally.

Don,
I never would have suspected that uio might sleep and panic,
thanks for the clue.

Artur,
Could you try the attached patch.

Thanks,
--Mat

-- 
Any idiot can face a crisis; it is this day-to-day living
that wears you out.
- Chekhov
--- channel.c   Sun Nov  9 04:17:22 2003
+++ /sys/dev/sound/pcm/channel.cWed Nov 26 02:21:14 2003
@@ -250,6 +250,8 @@
 {
int ret, timeout, newsize, count, sz;
struct snd_dbuf *bs = c->bufsoft;
+   void *off;
+   int t, x,togo,p;
 
CHN_LOCKASSERT(c);
/*
@@ -291,7 +293,22 @@
sz = MIN(sz, buf->uio_resid);
KASSERT(sz > 0, ("confusion in chn_write"));
/* printf("sz: %d\n", sz); */
+#if 0
ret = sndbuf_uiomove(bs, buf, sz);
+#else
+   togo = sz;
+   while (ret == 0 && togo> 0) {
+   p = sndbuf_getfreeptr(bs);
+   t = MIN(togo, sndbuf_getsize(bs) - p);
+   off = sndbuf_getbufofs(bs, p);
+   CHN_UNLOCK(c);
+   ret = uiomove(off, t, buf);
+   CHN_LOCK(c);
+   togo -= t;
+   x = sndbuf_acquire(bs, NULL, t);
+   }
+   ret = 0;
+#endif
if (ret == 0 && !(c->flags & CHN_F_TRIGGERED))
chn_start(c, 0);
}
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: pcm(4) related panic

2003-11-25 Thread Don Lewis
On 25 Nov, Don Lewis wrote:
> On 25 Nov, Artur Poplawski wrote:
>> Artur Poplawski <[EMAIL PROTECTED]> wrote:
>> 
>>> Hello,  
>>> 
>>> On a 5.1-RELEASE and 5.2-BETA machines I have been able to cause a panic 
>>> like this:
> 
>>> Sleeping on "swread" with the following non-sleepable locks held:
>>> exclusive sleep mutex pcm0:play:0 (pcm channel) r = 0 (0xc1c3d740) locked @ \   
>>> /usr/src/sys/dev/sound/pcm/dsp.c:146
> 
> This enables the panic.
> 
>>> panic: sleeping thread (pid 583) owns a non-sleepable lock
> 
> Then the panic happens when another thread tries to grab the mutex.
> 
> 
> The problem is that the pcm code attempts to hold a mutex across a call
> to uiomove(), which can sleep if the userland buffer that it is trying
> to access is paged out.  Either the buffer has to be pre-wired before
> calling getchns(), or the mutex has to be dropped around the call to
> uiomove().  The amount of memory to be wired should be limited to
> 'sz' as calculated by chn_read() and chn_write(), which complicates the
> logic.  Dropping the mutex probably has other issues.

Following up to myself ...

It might be safe to drop the mutex for the uiomove() call if the code
set flags to enforce a limit of one reader and one writer at a time to
keep the code from being re-entered.  The buffer pointer manipulations
in sndbuf_dispose() and sndbuf_acquire() would probably still have to be
protected by the mutex.  If this can be made to work, it would probably
be preferable to wiring the buffer.  It would have a lot less CPU
overhead, and would work better with large buffers, which could still be
allowed to page normally.
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: pcm(4) related panic

2003-11-25 Thread Don Lewis
On 25 Nov, Artur Poplawski wrote:
> Artur Poplawski <[EMAIL PROTECTED]> wrote:
> 
>> Hello,  
>> 
>> On a 5.1-RELEASE and 5.2-BETA machines I have been able to cause a panic 
>> like this:

>> Sleeping on "swread" with the following non-sleepable locks held:
>> exclusive sleep mutex pcm0:play:0 (pcm channel) r = 0 (0xc1c3d740) locked @ \   
>> /usr/src/sys/dev/sound/pcm/dsp.c:146

This enables the panic.

>> panic: sleeping thread (pid 583) owns a non-sleepable lock

Then the panic happens when another thread tries to grab the mutex.


The problem is that the pcm code attempts to hold a mutex across a call
to uiomove(), which can sleep if the userland buffer that it is trying
to access is paged out.  Either the buffer has to be pre-wired before
calling getchns(), or the mutex has to be dropped around the call to
uiomove().  The amount of memory to be wired should be limited to
'sz' as calculated by chn_read() and chn_write(), which complicates the
logic.  Dropping the mutex probably has other issues.


___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: pcm(4) related panic

2003-11-25 Thread Artur Poplawski
Artur Poplawski <[EMAIL PROTECTED]> wrote:

> Hello,  
> 
> On a 5.1-RELEASE and 5.2-BETA machines I have been able to cause a panic 
> like this:
>  
> (watch out for folded lines; the stack backtrace below is rewritten by
> hand from ddb)
> 
> lock order reversal
>  1st 0xc22a45ac vm object (vm object) @ /usr/src/sys/vm/swap_pager.c:1323
>  2nd 0xc06c0420 swap_pager swhash (swap_pager swhash) @ \
> /usr/src/sys/vm/swap_pager.c:1838
>  3rd 0xc0c358c4 vm object (vm object) @ /usr/src/sys/vm/uma_core.c:876
> Stack backtrace:
>   backtrace
>   witness_lock
>   _mtx_lock_flags
>   obj_allock
>   slab_zalloc
>   uma_zone_slab
>   uma_zalloc_internal
>   uma_zalloc_arg
>   swp_pager_meta_build
>   swap_pager_putpages
>   default_pager_putpages
>   vm_pageout_flush
>   vm_pageout_clean
>   vm_pageout_scan
>   vm_pageout
>   fork_exit
>   fork_trampoline
> 
> Sleeping on "swread" with the following non-sleepable locks held:
> exclusive sleep mutex pcm0:play:0 (pcm channel) r = 0 (0xc1c3d740) locked @ \   
> /usr/src/sys/dev/sound/pcm/dsp.c:146
> panic: sleeping thread (pid 583) owns a non-sleepable lock
> syncing disks, buffers remaining... 1410 1410 panic: mi_switch: \ 
> switch in a critical section
> Uptime: 1m45s
> panic: msleep
> Uptime: 1m45s
> panic: msleep
> Uptime: 1m45s
> panic: msleep
> Uptime: 1m45s
> panic: msleep
> [... repeated few more times]
> Fatal double fault:
> eip = 0xc05e3916
> esp = 0xc8db8ff4
> ebp = 0xc8db9004
> panic: double fault
> Uptime: 1m45s
> panic: msleep
> Uptime: 1m45s 
> panic: msleep
> Uptime: 1m45s
> panic: msleep
> Uptime: 1m45s
> [...]
> And the machine suddenly reboots, so there is no coredump.
>  
> eip address points close to:
> c05e3910 T sc_vtb_putc
>  
> To reproduce this panic just start some audio player app (like xmms), 
> and launch countless memory-eating applications (like mozilla ;>).
> The machine starts swapping, and it panics. 
> 
> % uname -a 
> FreeBSD kaszanka.domek 5.2-BETA FreeBSD 5.2-BETA #0: Sun Nov 23 01:23:10\ 
>  CET 2003 [EMAIL PROTECTED]:/usr/obj/usr/src/sys/KASZANKA i386 
> 
> dmesg fragments:
> CPU: AMD Athlon(tm) XP 2000+ (1666.73-MHz 686-class CPU)
> pcm0:  port 0xec00-0xec3f irq 10 at device 8.0 on pci0 
> pcm0: 
> rl0:  port 0xe800-0xe8ff mem \
>  0xdf00-0xdfff ir
> q 10 at device 10.0 on pci0
> miibus0:  on rl0
> rlphy0:  on miibus0
> rlphy0:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
> rl1:  port 0xe400-0xe4ff mem \
>  0xde00-0xdeff ir
> q 10 at device 11.0 on pci0
> rlphy1:  on miibus1
> rlphy1:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto



In the meantime I've managed to get a coredump, by directly calling
doadump() from ddb. Results:


[EMAIL PROTECTED]:/usr/obj/usr/src/sys/KASZANKA# gdb -k kernel.debug 
/var/crash/vmcore.0
GNU gdb 5.2.1 (FreeBSD)  
Copyright 2002 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "i386-undermydesk-freebsd"...
panic: sleeping thread (pid 568) owns a non-sleepable lock
panic messages:
---
panic: sleeping thread (pid 568) owns a non-sleepable lock

syncing disks, buffers remaining... panic: msleep
Dumping 128 MB
 16 32 48 64 80 96 112
---
Reading symbols from 
/usr/obj/usr/src/sys/KASZANKA/modules/usr/src/sys/modules/linprocfs/linprocfs.ko.debug...done.
Loaded symbols for 
/usr/obj/usr/src/sys/KASZANKA/modules/usr/src/sys/modules/linprocfs/linprocfs.ko.debug
Reading symbols from 
/usr/obj/usr/src/sys/KASZANKA/modules/usr/src/sys/modules/linux/linux.ko.debug...done.
Loaded symbols for 
/usr/obj/usr/src/sys/KASZANKA/modules/usr/src/sys/modules/linux/linux.ko.debug
Reading symbols from /boot/kernel/netgraph.ko...done.
Loaded symbols for /boot/kernel/netgraph.ko
Reading symbols from /boot/kernel/ng_ether.ko...done.
Loaded symbols for /boot/kernel/ng_ether.ko
Reading symbols from /boot/kernel/ng_pppoe.ko...done.
Loaded symbols for /boot/kernel/ng_pppoe.ko
Reading symbols from /boot/kernel/ng_socket.ko...done.
Loaded symbols for /boot/kernel/ng_socket.ko
Reading symbols from /boot/kernel/mga.ko...done.
Loaded symbols for /boot/kernel/mga.ko
#0  doadump () at /usr/src/sys/kern/kern_shutdown.c:240
240 dumping++;
(kgdb) where
#0  doadump () at /usr/src/sys/kern/kern_shutdown.c:240
#1  0xc04292cd in db_fncall (dummy1=0, dummy2=0, dummy3=0, dummy4=0xc8dba7bc "à×hÀ") 
at /usr/src/sys/ddb/db_command.c:548
#2  0xc042906a in db_command (last_cmdp=0xc068ce80, cmd_table=0x0, 
aux_cmd_tablep=0xc06480c0, aux_cmd_tablep_end=0xc

pcm(4) related panic

2003-11-25 Thread Artur Poplawski
Hello,  

On a 5.1-RELEASE and 5.2-BETA machines I have been able to cause a panic 
like this:
 
(watch out for folded lines; the stack backtrace below is rewritten by
hand from ddb)

lock order reversal
 1st 0xc22a45ac vm object (vm object) @ /usr/src/sys/vm/swap_pager.c:1323
 2nd 0xc06c0420 swap_pager swhash (swap_pager swhash) @ \
/usr/src/sys/vm/swap_pager.c:1838
 3rd 0xc0c358c4 vm object (vm object) @ /usr/src/sys/vm/uma_core.c:876
Stack backtrace:
  backtrace
  witness_lock
  _mtx_lock_flags
  obj_allock
  slab_zalloc
  uma_zone_slab
  uma_zalloc_internal
  uma_zalloc_arg
  swp_pager_meta_build
  swap_pager_putpages
  default_pager_putpages
  vm_pageout_flush
  vm_pageout_clean
  vm_pageout_scan
  vm_pageout
  fork_exit
  fork_trampoline

Sleeping on "swread" with the following non-sleepable locks held:
exclusive sleep mutex pcm0:play:0 (pcm channel) r = 0 (0xc1c3d740) locked @ \   
/usr/src/sys/dev/sound/pcm/dsp.c:146
panic: sleeping thread (pid 583) owns a non-sleepable lock
syncing disks, buffers remaining... 1410 1410 panic: mi_switch: \ 
switch in a critical section
Uptime: 1m45s
panic: msleep
Uptime: 1m45s
panic: msleep
Uptime: 1m45s
panic: msleep
Uptime: 1m45s
panic: msleep
[... repeated few more times]
Fatal double fault:
eip = 0xc05e3916
esp = 0xc8db8ff4
ebp = 0xc8db9004
panic: double fault
Uptime: 1m45s
panic: msleep
Uptime: 1m45s 
panic: msleep
Uptime: 1m45s
panic: msleep
Uptime: 1m45s
[...]
And the machine suddenly reboots, so there is no coredump.
 
eip address points close to:
c05e3910 T sc_vtb_putc
 
To reproduce this panic just start some audio player app (like xmms), 
and launch countless memory-eating applications (like mozilla ;>).
The machine starts swapping, and it panics. 

% uname -a 
FreeBSD kaszanka.domek 5.2-BETA FreeBSD 5.2-BETA #0: Sun Nov 23 01:23:10\ 
 CET 2003 [EMAIL PROTECTED]:/usr/obj/usr/src/sys/KASZANKA i386 

dmesg fragments:
CPU: AMD Athlon(tm) XP 2000+ (1666.73-MHz 686-class CPU)
pcm0:  port 0xec00-0xec3f irq 10 at device 8.0 on pci0 
pcm0: 
rl0:  port 0xe800-0xe8ff mem \
 0xdf00-0xdfff ir
q 10 at device 10.0 on pci0
miibus0:  on rl0
rlphy0:  on miibus0
rlphy0:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
rl1:  port 0xe400-0xe4ff mem \
 0xde00-0xdeff ir
q 10 at device 11.0 on pci0
rlphy1:  on miibus1
rlphy1:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto

Regards, Artur
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[EMAIL PROTECTED]"