Re: Patch #3 Re: UMA panic under load

2002-12-19 Thread Kris Kennaway
On Tue, Dec 17, 2002 at 05:06:06PM -0500, Brian F. Feldman wrote:
 Matthew Dillon [EMAIL PROTECTED] wrote:
  Whoop.  Ok, here's a new patch.  I think this covers all the cases.
  I've done some testing and it appears to do the right thing, please
  look it over (the last patch had type-o's and didn't cover the correct
  cases).
 
 I haven't tested, since I haven't provoked that specific panic on my 
 machines, but that does appear that it would indeed fix both issues.  Kris, 
 can you confirm that it makes the machines work properly?

I'm testing the committed version on the alpha cluster and my local
sparc machine.  It should be fairly obvious within a few days if it is
working.

Kris



msg49045/pgp0.pgp
Description: PGP signature


Re: UMA panic under load

2002-12-14 Thread Brian F. Feldman
John Baldwin [EMAIL PROTECTED] wrote:
 
 On 12-Dec-2002 Kris Kennaway wrote:
  I got this on an alpha tonight.  It was under heavy load at the time
  (18 simultaneous package builds had just been spawned on the machine).
  Any ideas?
  
  Slab at 0xfc00042d3fb8, freei 2 = 0.
  panic: Duplicate free of item 0xfc00042d22e0 from zone 
0xfc0007d31800(VMSPACE)
  
  db_print_backtrace() at db_print_backtrace+0x18
  panic() at panic+0x104
  uma_dbg_free() at uma_dbg_free+0x170
  uma_zfree_arg() at uma_zfree_arg+0x150
  vmspace_free() at vmspace_free+0xe4
  swapout_procs() at swapout_procs+0x428
  vm_daemon() at vm_daemon+0x74
  fork_exit() at fork_exit+0xe0
  exception_return() at exception_return
  --- root of call graph ---
  panic
  Stopped at  Debugger+0x34:  zapnot  v0,#0xf,v0  v0=0x0
  db
 
 I have seen this on a couple of different arch's I think.  A vmspace
 shouldn't be free'd here, it's refcount should not be that low.
 I wonder if something is free'ing the vmspace w/o dropping the refcount?

The problem appears to be that swapout_procs() is swapping out a process 
that is in the process of exiting (in exit1()) and having already 
relinquished its vmspace, but has not set PRS_ZOMBIE yet (which would be 
preventing the swapout).  It's clearly not correct for a process in exit1() 
to be swapped out, and the vmspace _needs_ to be decremented in the correct 
place or resources are NEVER freed when the race is lost.

-- 
Brian Fundakowski Feldman   \'[ FreeBSD ]''\
   [EMAIL PROTECTED]   [EMAIL PROTECTED]  \  The Power to Serve! \
 Opinions expressed are my own.   \,,\



To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: UMA panic under load

2002-12-14 Thread Jake Burkholder
Apparently, On Sat, Dec 14, 2002 at 07:37:31PM -0500,
Brian F. Feldman said words to the effect of;

 John Baldwin [EMAIL PROTECTED] wrote:
  
  On 12-Dec-2002 Kris Kennaway wrote:
   I got this on an alpha tonight.  It was under heavy load at the time
   (18 simultaneous package builds had just been spawned on the machine).
   Any ideas?
   
   Slab at 0xfc00042d3fb8, freei 2 = 0.
   panic: Duplicate free of item 0xfc00042d22e0 from zone 
0xfc0007d31800(VMSPACE)
   
   db_print_backtrace() at db_print_backtrace+0x18
   panic() at panic+0x104
   uma_dbg_free() at uma_dbg_free+0x170
   uma_zfree_arg() at uma_zfree_arg+0x150
   vmspace_free() at vmspace_free+0xe4
   swapout_procs() at swapout_procs+0x428
   vm_daemon() at vm_daemon+0x74
   fork_exit() at fork_exit+0xe0
   exception_return() at exception_return
   --- root of call graph ---
   panic
   Stopped at  Debugger+0x34:  zapnot  v0,#0xf,v0  v0=0x0
   db
  
  I have seen this on a couple of different arch's I think.  A vmspace
  shouldn't be free'd here, it's refcount should not be that low.
  I wonder if something is free'ing the vmspace w/o dropping the refcount?
 
 The problem appears to be that swapout_procs() is swapping out a process 
 that is in the process of exiting (in exit1()) and having already 
 relinquished its vmspace, but has not set PRS_ZOMBIE yet (which would be 
 preventing the swapout).  It's clearly not correct for a process in exit1() 
 to be swapped out, and the vmspace _needs_ to be decremented in the correct 
 place or resources are NEVER freed when the race is lost.

P_WEXIT is set, so the process won't get swapped out.  The problem is that
the vmspace refcnt is 0 when swapout_procs is called, since it was
decremented in exit1.  The refcnt is incremented before p_flag is tested
for P_WEXIT, the swapout is skipped because its found to be set, and then
vmspace_free is called which decrements the refcnt to 0 and prematurely
frees the vmspace.  Decrementing the refcnt in exit1 breaks the normal
refernce count semantics because the vmspace is not being freed then.

Jake

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: UMA panic under load

2002-12-14 Thread Matthew Dillon

:The problem appears to be that swapout_procs() is swapping out a process 
:that is in the process of exiting (in exit1()) and having already 
:relinquished its vmspace, but has not set PRS_ZOMBIE yet (which would be 
:preventing the swapout).  It's clearly not correct for a process in exit1() 
:to be swapped out, and the vmspace _needs_ to be decremented in the correct 
:place or resources are NEVER freed when the race is lost.
:
:-- 
:Brian Fundakowski Feldman   \'[ FreeBSD ]''\
:   [EMAIL PROTECTED]   [EMAIL PROTECTED]  \  The Power to Serve! \

P_WEXIT is set before the vmspace is released.  It may be sufficient
to have swapout_procs() ignore processes with P_WEXIT set.

-Matt

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: UMA panic under load

2002-12-14 Thread Matthew Dillon

:P_WEXIT is set, so the process won't get swapped out.  The problem is that
:the vmspace refcnt is 0 when swapout_procs is called, since it was
:decremented in exit1.  The refcnt is incremented before p_flag is tested
:for P_WEXIT, the swapout is skipped because its found to be set, and then
:vmspace_free is called which decrements the refcnt to 0 and prematurely
:frees the vmspace.  Decrementing the refcnt in exit1 breaks the normal
:refernce count semantics because the vmspace is not being freed then.
:
:Jake

Yup, I see it.  We could just move the P_WEXIT test but I wonder how
many other places the vmspace might be bumped and then released.  The
real bug appears to be in exit1().

I seem to recall we hit this situation a few months ago.  I thought 
it had been fixed.

-Matt
Matthew Dillon 
[EMAIL PROTECTED]

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: UMA panic under load

2002-12-14 Thread Matthew Dillon
What about something like this.  If the vm_refcnt is still being
decremented too early, could it be moved to just before the thread_exit()
call?

-Matt


Index: kern/kern_exit.c
===
RCS file: /home/ncvs/src/sys/kern/kern_exit.c,v
retrieving revision 1.187
diff -u -r1.187 kern_exit.c
--- kern/kern_exit.c10 Dec 2002 02:33:44 -  1.187
+++ kern/kern_exit.c15 Dec 2002 01:08:21 -
@@ -288,7 +288,7 @@
 * Can't free the entire vmspace as the kernel stack
 * may be mapped within that space also.
 */
-   if (--vm-vm_refcnt == 0) {
+   if (vm-vm_refcnt == 1) {
if (vm-vm_shm)
shmexit(p);
vm_page_lock_queues();
@@ -298,7 +298,9 @@
(void) vm_map_remove(vm-vm_map, vm_map_min(vm-vm_map),
vm_map_max(vm-vm_map));
vm-vm_freer = p;
+   KASSERT(vm-vm_refcnt == 1, (expected vm_refcnt of 1));
}
+   --vm-vm_refcnt;
 
sx_xlock(proctree_lock);
if (SESS_LEADER(p)) {

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: UMA panic under load

2002-12-14 Thread Brian F. Feldman
Jake Burkholder [EMAIL PROTECTED] wrote:
 Apparently, On Sat, Dec 14, 2002 at 07:37:31PM -0500,
   Brian F. Feldman said words to the effect of;
 
  John Baldwin [EMAIL PROTECTED] wrote:
   
   On 12-Dec-2002 Kris Kennaway wrote:
I got this on an alpha tonight.  It was under heavy load at the time
(18 simultaneous package builds had just been spawned on the machine).
Any ideas?

Slab at 0xfc00042d3fb8, freei 2 = 0.
panic: Duplicate free of item 0xfc00042d22e0 from zone 
0xfc0007d31800(VMSPACE)

db_print_backtrace() at db_print_backtrace+0x18
panic() at panic+0x104
uma_dbg_free() at uma_dbg_free+0x170
uma_zfree_arg() at uma_zfree_arg+0x150
vmspace_free() at vmspace_free+0xe4
swapout_procs() at swapout_procs+0x428
vm_daemon() at vm_daemon+0x74
fork_exit() at fork_exit+0xe0
exception_return() at exception_return
--- root of call graph ---
panic
Stopped at  Debugger+0x34:  zapnot  v0,#0xf,v0  v0=0x0
db
   
   I have seen this on a couple of different arch's I think.  A vmspace
   shouldn't be free'd here, it's refcount should not be that low.
   I wonder if something is free'ing the vmspace w/o dropping the refcount?
  
  The problem appears to be that swapout_procs() is swapping out a process 
  that is in the process of exiting (in exit1()) and having already 
  relinquished its vmspace, but has not set PRS_ZOMBIE yet (which would be 
  preventing the swapout).  It's clearly not correct for a process in exit1() 
  to be swapped out, and the vmspace _needs_ to be decremented in the correct 
  place or resources are NEVER freed when the race is lost.
 
 P_WEXIT is set, so the process won't get swapped out.  The problem is that
 the vmspace refcnt is 0 when swapout_procs is called, since it was
 decremented in exit1.  The refcnt is incremented before p_flag is tested
 for P_WEXIT, the swapout is skipped because its found to be set, and then
 vmspace_free is called which decrements the refcnt to 0 and prematurely
 frees the vmspace.  Decrementing the refcnt in exit1 breaks the normal
 refernce count semantics because the vmspace is not being freed then.

There are no normal reference count semantics; exit1() attempts to free 
parts of the vmspace.  Sounds to me like a simple solution is to check for 
P_WEXIT both before and after incrementing the vmspace refcount.

-- 
Brian Fundakowski Feldman   \'[ FreeBSD ]''\
   [EMAIL PROTECTED]   [EMAIL PROTECTED]  \  The Power to Serve! \
 Opinions expressed are my own.   \,,\



To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: UMA panic under load

2002-12-14 Thread Brian F. Feldman
Matthew Dillon [EMAIL PROTECTED] wrote:
 What about something like this.  If the vm_refcnt is still being
 decremented too early, could it be moved to just before the thread_exit()
 call?

The problem that had to be fixed by removing this race was that two 
processes with the same vmspace can exit at the same time, and the 
vm_refcnt could be 2 the entire time, so neither would perform the current
if (--vm-vm_refcnt == 0) { block.

-- 
Brian Fundakowski Feldman   \'[ FreeBSD ]''\
   [EMAIL PROTECTED]   [EMAIL PROTECTED]  \  The Power to Serve! \
 Opinions expressed are my own.   \,,\



To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: UMA panic under load

2002-12-14 Thread Matthew Dillon
It's a big mess.  exit1() sets up vm-vm_freer = p and then
vmspace_exitfree() tests that and calls vmspace_dofree().  It
looks like vm-vm_freer is acting like an exit-lock, so only
one process/thread actually frees the vmspace.  But there are
still some serious race conditions.  If two thread go into exit1()
at the same time, but vmspace_exitfree() is called in the reverse
order, so the first call to vmspace_exitfree() winds up freeing
the vmspace, the first process's vmspace might be ripped out from under
it.  

On the flip side if several threads go into exit1() at the same time
the vmspace's ref count may never be seen to be '0' if we move the
decrement to later on in the code.

So my 'what if we did this' patch will fix one problem and create 
another.  The reference count must be decremented where it is currently
being decremented in exit1() or there is a chance that multiple exit1()'s
will not see the ref count drop to 0 (or be equal to 1).

On the flip side (again), vmspace_exitfree() really should not call
vmspace_dofree() unless it is the last process, which is not necessarily
the same process that detected the ref count going to 0 in exit1().  

It's like we need a second ref count field for the vmspace structure, one
to determine when the initial bunch of garbage can be freed up
(sysV shared memory and usch), and another to determine when
vmspace_dofree() can actually be called.

-Matt

:There are no normal reference count semantics; exit1() attempts to free 
:parts of the vmspace.  Sounds to me like a simple solution is to check for 
:P_WEXIT both before and after incrementing the vmspace refcount.
:
:-- 
:Brian Fundakowski Feldman   \'[ FreeBSD ]''\
:   [EMAIL PROTECTED]   [EMAIL PROTECTED]  \  The Power to Serve! \

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: UMA panic under load

2002-12-14 Thread Matthew Dillon
Here's another go at a patch (untested).

-Matt

Index: kern/kern_exit.c
===
RCS file: /home/ncvs/src/sys/kern/kern_exit.c,v
retrieving revision 1.187
diff -u -r1.187 kern_exit.c
--- kern/kern_exit.c10 Dec 2002 02:33:44 -  1.187
+++ kern/kern_exit.c15 Dec 2002 01:36:35 -
@@ -289,6 +289,7 @@
 * may be mapped within that space also.
 */
if (--vm-vm_refcnt == 0) {
+   ++vm-vm_exitingcnt;
if (vm-vm_shm)
shmexit(p);
vm_page_lock_queues();
@@ -297,7 +298,6 @@
vm_page_unlock_queues();
(void) vm_map_remove(vm-vm_map, vm_map_min(vm-vm_map),
vm_map_max(vm-vm_map));
-   vm-vm_freer = p;
}
 
sx_xlock(proctree_lock);
Index: vm/vm_map.c
===
RCS file: /home/ncvs/src/sys/vm/vm_map.c,v
retrieving revision 1.273
diff -u -r1.273 vm_map.c
--- vm/vm_map.c 1 Dec 2002 18:57:56 -   1.273
+++ vm/vm_map.c 15 Dec 2002 01:40:39 -
@@ -258,7 +258,7 @@
vm-vm_map.pmap = vmspace_pmap(vm); /* XXX */
vm-vm_refcnt = 1;
vm-vm_shm = NULL;
-   vm-vm_freer = NULL;
+   vm-vm_exitingcount = 0;
return (vm);
 }
 
@@ -304,7 +304,7 @@
if (vm-vm_refcnt == 0)
panic(vmspace_free: attempt to free already freed vmspace);
 
-   if (--vm-vm_refcnt == 0)
+   if (--vm-vm_refcnt == 0  vm-vm_exitingcount == 0)
vmspace_dofree(vm);
 }
 
@@ -314,9 +314,10 @@
struct vmspace *vm;
 
GIANT_REQUIRED;
-   if (p == p-p_vmspace-vm_freer) {
-   vm = p-p_vmspace;
-   p-p_vmspace = NULL;
+   vm = p-p_vmspace;
+   p-p_vmspace = NULL;
+   if (--vm-vm_exitingcount == 0) {
+   KASSERT(vm-vm_refcnt == 0, (vm_refcnt was not 0));
vmspace_dofree(vm);
}
 }
Index: vm/vm_map.h
===
RCS file: /home/ncvs/src/sys/vm/vm_map.h,v
retrieving revision 1.92
diff -u -r1.92 vm_map.h
--- vm/vm_map.h 22 Sep 2002 04:33:43 -  1.92
+++ vm/vm_map.h 15 Dec 2002 01:38:29 -
@@ -219,7 +219,7 @@
caddr_t vm_daddr;   /* (c) user virtual address of data */
caddr_t vm_maxsaddr;/* user VA at max stack growth */
 #definevm_endcopy vm_freer
-   struct proc *vm_freer;  /* vm freed on whose behalf */
+   int vm_exitingcnt;  /* several processes zombied in exit1  */
 };
 
 #ifdef _KERNEL

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



(lots of posts today Matt!) Re: UMA panic under load

2002-12-14 Thread Matthew Dillon
   oops, sorry, I blew that patch.  exitingcnt would have to be incremented
   unconditionally.

-Matt


Index: kern/kern_exit.c
===
RCS file: /home/ncvs/src/sys/kern/kern_exit.c,v
retrieving revision 1.187
diff -u -r1.187 kern_exit.c
--- kern/kern_exit.c10 Dec 2002 02:33:44 -  1.187
+++ kern/kern_exit.c15 Dec 2002 01:42:12 -
@@ -288,6 +288,7 @@
 * Can't free the entire vmspace as the kernel stack
 * may be mapped within that space also.
 */
+   ++vm-vm_exitingcnt;
if (--vm-vm_refcnt == 0) {
if (vm-vm_shm)
shmexit(p);
@@ -297,7 +298,6 @@
vm_page_unlock_queues();
(void) vm_map_remove(vm-vm_map, vm_map_min(vm-vm_map),
vm_map_max(vm-vm_map));
-   vm-vm_freer = p;
}
 
sx_xlock(proctree_lock);
Index: vm/vm_map.c
===
RCS file: /home/ncvs/src/sys/vm/vm_map.c,v
retrieving revision 1.273
diff -u -r1.273 vm_map.c
--- vm/vm_map.c 1 Dec 2002 18:57:56 -   1.273
+++ vm/vm_map.c 15 Dec 2002 01:40:39 -
@@ -258,7 +258,7 @@
vm-vm_map.pmap = vmspace_pmap(vm); /* XXX */
vm-vm_refcnt = 1;
vm-vm_shm = NULL;
-   vm-vm_freer = NULL;
+   vm-vm_exitingcount = 0;
return (vm);
 }
 
@@ -304,7 +304,7 @@
if (vm-vm_refcnt == 0)
panic(vmspace_free: attempt to free already freed vmspace);
 
-   if (--vm-vm_refcnt == 0)
+   if (--vm-vm_refcnt == 0  vm-vm_exitingcount == 0)
vmspace_dofree(vm);
 }
 
@@ -314,9 +314,10 @@
struct vmspace *vm;
 
GIANT_REQUIRED;
-   if (p == p-p_vmspace-vm_freer) {
-   vm = p-p_vmspace;
-   p-p_vmspace = NULL;
+   vm = p-p_vmspace;
+   p-p_vmspace = NULL;
+   if (--vm-vm_exitingcount == 0) {
+   KASSERT(vm-vm_refcnt == 0, (vm_refcnt was not 0));
vmspace_dofree(vm);
}
 }
Index: vm/vm_map.h
===
RCS file: /home/ncvs/src/sys/vm/vm_map.h,v
retrieving revision 1.92
diff -u -r1.92 vm_map.h
--- vm/vm_map.h 22 Sep 2002 04:33:43 -  1.92
+++ vm/vm_map.h 15 Dec 2002 01:38:29 -
@@ -219,7 +219,7 @@
caddr_t vm_daddr;   /* (c) user virtual address of data */
caddr_t vm_maxsaddr;/* user VA at max stack growth */
 #definevm_endcopy vm_freer
-   struct proc *vm_freer;  /* vm freed on whose behalf */
+   int vm_exitingcnt;  /* several processes zombied in exit1  */
 };
 
 #ifdef _KERNEL

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Patch #3 Re: UMA panic under load

2002-12-14 Thread Matthew Dillon
Whoop.  Ok, here's a new patch.  I think this covers all the cases.
I've done some testing and it appears to do the right thing, please
look it over (the last patch had type-o's and didn't cover the correct
cases).

-Matt

Index: kern/kern_exit.c
===
RCS file: /home/ncvs/src/sys/kern/kern_exit.c,v
retrieving revision 1.187
diff -u -r1.187 kern_exit.c
--- kern/kern_exit.c10 Dec 2002 02:33:44 -  1.187
+++ kern/kern_exit.c15 Dec 2002 01:45:15 -
@@ -287,7 +287,15 @@
 * Need to do this early enough that we can still sleep.
 * Can't free the entire vmspace as the kernel stack
 * may be mapped within that space also.
+*
+* Processes sharing the same vmspace may exit in one order, and
+* get cleaned up by vmspace_exit() in a different order.  The
+* last exiting process to reach this point releases as much of
+* the environment as it can, and the last process cleaned up
+* by vmspace_exit() (which decrements exitingcnt) cleans up the
+* remainder.
 */
+   ++vm-vm_exitingcnt;
if (--vm-vm_refcnt == 0) {
if (vm-vm_shm)
shmexit(p);
@@ -297,7 +305,6 @@
vm_page_unlock_queues();
(void) vm_map_remove(vm-vm_map, vm_map_min(vm-vm_map),
vm_map_max(vm-vm_map));
-   vm-vm_freer = p;
}
 
sx_xlock(proctree_lock);
Index: vm/vm_map.c
===
RCS file: /home/ncvs/src/sys/vm/vm_map.c,v
retrieving revision 1.273
diff -u -r1.273 vm_map.c
--- vm/vm_map.c 1 Dec 2002 18:57:56 -   1.273
+++ vm/vm_map.c 15 Dec 2002 02:05:13 -
@@ -258,7 +258,7 @@
vm-vm_map.pmap = vmspace_pmap(vm); /* XXX */
vm-vm_refcnt = 1;
vm-vm_shm = NULL;
-   vm-vm_freer = NULL;
+   vm-vm_exitingcnt = 0;
return (vm);
 }
 
@@ -304,7 +304,7 @@
if (vm-vm_refcnt == 0)
panic(vmspace_free: attempt to free already freed vmspace);
 
-   if (--vm-vm_refcnt == 0)
+   if (--vm-vm_refcnt == 0  vm-vm_exitingcnt == 0)
vmspace_dofree(vm);
 }
 
@@ -314,11 +314,22 @@
struct vmspace *vm;
 
GIANT_REQUIRED;
-   if (p == p-p_vmspace-vm_freer) {
-   vm = p-p_vmspace;
-   p-p_vmspace = NULL;
+   vm = p-p_vmspace;
+   p-p_vmspace = NULL;
+
+   /*
+* cleanup by parent process wait()ing on exiting child.  vm_refcnt
+* may not be 0 (e.g. fork() and child exits without exec()ing).
+* exitingcnt may increment above 0 and drop back down to zero
+* several times while vm_refcnt is held non-zero.  vm_refcnt
+* may also increment above 0 and drop back down to zero several 
+* times while vm_exitingcnt is held non-zero.
+* 
+* The last wait on the exiting child's vmspace will clean up 
+* the remainder of the vmspace.
+*/
+   if (--vm-vm_exitingcnt == 0  vm-vm_refcnt == 0)
vmspace_dofree(vm);
-   }
 }
 
 /*
Index: vm/vm_map.h
===
RCS file: /home/ncvs/src/sys/vm/vm_map.h,v
retrieving revision 1.92
diff -u -r1.92 vm_map.h
--- vm/vm_map.h 22 Sep 2002 04:33:43 -  1.92
+++ vm/vm_map.h 15 Dec 2002 01:47:49 -
@@ -218,8 +218,8 @@
caddr_t vm_taddr;   /* (c) user virtual address of text */
caddr_t vm_daddr;   /* (c) user virtual address of data */
caddr_t vm_maxsaddr;/* user VA at max stack growth */
-#definevm_endcopy vm_freer
-   struct proc *vm_freer;  /* vm freed on whose behalf */
+#definevm_endcopy vm_exitingcnt
+   int vm_exitingcnt;  /* several processes zombied in exit1  */
 };
 
 #ifdef _KERNEL

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: Another UMA panic under load

2002-12-13 Thread Terry Lambert
Andrew Gallatin wrote:
 Ugh. Since it may call kmem_malloc(), UMA must hold Giant.
 
 This is the same problem the mbuf system has, and its what's
 keeping network device drivers under Giant in 5.0.
 
 Both subsytems should probably have GIANT_REQUIRED at all entry
 points so as to catch locking problems like this earlier.

No, they should probably be wired into machdep.c, instead.

It was pretty obvious (to me) that UMA could not use the kmem
primitives, if it wanted to avoid Giant, even right at the
beginning of integration.  I just assumed that this was known,
and that it would be dealt with later, using one of several
approaches.

IMO, the easiest approach is mapping all physical RAM into the KVA
at the start of life, and then apportioning it out from there.

-- Terry

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



UMA panic under load

2002-12-12 Thread Kris Kennaway
I got this on an alpha tonight.  It was under heavy load at the time
(18 simultaneous package builds had just been spawned on the machine).
Any ideas?

Slab at 0xfc00042d3fb8, freei 2 = 0.
panic: Duplicate free of item 0xfc00042d22e0 from zone 0xfc0007d31800(VMSPACE)

db_print_backtrace() at db_print_backtrace+0x18
panic() at panic+0x104
uma_dbg_free() at uma_dbg_free+0x170
uma_zfree_arg() at uma_zfree_arg+0x150
vmspace_free() at vmspace_free+0xe4
swapout_procs() at swapout_procs+0x428
vm_daemon() at vm_daemon+0x74
fork_exit() at fork_exit+0xe0
exception_return() at exception_return
--- root of call graph ---
panic
Stopped at  Debugger+0x34:  zapnot  v0,#0xf,v0  v0=0x0
db

Kris



msg48592/pgp0.pgp
Description: PGP signature


RE: UMA panic under load

2002-12-12 Thread John Baldwin

On 12-Dec-2002 Kris Kennaway wrote:
 I got this on an alpha tonight.  It was under heavy load at the time
 (18 simultaneous package builds had just been spawned on the machine).
 Any ideas?
 
 Slab at 0xfc00042d3fb8, freei 2 = 0.
 panic: Duplicate free of item 0xfc00042d22e0 from zone 
0xfc0007d31800(VMSPACE)
 
 db_print_backtrace() at db_print_backtrace+0x18
 panic() at panic+0x104
 uma_dbg_free() at uma_dbg_free+0x170
 uma_zfree_arg() at uma_zfree_arg+0x150
 vmspace_free() at vmspace_free+0xe4
 swapout_procs() at swapout_procs+0x428
 vm_daemon() at vm_daemon+0x74
 fork_exit() at fork_exit+0xe0
 exception_return() at exception_return
 --- root of call graph ---
 panic
 Stopped at  Debugger+0x34:  zapnot  v0,#0xf,v0  v0=0x0
 db

I have seen this on a couple of different arch's I think.  A vmspace
shouldn't be free'd here, it's refcount should not be that low.
I wonder if something is free'ing the vmspace w/o dropping the refcount?

-- 

John Baldwin [EMAIL PROTECTED]http://www.FreeBSD.org/~jhb/
Power Users Use the Power to Serve!  -  http://www.FreeBSD.org/

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Another UMA panic under load

2002-12-12 Thread Kris Kennaway
I think this is the same one I reported a few days ago (another alpha
under heavy load).

panic: mutex Giant not owned at /local0/src-client/sys/vm/vm_kern.c:312
db_print_backtrace() at db_print_backtrace+0x18
panic() at panic+0x104
_mtx_assert() at _mtx_assert+0xb4
kmem_malloc() at kmem_malloc+0x50
page_alloc() at page_alloc+0x3c
uma_large_malloc() at uma_large_malloc+0x58
malloc() at malloc+0x10c
fdalloc() at fdalloc+0x1b0
do_dup() at do_dup+0x1a4
dup2() at dup2+0x24
syscall() at syscall+0x338
XentSys() at XentSys+0x64
--- syscall (90) ---
--- user mode ---
panic

Kris


msg48609/pgp0.pgp
Description: PGP signature


Re: Another UMA panic under load

2002-12-12 Thread Andrew Gallatin

Ugh. Since it may call kmem_malloc(), UMA must hold Giant.

This is the same problem the mbuf system has, and its what's
keeping network device drivers under Giant in 5.0.

Both subsytems should probably have GIANT_REQUIRED at all entry
points so as to catch locking problems like this earlier.

Drew


Kris Kennaway writes:
  I think this is the same one I reported a few days ago (another alpha
  under heavy load).
  
  panic: mutex Giant not owned at /local0/src-client/sys/vm/vm_kern.c:312
  db_print_backtrace() at db_print_backtrace+0x18
  panic() at panic+0x104
  _mtx_assert() at _mtx_assert+0xb4
  kmem_malloc() at kmem_malloc+0x50
  page_alloc() at page_alloc+0x3c
  uma_large_malloc() at uma_large_malloc+0x58
  malloc() at malloc+0x10c
  fdalloc() at fdalloc+0x1b0
  do_dup() at do_dup+0x1a4
  dup2() at dup2+0x24
  syscall() at syscall+0x338
  XentSys() at XentSys+0x64
  --- syscall (90) ---
  --- user mode ---
  panic
  
  Kris

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message