On Thu, Feb 02, 2012 at 12:16:39AM +0200, Коньков Евгений wrote:
> repeated again:
> bug is repeateable:
> 1. radiusd + mod_perl + example.pl(it is connects to FireBird) +
> FireBIrd
> 2. restart firebird
> 3. try to restart radiusd
> 4. process in fall into STOP state

> # ps awx | grep radi
>  9438  ??  TLs     5:10.12 /usr/local/sbin/radiusd
> 27603   2  S+      0:00.00 grep radi
> # procstat -k 9438
>   PID    TID COMM             TDNAME           KSTACK
>  9438 100080 radiusd          -                mi_switch sleepq_switch 
> sleepq_wait _sx_xlock_hard _sx_xlock _vm_map_lock_upgrade vm_map_lookup 
> vm_fault_hold vm_fault trap_pfault trap calltrap
>  9438 100195 radiusd          -                mi_switch sleepq_switch 
> sleepq_wait __lockmgr_args ffs_lock VOP_LOCK1_APV _vn_lock 
> vm_object_deallocate unlock_and_deallocate vm_fault_hold vm_fault trap_pfault 
> trap calltrap
>  9438 101144 radiusd          -                mi_switch 
> thread_suspend_switch thread_single exit1 sigexit postsig ast doreti_ast
> # ps wHl9438
>   UID   PID  PPID CPU PRI NI    VSZ    RSS MWCHAN STAT  TT     TIME COMMAND
>   133  9438     1   0  20  0 351124 322000 user m TLs   ??  0:03.65 
> /usr/local/sbin/radiusd
>   133  9438     1   0  20  0 351124 322000 ufs    TLs   ??  0:00.00 
> /usr/local/sbin/radiusd
>   133  9438     1   0  20  0 351124 322000 -      TLs   ??  0:05.28 
> /usr/local/sbin/radiusd

> if I can supply another usefull debug info, answer as fast as you can, I can
> not wait too long. Thank you.

OK, this looks like it may be useful for someone who knows more about
the VM system than I do. It is very likely a FreeBSD kernel bug though,
so building freeradius and/or firebird with debug information is
unlikely to be useful (apart from perturbing a race condition, if the
problem is related to a race condition).

My analysis: thread 101144 is attempting to shut down the process in
response to a signal, but needs to wait for 100080 and 100195 to finish
page fault processing. For thread 100195, page fault processing resulted
in deallocating a VM object based on some sort of file, and it is
blocked waiting on the vnode lock for the file. It may or may not hold a
lock on a user map. Thread 100080 needs to lock a user map to continue
processing (this means the fault is either a copy-on-write fault or the
first write to anonymous memory). It seems that 100080 is not holding
the vnode lock that 100195 needs.

If you have DDB (kernel debugger) and witness compiled in, the DDB
command
  show locks
will show who owns these locks. This is probably

The output of
  procstat -kka
may be useful (like the previous procstat command but for all threads in
the system and with offsets from each function).

The output of
  procstat -v 9438
is the memory mappings of the process. It could be that this command
gets stuck because of the locks.

-- 
Jilles Tjoelker

------------------------------------------------------------------------------
Virtualization & Cloud Management Using Capacity Planning
Cloud computing makes use of virtualization - but cloud computing 
also focuses on allowing computing to be delivered as a service.
http://www.accelacomm.com/jaw/sfnl/114/51521223/
Firebird-Devel mailing list, web interface at 
https://lists.sourceforge.net/lists/listinfo/firebird-devel

Reply via email to