I've been running the gcc testsuite remotely which involves establishing ssh 
connection, running the test, returning the results, and terminating the ssh 
connection. The termination of the ssh connection results in a SIGCHLD being 
sent to the parent sshd process as the child goes away. Things proceed normally 
as scores of connections come up and go away. However, eventually  the 
situation arises where sigacthandler for the process detects that it's in a 
critical section so will defer the signal processing. This involves 
take_deferred_signal() being called as part of the do_exit_critical() 
processing.

What I'm seeing is that take_deferred_signal() will issue a sigresend syscall 
with sig=18, siginfo=NULL, and mask={0,0,0,0}. The syscall results in the 
t_sig_check flag being set such that we do post_syscall() which will then check 
for pending signals. In post_syscall() ISSIG_PENDING returns '18' so we then 
call issig_forreal(). This in turn calls fsig() which examines the mask sent as 
part of the sigresend syscall. As this is zero we have issig_forreal() 
returning false which means psig() is not called and no signal reaches the sshd 
parent. The end result is that we start accumulating defunct processes. 

My question is: is the mask being used by take_deferred_signal() incorrect or 
should the kernel keep this signal pending? I suspect it's probably the former 
and am curious as to what may be resulting in the {0,0,0,0} sigset_t.

Neale
 
 
This message posted from opensolaris.org
_______________________________________________
opensolaris-code mailing list
[email protected]
http://mail.opensolaris.org/mailman/listinfo/opensolaris-code

Reply via email to