I've been running the gcc testsuite remotely which involves establishing ssh
connection, running the test, returning the results, and terminating the ssh
connection. The termination of the ssh connection results in a SIGCHLD being
sent to the parent sshd process as the child goes away. Things proceed normally
as scores of connections come up and go away. However, eventually the
situation arises where sigacthandler for the process detects that it's in a
critical section so will defer the signal processing. This involves
take_deferred_signal() being called as part of the do_exit_critical()
processing.
What I'm seeing is that take_deferred_signal() will issue a sigresend syscall
with sig=18, siginfo=NULL, and mask={0,0,0,0}. The syscall results in the
t_sig_check flag being set such that we do post_syscall() which will then check
for pending signals. In post_syscall() ISSIG_PENDING returns '18' so we then
call issig_forreal(). This in turn calls fsig() which examines the mask sent as
part of the sigresend syscall. As this is zero we have issig_forreal()
returning false which means psig() is not called and no signal reaches the sshd
parent. The end result is that we start accumulating defunct processes.
My question is: is the mask being used by take_deferred_signal() incorrect or
should the kernel keep this signal pending? I suspect it's probably the former
and am curious as to what may be resulting in the {0,0,0,0} sigset_t.
Neale
This message posted from opensolaris.org
_______________________________________________
opensolaris-code mailing list
[email protected]
http://mail.opensolaris.org/mailman/listinfo/opensolaris-code