> Basically we have parent that
> 
> 1. blocks all signals
> 2. spawns a thread (SigHandler)
> 3. vforks()
>    - child does execve("/bin/ls",NULL)
>    - parent sleep() loops forever
> 
> The SigHandler thread
> 
> 1. blocks all signals
> 2. does a sigwait() loop
>      if SIGCHLD delivered,
>        _exit()
> 
> On regular PC, this runs as expected - as soon as 'ls' completes,
> everything shuts down.
> 
> However on MicroBlaze / noMMU, the child of the vfork() ('ls') becomes a
> Zombie and never exits, thus SIGCHLD never gets raised, and it all
> hangs.

> Reports of behaviour on another noMMU arch would be greatly appreciated.

Two other people confirmed the same problem.  (I haven't tried it.)

Xin Xie wrote:
> My first suspection is using the LinuxThread in the uClibc means any
> thread is treated almost same as process. So the child process SIGCHLD
> would never deliver to the SigHandler thread, instead the signal go to
> the main process which is doing the sleep(1) loop (by default, SIGCHLD
> is ignored). On MMU linux, they use the NPTL thread which means the
> thread can choose to listen a signal. 

All signals are blocked in step 1, so if the threads were POSIX
threads, only the thread calling sigwait() should get the SIGCHLD.

With MMU, we have NPTL threads which is quite POSIX compliant and
behaves as expected.  But with NOMMU, we have only the older
LinuxThreads.  (Same goes for old MMU distros, but they're not seen
much now).

LinuxThreads does not behave like POSIX threads for signals.  Signals
are delivered to specific threads, not any thread in the process.

(There is a hidden monitoring thread, but that blocks SIGCHLD _and_
it's not the target of SIGCHLD anyway.  Internally LinuxThreads uses a
different signal to detect threads dying.)

Because of not delivering to any thread in the process, the SIGCHLD
will be delivered to the thread which calls vfork().  Unfortunately,
that thread is blocked in vfork()...

A pretty good case can be made that vfork() should unblock when the
child becomes a zombie...

To detect that, you will need the specific thread which calls vfork()
to call waitpid/wait/waitid to wait for the child process, or to call
sigwait() for SIGCHLD, or to unblock SIGCHLD and handle it.

> Is this a noMMU-ism?  If so, any simple workarounds ?  Otherwise, I'll
> have to go digging in arch/microblaze :(

It's not directly to do with MMU.  It's a LinuxThreads-ism, as you're
not using the newer, better NPTL implementation of pthreads.  Someone
said you can't, on no-MMU.  I don't know if that's true, or if the
problem is NPTL not yet ported to some target architectures.

If you don't like the LinuxThreads behaviour, you could try porting
NPTL to your system.  In principle it can work on NOMMU systems.  It
will require some kernel support, for thread-specific data.  Lots of
people would like NPTL ported, and it's better than LinuxThreads in
every way, including standard POSIX behaviour and speed.

Or, you can change your program to work with LinuxThreads non-standard
signal behaviour.

Phil Wilshire wrote:
> Hi John,
> 
> This also crashes on the Blackfin BF537 Stamp (2008R1)
> Here is the (very nice) crash dump
> /var/SigTest
> NULL pointer access (probably)

Oh dear!  The program, library or kernel has a nasty bug then.

It shouldn't crash under any circumstances, on any architecture, MMU
or not, if the code follows the outlined steps.

It should just get stuck in "parent sleep() loops forever" in one
thread, and "does a sigwait() loop" in the other.  Manually sending
SIGCHLD to the second thread should make the program terminate cleanly.

Phil, you may want to investigate if this is triggering a bug in your
kernel or libc.

-- Jamie
_______________________________________________
uClinux-dev mailing list
uClinux-dev@uclinux.org
http://mailman.uclinux.org/mailman/listinfo/uclinux-dev
This message was resent by uclinux-dev@uclinux.org
To unsubscribe see:
http://mailman.uclinux.org/mailman/options/uclinux-dev

Reply via email to