Hello POEers.

This is a heads-up about problems with POE and signals.  If you use a signal
handler in POE, you are probably affected.

As of 1.006, POE's event queue manipulation code isn't signal safe.  If a signal
occurs while POE is removing an event from the queue (example:
$poe_kernel->remove_alarm( $aid )), there is a chance that POE will remove the
wrong element.  POE::Queue::XS::Array probably doesn't suffer from this
problem.  Long running daemons that fork a lot of sub-processes (example: pre-
or post-forking daemons) are highly susceptable to this problem.

What's more, there is a race condition between verifying the queue and calling
select() or poll().

You have probably been affected by this bug, without knowing it.  If you have
seen mysterious hanging processes, POE events that are going missing or other
heisenbugs, this bug is a prime suspect.

Rather then try patch around the problem, we are going to adopt a "final
solution" :

Semantically, the signal handlers and the main queue are running in different
threads.  The only robust way to implement threads is share-nothing.  This
means that the signal handlers have to talk to the main queue via some sort of
conduit like a socket or pipe.

The RT ticket is here :
   https://rt.cpan.org/Ticket/Display.html?id=47966

The second patch implements a signal pipe using POE::Pipe::OneWay, which is
highly portable. The previous unsafe behaviour is available if the
POE_USE_SIGNAL_PIPE=0 env var is set.

The patch has already been tested on Linux (CentOS 4) and Mac OS X against all
the POE::Loops that we could install.  If possible, it would be very much
appreciated if you could test it on your systems.  Report here on list or via
RT.

A developer release is planed, to see how much smoke the CPAN testers
release.  Then general CPAN release.

Thank you,

-Leolo



Reply via email to