Have you tried with the latest POE release? 1.284 is seven releases behind current, and there have been at least two signals-related fixes since.

If the problem persists with the latest release, I would appreciate a small, self-contained test case. This sounds like a nasty bug, and I'd like to fix it if I can. First, however, I need to see it breaking so I can track down the problem.

The filehandle to JSON.pm is peculiar. POE doesn't mention JSON at all, so I wonder where the filehandle came from.

1) macbookpoe:~% cd projects/poe/poe
1) macbookpoe:~/projects/poe/poe% ack JSON
1) macbookpoe:~/projects/poe/poe% ack -ai JSON
1) macbookpoe:~/projects/poe/poe%

--
Rocco Caputo - rcap...@pobox.com


On Sep 1, 2010, at 13:28, Ellery, Michael wrote:

Hello,

We use POE to manage a group of child processes doing processing in our system, using the POE::Wheel::Run plugin.

We recently upgraded our version of POE from 0.9989 to 1.284 and noticed a dramatic change in the performance of the system. With the older version of POE, the child processes consumed most of the CPU - which is expected since they do all the processing. With the newer version of POE, however, we notice that the parent process (the POE session manager) consumes all the CPU and starves the children. The parent process is really just there to manage the children, so it should consume very little resources.

I did some investigation with strace and noticed that the parent process was doing select/read in a very tight loop...and the filehandle it is trying to read from is just some perl library file on our system (I think it was JSON.pm, but that doesn't matter..). What's more, the list of FDs being passed to select included 10 sockets (presumably to the child processes) AND this one file handle to the JSON.pm module that the perl process has open. Since this file is always ready to read, select always returns true indicating that this file is ready to read..thus the tight loop.

I did a little investigating in the POE code and noticed that there is a lot of new code to handle signals differently. As an experiment, I tried setting USE_SIGNAL_PIPE to 0 in my script before loading POE and the behavior goes back to normal (that is, the parent process no longer consumes all the CPU).

So, has anyone else observed this behavior when using Wheel::Run to manage child processes? Is my workaround a reasonable one or should I fix this problem some other way. Based on my observations, it seems like the fundamental problem is that POE is calling select() with one incorrect file descriptor in its list..thus causing select to always return immediately and causing POE to try to read from some disk file instead of from a socket to child processes.

Thanks,
Mike Ellery


Protected by Websense Hosted Email Security -- www.websense.com

Reply via email to