Zombie Trouble Shooting

John Ackley Thu, 06 Apr 2006 04:55:18 -0700

I have a perl script that becomes a zombie.

It runs fine for days or weeks checking for new data every 60 seconds.
But after a long period of time running on Red Hat 9, Fedora Core 4,
and now Fedora Core 5 it remains in memory as an active process and
maintains its network tcp/ip connections but stops functioning.


Here you see two zombies that stopped functioning days after start
and still in the process queue after several weeks.:
USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND

kp2a 2866 0.0 0.9 14032 9556 ? S Mar27 0:13/usr/bin/perl -w ./packet.pl aa2mfkp2a 2868 0.0 0.9 14024 9456 ? S Mar27 0:05/usr/bin/perl -w ./packet.pl k5utd


I have tried for several years to find the problem as you can see
by the sequence of OS upgrades!  No luck!  I included a log
so I know the last good check point which is at the bottom of
an infinite loop:

Mar 29 08:56:45.327 ./packet.pl line 259:
   end while remote

I have also included logic to reconnect and restart the remote
network processes should the remote end stop responding.

Can anyone recommend any debugging tools or techniques?

I do not think that perldebug would be appropriate.
tcpdump might help if I arrange for a rotating log
to keep only the last few minutes of traffic.

I might be able to use, I guess, a daily kill SIGHUP
to keep restarting them or starting them fresh every
minute with cron but I would like to solve the mystery!

Thanks.



--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
<http://learn.perl.org/> <http://learn.perl.org/first-response>

Zombie Trouble Shooting

Reply via email to