[Freevo-devel] "PANIC can't kill program" even though program is dead

Thorsten Pferdekämper Mon, 31 Dec 2007 03:29:53 -0800

Hi,

I just wondered why channel switching needs so long on my box and why it is 
always accompagnied by a "PANIC can't kill program" message in main-0.log.
Here is what I believe why this is:


Basically, the problem is that os.waitpid and subprocess.Popen.poll do not 
really work together. It seems that only one of them can be used, but not 
both.

Here are the details:

I have a special plugin for watching TV from DVB-S. This plugin uses a program 
(dvbcat) which does not have the possibility to be stopped by some external 
command other than sending it a signal.
The plugin controls that program by using childapp.ChildApp. When switching 
the channel (or exiting watching TV), it uses childapp.ChildApp.kill to 
terminate dvbcat. This means that signal 15 is sent to the program here:

        if signal:
            _debug_('killing pid %s signal %s' % (self.child.pid, signal), 1)
            try:
                os.kill(self.child.pid, signal)
            except OSError:
                pass

After that, the ChildApp.kill uses wait() which in turn uses waitpid(). 
waitpid() eventually returns True (I believe usually when called the second 
time). However, a bit further down in ChildApp.kill(), there is...
 
        for i in range(5):
            if self.child.poll() != None:
                break
            time.sleep(0.1)

At this point in time, the program is already dead (by os.kill and 
os.waitpid). self.child.poll() however, always returns None. In the following 
coding, the system tries very hard to kill the already killed process.

It seems that the subprocess module does not notice that the process was 
killed when os.waitpid was called before. I have written a small tester 
program which uses both waitpid and poll in different scenarios. It seems 
that only the combinations
        os.kill
        os.waitpid
or
        os.kill
        subrocess.Popen.poll
really scurely kill the process and return the correct return value. Maybe in 
the time before the subprocess module, os.kill needed the waitpid, but now 
just using poll also seems ok, but only without calling waitpid before. I 
have the impression that poll() always returns None if waitpid was called 
before, regardless wether the process has ended or not.

After finding that out, I wondered why the normal player processes do not have 
this problem. Here's what I believe why:
These processes use ChildApp2 and for stopping, the method ChildApp2.stop is 
called. This sends some command to the player application and then checks 
with isAlive() if the player finished. isAlive in turn just uses poll(). 
Usually, the player application really quits soon, so isAlive() eventually 
returns False and the loop (see below) quits.
 
   if cmd and self.isAlive():
            self.write(cmd)
            # wait for the app to terminate itself
            for i in range(60):
                if not self.isAlive():
                    break
                time.sleep(0.1)

After that, kill() is only called when self.isAlive() still is true. Usually, 
this is not the case.
I.e. poll() is called without calling waitpid() and usually waitpid() is not 
called at all. This is a scenario which really works.

Regards,
        Thorsten




-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
Freevo-devel mailing list
Freevo-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/freevo-devel

[Freevo-devel] "PANIC can't kill program" even though program is dead

Reply via email to