The following issue has been set as RELATED TO issue 0000361. 
====================================================================== 
http://www.dbmail.org/mantis/view.php?id=370 
====================================================================== 
Reported By:                ryo
Assigned To:                
====================================================================== 
Project:                    DBMail
Issue ID:                   370
Category:                   General
Reproducibility:            sometimes
Severity:                   minor
Priority:                   normal
Status:                     feedback
target:                     2.1.7 
====================================================================== 
Date Submitted:             21-Jun-06 13:57 CEST
Last Modified:              22-Jun-06 07:02 CEST
====================================================================== 
Summary:                    waitpid() in ParentSigHander() function shuld be
called multiple time  for zombies.
Description: 
This issue is related to http://www.dbmail.org/mantis/view.php?id=361
(already resolved).

I update to svn2184, but I had any zombies.

I think that when the parent process recieved multiple SIGCHLD 
at the same time the signal handler is called only once on UNIX system.
So, waitpid() in ParentSigHander() function shuld be called multiple 
time for zombies.

In addition, it seems that the waipid() function is not called  
(after call kill()) in the manage_stop_chldren function.

I create the patch. Please see the attached file(dbmail-nozombie.patch).

======================================================================
Relationships       ID      Summary
----------------------------------------------------------------------
related to          0000363 Somtimes the count of grandchild proces...
related to          0000361 IMAP zombies after about a day.
====================================================================== 

---------------------------------------------------------------------- 
 paul - 21-Jun-06 16:30  
---------------------------------------------------------------------- 
patch accepted 

---------------------------------------------------------------------- 
 aaron - 21-Jun-06 18:09  
---------------------------------------------------------------------- 
Wait, is this guaranteed not to deadlock? I think I'd rather have a for
loop with a finite number of attempts, followed by some ominous trace
message that there's a zombie on the loose. 

---------------------------------------------------------------------- 
 ryo - 22-Jun-06 04:19  
---------------------------------------------------------------------- 
According to the man page of waitpid, the retrun value of waitpid
function is as follows.

 RETURN VALUE
      The process ID of the child which exited, or zero if  WNO-
      HANG  was  used and no child was available, or -1 on error
      (in which case errno is set to an appropriate value).

If zombie process does not exist then the waitpid function returns
zero or -1. So, the program maybe is not locked by the while loop 
of my patch.

I think that if you worry about the "while loop" lock then you limit
the times of loop(eg, to HARD_MAX_CHILDREN).
For example as follows.

 #include <pool.h> 
      :
    int cnt = 0;    
      :
    while((chpid = waitpid(-1,&sig,WNOHANG)) > 0 && (cnt++ <
HARD_MAX_CHILDREN))
        scoreboard_release(chpid); 

---------------------------------------------------------------------- 
 aaron - 22-Jun-06 06:12  
---------------------------------------------------------------------- 
Oh wait, sorry, I misread the patch. The situation that might lead to an
infinite loop is if there is some child that is not a zombie and yet does
not terminate. Is there a possibility of this occurring? 

---------------------------------------------------------------------- 
 ryo - 22-Jun-06 07:02  
---------------------------------------------------------------------- 
I think that if the waitpid() system call does not have any bug then an
infinite loop does not occur.

But I think that the program had better limit the times of loop by way of
precaution (like the example of 0001263). 

Issue History 
Date Modified   Username       Field                    Change               
====================================================================== 
21-Jun-06 13:57 ryo            New Issue                                    
21-Jun-06 13:57 ryo            File Added: dbmail-nozombie.patch                
   
21-Jun-06 16:30 paul           target                    => 2.1.7           
21-Jun-06 16:30 paul           Note Added: 0001256                          
21-Jun-06 16:30 paul           Status                   new => resolved     
21-Jun-06 16:30 paul           Resolution               open => fixed       
21-Jun-06 16:30 paul           Fixed in Version          => SVN Trunk       
21-Jun-06 18:09 aaron          Note Added: 0001261                          
22-Jun-06 04:19 ryo            Status                   resolved => feedback
22-Jun-06 04:19 ryo            Resolution               fixed => reopened   
22-Jun-06 04:19 ryo            Note Added: 0001263                          
22-Jun-06 06:12 aaron          Note Added: 0001264                          
22-Jun-06 07:02 ryo            Note Added: 0001265                          
03-Jul-06 12:54 paul           Relationship added       related to 0000363  
03-Jul-06 12:56 paul           Relationship added       related to 0000361  
======================================================================

Reply via email to