On 07/29/2012 12:46 AM, Chet Ramey wrote:
> On 7/27/12 9:50 AM, Michael Haubenwallner wrote:
> 
>> With attached patch I haven't been able to break the testcase below so far
>> on that AIX 6.1 box here.
>>
>> But still, the other one using the $()-childs still fails.
> 
> Try the attached patch for that.

Collecting the patches and cleaning up now unused code, attached patch
seems to fix both CHILD_MAX related problems on that AIX box here now,
without using the RECYCLES_PIDS workaround.

Thank you!

/haubi/
Bash assumes pids aren't reused before sysconf(_SC_CHILD_MAX) immediate childs
(the dynamic value), as well as ascending and wrapped around pid values.

However, as specified by POSIX, conforming kernels actually guarantee for
CHILD_MAX imediate childs (the static value) before reusing pids. Additionally,
AIX (at least) does not guarantee for ascending pid values at all. Actually,
AIX reuses pids after its CHILD_MAX value of 128 in somewhat random order in
some configuration- or load-cases, resulting in race conditions like these:
http://lists.gnu.org/archive/html/bug-bash/2008-07/msg00117.html

This looks like a similar problem with Cygwin, where RECYCLES_PIDS is defined
as the workaround, but that isn't really correct for AIX (and maybe Interix):
http://www.cygwin.com/ml/cygwin/2004-09/msg00882.html
http://www.cygwin.com/ml/cygwin/2002-08/msg00449.html
*** jobs.c.orig	2012-08-20 16:23:51 +0200
--- jobs.c	2012-08-20 16:51:36 +0200
***************
*** 317,324 ****
  static char retcode_name_buffer[64];
  
- /* flags to detect pid wraparound */
- static pid_t first_pid = NO_PID;
- static int pid_wrap = -1;
- 
  #if !defined (_POSIX_VERSION)
  
--- 317,320 ----
***************
*** 347,352 ****
  {
    js = zerojs;
-   first_pid = NO_PID;
-   pid_wrap = -1;
  }
  
--- 343,346 ----
***************
*** 1823,1833 ****
  	 as the proper pgrp if this is the first child. */
  
-       if (first_pid == NO_PID)
- 	first_pid = pid;
-       else if (pid_wrap == -1 && pid < first_pid)
- 	pid_wrap = 0;
-       else if (pid_wrap == 0 && pid >= first_pid)
- 	pid_wrap = 1;
- 
        if (job_control)
  	{
--- 1817,1820 ----
***************
*** 1863,1875 ****
  #endif
  
!       if (pid_wrap > 0)
! 	delete_old_job (pid);
  
! #if !defined (RECYCLES_PIDS)
!       /* Only check for saved status if we've saved more than CHILD_MAX
! 	 statuses, unless the system recycles pids. */
!       if ((js.c_reaped + bgpids.npid) >= js.c_childmax)
! #endif
! 	bgp_delete (pid);		/* new process, discard any saved status */
  
        last_made_pid = pid;
--- 1850,1856 ----
  #endif
  
!       delete_old_job (pid);
  
!       bgp_delete (pid);		/* new process, discard any saved status */
  
        last_made_pid = pid;
*** execute_cmd.c.orig	2012-08-20 16:36:10 +0200
--- execute_cmd.c	2012-08-20 16:51:14 +0200
***************
*** 742,748 ****
  
  	/* XXX - this is something to watch out for if there are problems
! 	   when the shell is compiled without job control. */
! 	if (already_making_children && pipe_out == NO_PIPE &&
! 	    last_made_pid != last_pid)
  	  {
  	    stop_pipeline (asynchronous, (COMMAND *)NULL);
--- 742,750 ----
  
  	/* XXX - this is something to watch out for if there are problems
! 	   when the shell is compiled without job control.  Don't worry about
! 	   whether or not last_made_pid == last_pid; already_making_children
! 	   tells us whether or not there are unwaited-for children to wait
! 	   for and reap. */
! 	if (already_making_children && pipe_out == NO_PIPE)
  	  {
  	    stop_pipeline (asynchronous, (COMMAND *)NULL);

Reply via email to