Re: Patch(?): bash-2.05/jobs.c loses interrupts

2001-05-01 Thread Pavel Machek

Hi!

> > Linux-2.4.4 has a change, for which I must accept blame,
> > where fork() runs the child first, reducing unnecessary copy-on-write
> > page duplications, because the child will usually promptly do an
> > exec().  I understand this is pretty standard in most unixes.
> > 
> > Peter Osterlund noticed an annoying side effect of this,
> > which I think is a bash bug.  He wrote:
> > 
> > > Another thing is that the bash loop "while true ; do /bin/true ; done" is
> > > not possible to interrupt with ctrl-c.
> > 
> > I have reproduced this problem on a single CPU system.
> > I also modified my kernel to sometimes run the fork child first
> > and sometimes not.  In that case, that loop would sometimes
> > abort on a control-C and sometimes ignore it, but ignoring it
> > would not make the loop less likely to abort on another control-C.
> > I'm pretty sure the control-C was being delivered only to the child
> > due to a race condition in bash, which may be mandated by posix.
> 
> Did you reconfigure and rebuild bash on your machine running the 2.4
> kernel, or just use a bash binary built on a previous kernel
> version?

This is nasty race condition. I do not believe you can test for it in
configure. 

This might happen on 2.4.3 (occasionally) too. Kernel is permitted to
do any kind of scheduling!

Pavel

> Bash has an autoconf test that will, if it detects the need to do so,
> force the job control code to synchronize between parent and child
> when setting up the process group for a new pipeline.  It may be the
> case that you have to reconfigure and rebuild bash to enable that code.
> 
> Look for PGRP_PIPE in config.h.


-- 
I'm [EMAIL PROTECTED] "In my country we have almost anarchy and I don't care."
Panos Katsaloulis describing me w.r.t. patents at [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Patch(?): bash-2.05/jobs.c loses interrupts

2001-05-01 Thread Pavel Machek

Hi!

  Linux-2.4.4 has a change, for which I must accept blame,
  where fork() runs the child first, reducing unnecessary copy-on-write
  page duplications, because the child will usually promptly do an
  exec().  I understand this is pretty standard in most unixes.
  
  Peter Osterlund noticed an annoying side effect of this,
  which I think is a bash bug.  He wrote:
  
   Another thing is that the bash loop while true ; do /bin/true ; done is
   not possible to interrupt with ctrl-c.
  
  I have reproduced this problem on a single CPU system.
  I also modified my kernel to sometimes run the fork child first
  and sometimes not.  In that case, that loop would sometimes
  abort on a control-C and sometimes ignore it, but ignoring it
  would not make the loop less likely to abort on another control-C.
  I'm pretty sure the control-C was being delivered only to the child
  due to a race condition in bash, which may be mandated by posix.
 
 Did you reconfigure and rebuild bash on your machine running the 2.4
 kernel, or just use a bash binary built on a previous kernel
 version?

This is nasty race condition. I do not believe you can test for it in
configure. 

This might happen on 2.4.3 (occasionally) too. Kernel is permitted to
do any kind of scheduling!

Pavel

 Bash has an autoconf test that will, if it detects the need to do so,
 force the job control code to synchronize between parent and child
 when setting up the process group for a new pipeline.  It may be the
 case that you have to reconfigure and rebuild bash to enable that code.
 
 Look for PGRP_PIPE in config.h.


-- 
I'm [EMAIL PROTECTED] In my country we have almost anarchy and I don't care.
Panos Katsaloulis describing me w.r.t. patents at [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Patch(?): bash-2.05/jobs.c loses interrupts

2001-04-30 Thread Adam J. Richter

>>  Linux-2.4.4 has a change, for which I must accept blame,
>> where fork() runs the child first, reducing unnecessary copy-on-write
>> page duplications, because the child will usually promptly do an
>> exec().  I understand this is pretty standard in most unixes.
>> 
>>  Peter Osterlund noticed an annoying side effect of this,
>> which I think is a bash bug.  He wrote:
>> 
>> > Another thing is that the bash loop "while true ; do /bin/true ; done" is
>> > not possible to interrupt with ctrl-c.
>> 
>>  I have reproduced this problem on a single CPU system.
>> I also modified my kernel to sometimes run the fork child first
>> and sometimes not.  In that case, that loop would sometimes
>> abort on a control-C and sometimes ignore it, but ignoring it
>> would not make the loop less likely to abort on another control-C.
>> I'm pretty sure the control-C was being delivered only to the child
>> due to a race condition in bash, which may be mandated by posix.

>Did you reconfigure and rebuild bash on your machine running the 2.4
>kernel, or just use a bash binary built on a previous kernel version?

>Bash has an autoconf test that will, if it detects the need to do so,
>force the job control code to synchronize between parent and child
>when setting up the process group for a new pipeline.  It may be the
>case that you have to reconfigure and rebuild bash to enable that code.

>Look for PGRP_PIPE in config.h.

Rebuilding bash from pristine 2.05 sources under such a kernel
does *not* solve the problem.  PGRP_PIPE is undef'ed in the resulting
config.h.

Adam J. Richter __ __   4880 Stevens Creek Blvd, Suite 104
[EMAIL PROTECTED] \ /  San Jose, California 95129-1034
+1 408 261-6630 | g g d r a s i l   United States of America
fax +1 408 261-6631  "Free Software For The Rest Of Us."

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Patch(?): bash-2.05/jobs.c loses interrupts

2001-04-30 Thread Chet Ramey

>   Linux-2.4.4 has a change, for which I must accept blame,
> where fork() runs the child first, reducing unnecessary copy-on-write
> page duplications, because the child will usually promptly do an
> exec().  I understand this is pretty standard in most unixes.
> 
>   Peter Osterlund noticed an annoying side effect of this,
> which I think is a bash bug.  He wrote:
> 
> > Another thing is that the bash loop "while true ; do /bin/true ; done" is
> > not possible to interrupt with ctrl-c.
> 
>   I have reproduced this problem on a single CPU system.
> I also modified my kernel to sometimes run the fork child first
> and sometimes not.  In that case, that loop would sometimes
> abort on a control-C and sometimes ignore it, but ignoring it
> would not make the loop less likely to abort on another control-C.
> I'm pretty sure the control-C was being delivered only to the child
> due to a race condition in bash, which may be mandated by posix.

Did you reconfigure and rebuild bash on your machine running the 2.4
kernel, or just use a bash binary built on a previous kernel version?

Bash has an autoconf test that will, if it detects the need to do so,
force the job control code to synchronize between parent and child
when setting up the process group for a new pipeline.  It may be the
case that you have to reconfigure and rebuild bash to enable that code.

Look for PGRP_PIPE in config.h.

-- 
``The lyf so short, the craft so long to lerne.'' - Chaucer
( ``Discere est Dolere'' -- chet)

Chet Ramey, CWRU[EMAIL PROTECTED]http://cnswww.cns.cwru.edu/~chet/
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Patch(?): bash-2.05/jobs.c loses interrupts

2001-04-30 Thread Chet Ramey

   Linux-2.4.4 has a change, for which I must accept blame,
 where fork() runs the child first, reducing unnecessary copy-on-write
 page duplications, because the child will usually promptly do an
 exec().  I understand this is pretty standard in most unixes.
 
   Peter Osterlund noticed an annoying side effect of this,
 which I think is a bash bug.  He wrote:
 
  Another thing is that the bash loop while true ; do /bin/true ; done is
  not possible to interrupt with ctrl-c.
 
   I have reproduced this problem on a single CPU system.
 I also modified my kernel to sometimes run the fork child first
 and sometimes not.  In that case, that loop would sometimes
 abort on a control-C and sometimes ignore it, but ignoring it
 would not make the loop less likely to abort on another control-C.
 I'm pretty sure the control-C was being delivered only to the child
 due to a race condition in bash, which may be mandated by posix.

Did you reconfigure and rebuild bash on your machine running the 2.4
kernel, or just use a bash binary built on a previous kernel version?

Bash has an autoconf test that will, if it detects the need to do so,
force the job control code to synchronize between parent and child
when setting up the process group for a new pipeline.  It may be the
case that you have to reconfigure and rebuild bash to enable that code.

Look for PGRP_PIPE in config.h.

-- 
``The lyf so short, the craft so long to lerne.'' - Chaucer
( ``Discere est Dolere'' -- chet)

Chet Ramey, CWRU[EMAIL PROTECTED]http://cnswww.cns.cwru.edu/~chet/
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Patch(?): bash-2.05/jobs.c loses interrupts

2001-04-30 Thread Adam J. Richter

  Linux-2.4.4 has a change, for which I must accept blame,
 where fork() runs the child first, reducing unnecessary copy-on-write
 page duplications, because the child will usually promptly do an
 exec().  I understand this is pretty standard in most unixes.
 
  Peter Osterlund noticed an annoying side effect of this,
 which I think is a bash bug.  He wrote:
 
  Another thing is that the bash loop while true ; do /bin/true ; done is
  not possible to interrupt with ctrl-c.
 
  I have reproduced this problem on a single CPU system.
 I also modified my kernel to sometimes run the fork child first
 and sometimes not.  In that case, that loop would sometimes
 abort on a control-C and sometimes ignore it, but ignoring it
 would not make the loop less likely to abort on another control-C.
 I'm pretty sure the control-C was being delivered only to the child
 due to a race condition in bash, which may be mandated by posix.

Did you reconfigure and rebuild bash on your machine running the 2.4
kernel, or just use a bash binary built on a previous kernel version?

Bash has an autoconf test that will, if it detects the need to do so,
force the job control code to synchronize between parent and child
when setting up the process group for a new pipeline.  It may be the
case that you have to reconfigure and rebuild bash to enable that code.

Look for PGRP_PIPE in config.h.

Rebuilding bash from pristine 2.05 sources under such a kernel
does *not* solve the problem.  PGRP_PIPE is undef'ed in the resulting
config.h.

Adam J. Richter __ __   4880 Stevens Creek Blvd, Suite 104
[EMAIL PROTECTED] \ /  San Jose, California 95129-1034
+1 408 261-6630 | g g d r a s i l   United States of America
fax +1 408 261-6631  Free Software For The Rest Of Us.

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Patch(?): bash-2.05/jobs.c loses interrupts

2001-04-29 Thread Adam J. Richter

Linux-2.4.4 has a change, for which I must accept blame,
where fork() runs the child first, reducing unnecessary copy-on-write
page duplications, because the child will usually promptly do an
exec().  I understand this is pretty standard in most unixes.

Peter Osterlund noticed an annoying side effect of this,
which I think is a bash bug.  He wrote:

> Another thing is that the bash loop "while true ; do /bin/true ; done" is
> not possible to interrupt with ctrl-c.

I have reproduced this problem on a single CPU system.
I also modified my kernel to sometimes run the fork child first
and sometimes not.  In that case, that loop would sometimes
abort on a control-C and sometimes ignore it, but ignoring it
would not make the loop less likely to abort on another control-C.
I'm pretty sure the control-C was being delivered only to the child
due to a race condition in bash, which may be mandated by posix.

I am pretty sure that the reason for this behavior is that
is that make_child() in bash-2.05/jobs.c has the child define itself
as a new process group and set the terminal's process group to it.
The parent will eventually also set its pgid to the child's pid when
it finally runs, but, in this example, /bin/true will probably run to
completion before that.  So, there is a period of time when the
child has set itself up as a distinct process group and pointed
the terminal to it, but the parent has not yet joined that process
group, so only the child will receive a ^C that happens during this
time.  This is the case basically 100% of the time if you do
a "while true ; do /bin/true ; done" loop under linux-2.4.4 on a
1GHz Pentium III (slower CPU's may not have enough cycles per time
slice to make this race happen reliably, as I do not see it on a
similar 866MHz Pentium III).

I think the correct fix is for bash to have the parent
set the controlling process of the terminal, not to have the child
do it.  In fact, there are comments to this effect in bash-2.05/jobs.c,
although they do not explain why this is not currently done.  I have
attached a patch which is my guess at how to implement the change.
I know it fixes the "while true ; do /bin/true ; done" example.
I think that there may be some other loose ends to clean up, though.
For example, there is now potentially a time window when only the
parent will receive a control-C, so it may be necessary for the
parent to signal the child if the parent sees a signal as soon as
it has unblocked them.

-- 
Adam J. Richter __ __   4880 Stevens Creek Blvd, Suite 104
[EMAIL PROTECTED] \ /  San Jose, California 95129-1034
+1 408 261-6630 | g g d r a s i l   United States of America
fax +1 408 261-6631  "Free Software For The Rest Of Us."


--- bash-2.05/jobs.cMon Mar 26 10:08:24 2001
+++ bash/jobs.c Sat Apr 28 23:51:33 2001
@@ -1202,17 +1202,6 @@
 #if defined (PGRP_PIPE)
  if (pipeline_pgrp == mypid)
{
-#endif
- /* By convention (and assumption above), if
-pipeline_pgrp == shell_pgrp, we are making a child for
-command substitution.
-In this case, we don't want to give the terminal to the
-shell's process group (we could be in the middle of a
-pipeline, for example). */
- if (async_p == 0 && pipeline_pgrp != shell_pgrp)
-   give_terminal_to (pipeline_pgrp, 0);
-
-#if defined (PGRP_PIPE)
  pipe_read (pgrp_pipe);
}
 #endif
@@ -1251,9 +1240,14 @@
  if (pipeline_pgrp == 0)
{
  pipeline_pgrp = pid;
- /* Don't twiddle terminal pgrps in the parent!  This is the bug,
-not the good thing of twiddling them in the child! */
- /* give_terminal_to (pipeline_pgrp, 0); */
+ /* By convention (and assumption above), if
+pipeline_pgrp == shell_pgrp, we are making a child for
+command substitution.
+In this case, we don't want to give the terminal to the
+shell's process group (we could be in the middle of a
+pipeline, for example). */
+ if (async_p == 0 && pipeline_pgrp != shell_pgrp)
+   give_terminal_to (pipeline_pgrp, 0);
}
  /* This is done on the recommendation of the Rationale section of
 the POSIX 1003.1 standard, where it discusses job control and



Patch(?): bash-2.05/jobs.c loses interrupts

2001-04-29 Thread Adam J. Richter

Linux-2.4.4 has a change, for which I must accept blame,
where fork() runs the child first, reducing unnecessary copy-on-write
page duplications, because the child will usually promptly do an
exec().  I understand this is pretty standard in most unixes.

Peter Osterlund noticed an annoying side effect of this,
which I think is a bash bug.  He wrote:

 Another thing is that the bash loop while true ; do /bin/true ; done is
 not possible to interrupt with ctrl-c.

I have reproduced this problem on a single CPU system.
I also modified my kernel to sometimes run the fork child first
and sometimes not.  In that case, that loop would sometimes
abort on a control-C and sometimes ignore it, but ignoring it
would not make the loop less likely to abort on another control-C.
I'm pretty sure the control-C was being delivered only to the child
due to a race condition in bash, which may be mandated by posix.

I am pretty sure that the reason for this behavior is that
is that make_child() in bash-2.05/jobs.c has the child define itself
as a new process group and set the terminal's process group to it.
The parent will eventually also set its pgid to the child's pid when
it finally runs, but, in this example, /bin/true will probably run to
completion before that.  So, there is a period of time when the
child has set itself up as a distinct process group and pointed
the terminal to it, but the parent has not yet joined that process
group, so only the child will receive a ^C that happens during this
time.  This is the case basically 100% of the time if you do
a while true ; do /bin/true ; done loop under linux-2.4.4 on a
1GHz Pentium III (slower CPU's may not have enough cycles per time
slice to make this race happen reliably, as I do not see it on a
similar 866MHz Pentium III).

I think the correct fix is for bash to have the parent
set the controlling process of the terminal, not to have the child
do it.  In fact, there are comments to this effect in bash-2.05/jobs.c,
although they do not explain why this is not currently done.  I have
attached a patch which is my guess at how to implement the change.
I know it fixes the while true ; do /bin/true ; done example.
I think that there may be some other loose ends to clean up, though.
For example, there is now potentially a time window when only the
parent will receive a control-C, so it may be necessary for the
parent to signal the child if the parent sees a signal as soon as
it has unblocked them.

-- 
Adam J. Richter __ __   4880 Stevens Creek Blvd, Suite 104
[EMAIL PROTECTED] \ /  San Jose, California 95129-1034
+1 408 261-6630 | g g d r a s i l   United States of America
fax +1 408 261-6631  Free Software For The Rest Of Us.


--- bash-2.05/jobs.cMon Mar 26 10:08:24 2001
+++ bash/jobs.c Sat Apr 28 23:51:33 2001
@@ -1202,17 +1202,6 @@
 #if defined (PGRP_PIPE)
  if (pipeline_pgrp == mypid)
{
-#endif
- /* By convention (and assumption above), if
-pipeline_pgrp == shell_pgrp, we are making a child for
-command substitution.
-In this case, we don't want to give the terminal to the
-shell's process group (we could be in the middle of a
-pipeline, for example). */
- if (async_p == 0  pipeline_pgrp != shell_pgrp)
-   give_terminal_to (pipeline_pgrp, 0);
-
-#if defined (PGRP_PIPE)
  pipe_read (pgrp_pipe);
}
 #endif
@@ -1251,9 +1240,14 @@
  if (pipeline_pgrp == 0)
{
  pipeline_pgrp = pid;
- /* Don't twiddle terminal pgrps in the parent!  This is the bug,
-not the good thing of twiddling them in the child! */
- /* give_terminal_to (pipeline_pgrp, 0); */
+ /* By convention (and assumption above), if
+pipeline_pgrp == shell_pgrp, we are making a child for
+command substitution.
+In this case, we don't want to give the terminal to the
+shell's process group (we could be in the middle of a
+pipeline, for example). */
+ if (async_p == 0  pipeline_pgrp != shell_pgrp)
+   give_terminal_to (pipeline_pgrp, 0);
}
  /* This is done on the recommendation of the Rationale section of
 the POSIX 1003.1 standard, where it discusses job control and