Re: so-called pipe files (sh-np-*) do not get deleted when processes close.

2021-03-22 Thread Chet Ramey

On 3/16/21 8:04 AM, Michael Felt wrote:





Decided to give bash-5.1 a try. I doubt it is major, but I get as far as:

"../../../src/bash-5.1.0/lib/sh/tmpfile.c", line 289.11: 1506-068 (W) 
Operation between types "char*" and "int" is not allowed.

ld: 0711-317 ERROR: Undefined symbol: .mkdtemp
ld: 0711-345 Use the -bloadmap or -bnoquiet option to obtain more information.
make: 1254-004 The error code from the last command is 8.


I figured this out. There is a typo in config-bot.h that prevents
USE_MKDTEMP from being disabled if configure doesn't find it.


--
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/



Re: so-called pipe files (sh-np-*) do not get deleted when processes close.

2021-03-22 Thread Chet Ramey

On 3/20/21 3:15 PM, Michael Felt wrote:

Scraping through this - thanks for the lessons aka explanations.

On 18/03/2021 16:08, Chet Ramey wrote:

On 3/18/21 5:53 AM, Michael Felt wrote:

Yes, something to test. Thx. The ojdk scenario is: /usr/bin/printf > 
>(tee -a stdout.log) 2> >(tee -a stderr.log).


So, yes, in this case it is working because printf is the parent - 
(which I never seemed to find actually calling open() of the file. It 
seems to be using the fd opened by the child - in a magical way).


It's the redirection. The shell does the open, since the filename resulting
from process substitution is the target of a redirection operator. This is
a common idiom -- so common, in fact, that people have interpreted it to
mean that the entire `> >(xxx)' is a single operator.

However, the shell expands redirections in the child process it forks to
exec printf, so that child shell is what does the process substitution.
That might be the problem here.


I think that ended up being the problem. When the process substitution is 
used in a redirection, and the command is a normal command found in the

file system and executed with execve(2), there's nothing left to remove
the FIFOs when the command completes.

The best solution is for the shell to remove the FIFOs created as part of
redirection (they've already been opened) before calling execve.

--
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/



Re: so-called pipe files (sh-np-*) do not get deleted when processes close.

2021-03-20 Thread Michael Felt

Scraping through this - thanks for the lessons aka explanations.

On 18/03/2021 16:08, Chet Ramey wrote:

On 3/18/21 5:53 AM, Michael Felt wrote:

Yes, something to test. Thx. The ojdk scenario is: /usr/bin/printf > 
>(tee -a stdout.log) 2> >(tee -a stderr.log).


So, yes, in this case it is working because printf is the parent - 
(which I never seemed to find actually calling open() of the file. It 
seems to be using the fd opened by the child - in a magical way).


It's the redirection. The shell does the open, since the filename 
resulting
from process substitution is the target of a redirection operator. 
This is

a common idiom -- so common, in fact, that people have interpreted it to
mean that the entire `> >(xxx)' is a single operator.

However, the shell expands redirections in the child process it forks to
exec printf, so that child shell is what does the process substitution.
That might be the problem here.

The command itself doesn't do anything, though. `tee' just sits there
waiting for data to write to log files. It has no purpose. I'm not sure
what the intent is.

If you wrapped that command into a script, it's unlikely that either 
`tee'

would exit (why would they?) before `printf' exits and the script
completes. In bash-5.0, there would be nothing to remove the FIFOs.
If I understand correctly, the commands are generated by gmake as it 
processes targets.




This is defined to provide `diff' with two arguments. Let's call them

/var/tmp/sh-np12345
and
/var/tmp/sh-np67890

So diff runs, sees two arguments, opens both files, and does its thing.
Diff has to see two filenames when it runs, otherwise it's an error.

But what I thoght I was seeing is that diff is the PARENT calling 
substitute_process() that create(s) a child process that reads/writes 
to a fifo file.


Yes and no. Process substitution is a word expansion that results in
a filename. The stuff between the parens defines a command that writes to
or reads from a pipe expressed as a filename (/dev/fd/NN or a FIFO) 
that is
the result of the word expansion. In this case, the process 
substitution is

the target of a redirection, so the shell performs that word expansion
before it execs diff.

a) the child process never returns - it `exits` via, iirc, 
sh_exit(result) and the end of the routine


It executes the specified command and exits.
Got it: must remember - initially it is bash busy with word expansion (a 
new bash child for each 'process substitution'



b) the parent gets the filename (pathname) - but I never see it 
actually opening it - only (when using bash -x) seeing the name in 
the -x command expansion.


It doesn't have to. The filename itself is the expansion: it's an object
you can use to communicate with an asynchronous process. This is how you
can have programs that expect a filename use program-generated output, 
for

instance, without using a temp file.
Yes - for me at least it is much easier to fathom as input - that ends 
and behaves/looks/feels like EOF.


In this case, it opens the FIFO because it is the target of a 
redirection.
Nods: just as it would if it was the output from a program than had just 
run 'moments' before.



Now, let's say your change is there. The shell still runs

diff /var/tmp/sh-np12345 /var/tmp/sh-np67890

but, depending on how processes get scheduled, the shell forked to run
the process substitutions has already unlinked those FIFOs. Diff will
error out. The user will be unhappy. I will get bug reports.

You have introduced a race condition. You may not get hit by it, but
you cannot guarantee that no one will.
No I cannot - and for now it is a `hack` to solve a bigger issue. 
With 3500 calls in a single build I hope the race occurs - and I'll 
finally see where the PARENT actually uses the name returned.


You mean in terms of using the filename as an argument to a shell 
builtin?

Otherwise you'll have to trace into other child process execution.


/usr/bin/printf is not a built-n (afaik)

If I understand correctly - from printf perspective we have

/usr/bin/printf "Some formatted message" > /tmp/sh-np.123456 2> 
/tmp/sh-np.9876543 &


And, if for ease of discussion we say program1 is PID-123456 and program 
is PID-987654 - these programs have no way of knowing their stdin is 
named /tmp/sh-np-something?
BING: as the dutch (used to) say - the quarter drops - the other 
programs (e.g., tee) have no fifo knowledge - they are who/what they are.
What maybe needed for this situation - is rather than directly execev() 
the program - yet another fork (for the execve() - and wait for that 
program to hit it's EOF on input and then the sleeping 'word expansion 
child' cleans up the fifo file it created for the communication path.


* Am I getting closer? :)





The shell can't unlink the FIFO until it can guarantee that the
processes that need to open it have opened it, and it can't guarantee
that in the general case. It has to wait until the process completes,
at least, and even that 

Re: so-called pipe files (sh-np-*) do not get deleted when processes close.

2021-03-18 Thread Chet Ramey

On 3/18/21 5:53 AM, Michael Felt wrote:

Yes, something to test. Thx. The ojdk scenario is: /usr/bin/printf > >(tee 
-a stdout.log) 2> >(tee -a stderr.log).


So, yes, in this case it is working because printf is the parent - (which I 
never seemed to find actually calling open() of the file. It seems to be 
using the fd opened by the child - in a magical way).


It's the redirection. The shell does the open, since the filename resulting
from process substitution is the target of a redirection operator. This is
a common idiom -- so common, in fact, that people have interpreted it to
mean that the entire `> >(xxx)' is a single operator.

However, the shell expands redirections in the child process it forks to
exec printf, so that child shell is what does the process substitution.
That might be the problem here.

The command itself doesn't do anything, though. `tee' just sits there
waiting for data to write to log files. It has no purpose. I'm not sure
what the intent is.

If you wrapped that command into a script, it's unlikely that either `tee'
would exit (why would they?) before `printf' exits and the script
completes. In bash-5.0, there would be nothing to remove the FIFOs.



This is defined to provide `diff' with two arguments. Let's call them

/var/tmp/sh-np12345
and
/var/tmp/sh-np67890

So diff runs, sees two arguments, opens both files, and does its thing.
Diff has to see two filenames when it runs, otherwise it's an error.

But what I thoght I was seeing is that diff is the PARENT calling 
substitute_process() that create(s) a child process that reads/writes to a 
fifo file.


Yes and no. Process substitution is a word expansion that results in
a filename. The stuff between the parens defines a command that writes to
or reads from a pipe expressed as a filename (/dev/fd/NN or a FIFO) that is
the result of the word expansion. In this case, the process substitution is
the target of a redirection, so the shell performs that word expansion
before it execs diff.

a) the child process never returns - it `exits` via, iirc, sh_exit(result) 
and the end of the routine


It executes the specified command and exits.


b) the parent gets the filename (pathname) - but I never see it actually 
opening it - only (when using bash -x) seeing the name in the -x command 
expansion.


It doesn't have to. The filename itself is the expansion: it's an object
you can use to communicate with an asynchronous process. This is how you
can have programs that expect a filename use program-generated output, for
instance, without using a temp file.

In this case, it opens the FIFO because it is the target of a redirection.


Now, let's say your change is there. The shell still runs

diff /var/tmp/sh-np12345 /var/tmp/sh-np67890

but, depending on how processes get scheduled, the shell forked to run
the process substitutions has already unlinked those FIFOs. Diff will
error out. The user will be unhappy. I will get bug reports.

You have introduced a race condition. You may not get hit by it, but
you cannot guarantee that no one will.
No I cannot - and for now it is a `hack` to solve a bigger issue. With 3500 
calls in a single build I hope the race occurs - and I'll finally see where 
the PARENT actually uses the name returned.


You mean in terms of using the filename as an argument to a shell builtin?
Otherwise you'll have to trace into other child process execution.



The shell can't unlink the FIFO until it can guarantee that the
processes that need to open it have opened it, and it can't guarantee
that in the general case. It has to wait until the process completes,
at least, and even that might not be correct.

Again, my issue was with >(command) substitution - where the `files` get 
written to by the parent - rather than reading them.


It doesn't matter. Let's try that scenario. A FIFO reader can live forever;
just waiting for someone to open the FIFO to write to it. In this case,
the child process opens the FIFO for read, and blocks until another
process opens it for write. That's the shell, since it's the target of a
redirection, but it doesn't have to be (the filename could just be passed
to another process as an argument). The file descriptor gets passed to
printf as its stdout (and printf apparently does nothing with it) and then
closed as part of the process exiting. When that happens, the tee should
get EOF and exit. The shell notices that tee exits and cleans up the FIFO.
If the shell exits, for instance, before the tee exits, nothing cleans up 
the FIFO.




p.s. it is not my call to ask why they do not use regular redirection or 
pipes. Feels much simpler - but some people cannot miss the opportunity to 
use something blinky and shiney.


p.p.s. - If you have `words of wisdom` re: why this approach is much better 
than `standard` redirection - I am all ears!


If you want to send the output to the terminal (or wherever) as well as a
log file, something like `tee' is required. If you want to keep stderr and

Re: so-called pipe files (sh-np-*) do not get deleted when processes close.

2021-03-18 Thread Michael Felt


On 17/03/2021 23:12, Chet Ramey wrote:

On 3/17/21 3:29 PM, Michael Felt wrote:
I tried as many combinations of commands as I could - and it seems 
that the regular behavior of dup2 on the opened fifo is enough to 
maintain communication.


It's not, since FIFOs exist in the file system and have to be 
available to

open(2) when the other process (consumer, producer) wants to use them.
I didn't expect it to be perfect (see thx etc below). Better: I needed 
help in telling when it would most like fail! :)




Going into a system test (ie. a normal AdoptOpemJDK build process) 
that has nearly 3500 commands, each with two process_substitution 
commands.


Consider the following scenario. You want to perform a regression test of
two versions of a program, each of which produces textual output. You run

diff <(program-version-1 args) <(program-version-2 args)


Yes, something to test. Thx. The ojdk scenario is: /usr/bin/printf > 
>(tee -a stdout.log) 2> >(tee -a stderr.log).


So, yes, in this case it is working because printf is the parent - 
(which I never seemed to find actually calling open() of the file. It 
seems to be using the fd opened by the child - in a magical way).




This is defined to provide `diff' with two arguments. Let's call them

/var/tmp/sh-np12345
and
/var/tmp/sh-np67890

So diff runs, sees two arguments, opens both files, and does its thing.
Diff has to see two filenames when it runs, otherwise it's an error.

But what I thoght I was seeing is that diff is the PARENT calling 
substitute_process() that create(s) a child process that reads/writes to 
a fifo file.
a) the child process never returns - it `exits` via, iirc, 
sh_exit(result) and the end of the routine
b) the parent gets the filename (pathname) - but I never see it actually 
opening it - only (when using bash -x) seeing the name in the -x command 
expansion.

Now, let's say your change is there. The shell still runs

diff /var/tmp/sh-np12345 /var/tmp/sh-np67890

but, depending on how processes get scheduled, the shell forked to run
the process substitutions has already unlinked those FIFOs. Diff will
error out. The user will be unhappy. I will get bug reports.

You have introduced a race condition. You may not get hit by it, but
you cannot guarantee that no one will.
No I cannot - and for now it is a `hack` to solve a bigger issue. With 
3500 calls in a single build I hope the race occurs - and I'll finally 
see where the PARENT actually uses the name returned.


The shell can't unlink the FIFO until it can guarantee that the
processes that need to open it have opened it, and it can't guarantee
that in the general case. It has to wait until the process completes,
at least, and even that might not be correct.

Again, my issue was with >(command) substitution - where the `files` get 
written to by the parent - rather than reading them.


p.s. it is not my call to ask why they do not use regular redirection or 
pipes. Feels much simpler - but some people cannot miss the opportunity 
to use something blinky and shiney.


p.p.s. - If you have `words of wisdom` re: why this approach is much 
better than `standard` redirection - I am all ears!


*** Thanks again for the time to reply ***


That's why the last-ditch approach is to remove all remaining FIFOs
when the shell exits.

btw: other than the one open in the middle of process_substitution() 
I did not see anywhere where another process even tries to open the 
file.


They are not necessarily shell processes, but what if they were? Since a
FIFO is an object in the file system, you can just open(2) it. That's
ostensibly the advantage of FIFOs.



what I also noticed is that the process, (iirc) that opens the file - 
never 'returns' - it ends via sh_exit() and the end of the routine.


Of course. It's a process that is forked to run the command specified in
the process substitution. What else does it need to do?



OpenPGP_0x722BFDB61F396FC2.asc
Description: OpenPGP public key


OpenPGP_signature
Description: OpenPGP digital signature


Re: so-called pipe files (sh-np-*) do not get deleted when processes close.

2021-03-17 Thread Chet Ramey

On 3/17/21 3:29 PM, Michael Felt wrote:
I tried as many combinations of commands as I could - and it seems that the 
regular behavior of dup2 on the opened fifo is enough to maintain 
communication.


It's not, since FIFOs exist in the file system and have to be available to
open(2) when the other process (consumer, producer) wants to use them.



Going into a system test (ie. a normal AdoptOpemJDK build process) that has 
nearly 3500 commands, each with two process_substitution commands.


Consider the following scenario. You want to perform a regression test of
two versions of a program, each of which produces textual output.  You run

diff <(program-version-1 args) <(program-version-2 args)

This is defined to provide `diff' with two arguments. Let's call them

/var/tmp/sh-np12345
and
/var/tmp/sh-np67890

So diff runs, sees two arguments, opens both files, and does its thing.
Diff has to see two filenames when it runs, otherwise it's an error.

Now, let's say your change is there. The shell still runs

diff /var/tmp/sh-np12345 /var/tmp/sh-np67890

but, depending on how processes get scheduled, the shell forked to run
the process substitutions has already unlinked those FIFOs. Diff will
error out. The user will be unhappy. I will get bug reports.

You have introduced a race condition. You may not get hit by it, but
you cannot guarantee that no one will.

The shell can't unlink the FIFO until it can guarantee that the
processes that need to open it have opened it, and it can't guarantee
that in the general case. It has to wait until the process completes,
at least, and even that might not be correct.

That's why the last-ditch approach is to remove all remaining FIFOs
when the shell exits.

btw: other than the one open in the middle of process_substitution() I did 
not see anywhere where another process even tries to open the file.


They are not necessarily shell processes, but what if they were? Since a
FIFO is an object in the file system, you can just open(2) it. That's
ostensibly the advantage of FIFOs.



what I also noticed is that the process, (iirc) that opens the file - never 
'returns' - it ends via sh_exit() and the end of the routine.


Of course. It's a process that is forked to run the command specified in
the process substitution. What else does it need to do?

--
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/



Re: so-called pipe files (sh-np-*) do not get deleted when processes close.

2021-03-17 Thread Michael Felt
I tried as many combinations of commands as I could - and it seems that 
the regular behavior of dup2 on the opened fifo is enough to maintain 
communication.


Going into a system test (ie. a normal AdoptOpemJDK build process) that 
has nearly 3500 commands, each with two process_substitution commands.


Maybe that fails - and then I'll have yet another test.

btw: other than the one open in the middle of process_substitution() I 
did not see anywhere where another process even tries to open the file.


what I also noticed is that the process, (iirc) that opens the file - 
never 'returns' - it ends via sh_exit() and the end of the routine.


Next time - I'll save all of my debug changes. Got a bit too rigorous 
when I cleaned up.


On 17/03/2021 19:03, Chet Ramey wrote:

On 3/17/21 11:52 AM, Michael Felt wrote:
OK - this process on github has not gone exactly as I intended - 
merged with master - while I wanted to update, ie., merge with branch 
5.0.18. So, the link may not be accurate.


This is not correct. Process substitution is a word expansion that 
results
in a pathname. You can't just remove the pathname after the child 
opens it.

How will other processes that want to communicate with the process
substitution use it?



OpenPGP_0x722BFDB61F396FC2.asc
Description: OpenPGP public key


OpenPGP_signature
Description: OpenPGP digital signature


Re: so-called pipe files (sh-np-*) do not get deleted when processes close.

2021-03-17 Thread Chet Ramey

On 3/17/21 11:17 AM, Michael Felt wrote:


On 11/03/2021 18:11, Chet Ramey wrote:

On 3/11/21 11:28 AM, Michael Felt wrote:

Hi,

Issue: AdoptOpenJDK build process makes bash calls in a particular way. 
An abbreviated (shorter pathnames) example is:


```
bash-5.0$ /usr/bin/printf "Building targets 'product-images 
legacy-jre-image test-image' in configuration 
'aix-ppc64-normal-server-release'\n" > >(/usr/bin/tee -a 
/home/aixtools/build.log) 2> >(/usr/bin/tee -a /home/aixtools/build.log 
>&2)
Building targets 'product-images legacy-jre-image test-image' in 
configuration 'aix-ppc64-normal-server-release'


I believe this is fixed in bash-5.1.


I added some debug statements to try and catch what is not happening. It 
seems that the fifo_list[i].proc value is never being set to (pid_t)-1 
so any call to `unlink_fifo()` or `unlink_fifo_list()` does not unlink 
the special file created.


Probably because the process substitution does not exit before the shell 
does.


I spent several days debugging - and, basically, they never get cleared 
because the fifo_struct never gets the (pid_t) -1 value assigned.


Although the `reap` function does get called - there is never anything to do.

The routine that does assign the (pid_t) -1 value is `wait`*something - and 
this is only called via an interrupt (aka signal) - as far as I could see.


Probably because the process substitution does not exit before the shell
does. The shell doesn't wait for asynchronous processes before it exits,
so if the process doesn't exit and the shell doesn't get a SIGCHLD, it
won't reap that process.

The bash-5.1 solution is pretty heavy-handed: unlink all FIFOs it thinks
still exist before the shell exits.


--
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/



Re: so-called pipe files (sh-np-*) do not get deleted when processes close.

2021-03-17 Thread Chet Ramey

On 3/17/21 11:52 AM, Michael Felt wrote:
OK - this process on github has not gone exactly as I intended - merged 
with master - while I wanted to update, ie., merge with branch 5.0.18. So, 
the link may not be accurate.


This is not correct. Process substitution is a word expansion that results
in a pathname. You can't just remove the pathname after the child opens it.
How will other processes that want to communicate with the process
substitution use it?

--
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/



Re: so-called pipe files (sh-np-*) do not get deleted when processes close.

2021-03-17 Thread Michael Felt
OK - this process on github has not gone exactly as I intended - merged 
with master - while I wanted to update, ie., merge with branch 5.0.18. 
So, the link may not be accurate.


The change is simple:

diff --git a/subst.c b/subst.c
index 843c9d39..3792e45c 100644
--- a/subst.c
+++ b/subst.c
@@ -5926,6 +5926,8 @@ process_substitute (string, open_for_read_in_child)
 #if !defined (HAVE_DEV_FD)
   /* Open the named pipe in the child. */
   fd = open (pathname, open_for_read_in_child ? O_RDONLY : O_WRONLY);
+  /* now that the file is open (or not) * unlink it to keep garbage down */
+  unlink(pathname);
   if (fd < 0)
 {
   /* Two separate strings for ease of translation. */


On 17/03/2021 16:17, Michael Felt wrote:


On 11/03/2021 18:11, Chet Ramey wrote:

On 3/11/21 11:28 AM, Michael Felt wrote:

Hi,

Issue: AdoptOpenJDK build process makes bash calls in a particular 
way. An abbreviated (shorter pathnames) example is:


```
bash-5.0$ /usr/bin/printf "Building targets 'product-images 
legacy-jre-image test-image' in configuration 
'aix-ppc64-normal-server-release'\n" > >(/usr/bin/tee -a 
/home/aixtools/build.log) 2> >(/usr/bin/tee -a 
/home/aixtools/build.log >&2)
Building targets 'product-images legacy-jre-image test-image' in 
configuration 'aix-ppc64-normal-server-release'


I believe this is fixed in bash-5.1.


I added some debug statements to try and catch what is not 
happening. It seems that the fifo_list[i].proc value is never being 
set to (pid_t)-1 so any call to `unlink_fifo()` or 
`unlink_fifo_list()` does not unlink the special file created.


Probably because the process substitution does not exit before the 
shell does.


I spent several days debugging - and, basically, they never get 
cleared because the fifo_struct never gets the (pid_t) -1 value assigned.


Although the `reap` function does get called - there is never anything 
to do.


The routine that does assign the (pid_t) -1 value is `wait`*something 
- and this is only called via an interrupt (aka signal) - as far as I 
could see.



in the end I came up with a very simple - basically historical 
solution - for working with tempoary files that do not need to survive 
the process - unlink() the file immediately after open()>


As I need to document for AdoptOpenJDK I created a mirror of savannah 
(git) and created a PR: https://github.com/aixtools/bash/pull/2


I expect much more testing is warrented - as to potential side-effects 
with the fifo struct (that is no longer accurate as the file may (read 
should) already be unlinked.


Hope this helps,

Michael



OpenPGP_0x722BFDB61F396FC2.asc
Description: OpenPGP public key


OpenPGP_signature
Description: OpenPGP digital signature


Re: so-called pipe files (sh-np-*) do not get deleted when processes close.

2021-03-17 Thread Michael Felt


On 11/03/2021 18:11, Chet Ramey wrote:

On 3/11/21 11:28 AM, Michael Felt wrote:

Hi,

Issue: AdoptOpenJDK build process makes bash calls in a particular 
way. An abbreviated (shorter pathnames) example is:


```
bash-5.0$ /usr/bin/printf "Building targets 'product-images 
legacy-jre-image test-image' in configuration 
'aix-ppc64-normal-server-release'\n" > >(/usr/bin/tee -a 
/home/aixtools/build.log) 2> >(/usr/bin/tee -a 
/home/aixtools/build.log >&2)
Building targets 'product-images legacy-jre-image test-image' in 
configuration 'aix-ppc64-normal-server-release'


I believe this is fixed in bash-5.1.


I added some debug statements to try and catch what is not happening. 
It seems that the fifo_list[i].proc value is never being set to 
(pid_t)-1 so any call to `unlink_fifo()` or `unlink_fifo_list()` does 
not unlink the special file created.


Probably because the process substitution does not exit before the 
shell does.


I spent several days debugging - and, basically, they never get cleared 
because the fifo_struct never gets the (pid_t) -1 value assigned.


Although the `reap` function does get called - there is never anything 
to do.


The routine that does assign the (pid_t) -1 value is `wait`*something - 
and this is only called via an interrupt (aka signal) - as far as I 
could see.



in the end I came up with a very simple - basically historical solution 
- for working with tempoary files that do not need to survive the 
process - unlink() the file immediately after open()>


As I need to document for AdoptOpenJDK I created a mirror of savannah 
(git) and created a PR: https://github.com/aixtools/bash/pull/2


I expect much more testing is warrented - as to potential side-effects 
with the fifo struct (that is no longer accurate as the file may (read 
should) already be unlinked.


Hope this helps,

Michael



OpenPGP_0x722BFDB61F396FC2.asc
Description: OpenPGP public key


OpenPGP_signature
Description: OpenPGP digital signature


Re: so-called pipe files (sh-np-*) do not get deleted when processes close.

2021-03-16 Thread Michael Felt


On 16/03/2021 16:21, Chet Ramey wrote:

On 3/16/21 11:07 AM, Michael Felt wrote:


On 16/03/2021 14:38, Chet Ramey wrote:

On 3/16/21 8:04 AM, Michael Felt wrote:

Decided to give bash-5.1 a try. I doubt it is major, but I get as 
far as:


"../../../src/bash-5.1.0/lib/sh/tmpfile.c", line 289.11: 1506-068 
(W) Operation between types "char*" and "int" is not allowed.

ld: 0711-317 ERROR: Undefined symbol: .mkdtemp


Then how does configure find it? It's a POSIX function, and that 
file includes the appropriate headers.
Haven't looked at configure yet - but do not find it in the usual 
places:


root@x065:[/data/prj/gnu/bash/bash-5.0.18]grep mkdtemp /usr/include/*.h
root@x065:[/data/prj/gnu/bash/bash-5.0.18]grep mkdtemp 
/usr/include/sys/*.h


Also, not found on AIX 6.1 (TL9), but did find on AIX 7.1 TL4.


Sure, but configure checks for it, and bash only uses mkdtemp if 
configure

finds it. Why does configure find it?

Not sure. I'll have time again Thursday. Will be back.





OpenPGP_0x722BFDB61F396FC2.asc
Description: OpenPGP public key


OpenPGP_signature
Description: OpenPGP digital signature


Re: so-called pipe files (sh-np-*) do not get deleted when processes close.

2021-03-16 Thread Chet Ramey

On 3/16/21 11:07 AM, Michael Felt wrote:


On 16/03/2021 14:38, Chet Ramey wrote:

On 3/16/21 8:04 AM, Michael Felt wrote:


Decided to give bash-5.1 a try. I doubt it is major, but I get as far as:

"../../../src/bash-5.1.0/lib/sh/tmpfile.c", line 289.11: 1506-068 (W) 
Operation between types "char*" and "int" is not allowed.

ld: 0711-317 ERROR: Undefined symbol: .mkdtemp


Then how does configure find it? It's a POSIX function, and that file 
includes the appropriate headers.

Haven't looked at configure yet - but do not find it in the usual places:

root@x065:[/data/prj/gnu/bash/bash-5.0.18]grep mkdtemp /usr/include/*.h
root@x065:[/data/prj/gnu/bash/bash-5.0.18]grep mkdtemp /usr/include/sys/*.h

Also, not found on AIX 6.1 (TL9), but did find on AIX 7.1 TL4.


Sure, but configure checks for it, and bash only uses mkdtemp if configure
finds it. Why does configure find it?


--
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/



Re: so-called pipe files (sh-np-*) do not get deleted when processes close.

2021-03-16 Thread Michael Felt


On 16/03/2021 14:38, Chet Ramey wrote:

On 3/16/21 8:04 AM, Michael Felt wrote:

Decided to give bash-5.1 a try. I doubt it is major, but I get as far 
as:


"../../../src/bash-5.1.0/lib/sh/tmpfile.c", line 289.11: 1506-068 (W) 
Operation between types "char*" and "int" is not allowed.

ld: 0711-317 ERROR: Undefined symbol: .mkdtemp


Then how does configure find it? It's a POSIX function, and that file 
includes the appropriate headers.

Haven't looked at configure yet - but do not find it in the usual places:

root@x065:[/data/prj/gnu/bash/bash-5.0.18]grep mkdtemp /usr/include/*.h
root@x065:[/data/prj/gnu/bash/bash-5.0.18]grep mkdtemp /usr/include/sys/*.h

Also, not found on AIX 6.1 (TL9), but did find on AIX 7.1 TL4.

Hope this helps,

Michael




OpenPGP_0x722BFDB61F396FC2.asc
Description: OpenPGP public key


OpenPGP_signature
Description: OpenPGP digital signature


Re: so-called pipe files (sh-np-*) do not get deleted when processes close.

2021-03-16 Thread Chet Ramey

On 3/16/21 8:04 AM, Michael Felt wrote:


Decided to give bash-5.1 a try. I doubt it is major, but I get as far as:

"../../../src/bash-5.1.0/lib/sh/tmpfile.c", line 289.11: 1506-068 (W) 
Operation between types "char*" and "int" is not allowed.

ld: 0711-317 ERROR: Undefined symbol: .mkdtemp


Then how does configure find it? It's a POSIX function, and that file 
includes the appropriate headers.


--
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/



Re: so-called pipe files (sh-np-*) do not get deleted when processes close.

2021-03-16 Thread Michael Felt


On 11/03/2021 22:27, Chet Ramey wrote:

On 3/11/21 3:55 PM, Michael Felt (aixtools) wrote:



Sent from my iPhone


On 11 Mar 2021, at 18:15, Chet Ramey  wrote:

On 3/11/21 11:28 AM, Michael Felt wrote:

Hi,
Issue: AdoptOpenJDK build process makes bash calls in a particular 
way. An abbreviated (shorter pathnames) example is:

```
bash-5.0$ /usr/bin/printf "Building targets 'product-images 
legacy-jre-image test-image' in configuration 
'aix-ppc64-normal-server-release'\n" > >(/usr/bin/tee -a 
/home/aixtools/build.log) 2> >(/usr/bin/tee -a 
/home/aixtools/build.log >&2)
Building targets 'product-images legacy-jre-image test-image' in 
configuration 'aix-ppc64-normal-server-release'


I believe this is fixed in bash-5.1.
Would it be difficult to give me a hint for 5.0. I could test further 
now i have a command that generates the issue.


I can't reproduce it, but you can look at unlink_all_fifos() in bash-5.1.
It's defined in subst.c and called in shell.c.


FYI: Been digging in bash-5.0.18 - learning...

Decided to give bash-5.1 a try. I doubt it is major, but I get as far as:

"../../../src/bash-5.1.0/lib/sh/tmpfile.c", line 289.11: 1506-068 (W) 
Operation between types "char*" and "int" is not allowed.

ld: 0711-317 ERROR: Undefined symbol: .mkdtemp
ld: 0711-345 Use the -bloadmap or -bnoquiet option to obtain more 
information.

make: 1254-004 The error code from the last command is 8.


Using AIX 5.3 TL12 and xlc/xlC-v11, and a largely stripped system Of 
other OSS).


I'll also give it a go on a more public server.

Michael

Probably because the process substitution does not exit before the 
shell does.
I was hoping that is what the wait routines were for. Also noticed 
that the second fifo never gets a pid.


Bash doesn't wait for asynchronous processes before it exits unless 
you use

`wait'.




OpenPGP_0x722BFDB61F396FC2.asc
Description: OpenPGP public key


OpenPGP_signature
Description: OpenPGP digital signature


Re: so-called pipe files (sh-np-*) do not get deleted when processes close.

2021-03-11 Thread Chet Ramey

On 3/11/21 3:55 PM, Michael Felt (aixtools) wrote:



Sent from my iPhone


On 11 Mar 2021, at 18:15, Chet Ramey  wrote:

On 3/11/21 11:28 AM, Michael Felt wrote:

Hi,
Issue: AdoptOpenJDK build process makes bash calls in a particular way. An 
abbreviated (shorter pathnames) example is:
```
bash-5.0$ /usr/bin/printf "Building targets 'product-images legacy-jre-image test-image' in 
configuration 'aix-ppc64-normal-server-release'\n" > >(/usr/bin/tee -a /home/aixtools/build.log) 
2> >(/usr/bin/tee -a /home/aixtools/build.log >&2)
Building targets 'product-images legacy-jre-image test-image' in configuration 
'aix-ppc64-normal-server-release'


I believe this is fixed in bash-5.1.

Would it be difficult to give me a hint for 5.0. I could test further now i 
have a command that generates the issue.


I can't reproduce it, but you can look at unlink_all_fifos() in bash-5.1.
It's defined in subst.c and called in shell.c.


Probably because the process substitution does not exit before the shell does.

I was hoping that is what the wait routines were for. Also noticed that the 
second fifo never gets a pid.


Bash doesn't wait for asynchronous processes before it exits unless you use
`wait'.


--
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/



Re: so-called pipe files (sh-np-*) do not get deleted when processes close.

2021-03-11 Thread Michael Felt (aixtools)


Sent from my iPhone

> On 11 Mar 2021, at 18:15, Chet Ramey  wrote:
> 
> On 3/11/21 11:28 AM, Michael Felt wrote:
>> Hi,
>> Issue: AdoptOpenJDK build process makes bash calls in a particular way. An 
>> abbreviated (shorter pathnames) example is:
>> ```
>> bash-5.0$ /usr/bin/printf "Building targets 'product-images legacy-jre-image 
>> test-image' in configuration 'aix-ppc64-normal-server-release'\n" > 
>> >(/usr/bin/tee -a /home/aixtools/build.log) 2> >(/usr/bin/tee -a 
>> /home/aixtools/build.log >&2)
>> Building targets 'product-images legacy-jre-image test-image' in 
>> configuration 'aix-ppc64-normal-server-release'
> 
> I believe this is fixed in bash-5.1.
Would it be difficult to give me a hint for 5.0. I could test further now i 
have a command that generates the issue. 
> 
> 
>> I added some debug statements to try and catch what is not happening. It 
>> seems that the fifo_list[i].proc value is never being set to (pid_t)-1 so 
>> any call to `unlink_fifo()` or `unlink_fifo_list()` does not unlink the 
>> special file created.
> 
> Probably because the process substitution does not exit before the shell does.
I was hoping that is what the wait routines were for. Also noticed that the 
second fifo never gets a pid. 
> -- 
> ``The lyf so short, the craft so long to lerne.'' - Chaucer
>``Ars longa, vita brevis'' - Hippocrates
> Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/




Re: so-called pipe files (sh-np-*) do not get deleted when processes close.

2021-03-11 Thread Chet Ramey

On 3/11/21 11:28 AM, Michael Felt wrote:

Hi,

Issue: AdoptOpenJDK build process makes bash calls in a particular way. An 
abbreviated (shorter pathnames) example is:


```
bash-5.0$ /usr/bin/printf "Building targets 'product-images 
legacy-jre-image test-image' in configuration 
'aix-ppc64-normal-server-release'\n" > >(/usr/bin/tee -a 
/home/aixtools/build.log) 2> >(/usr/bin/tee -a /home/aixtools/build.log >&2)
Building targets 'product-images legacy-jre-image test-image' in 
configuration 'aix-ppc64-normal-server-release'


I believe this is fixed in bash-5.1.


I added some debug statements to try and catch what is not happening. It 
seems that the fifo_list[i].proc value is never being set to (pid_t)-1 so 
any call to `unlink_fifo()` or `unlink_fifo_list()` does not unlink the 
special file created.


Probably because the process substitution does not exit before the shell does.

--
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/