Hi,
Even after the last patch I can still see random ACATS failures on a
stock debian etch x86_64 machine (gcc13). I've added many traces to the
ACATS script and I can see now a common pattern and it's not related
to Ada multi threading or wrong code generation.
First the ACATS script itself is relatively straightforward: loop
over the test, copy some files, call gnatmake and then run the
compiled test and check the output.
The issue comes for a surprising - to me - /bin/sh behaviour,
if an /bin/sh expert could help me figure out the following:
=> gcc/testsuite/ada/acats/run_all.sh
<<
#!/bin/sh
# Run ACATS with the GNU Ada compiler
...
target_gnatmake () {
echo gnatmake --GCC=\"$GCC\" $gnatflags $gccflags $* -largs $EXTERNAL_OBJECTS
--GCC=\"$GCC\"
gnatmake --GCC="$GCC" $gnatflags $gccflags $* -largs $EXTERNAL_OBJECTS
--GCC="$GCC"
}
...
while ...
...
target_gnatmake $extraflags -I$dir/support $main >> $dir/acats.log 2>&1
if [ $? -ne 0 ]; then
display "FAIL: $i"
failed="${failed}${i} "
clean_dir
continue
fi
echo "RUN $binmain" >> $dir/acats.log
cd $dir/run
ZSTAMP=none
if [ ! -x $dir/tests/$chapter/$i/$binmain ]; then
sync
ZSTAMP=$(date '+%Y%m%dT%H%M%S')
ls -l $dir/tests/$chapter/$i/ >
/home/guerby/tmp/acats/postsync-${i}-${ZSTAMP} 2>&1
ps fauxwwwww > /home/guerby/tmp/acats/psfauxw1-${i}-${ZSTAMP} 2>&1
fi
target_run $dir/tests/$chapter/$i/$binmain >
$dir/tests/$chapter/$i/${i}.log 2>&1
...
>>
Now the common fail pattern is as follows:
1/ target_gnatmake succeeds, that is we don't pass in the first "if".
2/ However even is gnatmake has succeeded we enter the second "if"
because there's no executable in the dir as shown by "ls -l" output:
=> postsync-c48005b-20090813T202815
<<
total 44
-rw-r--r-- 1 guerby guerby 10345 2009-08-13 20:28 b~c48005b.adb
-rw-r--r-- 1 guerby guerby 12375 2009-08-13 20:28 b~c48005b.ads
-rw-r--r-- 1 guerby guerby 2786 2009-08-13 20:28 c48005b.adb
-rw-r--r-- 1 guerby guerby 784 2009-08-13 20:28 c48005b.ali
-rw-r--r-- 1 guerby guerby 12 2009-08-13 20:28 c48005b.lst
-rw-r--r-- 1 guerby guerby 3208 2009-08-13 20:28 c48005b.o
>>
3/ Here is the point I find surprising: the "ps fauxww" run in the
second "if" show that even if the script is fully sequential
at least one gnatmake subprocess (collect-ld) is still marked as running
*in parallel* with the ps command in the subsequent "if" of the script!
=> psfauxw1-c48005b-20090813T202815
<<
...
guerby 7715 1.3 0.0 12176 1936 ? SN 20:20 0:06 \_ /bin/sh
/home/guerby/trunk/gcc/testsuite/ada/acats/run_all.sh
guerby 7794 0.0 0.0 10796 2476 ? SN 20:28 0:00 \_
gnatmake --GCC=/home/guerby/build/gcc/xgcc -B/home/guerby/build/gcc/ -gnatws
-O2 -I/home/guerby/build/gcc/testsuite/ada/acats/support c48005b.adb -largs
--GCC=/home/guerby/build/gcc/xgcc -B/home/guerby/build/gcc/
guerby 7803 0.0 0.0 4048 1228 ? SN 20:28 0:00 | \_
/home/guerby/build/gcc/gnatlink c48005b.ali --GCC=/home/guerby/build/gcc/xgcc
-B/home/guerby/build/gcc/
guerby 7809 0.0 0.0 2880 584 ? SN 20:28 0:00 |
\_ /home/guerby/build/gcc/xgcc b~c48005b.o ... -o c48005b ...
guerby 7810 0.0 0.0 2756 444 ? SN 20:28 0:00 |
\_ /home/guerby/build/gcc/collect2 --eh-frame-hdr -m elf_x86_64
-dynamic-linker /lib64/ld-linux-x86-64.so.2 -o c48005b ...
guerby 7811 0.0 0.0 11548 1500 ? RN 20:28 0:00 |
\_ /bin/sh /home/guerby/build/gcc/collect-ld --eh-frame-hdr -m
elf_x86_64 -dynamic-linker /lib64/ld-linux-x86-64.so.2 -o c48005b ....
guerby 7808 0.0 0.0 14328 1156 ? RN 20:28 0:00 \_ ps
fauxwwwww
...
>>
4/ we run the executable but since it's not there we get:
=> c48005b.log
<<
/home/guerby/trunk/gcc/testsuite/ada/acats/run_all.sh: line 16:
/home/guerby/build/gcc/testsuite/ada/acats/tests/c4/c48005b/c48005b: Permission
denied
>>
5/ After the run an empty file appears in another "ls -l"
(not shown in the script above):
-rw-r--r-- 1 guerby guerby 0 2009-08-13 20:28 c48005b
6/ Waiting for one more second ("sleep 1" not shown above) the
full file appears at last in "ls -l":
-rwxr-xr-x 1 guerby guerby 1164960 2009-08-13 20:28 c48005b
Any idea of why /bin/sh is running stuff in parallel instead
of sequential?
Could some code in
gnatmake/gnatlink/xgcc/collect2/collect-ld cause it?
gue...@gcc13:~$ /bin/sh --version
GNU bash, version 3.1.17(1)-release (x86_64-pc-linux-gnu)
Thanks in advance,
Laurent