Hi all,
I seem to have encountered another unpleasant behavior of CTI-TET: when
working on better handling of potential mktemp(1) failures, I also
wanted to remove the already created temporary files upon abnormal test
case/purpose termination so they are not left laying around, e.g. when
the user sends SIGTERM to the run_test script.
Browsing through the code of STF based test suites, there are often the
following lines of code (ksh):
# Make sure we cleanup
trap cleanup 1 2 15
The function cleanup() will then do whatever cleanup it can do upon
receiving one of the specified signals.
I did a little experiment with a test case which comprises of multiple
test purposes and hit Ctrl-C in the middle of the test run. To my
surprise there were no leftover temporary files. So I modified one of
the test purposes to create temporary file, sleep for long period of
time and then remove the temporary file and exit. This is how it went:
1. we are happily running:
18269 -sh
18273 bash
24075 /bin/bash
1071 /usr/bin/ksh
/opt/SUNWstc-tetlite/contrib/ctitools/bin/run_test nc
1168 tcc -p -e -a /var/tmp/testzone_1071/CONFIG -j
/var/tmp/results.10
1169 /usr/bin/ksh -p tc_hflag all
1193 /usr/bin/ksh -p tc_hflag all
1196 /usr/bin/ksh93 /bin/sleep 30
2. here I hit Ctrl-C:
18269 -sh
18273 bash
24075 /bin/bash
1071 /usr/bin/ksh
/opt/SUNWstc-tetlite/contrib/ctitools/bin/run_test nc
1168 tcc -p -e -a /var/tmp/testzone_1071/CONFIG -j
/var/tmp/results.10
1169 <defunct>
1193 /usr/bin/ksh -p tc_hflag all
1196 /usr/bin/ksh93 /bin/sleep 30
3. run_test has exited but the test purpose is still running:
1193 /usr/bin/ksh -p tc_hflag all
1196 /usr/bin/ksh93 /bin/sleep 30
The test purpose happily continued away and at the end it cleaned the
temporary file. This means that one cannot be sure all test case
processing is really done after run_test is finished which is a bit
shocking to me. Running run_test again while a test purpose is still
running could easily produce unpredictable behavior and confusion with
more complicated test suites.
Looking into run_test.ksh, it is handling SIGTERM via
stcnv-clone/usr/src/tools/tet/contrib/ctitools/src/utils/run_test.ksh:cleanup()
function which sends SIGTERM to the tcc program:
1242 if [[ ! -z $TCC_PID ]]
1243 then
1244 echo kill -TERM $TCC_PID
1245 kill -TERM $TCC_PID
1246 wait $TCC_PID
1247 TCC_PID=
1248 fi
However, tcc does not seem to do anything after receiving the signal in
stcnv-clone/usr/src/tools/tet/src/tet3/tcc/sigtrap.c:initial_sigtrap()
SIGTERM handler:
129 static void initial_sigtrap(sig)
130 int sig;
131 {
132 static char text[] = "TCC shutdown on signal";
133
134
135 if (jnl_usable())
136 (void) fprintf(stderr, "%s %d\n", text, sig);
137
138 fatal(0, text, tet_i2a(sig));
139 }
This means that my effort to make the test suite more robust against
abnormal termination is probably futile since the fatal() seems to do
only slightly better version of exit(). Anyone got different experience ?
v.