Hey guys,
We had a bit of situation today when our system crashed due to a problem
with a SMSPDSE address space and a looping Co:Z batch job. We don't know
the root cause
of the problem yet but we suspect it may be a bug with PDSE (there are
PTFs for 2.1). Unfortunately, it had Co:Z fingerprints that I had to
explain to my management.
*The symptoms:*
One of our testers ran a job on Friday where they had edited a Co:Z
batch job to turn on trace and had replaced the
ssh-options=-oStrictHostKeyChecking=no
option with ssh-options=-vvv to put a trace on. The tester knows very
little about Co:Z or ssh and had not setup any key files so it's a
reasonable mistake to make.
The batch job was using a password file, the complete command deck is as
follows:
//COZCONF DD DISP=SHR,DSN=SYS3.COZ.SAMPJCL(COZCFGD)
// DD *
ssh-options=-vvv
server-env-PASSWD_DSN=//FUW130.DEVT.INSTALL(VAGPASS)
server-env-SSH_ASKPASS=/usr/local/coz/bin/read_passwd_dsn.sh
server-env-DISPLAY=none
/*
The job output:
/usr/local/coz/bin/read_passwd_dsn.sh prompt: "Please type 'yes' or 'no': "
fromdsn(FUW130.DEVT.INSTALL(VAGPASS))[N]: 1 records/200 bytes read; 8
bytes written in 0 milliseconds.
debug1: read_passphrase: can't open /dev/tty: EDC5128I No such device.
(errno2=0x056201A9)
The ssh_config specified "StrictHostKeyChecking ask" so the job went
into a loop continuously spawning OMVS processes. It ran for three days
creating quite a lot of trace data
on the spool which is not a big problem but this morning we started to
experience abends starting with a S04F and then lots of S0D6 abends in
the spawned Co:Z process. I have looked
at the dumps and it looks like PDSE became unstable and couldn't be
accessed via a PC call. Because the Co:Z launcher was looping and each
spawned task was abending with a dump
our system became unstable as the dump address space couldn't keep up
and overspilled storage into ESQA eventually bringing our system down.
We had IPL to resolve the problem.
We have sent the original dump to IBM as we suspect the problem is we
PDSE but my question can we tweak the Co:Z launcher so it doesn't loop
when the options are specified
as above.
Is it possible to just bail on the first password error?
Regards,
David
----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN