Re: S106 abends after copying into LINKLIST

Seymour J Metz Tue, 09 Oct 2018 08:27:55 -0700

The BLDL entry for a load module has included the TTR and length of the first 
text record since Old Man Noach cornered the market in Gopher Wood. Fetch 
chained the read of the text record to another read; the following record could 
never be a text record. There was no need to read the ESD. It seems highly 
unlikely that this has changed.

Of course, none of this applies to program objects.

--
Shmuel (Seymour J.) Metz
http://mason.gmu.edu/~smetz3

________________________________________
From: IBM Mainframe Discussion List <IBM-MAIN@listserv.ua.edu> on behalf of 
John Eells <ee...@us.ibm.com>
Sent: Monday, October 8, 2018 3:44 PM
To: IBM-MAIN@listserv.ua.edu
Subject: Re: S106 abends after copying into LINKLIST

I do not know whether LLA keeps a pointer to the first text record
(though it might), but it would certainly need the preceding associated
control and ESD records to be cached as well if the first read done is
for a text record.  I would expect that, since the ESD and control
records encode their own length, they are read with the SLI bit on in
the CCW, so that incorrect length does not cause any sort of I/O error,
logical or otherwise.  The same goes for RLD records, and it might also
apply to others.

Based on some research I did a long time ago, here is how I believe
things work:

The control record contains a CCW fragment to be used in constructing
the Read CCW for the next text record, unless it's the last.  PCI
processing is used to chain onto the channel program to get the entire
module in one shot unless the system is so busy the PCI can't be
serviced in time to add to the chain and the I/O operation terminates.
In that case, I believe it's restarted where it left off.

The read CCW for the text record should be constructed using the
specific length stored in the control record, and I would not expect the
SLI bit to be used for that CCW.  On that basis, I would agree that if
the first "text" record you read does not have the expected length that
the unexpected status back from the device would likely result in a
"logical I/O error."  However, it's possible that SLI is used for the
read (I have not read the code), and that would make other reasons
(empty track, no record at that location on a track, additional extents,
etc.) more likely culprits for ABEND106-F RC40.  For performance
reasons, though, I would expect that SLI is not set.  This code was
originally written before control unit cache existed and was designed to
be really good at avoiding unecessary disk latency.  And, of course, we
might change details in the code at any time (though why we would ever
want to is a good question!).

The text records themselves are of variable length.  They have a minimum
length of 1024 bytes, and a maximum length of the track length or block
size, whichever is smaller.  The Binder (and COPYMOD) try to write the
minimum possible number based on those limits.  They issue TRKBAL to
find out how much space is left on the track, and write records on
following tracks as needed to finish writing a load module.  (This is
why 32760 is the best block size for load libraries.)

Because the directory pointers to PDS members are TTR pointers, every
load module does not generally happen to start on a new track.  This
means that large block sizes rarely if ever result in uniform text
record lengths.  They do result in fewer text records if the modules'
lengths exceed a lower block size, though.

All the above applies to load modules.  I have no idea how this works
under the covers for program objects, but Program Management Advanced
Facilities documents load module records.

Just some random additional info to reinforce the "except under narrow
circumstances, with sufficient advance reflection, and malice--er, risk
acceptance-aforethought, don't update running systems' data sets" others
have already expressed.

Michael Stein wrote:
<snip>
>
> It's been a while but from what I remember about program fetch
> here's a guess.
>
> Looking up S106 RC 0F reason code 40:
>
>     either an uncorrectable I/O error occurred or an error in the
>     load module caused the channel program to fail.
>
> Well, lets assume the hardware is work so this isn't a "real" I/O
> error caused by some hardware problem.  And there are no dataset
> extent changes, only the overwriting the dataset to empty it
> out and then copying in the new modules.
>
> Well the EOF pointer for the dataset got moved toward the front after
> the directory.  This caused the new modules to be written starting at
> the new EOF over the old modules.
>
> And LLA still has the directory entries for the old modules, not the new
> ones.  These now point into the new modules.  LLA's information includes
> specific information on the first block of text of each old module:
>
>    - the TTR of the first block of text
>    - the length of the first block of text
>    - the linkage editor assigned origin of first block of text
>
> This allows program fetch to start with reading first text block,
> rather than having to start at the beginning of the module.   Fetch can
> build a CCW to directly read the first block since it knows the TTR of
> the block and it's length and also the storage address (storage area +
> block origin).
>
> Since the old modules were overwritten, it's certain that the block at
> the old location isn't the expected one.  There might not be a block there
> giving no record found, there might be an EOF or there might be different
> length block causing fetch's channel program to end with incorrect length.
>
> This would explain the S106 RC 0F reason code 40.
>
> This isn't that bad.  The length of the wrong block/module might
> have matched.  I wonder if program fetch could successfully load the
> wrong module.
>
> Now with a blocksize of 32760, possibly each module will fit in one block
> and they likely have different sizes so this wrong module case might
> be unlikely.  Or something else might prevent loading the wrong module
> (what?)  Or it may be possible to have a successful program fetch with
> the wrong module.  And then attempt to execute it with the parameters
> and environment of the old module.
>
> What would that cause?  Program checks?  Mangled data?

--
John Eells
IBM Poughkeepsie
ee...@us.ibm.com

----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN

----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN

Re: S106 abends after copying into LINKLIST

Reply via email to