Re: AW: Re: Follow-on: S0C4-11 abend caused by BASSM to address with all X'00' bytes

nitz-ibm Wed, 27 Jul 2016 00:22:19 -0700

> >I don't know how LE plays that game. A SLIP of the 0C4 would have true 
> >contents. 
> System Trace would also help with this information but it is not in the dump 
> produced by the "little"helpers". Setting a SLIP in production is a not so 
> short adventure trip; and SLIPping for a 0C4 is not trivial as long as I 
> don't have more information what actually happens. 
> I have asked that the job be changed to switch off the 'little helpers' and a 
> SYSMDUMP DD be added. Hopefully I will have more information next time it 
> fails.


Just my 2 cents: As long as TRAP(ON,SPIE) is set (and you said you cannot turn 
it off for various reasons) slip will not get control because (E)SPIE gets 
control before slip. So your slip trap probably won't match at all. And your 
sysmdump may contain a dump, but not the first 0C4. I speak from three years of 
dealing with LE in the picture (lucky for me, it was only LE). If it were me, I 
would a) try to find out if that dump option thing is customizable and if not 
kick the vendor in the behind - very hard - for disabling installation set dump 
options, and b) I would try to figure out where that bit is set and zap it off 
to get the full set of dump options that I have defined (everything except the 
IBM-software-support-all-time-favourite of ALLNUC which is unnecessary for most 
problems anyway, but gets copied every time a slip dump is requested).

>Machine State:
> ILC..... 0002    Interruption Code..... 0004
That's the ZMCH and that is what FLIH recorded. It gets copied early in LE 
processing and is the first problem that occured. 

>RTPSW1... 478D0400  A31A7BB8
>RTPSW2... 00020011  231A7800
>What can I learn from this? How do I properly use these fields in dump 
>analysis?

There's your PIC 11 in RTPSW2. So following the first 0c4-4 (see ZMCH in LE) 
you got (at least one) subsequent 0c4-11. If both fields are set, it means that 
while RTM was still dealing with the first problem, a second problem occured. 

I think that the fields in the XSB may have also been reused by the later 
problem, which means at the time of the dump things are definitely not the way 
they were at the time of the first problem anymore (they never are when LE has 
a chance to get at something first). I'm sure that Jim will correct me if I 
said something wrong here.

If it were me, I would try to find out what address is supposed to be at 
r7+x'90'. Assuming that DCA$DCT is a vendor control block and not one belonging 
to JES2. To do that, take a slip dump of the same program execution in test 
(the equivalent of address 24D90BE0 in your code snippet), find the same 
control block in it using the eyecatcher, look at that storage and see if the 
addresses look similar to what you have in the error case. If they do, then 
find out where the address at x'90' points to. Maybe that will give a clue. 
Another option would be getmain/freemain trace (if you can set that up in 
production) presumably for SP0, KEY8. 

Another idea - get yourself IPCS access to private storage in other address 
spaces (a FACILITY class profile) and while the job is running, look at the 
same control block - I am betting that it will always be at offset DC20. In 
IPCS active storage (using the asid) you can then use the pointer command (?) 
to see where offset x'90' points to. Maybe it is getmained then!

Not sure that you mentioned it - is the problem reproducible at will?

Barbara

----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN

Re: AW: Re: Follow-on: S0C4-11 abend caused by BASSM to address with all X'00' bytes

Reply via email to