Re: SVCDUMP capture phase statistics.....

Jim Mulder Thu, 10 Apr 2008 14:31:52 -0700

IBM Mainframe Discussion List <IBM-MAIN@BAMA.UA.EDU> wrote on 04/10/2008 
02:05:33 AM:


> ..and how to interpret them.
> 
> Yesterday connect:direct took another of those abend0c4 that 
> Sterling always tells us 'they're all fixed'. They're all from 
> ISTAICPT in SRB mode... And of course they always occur in 
> production where there is extremely high load (both CPU and workload)
> 
> The problem was that it took a full minute between IST413I VTAM 
> DUMPING FOR JOB NDM and IEA794I SVC DUMP HAS CAPTURED, with system-
> wide non-dispatchability due to Q=YES 28seconds. This causes TCPIP 
> to get 'adjacency failures' and to drop lots of MQ channel 
> connections, which has a major impact on customers connected to us. 
> Which means a lot of management attention.
> 
> The dump statistics tell me this:
> Total dump capture time           00:00:57.956068 
> System nondispatchability start   04/09/2008 15:04:43.405987 
> System set nondispatchable        04/09/2008 15:04:43.406106 
> Global storage start              04/09/2008 15:04:43.053199 
> Global storage end                04/09/2008 15:04:48.466431 
> Global storage capture time       00:00:05.413231 
> System reset dispatchable         04/09/2008 15:05:12.204912 
> System was nondispatchable        00:00:28.798924
> 
> Asid 016A (NDM): 
>   Local storage start             04/09/2008 15:05:12.204988 
>   Local storage end               09/18/2042 01:53:47.370496  <-- 
> Local storage capture time      10:48:35.165507 
> 
>   Tasks reset dispatchable        04/09/2008 15:05:39.356416 
>   Tasks were nondispatchable      00:00:27.151414
> 
> Exit address                    04353880 
>   Home ASID                       0005     DUMPSRV 
>   Exit time                       00:00:20.810908 
>   Exit attributes:  Global, Sdump, SYSMDUMP
> 
> I've got three questions here:
> 1. Why is there this interesting time stamp that says the dumps will
> be finished in 2042?

  That's just how a timestamp of all zeros gets converted by
BLSUXTOD.  I kind of hacked that IEAVTSFS formatter together in a
hurry one evening when I really wanted to look at some of the 
SDUMP statistics, but didn't want to decipher it by hand.
So I didn't think at the time to check for zeros before
converting it.  The more interesting question would be why
the local storage end timestamp apparently didn't get stored. 
This is probably because of the partial dump reason code that 
says only fixed storage was dumped for the address space.  This
happens when SDUMP detects that the dump task in the address 
space never got started (after 25 seconds), so it instead
dumps the fixed frames of the address space, accessing them
via their real storage addresses.  Apparently this processing 
doesn't set the Local Storage End timestamp (that can be corrected
in a future release).  So the question would be, why didn't
the NDM dump task get going for 25 seconds (that should be 25 
seconds after resetting system nondispatchability, I think).
Since NDM's LSQA should be in the dump, possibly 
SUMM FOR ASID(x'16A') might have some clues.  Is the dump task
still in a wait?  Is the ECB POSTed? 

> 2. Global capture phase was a mere 5 seconds, why did it take 24 
> seconds after global capture was finished for the system to become 
> dispatchable again?

  System nondispatchability is not reset until the global 
exits complete.

> 3. What the heck took DUMPSRV 20 seconds in the exit?

  You are probably running in GRS Star mode, possibly with the 
option that requests all data from all systems for SDATA=GRSQ. 
Does your GRSCNFxx specify GRSQ(ALL)? 

> 
> ==> FLAGS SET IN SDUSDATA: Dump all PSAs, current PSA, nucleus SQA, 
LSQA,
> rgn-private area, LPA mod. for rgn, trace, CSA, SWA,summary dump
> ==> FLAGS SET IN SDUFLAG2: SUBPLST, KEYLIST, LISTD
> ==> FLAGS SET IN SDUCNTL1: SRB 
> ==> FLAGS SET IN SDUTYP1: FAILRC 
> ==> FLAGS SET IN SDUEXIT: GRSQ, MASTER trace, SMSX, XESDATA, IOS, RSM, 
OE 
> ==> FLAGS SET IN SDUSDAT3: IO
> 
> The dump is 8929 trks big and was partial, MAXSPACE is 1500M, 6 
> logical cps, 8.7G real.
> partial dump reason codes:
> During dump processing of local storage, the system issued a PURGEDQ
> because a hung address space was detected. This will result in the 
> loss of some storage related to the address space.
> During dump processing of a possibly hung address space, dump 
> processing obtained only fixed storage for the address space
> 
> NDM runs in a discretionary SC, VTAM in SYSSTC.
> 
> Any idea what's going on? (I am hoping to get a faster answer/ideas 
> what to change here than by opening an ETR with IBM, especially as 
> this may be some sort of tuning problem, except for the 2042 time 
stamp.)


Jim Mulder   z/OS System Test   IBM Corp.  Poughkeepsie,  NY

----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: GET IBM-MAIN INFO
Search the archives at http://bama.ua.edu/archives/ibm-main.html

Re: SVCDUMP capture phase statistics.....

Reply via email to