>From: IBM Mainframe Discussion List [mailto:ibm-m...@bama.ua.edu] On Behalf Of 
>Hunkeler Peter (KIUK 3)
Sent: Wednesday, March 11, 2009 7:28 AM
To: IBM-MAIN@bama.ua.edu
Subject: How much does GTF trace impact system throughput? 

>So far, in all the cases where I had to run GTF to gather 
debugging information, I could force the problem to occur
and therefore GTF was only run for a short period of time
and the impact on system throughput was not questioned.

>This time, we've got a problem with two of our printers that
only occurs sporadically. I'm planning to run GTF continuously,
one instance per printer, e.g. PSF, until the problem occurs. 
Each GTF will be limited to collect trace data for a single PSF 
instance.

>I'm trying to figure out the impact this will have on overall 
system throughput. How much is GTF "eating". Any other caveats?

>Peter Hunkeler
CREDIT SUISSE


I haven't seen any recent precise figures on GTF overhead.  About 20 years ago 
Dave Halbig did several experiments to determine GTF's "overhead" for tracing 
various combinations of events, and he discussed his experiments and results in 
a paper that he presented at either SHARE or CMG.  One conclusion of his was 
that the overhead is acceptable if your problem that requires using GTF is 
great enough.  Another conclusion was that the more events you trace, the 
greater the overhead, so use filtering as much as possible to reduce the number 
of events that need to be traced.  Another conclusion was the more frequently 
the traceable events occur, the greater the overhead.  All three conclusions 
are intuitively obvious.  Bottom line:  your mileage may vary.  Use it when you 
must, and use it so the fewest possible events are traced.

In your specific case, e.g., trace ONLY the two printers' device numbers rather 
than tracing all I/O devices.  And filter by jobname also if you can.  And if 
you only need the I/O interrupt trace records, then do not also trace the SSCH 
or any other I/O-related events (CSCH, HSCH, MSCH, PCI, etc.).  Also try to 
trace the fewest number of CCWs and bytes transferred per CCW that you think 
you will need to diagnose the problem.

The GTF hook for the event class will fire regardless of how finely you are 
filtering.  If you are tracing I/O interrupts for only two printers' device 
numbers, then the hook will fire for every I/O interrupt.  The hook causes a 
program interrupt, which takes you into Program Interrupt FLIH, which takes you 
into a GTF module which applies all your filtering criteria, then exits most of 
the time back to the interrupted code (which was a disabled I/O interrupt 
handling module in IOS).  Once in a while a trace record will be generated.

Once you have incurred all the overhead of getting through the filter, the code 
to trace the I/O request is probably not very much additional overhead, given 
that you have to go through at least some of the filtering logic for every I/O 
interrupt regardless of device number or device class.  Thus you will be 
incurring some overhead for every I/O interrupt even though most of the time 
the interrupt will not need to be traced.  This extra code may add 100 to 200 
more instructions to the path length of processing an I/O interrupt, the total 
for which without GTF added is several thousand instructions, so another 100 to 
200 is not a great percentage increase, but how many thousands of I/O 
interrupts does your system handle per second?

I think you will have a lower total overhead if you have only one instance of 
GTF that is defined to trace events from either of the two printers involved.

Your mileage may vary.

Bill Fairchild
Rocket Software

----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@bama.ua.edu with the message: GET IBM-MAIN INFO
Search the archives at http://bama.ua.edu/archives/ibm-main.html

Reply via email to