Re: FICON Saturation?
Thanks for that Bill, it's certainly filled in a few gaps in my understanding of PAVs, but I'm still not convinced that is what caused this sort to run for 6 times it's normal elapsed time. I'm seeing many other jobs with the same symptoms. Here's the device activity report Time S --- Activity --- act con disc pend rate respt iosq % % % % 22.40.00 S 38.7 .026 .0243 2 0 1 22.38.20 S 38.0 .023 .0212 2 0 1 22.36.40 S 39.8 .024 .0223 2 0 1 22.35.00 S 22.1 .046 .0441 1 0 0 22.33.20 S 21.5 .047 .0451 1 0 0 22.31.40 S 13.3 .075 .0731 1 0 0 22.30.00 S 18.8 .052 .0501 1 0 0 22.28.20 S 14.0 .072 .0701 1 0 0 22.26.40 S 12.0 .084 .0821 1 0 0 22.25.00 S 12.6 .081 .0791 1 0 0 22.23.20 S 13.4 .072 .0701 1 0 0 22.21.40 S 12.7 .076 .0741 1 0 0 Can i confirm that with a RESPT of .026 and an IOSQT of .024 the job was queuing for 95% (ish) of the time? or am I mis-reading that. I thought I understood it as in the ESCON days, the CHANNEL PATH BUSY was reported in PEND time. Now that FICON never presents CHANNEL BUSY, if the CHANNEL has run out of resources (OE's/Credits) this is now shown in IOSQ time. I'd love someone to tell me I'm wrong as that would rule out my current suspicions. Could anyone point me to where/how RMF can report FICON metrics or how to spot the signs of saturation in device reports? At the moment, if I can rule this out, I could focus on other issues. Failing that, anyone know how to setup ANALFIOE in MXGSAS? I asked our performance team and was told that they don't 'DO' DASD. Thanks for any help. Joe -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to [EMAIL PROTECTED] with the message: GET IBM-MAIN INFO Search the archives at http://bama.ua.edu/archives/ibm-main.html
Re: FICON Saturation?
Joe, in FICON channel environments, if the CHANNEL has run out of resources (OE's/Credits) this is not shown in IOSQ time. This is recorded as PENDING TIME. As you certainly know, IOSQ time is time spent inside operating system in UCB queuing. During this state, WLM IOS are controlling I/O priority, and PAV can reduce IOSQ delay. In an XRC environment, high IOSQ time for primary volume may signify LONG BUSY; if LB is at LCU level, all volumes in that LCU can have long IOSQ time, even volumes with low access rate. You can have some metric about FICON utilization, analyzing PORT status for DASD subsystem in RMF. Hope this helps. _ Paolo Cacciari Business Continuity and Recovery Services, IBM Global Services - South Region, EMEA Via Darwin 85, 20019 Settimo Milanese(MI) – Italy - MISET001 The goal is to be prepared for a disaster not to continually plan for a successful test * [EMAIL PROTECTED] ( + 39 051 41.36799 Mobile: + 39 335 6287584 7 + 39 02 596.23288 -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to [EMAIL PROTECTED] with the message: GET IBM-MAIN INFO Search the archives at http://bama.ua.edu/archives/ibm-main.html
Re: FICON Saturation?
Many thanks Paolo, I'll certainly check out the RMF bits and revisit ANTMON. With XRC in mind, could anyone point me to a description of the LONGBUSY (I've only been around XRC for 10 weeks). I did look at the ANTMON figures for the related secondary volume and saw very low consistency delays and no appearances in top 25 volume list. LONGBUSY looked to be 0 (but as I don't understand LONGBUSY - I may have misread that as well). Regards, Joe -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to [EMAIL PROTECTED] with the message: GET IBM-MAIN INFO Search the archives at http://bama.ua.edu/archives/ibm-main.html
Re: FICON Saturation?
-- snip -- I'm seeing a DFSORT that normally runs in 6 minutes take 33 minutes with the same record counts. It's accessing 1 disk that has an extremely low activity rate, response times not that high, IOSQ time appears to be 95% of resp time and device delays of 99 - 100%. One VSAM cluster is being -- snip -- A high IOSQ doesn't sound good. Does this device have any PAVs assigned to it (look at RMF)? Are there any other tasks that are also using this volume? Is it possibly a duplexed volume in XRC that is being put on long busy due to high IO activity in general (ANT... messages in SYSLOG)? RMF will also show you FICON usage if you want to eliminate that as a possible cause. John -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to [EMAIL PROTECTED] with the message: GET IBM-MAIN INFO Search the archives at http://bama.ua.edu/archives/ibm-main.html
Re: FICON Saturation?
John, I've already checked the device on ANTMON and checked the ANTX messages. Nothing. The device is online to 7 systems and had zero activity on the ther 6. It's highest activity on the system in question was only 38 so it certainly wasn't a stressed volume. I didn't think the PAV would help as it was to the same dataset? No other jobs/datasets active on that volume. Where can I see FICON channel stuff on RMF? that would be a tremendous help. I thought RMF didn't give any useful metrics for FICON. %utilised doesn't mean squat with FICON? I've been wrong so many times before though. JJ -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to [EMAIL PROTECTED] with the message: GET IBM-MAIN INFO Search the archives at http://bama.ua.edu/archives/ibm-main.html
Re: FICON Saturation?
In a message dated 11/8/2006 8:51:48 A.M. Central Standard Time, [EMAIL PROTECTED] writes: I didn't think the PAV would help as it was to the same dataset? No other jobs/datasets active on that volume. PAV can indeed help even if all I/O is to only one data set. You can have any number of simultaneous I/Os going on within a single data set without any mutual performance impact now with PAV as long as the following are all true: (1) all software which generates channel programs to go against that data set has been modernized to produce Define Extent limits that are as small as possible (preferably to a single track if possible); (2) the workload involved with that data set does not cause many writes into the same track or track ranges that are involved with other, simultaneous, read-only I/Os; (3) there are enough channel paths into the controller and enough internal paths within the controller to allow that level of simultaneity; (4) most of the I/Os use only cache storage within the controller and do not directly involve the device (meaning either cache hits, cache fast writes, and/or DASD fast writes). Bill Fairchild -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to [EMAIL PROTECTED] with the message: GET IBM-MAIN INFO Search the archives at http://bama.ua.edu/archives/ibm-main.html
Re: FICON Saturation?
More on minimizing the range of tracks in the Define Extent parameters: Until the advent of PAV on IBM 2105 (ESS) controllers, there was no demonstrable incentive to code the logic necessary to change the Define Extent parameter on every I/O so that only the track(s) accessed by that one I/O would be in the range of tracks that could possibly be accessed, so the typical access method's code would copy the beginning and ending CCHH values from the extent descriptor into the Define Extent parameters. It might have been more elegant and forward-thinking to minimize the extent, but there was no possible performance gain in doing so. When PAV became available, this no longer best practice caused unnecessary contention within a single data set and possible performance gains by using PAV were made impossible. So access methods were changed to use a better best practice. Which access methods were changed depended, as always, on what IBM viewed as strategic products. Thus DB2 was changed immediately. I don't know for sure, but I would suspect the change went into the Media Manager code, thus allowing all other strategic products that use Media Manager to benefit from PAV as well as DB2. Other products will not have their code changed unless those products turn into squeaky wheels that need some grease. In early 2001 I discovered that DB2's I/O requests were having their extents shrunk to the minimum necessary for the I/O. I don't know when the change was introduced, though. There are some subroutines in IOS involved in building channel program prefixes that would be the logical place where IBM could insert new code to shrink extents for all access methods, but I don't think this will ever happen. Only strategic I/Os are ever optimized. Bill Fairchild -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to [EMAIL PROTECTED] with the message: GET IBM-MAIN INFO Search the archives at http://bama.ua.edu/archives/ibm-main.html