Re: Degraded I/O performance in 1.10?
George, I would be glad to open a PMR. Not sure what component I would assign too is all _ Dave Jousma Assistant Vice President, Mainframe Services david.jou...@53.com 1830 East Paris, Grand Rapids, MI 49546 MD RSCB1G p 616.653.8429 f 616.653.8497 -Original Message- From: IBM Mainframe Discussion List [mailto:ibm-m...@bama.ua.edu] On Behalf Of George Kozakos Sent: Tuesday, August 18, 2009 7:11 PM To: IBM-MAIN@bama.ua.edu Subject: Re: Degraded I/O performance in 1.10? David Jousma wrote, Thanks for asking. We are still trying to evaluate the situation. So far, we are leaning towards degraded I/o performance. Of course elapsed times have gone up, but not due to CPU constraint. It is a pretty elusive situation. Hi David, The reason I asked is that even though the CPUs are not constrained the problem could still be due to CPU rather than I/O performance. Without seeing any data I do not want to point to a specific APAR and would recommend opening a PMR. Regards, George Kozakos z/OS Software Service, Level 2 Supervisor This e-mail transmission contains information that is confidential and may be privileged. It is intended only for the addressee(s) named above. If you receive this e-mail in error, please do not read, copy or disseminate it in any manner. If you are not the intended recipient, any disclosure, copying, distribution or use of the contents of this information is prohibited. Please reply to the message immediately by informing the sender that the message was misdirected. After replying, please erase it from your computer system. Your assistance in correcting this error is appreciated. -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@bama.ua.edu with the message: GET IBM-MAIN INFO Search the archives at http://bama.ua.edu/archives/ibm-main.html
Re: Degraded I/O performance in 1.10?
I would be glad to open a PMR. Not sure what component I would assign too is all Hi David, You should open a PMR to the Supervisor component with the description that batch job elapsed times have increased after migration to z/OS R10. I suspect that OA29595 is a possibility from your updates but would need to see a dump of one of the batch jobs via a PMR. Regards, George Kozakos z/OS Software Service, Level 2 Supervisor -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@bama.ua.edu with the message: GET IBM-MAIN INFO Search the archives at http://bama.ua.edu/archives/ibm-main.html
Re: Degraded I/O performance in 1.10?
Hi David, Have you verified that the problem is due to degraded I/O performance or could it be longer elapsed times for the same amount of CPU time? Regards, George Kozakos z/OS Software Service, Level 2 Supervisor -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@bama.ua.edu with the message: GET IBM-MAIN INFO Search the archives at http://bama.ua.edu/archives/ibm-main.html
Re: Degraded I/O performance in 1.10?
Longer elapsed time for the same amount of CPU implies a bottleneck either waiting for CPU, waiting for I/O, or paging. Joel Wolpert Performance and Capacity Planning consultant WEBSITE: www.perfconsultant.com - Original Message - From: George Kozakos gkoza...@au1.ibm.com Newsgroups: bit.listserv.ibm-main To: IBM-MAIN@bama.ua.edu Sent: Tuesday, August 18, 2009 5:11 PM Subject: Re: Degraded I/O performance in 1.10? Hi David, Have you verified that the problem is due to degraded I/O performance or could it be longer elapsed times for the same amount of CPU time? Regards, George Kozakos z/OS Software Service, Level 2 Supervisor -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@bama.ua.edu with the message: GET IBM-MAIN INFO Search the archives at http://bama.ua.edu/archives/ibm-main.html -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@bama.ua.edu with the message: GET IBM-MAIN INFO Search the archives at http://bama.ua.edu/archives/ibm-main.html
Re: Degraded I/O performance in 1.10?
George, Thanks for asking. We are still trying to evaluate the situation. So far, we are leaning towards degraded I/o performance. Of course elapsed times have gone up, but not due to CPU constraint. It is a pretty elusive situation. _ Dave Jousma Assistant Vice President, Mainframe Services david.jou...@53.com 1830 East Paris, Grand Rapids, MI 49546 MD RSCB1G p 616.653.8429 f 616.653.8497 -Original Message- From: IBM Mainframe Discussion List [mailto:ibm-m...@bama.ua.edu] On Behalf Of George Kozakos Sent: Tuesday, August 18, 2009 5:11 PM To: IBM-MAIN@bama.ua.edu Subject: Re: Degraded I/O performance in 1.10? Hi David, Have you verified that the problem is due to degraded I/O performance or could it be longer elapsed times for the same amount of CPU time? Regards, George Kozakos z/OS Software Service, Level 2 Supervisor -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@bama.ua.edu with the message: GET IBM-MAIN INFO Search the archives at http://bama.ua.edu/archives/ibm-main.html This e-mail transmission contains information that is confidential and may be privileged. It is intended only for the addressee(s) named above. If you receive this e-mail in error, please do not read, copy or disseminate it in any manner. If you are not the intended recipient, any disclosure, copying, distribution or use of the contents of this information is prohibited. Please reply to the message immediately by informing the sender that the message was misdirected. After replying, please erase it from your computer system. Your assistance in correcting this error is appreciated. -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@bama.ua.edu with the message: GET IBM-MAIN INFO Search the archives at http://bama.ua.edu/archives/ibm-main.html
Re: Degraded I/O performance in 1.10?
David Jousma wrote, Thanks for asking. We are still trying to evaluate the situation. So far, we are leaning towards degraded I/o performance. Of course elapsed times have gone up, but not due to CPU constraint. It is a pretty elusive situation. Hi David, The reason I asked is that even though the CPUs are not constrained the problem could still be due to CPU rather than I/O performance. Without seeing any data I do not want to point to a specific APAR and would recommend opening a PMR. Regards, George Kozakos z/OS Software Service, Level 2 Supervisor -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@bama.ua.edu with the message: GET IBM-MAIN INFO Search the archives at http://bama.ua.edu/archives/ibm-main.html
Re: Degraded I/O performance in 1.10?
snip-- Thanks for asking. We are still trying to evaluate the situation. So far, we are leaning towards degraded I/o performance. Of course elapsed times have gone up, but not due to CPU constraint. It is a pretty elusive situation. -unsnip-- Sounds to me like time for a serious I/O analysis. No simple way to do this; you just pore through reports, maing notes and comparisons. Are you doing any DASD mirroring? If this is the case, the problem becomes even more elusive; you'll have to compare mirrored volumes with non-mirrored volumes for comparable I/O rates. IIRC, when the mirror gets overloaded, it starts to disable the CPU-DASD paths in an effort to catch up; in the worst case, you may find yourself running a single path to the distressed DASD. HTH. Rick... -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@bama.ua.edu with the message: GET IBM-MAIN INFO Search the archives at http://bama.ua.edu/archives/ibm-main.html
Re: Degraded I/O performance in 1.10?
Brian, Thanks for the response. We have been doing more digging, and are looking at our storage arrays to make sure everything is performing as planned. _ Dave Jousma Assistant Vice President, Mainframe Services david.jou...@53.com 1830 East Paris, Grand Rapids, MI 49546 MD RSCB1G p 616.653.8429 f 616.653.8497 -Original Message- From: IBM Mainframe Discussion List [mailto:ibm-m...@bama.ua.edu] On Behalf Of Brian Westerman Sent: Friday, August 14, 2009 11:00 PM To: IBM-MAIN@bama.ua.edu Subject: Re: Degraded I/O performance in 1.10? Back to the original question/problem. I'm assuming that your programmers are not complaining that they problem is the number of I/O's or EXCPS have gone up because they could probably check those figures for themselves in the actual JOB output, but that it feels to them like jobs that do a lot of I/O seem to be taking longer to run. This could be any of several issues related to your parmlib settings or WLM settings where you are penalizing high I/O, or could be a hardware issue that coincided with your OS upgrade. I couldn't even count the number of problems that I have searched on during and after upgrades that turned out to be something that the site's CE decided to implement during the outage. So don't limit your searching to z/OS 1.10 possibilities as it could very well be a hardware issue that you had very little control over. Check to be sure that your WLM settings have not changes in an unwarranted manner. This may not be an issue of everything being bad, just that some jobs are now taking longer while a lot of others are running faster. I think you shoudl probably err on the side of caution and assume that they have a point until you can prove otherwise. They won't believe you anyway without proof. If you were allowed to function without proof, you would be one of them. :) Have you checked to be sure that your PAV settings are still there. You may have lost your dynamic PAV in the quest for HyperPAV. Also, you may want to see if your CE (IBM or other) has made changes to your RAID. It's possible that you may have lost some cache, or some of the features are not set as they were previously. Is it only certain datasets, or certain volumes (or subsets of volumes) that appear to be affected? For instance, is it only a few VSAM files that may exhibit the perceived problem? What has changed (if anything) about their location? Once you can quantify something concrete, it will make the job much easier. Once you locate some common threads you can start to zoom in on where the issue is presenting itself and figure out what may have changed. It's also completely possible that there may not be a problem, but programmers, (being what they are), will need you to prove that nothing has changed. If you check everything and see absolutely no difference in the jobs, then you can move into that response. If you need to contact me offline about this, feel free to do so and let me know what I can do to help. Brian -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@bama.ua.edu with the message: GET IBM-MAIN INFO Search the archives at http://bama.ua.edu/archives/ibm-main.html This e-mail transmission contains information that is confidential and may be privileged. It is intended only for the addressee(s) named above. If you receive this e-mail in error, please do not read, copy or disseminate it in any manner. If you are not the intended recipient, any disclosure, copying, distribution or use of the contents of this information is prohibited. Please reply to the message immediately by informing the sender that the message was misdirected. After replying, please erase it from your computer system. Your assistance in correcting this error is appreciated. -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@bama.ua.edu with the message: GET IBM-MAIN INFO Search the archives at http://bama.ua.edu/archives/ibm-main.html
Re: Degraded I/O performance in 1.10?
Ted, I'm eating some humble pie while typing this. I found an old article by Cheryl Watson published in August 1988 that described IO Service Units, I presume with IOSERV=TIME, as 1 IO Service Unit = 8.32msec of connect time (about 1/2 revolution of most DASD devices). Being 1988 the DASD would have been 3380 which spun at 3600 RPM, which is an average latency of 8 1/3 msec. 3390s spun at 4200 RPM. 8.32 msec does divide evenly by 128 microseconds (128 * 6500 = 832000). I would be surprised if the proximity to average latency is anything more than a coincidence though. Ron -Original Message- From: IBM Mainframe Discussion List [mailto:ibm-m...@bama.ua.edu] On Behalf Of Ted MacNEIL Sent: Thursday, August 13, 2009 7:23 PM To: IBM-MAIN@bama.ua.edu Subject: Re: [IBM-MAIN] Degraded I/O performance in 1.10? Ted, I like to see the documentation. The Channel measurement block records connect time and SRM in turn converts that to IO service units. Are you saying that 8.3ms was equivalent to 1 IO service Unit? Gotta look over 20 years ago. IOSRV=COUNT IOSRV=TIME was an option, a long time ago. I believe XA, but (as always) I could be wrong. And, back then, it was documented as 8.3ms. Or was it IOSRVC? - Too busy driving to stop for gas! -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@bama.ua.edu with the message: GET IBM-MAIN INFO Search the archives at http://bama.ua.edu/archives/ibm-main.html -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@bama.ua.edu with the message: GET IBM-MAIN INFO Search the archives at http://bama.ua.edu/archives/ibm-main.html
Re: Degraded I/O performance in 1.10?
Ron Hawkins wrote: Being 1988 the DASD would have been 3380 which spun at 3600 RPM, which is an average latency of 8 1/3 msec. 3390s spun at 4200 RPM. 8.32 msec does divide evenly by 128 microseconds (128 * 6500 = 832000). I would be surprised if the proximity to average latency is anything more than a coincidence though. I would be surprised if it isn't, although I'd use a value a little smidgen higher. The typical I/O requires some setup to get IOs to handle the request, it needs to be queued, wait until the device is available, position the heads, search or seek the record, transfer the data, and clean up. Processors in the late eighties were fast enough so that only the search or seek processing took any significant time compared to processing time. If the disks were favorably positioned at the time of request, there would be no overhead, vs. maximum overhead if it just passed the requested record. So half the latency represents an average; I'd also add a correction factor for arm positioning, but if you're the only one running, after the first I/O that also becomes negligible. Gerhard Postpischil Bradford, VT -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@bama.ua.edu with the message: GET IBM-MAIN INFO Search the archives at http://bama.ua.edu/archives/ibm-main.html
Re: Degraded I/O performance in 1.10?
Gerhard Postpischil gerh...@valley.net wrote in message news:4a855dee.4060...@valley.net... Ron Hawkins wrote: Being 1988 the DASD would have been 3380 which spun at 3600 RPM, which is an average latency of 8 1/3 msec. 3390s spun at 4200 RPM. 8.32 msec does divide evenly by 128 microseconds (128 * 6500 = 832000). I would be surprised if the proximity to average latency is anything more than a coincidence though. I would be surprised if it isn't, although I'd use a value a little smidgen higher. The typical I/O requires some setup to get IOs to handle the request, it needs to be queued, wait until the device is available, position the heads, search or seek the record, transfer the data, and clean up. Processors in the late eighties were fast enough so that only the search or seek processing took any significant time compared to processing time. If the disks were favorably positioned at the time of request, there would be no overhead, vs. maximum overhead if it just passed the requested record. So half the latency represents an average; I'd also add a correction factor for arm positioning, but if you're the only one running, after the first I/O that also becomes negligible. Gerhard Postpischil Bradford, VT Please, this is, as often in this group, far Off-Topic. Is there anybody who can say something On-Topic, meaning answer Davids question? We are going to 1.10 soon and are very interested in this threads Topic. Kees. ** For information, services and offers, please visit our web site: http://www.klm.com. This e-mail and any attachment may contain confidential and privileged material intended for the addressee only. If you are not the addressee, you are notified that no part of the e-mail or any attachment may be disclosed, copied or distributed, and that any other action related to this e-mail or attachment is strictly prohibited, and may be unlawful. If you have received this e-mail by error, please notify the sender immediately by return e-mail, and delete this message. Koninklijke Luchtvaart Maatschappij NV (KLM), its subsidiaries and/or its employees shall not be liable for the incorrect or incomplete transmission of this e-mail or any attachments, nor responsible for any delay in receipt. Koninklijke Luchtvaart Maatschappij N.V. (also known as KLM Royal Dutch Airlines) is registered in Amstelveen, The Netherlands, with registered number 33014286 ** -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@bama.ua.edu with the message: GET IBM-MAIN INFO Search the archives at http://bama.ua.edu/archives/ibm-main.html
Re: Degraded I/O performance in 1.10?
You're right. It spun at 70 revolutions per second. The 3380 spun at 60 rps, so its revolution took 16.67 ms. The average latency of a disk drive was useful for calculating connect time when every I/O probably involved a real seek (disconnect time) and a real partial revolution for the search loop to find the correct record (which was all connect time). But with today's hardware, caching, RAID, channel speed, controller buffering, etc., the connect time component should consist almost totally of data transfer. 1/2 revolution's worth of data transfer indicates the average amount of data to be transferred per I/O is 1/2 of a full track. Since EXCP tells SMF to add one to its I/O counters not for every I/O request but rather for every block being transferred, then RMF's reported connect time for these I/Os should vary widely if BUFNO is varied widely, say from one to ten, while the EXCP counted reported by SMF would be constant. I don't doubt the validity of the IBM number at the time it was published (aeons ago). I doubt its validity for today's hardware. I am only trying to guess why IBM recommended that number aeons ago in the face of its obvious inapplicability today. Bill Fairchild Software Developer Rocket Software 275 Grove Street * Newton, MA 02466-2272 * USA Tel: +1.617.614.4503 * Mobile: +1.508.341.1715 Email: bi...@mainstar.com Web: www.rocketsoftware.com -Original Message- From: IBM Mainframe Discussion List [mailto:ibm-m...@bama.ua.edu] On Behalf Of Ron Hawkins Sent: Thursday, August 13, 2009 8:56 PM To: IBM-MAIN@bama.ua.edu Subject: Re: Degraded I/O performance in 1.10? Bill, My memory, and follow up calculation, says that a 3390 rotated every 14.2ms, not 16.67ms. Even so, it would hardly seem a good move to multiply or divide a metric based on transfer time by the avg latency of a disk drive. I don't see the relationship. Ron -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@bama.ua.edu with the message: GET IBM-MAIN INFO Search the archives at http://bama.ua.edu/archives/ibm-main.html
Re: Degraded I/O performance in 1.10?
I'll let you talk to IBM, since I don't do I/O performance measurements any more. I believe their number was perfectly correct at one time, but not now (see my post in reply to Ron Hawkins for details). If I were doing I/O performance measurement and tuning today, I would most definitely not use that number. Since you are using the number, you should verify its accuracy and, if not accurate any more, ask IBM yourself or else find a more modern analysis of average I/O service time. Bill Fairchild Software Developer Rocket Software 275 Grove Street * Newton, MA 02466-2272 * USA Tel: +1.617.614.4503 * Mobile: +1.508.341.1715 Email: bi...@mainstar.com Web: www.rocketsoftware.com -Original Message- From: IBM Mainframe Discussion List [mailto:ibm-m...@bama.ua.edu] On Behalf Of Ted MacNEIL Sent: Thursday, August 13, 2009 6:52 PM To: IBM-MAIN@bama.ua.edu Subject: Re: Degraded I/O performance in 1.10? It was probably a good value to use aeons ago when it took a real SLED 3390 16.67 ms. to spin around once, so 8.3 ms. was 1/2 revolution. Today, however, is aeons later as far as the hardware is concerned, especially channel speed when delivering data from controller cache instead of straight from the platter. Talk to IBM! I didn't make up the number. It's just what it is. - Too busy driving to stop for gas! -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@bama.ua.edu with the message: GET IBM-MAIN INFO Search the archives at http://bama.ua.edu/archives/ibm-main.html -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@bama.ua.edu with the message: GET IBM-MAIN INFO Search the archives at http://bama.ua.edu/archives/ibm-main.html
Re: Degraded I/O performance in 1.10?
The connect time estimate of 8.3 ms. is apparently 1/2 revolution of a 3380. Over 20 years ago (before 1989) was before the 3390 was first introduced, so a 3380's values would still be a correct value in whatever year that value was published. Whatever is reported by RMF will always be an integral multiple of 128 microseconds after rounding. Bill Fairchild Software Developer Rocket Software 275 Grove Street * Newton, MA 02466-2272 * USA Tel: +1.617.614.4503 * Mobile: +1.508.341.1715 Email: bi...@mainstar.com Web: www.rocketsoftware.com -Original Message- From: IBM Mainframe Discussion List [mailto:ibm-m...@bama.ua.edu] On Behalf Of Ted MacNEIL Sent: Thursday, August 13, 2009 9:23 PM To: IBM-MAIN@bama.ua.edu Subject: Re: Degraded I/O performance in 1.10? Ted, I like to see the documentation. The Channel measurement block records connect time and SRM in turn converts that to IO service units. Are you saying that 8.3ms was equivalent to 1 IO service Unit? Gotta look over 20 years ago. IOSRV=COUNT IOSRV=TIME was an option, a long time ago. I believe XA, but (as always) I could be wrong. And, back then, it was documented as 8.3ms. Or was it IOSRVC? - Too busy driving to stop for gas! -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@bama.ua.edu with the message: GET IBM-MAIN INFO Search the archives at http://bama.ua.edu/archives/ibm-main.html -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@bama.ua.edu with the message: GET IBM-MAIN INFO Search the archives at http://bama.ua.edu/archives/ibm-main.html
Re: Degraded I/O performance in 1.10?
Vernooij, CP - SPLXM pisze: [...] Please, this is, as often in this group, far Off-Topic. Is there anybody who can say something On-Topic, meaning answer Davids question? We are going to 1.10 soon and are very interested in this threads Topic. IMHO it is on topic (mainframes) in the IBM-MAIN list context, and it it off-topic when considering the thread's topic. In other words I think it is justified to keep the discussion on the forum, but maybe it would be good idea to change message topic. Just my $0.02 BTW: I don't like topic deviations to recollections of S/360 models ;-) Regards -- Radoslaw Skorupka Lodz, Poland -- BRE Bank SA ul. Senatorska 18 00-950 Warszawa www.brebank.pl Sd Rejonowy dla m. st. Warszawy XII Wydzia Gospodarczy Krajowego Rejestru Sdowego, nr rejestru przedsibiorców KRS 025237 NIP: 526-021-50-88 Wedug stanu na dzie 01.01.2009 r. kapita zakadowy BRE Banku SA (w caoci wpacony) wynosi 118.763.528 zotych. W zwizku z realizacj warunkowego podwyszenia kapitau zakadowego, na podstawie uchway XXI WZ z dnia 16 marca 2008r., oraz uchway XVI NWZ z dnia 27 padziernika 2008r., moe ulec podwyszeniu do kwoty 123.763.528 z. Akcje w podwyszonym kapitale zakadowym BRE Banku SA bd w caoci opacone. -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@bama.ua.edu with the message: GET IBM-MAIN INFO Search the archives at http://bama.ua.edu/archives/ibm-main.html
Re: Degraded I/O performance in 1.10?
Herman, Thanks for the response. The files in question are VSAM. I will re-check the migration guide for info. _ Dave Jousma Assistant Vice President, Mainframe Services david.jou...@53.com 1830 East Paris, Grand Rapids, MI 49546 MD RSCB1G p 616.653.8429 f 616.653.8497 -Original Message- From: IBM Mainframe Discussion List [mailto:ibm-m...@bama.ua.edu] On Behalf Of Stocker, Herman Sent: Thursday, August 13, 2009 1:13 PM To: IBM-MAIN@bama.ua.edu Subject: Re: Degraded I/O performance in 1.10? Hi David, Look into increasing the buffers VSAM index. Some catalog changes have occurred that may be the cause of your slow response. Also SMF and Logrec buffering. Regards, Herman Stocker --- snip--- All, I realize this is a really open ended question. We completed our 1.8 to 1.10 upgrade in June, with no known problems. Everything seems to be running fine. However, I have various people(mostly developers) occasionally complaining that they think the system is slower since the upgrade. Of course, the upgrade gets blamed for everything. By slower, the are referring to their batch jobs, those that do a lot of I/O. Interestingly, several people, who do not work in the same area(and most likely do not talk to each other), asked if file buffering has changed somehow with the upgrade. I tell them, not that I am aware of, and ask them for specifics to research, and in most cases I compare the jobs running before and after the upgrade, and the EXCPs all seem to be inline. So, all I can say is that there is this gut feeling that something isn't quite right, but can't put a finger on it. Has anyone else noticed anything, or have idea's on what to look for? _ Dave Jousma ---/snip--- The sender believes that this E-mail and any attachments were free of any virus, worm, Trojan horse, and/or malicious code when sent. This message and its attachments could have been infected during transmission. By reading the message and opening any attachments, the recipient accepts full responsibility for taking protective and remedial action about viruses and other defects. The sender's employer is not liable for any loss or damage arising in any way from this message or its attachments. -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@bama.ua.edu with the message: GET IBM-MAIN INFO Search the archives at http://bama.ua.edu/archives/ibm-main.html This e-mail transmission contains information that is confidential and may be privileged. It is intended only for the addressee(s) named above. If you receive this e-mail in error, please do not read, copy or disseminate it in any manner. If you are not the intended recipient, any disclosure, copying, distribution or use of the contents of this information is prohibited. Please reply to the message immediately by informing the sender that the message was misdirected. After replying, please erase it from your computer system. Your assistance in correcting this error is appreciated. -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@bama.ua.edu with the message: GET IBM-MAIN INFO Search the archives at http://bama.ua.edu/archives/ibm-main.html
Re: Degraded I/O performance in 1.10?
Herman, Do you have any links to this info? I only find the changes to CA-sizes in the migration guide. _ Dave Jousma Assistant Vice President, Mainframe Services david.jou...@53.com 1830 East Paris, Grand Rapids, MI 49546 MD RSCB1G p 616.653.8429 f 616.653.8497 -Original Message- From: IBM Mainframe Discussion List [mailto:ibm-m...@bama.ua.edu] On Behalf Of Jousma, David Sent: Friday, August 14, 2009 9:40 AM To: IBM-MAIN@bama.ua.edu Subject: Re: Degraded I/O performance in 1.10? Herman, Thanks for the response. The files in question are VSAM. I will re-check the migration guide for info. _ Dave Jousma Assistant Vice President, Mainframe Services david.jou...@53.com 1830 East Paris, Grand Rapids, MI 49546 MD RSCB1G p 616.653.8429 f 616.653.8497 -Original Message- From: IBM Mainframe Discussion List [mailto:ibm-m...@bama.ua.edu] On Behalf Of Stocker, Herman Sent: Thursday, August 13, 2009 1:13 PM To: IBM-MAIN@bama.ua.edu Subject: Re: Degraded I/O performance in 1.10? Hi David, Look into increasing the buffers VSAM index. Some catalog changes have occurred that may be the cause of your slow response. Also SMF and Logrec buffering. Regards, Herman Stocker --- snip--- All, I realize this is a really open ended question. We completed our 1.8 to 1.10 upgrade in June, with no known problems. Everything seems to be running fine. However, I have various people(mostly developers) occasionally complaining that they think the system is slower since the upgrade. Of course, the upgrade gets blamed for everything. By slower, the are referring to their batch jobs, those that do a lot of I/O. Interestingly, several people, who do not work in the same area(and most likely do not talk to each other), asked if file buffering has changed somehow with the upgrade. I tell them, not that I am aware of, and ask them for specifics to research, and in most cases I compare the jobs running before and after the upgrade, and the EXCPs all seem to be inline. So, all I can say is that there is this gut feeling that something isn't quite right, but can't put a finger on it. Has anyone else noticed anything, or have idea's on what to look for? _ Dave Jousma ---/snip--- The sender believes that this E-mail and any attachments were free of any virus, worm, Trojan horse, and/or malicious code when sent. This message and its attachments could have been infected during transmission. By reading the message and opening any attachments, the recipient accepts full responsibility for taking protective and remedial action about viruses and other defects. The sender's employer is not liable for any loss or damage arising in any way from this message or its attachments. -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@bama.ua.edu with the message: GET IBM-MAIN INFO Search the archives at http://bama.ua.edu/archives/ibm-main.html This e-mail transmission contains information that is confidential and may be privileged. It is intended only for the addressee(s) named above. If you receive this e-mail in error, please do not read, copy or disseminate it in any manner. If you are not the intended recipient, any disclosure, copying, distribution or use of the contents of this information is prohibited. Please reply to the message immediately by informing the sender that the message was misdirected. After replying, please erase it from your computer system. Your assistance in correcting this error is appreciated. -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@bama.ua.edu with the message: GET IBM-MAIN INFO Search the archives at http://bama.ua.edu/archives/ibm-main.html This e-mail transmission contains information that is confidential and may be privileged. It is intended only for the addressee(s) named above. If you receive this e-mail in error, please do not read, copy or disseminate it in any manner. If you are not the intended recipient, any disclosure, copying, distribution or use of the contents of this information is prohibited. Please reply to the message immediately by informing the sender that the message was misdirected. After replying, please erase it from your computer system. Your assistance in correcting this error is appreciated. -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@bama.ua.edu with the message: GET IBM-MAIN INFO Search the archives at http://bama.ua.edu/archives/ibm-main.html
Re: Degraded I/O performance in 1.10?
Sorry Dave, I unlike a number of the listers do not keep references after I have used them. Regards, Herman Stocker - Snip- Herman, Do you have any links to this info? I only find the changes to CA-sizes in the migration guide. -/Snip- The sender believes that this E-mail and any attachments were free of any virus, worm, Trojan horse, and/or malicious code when sent. This message and its attachments could have been infected during transmission. By reading the message and opening any attachments, the recipient accepts full responsibility for taking protective and remedial action about viruses and other defects. The sender's employer is not liable for any loss or damage arising in any way from this message or its attachments. -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@bama.ua.edu with the message: GET IBM-MAIN INFO Search the archives at http://bama.ua.edu/archives/ibm-main.html
Re: Degraded I/O performance in 1.10?
Gerhard, Nothing wrong with what you said, but IOSERV uses connect time, which is handshake and transfer and represents work being done by the CEC. If everything else you mentioned was to be included then why not use the sum of Connect, Disconnect and Pend (Service Time) to calculate IO Service Units? Ron I would be surprised if it isn't, although I'd use a value a little smidgen higher. The typical I/O requires some setup to get IOs to handle the request, it needs to be queued, wait until the device is available, position the heads, search or seek the record, transfer the data, and clean up. Processors in the late eighties were fast enough so that only the search or seek processing took any significant time compared to processing time. If the disks were favorably positioned at the time of request, there would be no overhead, vs. maximum overhead if it just passed the requested record. So half the latency represents an average; I'd also add a correction factor for arm positioning, but if you're the only one running, after the first I/O that also becomes negligible. Gerhard Postpischil Bradford, VT -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@bama.ua.edu with the message: GET IBM-MAIN INFO Search the archives at http://bama.ua.edu/archives/ibm-main.html -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@bama.ua.edu with the message: GET IBM-MAIN INFO Search the archives at http://bama.ua.edu/archives/ibm-main.html
Re: Degraded I/O performance in 1.10?
If I were doing I/O performance measurement and tuning today, I would most definitely not use that number. Why not? That is what it is -- constant. I'm pretty sure it's derived from the equation 128 mics * 6500 = 8.32 ms. Since you are using the number, you should verify its accuracy and, if not accurate any more, ask IBM yourself or else find a more modern analysis of average I/O service time. The number is good for the 'quick and dirty'. I never said that Ron's suggestion for the analysis of I/O from RMF (etc) was wrong. Nor did I say I was using the number, myself. I was just disputing the comment that EXCP's were only block counts. That depends on the setting (TIME or COUNT). And, I believe COUNT is still the default. - Too busy driving to stop for gas! -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@bama.ua.edu with the message: GET IBM-MAIN INFO Search the archives at http://bama.ua.edu/archives/ibm-main.html
Re: Degraded I/O performance in 1.10?
Bill, With XA I don't think that RPS was ever included in connect time. I admit I only started working on XA in 1984, but everything I had from back then by Beretvas and Freisenborg uses disconnect time to estimate if there is a seek problem based on RPS being counted in Disconnect time. Of course this was focused on 3880 Controllers. I have no idea if it was different for earlier models that required reconnect for handling TIC. Ron -Original Message- From: IBM Mainframe Discussion List [mailto:ibm-m...@bama.ua.edu] On Behalf Of Bill Fairchild Sent: Friday, August 14, 2009 6:24 AM To: IBM-MAIN@bama.ua.edu Subject: Re: [IBM-MAIN] Degraded I/O performance in 1.10? You're right. It spun at 70 revolutions per second. The 3380 spun at 60 rps, so its revolution took 16.67 ms. The average latency of a disk drive was useful for calculating connect time when every I/O probably involved a real seek (disconnect time) and a real partial revolution for the search loop to find the correct record (which was all connect time). But with today's hardware, caching, RAID, channel speed, controller buffering, etc., the connect time component should consist almost totally of data transfer. 1/2 revolution's worth of data transfer indicates the average amount of data to be transferred per I/O is 1/2 of a full track. Since EXCP tells SMF to add one to its I/O counters not for every I/O request but rather for every block being transferred, then RMF's reported connect time for these I/Os should vary widely if BUFNO is varied widely, say from one to ten, while the EXCP counted reported by SMF would be constant. I don't doubt the validity of the IBM number at the time it was published (aeons ago). I doubt its validity for today's hardware. I am only trying to guess why IBM recommended that number aeons ago in the face of its obvious inapplicability today. Bill Fairchild Software Developer Rocket Software 275 Grove Street * Newton, MA 02466-2272 * USA Tel: +1.617.614.4503 * Mobile: +1.508.341.1715 Email: bi...@mainstar.com Web: www.rocketsoftware.com -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@bama.ua.edu with the message: GET IBM-MAIN INFO Search the archives at http://bama.ua.edu/archives/ibm-main.html
Re: Degraded I/O performance in 1.10?
In a message dated 8/14/2009 9:01:30 A.M. Central Daylight Time, ron.hawkins1...@sbcglobal.net writes: handshake and transfer and represents work being done by the CEC. If everything else you mentioned was to be included then why not use the sum of Connect, Disconnect and Pend (Service Time) to calculate IO Service Units? Guess I'd go for more of a macro level approach first. 1)Open a PMR with IBM. They may be able to suggest remedial maint. 2)Look at the rudimentary RMF(or RMFPP) reports for channels and controllers %Busy. May be that doing the conversion that the PROD config lost paths. 3)Get help. _www.perfassoc.com_ (http://www.perfassoc.com) or _www.watsonwalker.com_ (http://www.watsonwalker.com) offer tuning services . **A Good Credit Score is 700 or Above. See yours in just 2 easy steps! (http://pr.atwola.com/promoclk/100126575x1222846709x1201493018/aol?redir=http://www.freecreditreport.com/pm/default.aspx?sc=668072hmpgID=115bcd =JulystepsfooterNO115) -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@bama.ua.edu with the message: GET IBM-MAIN INFO Search the archives at http://bama.ua.edu/archives/ibm-main.html
Re: Degraded I/O performance in 1.10?
Ron, It's been so long that I had forgotten about RPS. My comments about a connected search loop became obsolete with the advent of RPS. Then the average value of 1/2 rotation was to compute the disconnect time waiting for RPS to cause a reconnect to the channel, assuming that the sector value had been computed correctly. Some more milliseconds of disconnect time were added in to account for average seek. After RPS' advent, connect time was 100% due to data transfer. Today it is different thanks to FICON and controller microcode. At one time, all the handshaking necessary to get the I/O started was lumped into pend time. With RPS, connect time was reliably used for calculating work done, and pend and disconnect time were attributed to queueing and thus non-repeatable. Another component of queueing that is not visible and usually ignored is I/O interrupt pending time, caused by not having a CPU available to field an interrupt as soon as the I/O ends. This component is sti! ll with us. After 20+ years our memories have trouble recalling all the details. Like you, I would want to see the original quote, its context, and the year when it was published. Bill Fairchild Software Developer Rocket Software 275 Grove Street * Newton, MA 02466-2272 * USA Tel: +1.617.614.4503 * Mobile: +1.508.341.1715 Email: bi...@mainstar.com Web: www.rocketsoftware.com -Original Message- From: IBM Mainframe Discussion List [mailto:ibm-m...@bama.ua.edu] On Behalf Of Ron Hawkins Sent: Friday, August 14, 2009 9:13 AM To: IBM-MAIN@bama.ua.edu Subject: Re: Degraded I/O performance in 1.10? Bill, With XA I don't think that RPS was ever included in connect time. I admit I only started working on XA in 1984, but everything I had from back then by Beretvas and Freisenborg uses disconnect time to estimate if there is a seek problem based on RPS being counted in Disconnect time. Of course this was focused on 3880 Controllers. I have no idea if it was different for earlier models that required reconnect for handling TIC. Ron -Original Message- From: IBM Mainframe Discussion List [mailto:ibm-m...@bama.ua.edu] On Behalf Of Bill Fairchild Sent: Friday, August 14, 2009 6:24 AM To: IBM-MAIN@bama.ua.edu Subject: Re: [IBM-MAIN] Degraded I/O performance in 1.10? You're right. It spun at 70 revolutions per second. The 3380 spun at 60 rps, so its revolution took 16.67 ms. The average latency of a disk drive was useful for calculating connect time when every I/O probably involved a real seek (disconnect time) and a real partial revolution for the search loop to find the correct record (which was all connect time). But with today's hardware, caching, RAID, channel speed, controller buffering, etc., the connect time component should consist almost totally of data transfer. 1/2 revolution's worth of data transfer indicates the average amount of data to be transferred per I/O is 1/2 of a full track. Since EXCP tells SMF to add one to its I/O counters not for every I/O request but rather for every block being transferred, then RMF's reported connect time for these I/Os should vary widely if BUFNO is varied widely, say from one to ten, while the EXCP counted reported by SMF would be constant. I don't doubt the validity of the IBM number at the time it was published (aeons ago). I doubt its validity for today's hardware. I am only trying to guess why IBM recommended that number aeons ago in the face of its obvious inapplicability today. Bill Fairchild Software Developer Rocket Software 275 Grove Street * Newton, MA 02466-2272 * USA Tel: +1.617.614.4503 * Mobile: +1.508.341.1715 Email: bi...@mainstar.com Web: www.rocketsoftware.com -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@bama.ua.edu with the message: GET IBM-MAIN INFO Search the archives at http://bama.ua.edu/archives/ibm-main.html -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@bama.ua.edu with the message: GET IBM-MAIN INFO Search the archives at http://bama.ua.edu/archives/ibm-main.html
Re: Degraded I/O performance in 1.10?
Ted, Isn't the statement I'm pretty sure it's derived from the equation 128 mics * 6500 = 8.32 ms a little arse about? That's how service units are derived, but Connect time and EXCP counts are not derived, they are recorded. And connect time is definitely not constant. With FICON the attribution of connect time varies from vendor to vendor, and transfer time varies with path activity such that connect time can be double accounted. I'd go as far as to say that in going from ESCON to FICON connect time became one of the more unreliable IO metrics. I still use connect time for some things, but I agree with Bill that IO service units derived from Connect Time are somewhat useless. I think you should restate what you are disputing, or show me which EXCP count field is recorded incorrectly when IOSERV=TIME is used. EXCP counts are EXCP counts. Connect Time is Connect Time. The only thing that IOSERV changes is whether IO services units are derived from EXCP counts or Connect Time. If for some reason you ignored block count fields and created a block count from IO service units then that result would change, but who would do that in the first place? Finally, my Guru has spoken! I recalled that Barry calculated both EXCP and IOTM from Service Units in the MXG Type72 record, and while checking that I found this interesting note: /* NOTE: PRIOR TO MVS/ESA 5.2, IO SERVICE UNITS COULD BE BASED ON */ /* EITHER EXCP COUNT OR IO CONNECT TIME, AND MXG CALCULATED TWO*/ /* VARIABLES, PGPEXCP AND PGPIOTM TO GIVE THE RAW IO UNITS. AS*/ /* THERE WAS NO FLAG IN TYPE72 TO IDENTIFY WHICH UNITS WERE USED, */ /* BOTH VARIABLES WERE CALCULATED KNOWING ONLY ONE WAS VALID. */ /* WHEN DEVICE CONNECT TIME WAS USED FOR SERVICE UNITS, A SERVICE */ /* UNIT WAS DEFINED AS 65 CONNECT TIME UNITS, AND A CONNECT TIME */ /* UNIT IS 128 MICROSECONDS, HENCE THE 8320E-6 FACTOR IN PGPIOTM. */ /* BUT BEGINNING WITH MVS/ESA 5.2, IO SERVICE UNITS CAN ONLY BE */ /* BASED ON EXCP COUNT, SO PGPIOTM IS FORCED MISSING FOR 5.2+. */ So based on that it would seem that IOSERV=TIME is no longer honoured and IO Service Units are always based on EXCP count. It also corrects my 6500* 128 calculation - it should be 65. Ron Ron -Original Message- From: IBM Mainframe Discussion List [mailto:ibm-m...@bama.ua.edu] On Behalf Of Ted MacNEIL Sent: Friday, August 14, 2009 7:12 AM To: IBM-MAIN@bama.ua.edu Subject: Re: [IBM-MAIN] Degraded I/O performance in 1.10? If I were doing I/O performance measurement and tuning today, I would most definitely not use that number. Why not? That is what it is -- constant. I'm pretty sure it's derived from the equation 128 mics * 6500 = 8.32 ms. Since you are using the number, you should verify its accuracy and, if not accurate any more, ask IBM yourself or else find a more modern analysis of average I/O service time. The number is good for the 'quick and dirty'. I never said that Ron's suggestion for the analysis of I/O from RMF (etc) was wrong. Nor did I say I was using the number, myself. I was just disputing the comment that EXCP's were only block counts. That depends on the setting (TIME or COUNT). And, I believe COUNT is still the default. - Too busy driving to stop for gas! -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@bama.ua.edu with the message: GET IBM-MAIN INFO Search the archives at http://bama.ua.edu/archives/ibm-main.html
Re: Degraded I/O performance in 1.10?
So based on that it would seem that IOSERV=TIME is no longer honoured and IO Service Units are always based on EXCP count. It also corrects my 6500* 128 calculation - it should be 65. I honestly don't know, but the last doc I looked at was circa 1.7 and the distinction of COUNT/TIME was still there, with nothing saying 'no longer honoured'. When I get to a PC I'm going to look it up in the current INITTuna. - Too busy driving to stop for gas! -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@bama.ua.edu with the message: GET IBM-MAIN INFO Search the archives at http://bama.ua.edu/archives/ibm-main.html
Re: Degraded I/O performance in 1.10?
On Fri, 14 Aug 2009 18:40:58 +, Ted MacNEIL wrote: Ron Hawkins wrote: So based on that it would seem that IOSERV=TIME is no longer I think you mean IOSRVC, a parameter in IEAIPSxx. honoured and IO Service Units are always based on EXCP count. It also corrects my 6500* 128 calculation - it should be 65. I honestly don't know, but the last doc I looked at was circa 1.7 and the distinction of COUNT/TIME was still there, with nothing saying 'no longer honoured'. This note appears in the Summary of Changes to the Initialization and Tuning Reference for z/OS 1.6: quote Beginning with z/OS V1R3, workload management (WLM) compatibility mode is no longer available. Information about WLM compatibility mode has been removed throughout this document, including descriptions of the IEAICSxx parmlib member, the IEAIPSxx member, and many options of the IEAOPTxx member. /quote IIRC, IEAIPSxx is not used when in goal mode.. -- Tom Marchant -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@bama.ua.edu with the message: GET IBM-MAIN INFO Search the archives at http://bama.ua.edu/archives/ibm-main.html
Re: Degraded I/O performance in 1.10?
Back to the original question/problem. I'm assuming that your programmers are not complaining that they problem is the number of I/O's or EXCPS have gone up because they could probably check those figures for themselves in the actual JOB output, but that it feels to them like jobs that do a lot of I/O seem to be taking longer to run. This could be any of several issues related to your parmlib settings or WLM settings where you are penalizing high I/O, or could be a hardware issue that coincided with your OS upgrade. I couldn't even count the number of problems that I have searched on during and after upgrades that turned out to be something that the site's CE decided to implement during the outage. So don't limit your searching to z/OS 1.10 possibilities as it could very well be a hardware issue that you had very little control over. Check to be sure that your WLM settings have not changes in an unwarranted manner. This may not be an issue of everything being bad, just that some jobs are now taking longer while a lot of others are running faster. I think you shoudl probably err on the side of caution and assume that they have a point until you can prove otherwise. They won't believe you anyway without proof. If you were allowed to function without proof, you would be one of them. :) Have you checked to be sure that your PAV settings are still there. You may have lost your dynamic PAV in the quest for HyperPAV. Also, you may want to see if your CE (IBM or other) has made changes to your RAID. It's possible that you may have lost some cache, or some of the features are not set as they were previously. Is it only certain datasets, or certain volumes (or subsets of volumes) that appear to be affected? For instance, is it only a few VSAM files that may exhibit the perceived problem? What has changed (if anything) about their location? Once you can quantify something concrete, it will make the job much easier. Once you locate some common threads you can start to zoom in on where the issue is presenting itself and figure out what may have changed. It's also completely possible that there may not be a problem, but programmers, (being what they are), will need you to prove that nothing has changed. If you check everything and see absolutely no difference in the jobs, then you can move into that response. If you need to contact me offline about this, feel free to do so and let me know what I can do to help. Brian -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@bama.ua.edu with the message: GET IBM-MAIN INFO Search the archives at http://bama.ua.edu/archives/ibm-main.html
Re: Degraded I/O performance in 1.10?
Have you checked RMF to compare CPU usage and I/O performance before and after the upgrade; specifically the I/O response time. What about the cpu usage of individual jobs. - Original Message - From: Jousma, David david.jou...@53.com Newsgroups: bit.listserv.ibm-main To: IBM-MAIN@bama.ua.edu Sent: Thursday, August 13, 2009 12:45 PM Subject: Degraded I/O performance in 1.10? All, I realize this is a really open ended question. We completed our 1.8 to 1.10 upgrade in June, with no known problems. Everything seems to be running fine. However, I have various people(mostly developers) occasionally complaining that they think the system is slower since the upgrade. Of course, the upgrade gets blamed for everything. By slower, the are referring to their batch jobs, those that do a lot of I/O. Interestingly, several people, who do not work in the same area(and most likely do not talk to each other), asked if file buffering has changed somehow with the upgrade. I tell them, not that I am aware of, and ask them for specifics to research, and in most cases I compare the jobs running before and after the upgrade, and the EXCPs all seem to be inline. So, all I can say is that there is this gut feeling that something isn't quite right, but can't put a finger on it. Has anyone else noticed anything, or have idea's on what to look for? _ Dave Jousma Assistant Vice President, Mainframe Services david.jou...@53.com 1830 East Paris, Grand Rapids, MI 49546 MD RSCB1G p 616.653.8429 f 616.653.8497 This e-mail transmission contains information that is confidential and may be privileged. It is intended only for the addressee(s) named above. If you receive this e-mail in error, please do not read, copy or disseminate it in any manner. If you are not the intended recipient, any disclosure, copying, distribution or use of the contents of this information is prohibited. Please reply to the message immediately by informing the sender that the message was misdirected. After replying, please erase it from your computer system. Your assistance in correcting this error is appreciated. -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@bama.ua.edu with the message: GET IBM-MAIN INFO Search the archives at http://bama.ua.edu/archives/ibm-main.html -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@bama.ua.edu with the message: GET IBM-MAIN INFO Search the archives at http://bama.ua.edu/archives/ibm-main.html
Re: Degraded I/O performance in 1.10?
Hi David, Look into increasing the buffers VSAM index. Some catalog changes have occurred that may be the cause of your slow response. Also SMF and Logrec buffering. Regards, Herman Stocker --- snip--- All, I realize this is a really open ended question. We completed our 1.8 to 1.10 upgrade in June, with no known problems. Everything seems to be running fine. However, I have various people(mostly developers) occasionally complaining that they think the system is slower since the upgrade. Of course, the upgrade gets blamed for everything. By slower, the are referring to their batch jobs, those that do a lot of I/O. Interestingly, several people, who do not work in the same area(and most likely do not talk to each other), asked if file buffering has changed somehow with the upgrade. I tell them, not that I am aware of, and ask them for specifics to research, and in most cases I compare the jobs running before and after the upgrade, and the EXCPs all seem to be inline. So, all I can say is that there is this gut feeling that something isn't quite right, but can't put a finger on it. Has anyone else noticed anything, or have idea's on what to look for? _ Dave Jousma ---/snip--- The sender believes that this E-mail and any attachments were free of any virus, worm, Trojan horse, and/or malicious code when sent. This message and its attachments could have been infected during transmission. By reading the message and opening any attachments, the recipient accepts full responsibility for taking protective and remedial action about viruses and other defects. The sender's employer is not liable for any loss or damage arising in any way from this message or its attachments. -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@bama.ua.edu with the message: GET IBM-MAIN INFO Search the archives at http://bama.ua.edu/archives/ibm-main.html
Re: Degraded I/O performance in 1.10?
David, For BSAM and QSAM looking at the EXCP will not tell you if buffering has changed because it is the count of blocks processed. Set BUFNO to 1 or 8 and you will still get the same EXCP count. If you think there is a change in SSCH for some datasets then try looking at the type 42 subtype 6 SMF records for dataset level IO metrics. Ron -Original Message- From: IBM Mainframe Discussion List [mailto:ibm-m...@bama.ua.edu] On Behalf Of Jousma, David Sent: Thursday, August 13, 2009 9:45 AM To: IBM-MAIN@bama.ua.edu Subject: [IBM-MAIN] Degraded I/O performance in 1.10? All, I realize this is a really open ended question. We completed our 1.8 to 1.10 upgrade in June, with no known problems. Everything seems to be running fine. However, I have various people(mostly developers) occasionally complaining that they think the system is slower since the upgrade. Of course, the upgrade gets blamed for everything. By slower, the are referring to their batch jobs, those that do a lot of I/O. Interestingly, several people, who do not work in the same area(and most likely do not talk to each other), asked if file buffering has changed somehow with the upgrade. I tell them, not that I am aware of, and ask them for specifics to research, and in most cases I compare the jobs running before and after the upgrade, and the EXCPs all seem to be inline. So, all I can say is that there is this gut feeling that something isn't quite right, but can't put a finger on it. Has anyone else noticed anything, or have idea's on what to look for? _ Dave Jousma Assistant Vice President, Mainframe Services david.jou...@53.com 1830 East Paris, Grand Rapids, MI 49546 MD RSCB1G p 616.653.8429 f 616.653.8497 This e-mail transmission contains information that is confidential and may be privileged. It is intended only for the addressee(s) named above. If you receive this e- mail in error, please do not read, copy or disseminate it in any manner. If you are not the intended recipient, any disclosure, copying, distribution or use of the contents of this information is prohibited. Please reply to the message immediately by informing the sender that the message was misdirected. After replying, please erase it from your computer system. Your assistance in correcting this error is appreciated. -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@bama.ua.edu with the message: GET IBM-MAIN INFO Search the archives at http://bama.ua.edu/archives/ibm-main.html -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@bama.ua.edu with the message: GET IBM-MAIN INFO Search the archives at http://bama.ua.edu/archives/ibm-main.html
Re: Degraded I/O performance in 1.10?
For BSAM and QSAM looking at the EXCP will not tell you if buffering has changed because it is the count of blocks processed. Set BUFNO to 1 or 8 and you will still get the same EXCP count. I can't remember when exactly. But, you have been able to change EXCP to service rather than blocks for a long time. So, each 'EXCP' becomes 8.3 ms of connect time, under that option. I can't remember the exact option (or even release of the OS), but it's been around for a long time. In that case, 1, or 8, BUFNO makes a difference. - Too busy driving to stop for gas! -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@bama.ua.edu with the message: GET IBM-MAIN INFO Search the archives at http://bama.ua.edu/archives/ibm-main.html
Re: Degraded I/O performance in 1.10?
Ted, You are talking about the IOSRVC statement in the IPS which is used to calculate IO Service Units. It doesn't change the EXCP count, it changes whether EXCP count or IO connect time are used as the basis for IO Service Units. SMF30BLK and SMF30TEP if IOSRVC is set to TIME. I have to admit to not having looked at Type 30 records closely for a very, very long time. I noticed that there is an SMF30AIS field that is described as the SSCH count for the Address that was introduced in OS/390 2.4. That may be a good starting place to look for differences. The MXG variable is also called SMF30AIS. I'm not sure where you get 8.3ms of connect time for one EXCP. That's about 3MB in an EXCP on FE4, and pretty close to a half track block with BUFNO=5 on ESCON. Ron -Original Message- From: IBM Mainframe Discussion List [mailto:ibm-m...@bama.ua.edu] On Behalf Of Ted MacNEIL Sent: Thursday, August 13, 2009 1:37 PM To: IBM-MAIN@bama.ua.edu Subject: Re: [IBM-MAIN] Degraded I/O performance in 1.10? For BSAM and QSAM looking at the EXCP will not tell you if buffering has changed because it is the count of blocks processed. Set BUFNO to 1 or 8 and you will still get the same EXCP count. I can't remember when exactly. But, you have been able to change EXCP to service rather than blocks for a long time. So, each 'EXCP' becomes 8.3 ms of connect time, under that option. I can't remember the exact option (or even release of the OS), but it's been around for a long time. In that case, 1, or 8, BUFNO makes a difference. - Too busy driving to stop for gas! -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@bama.ua.edu with the message: GET IBM-MAIN INFO Search the archives at http://bama.ua.edu/archives/ibm-main.html
Re: Degraded I/O performance in 1.10?
You are talking about the IOSRVC statement in the IPS which is used to calculate IO Service Units. It doesn't change the EXCP count, it changes whether EXCP count or IO connect time are used as the basis for IO Service Units. SMF30BLK and SMF30TEP if IOSRVC is set to TIME. I have to admit to not having looked at Type 30 records closely for a very, very long time. I noticed that there is an SMF30AIS field that is described as the SSCH count for the Address that was introduced in OS/390 2.4. That may be a good starting place to look for differences. The MXG variable is also called SMF30AIS. I'm not sure where you get 8.3ms of connect time for one EXCP. That's about I'm not sure where you get 8.3ms of connect time for one EXCP. That's about 3MB in an EXCP on FE4, and pretty close to a half track block with BUFNO=5 on ESCON. - Too busy driving to stop for gas! -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@bama.ua.edu with the message: GET IBM-MAIN INFO Search the archives at http://bama.ua.edu/archives/ibm-main.html
Re: Degraded I/O performance in 1.10?
I'm not sure where you get 8.3ms of connect time for one EXCP. I get that from the IBM documentation when I changed from COUNT to time aeons ago. - Too busy driving to stop for gas! -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@bama.ua.edu with the message: GET IBM-MAIN INFO Search the archives at http://bama.ua.edu/archives/ibm-main.html
Re: Degraded I/O performance in 1.10?
Should say SMF30BLK and SMF30TEP do not change if IOSRVC is set to TIME. -Original Message- From: IBM Mainframe Discussion List [mailto:ibm-m...@bama.ua.edu] On Behalf Of Ron Hawkins Sent: Thursday, August 13, 2009 3:12 PM To: IBM-MAIN@bama.ua.edu Subject: Re: [IBM-MAIN] Degraded I/O performance in 1.10? Ted, You are talking about the IOSRVC statement in the IPS which is used to calculate IO Service Units. It doesn't change the EXCP count, it changes whether EXCP count or IO connect time are used as the basis for IO Service Units. SMF30BLK and SMF30TEP if IOSRVC is set to TIME. I have to admit to not having looked at Type 30 records closely for a very, very long time. I noticed that there is an SMF30AIS field that is described as the SSCH count for the Address that was introduced in OS/390 2.4. That may be a good starting place to look for differences. The MXG variable is also called SMF30AIS. I'm not sure where you get 8.3ms of connect time for one EXCP. That's about 3MB in an EXCP on FE4, and pretty close to a half track block with BUFNO=5 on ESCON. Ron -Original Message- From: IBM Mainframe Discussion List [mailto:ibm-m...@bama.ua.edu] On Behalf Of Ted MacNEIL Sent: Thursday, August 13, 2009 1:37 PM To: IBM-MAIN@bama.ua.edu Subject: Re: [IBM-MAIN] Degraded I/O performance in 1.10? For BSAM and QSAM looking at the EXCP will not tell you if buffering has changed because it is the count of blocks processed. Set BUFNO to 1 or 8 and you will still get the same EXCP count. I can't remember when exactly. But, you have been able to change EXCP to service rather than blocks for a long time. So, each 'EXCP' becomes 8.3 ms of connect time, under that option. I can't remember the exact option (or even release of the OS), but it's been around for a long time. In that case, 1, or 8, BUFNO makes a difference. - Too busy driving to stop for gas! -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@bama.ua.edu with the message: GET IBM-MAIN INFO Search the archives at http://bama.ua.edu/archives/ibm-main.html -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@bama.ua.edu with the message: GET IBM-MAIN INFO Search the archives at http://bama.ua.edu/archives/ibm-main.html
Re: Degraded I/O performance in 1.10?
It was probably a good value to use aeons ago when it took a real SLED 3390 16.67 ms. to spin around once, so 8.3 ms. was 1/2 revolution. Today, however, is aeons later as far as the hardware is concerned, especially channel speed when delivering data from controller cache instead of straight from the platter. Bill Fairchild Software Developer Rocket Software 275 Grove Street * Newton, MA 02466-2272 * USA Tel: +1.617.614.4503 * Mobile: +1.508.341.1715 Email: bi...@mainstar.com Web: www.rocketsoftware.com -Original Message- From: IBM Mainframe Discussion List [mailto:ibm-m...@bama.ua.edu] On Behalf Of Ted MacNEIL Sent: Thursday, August 13, 2009 5:43 PM To: IBM-MAIN@bama.ua.edu Subject: Re: Degraded I/O performance in 1.10? I'm not sure where you get 8.3ms of connect time for one EXCP. I get that from the IBM documentation when I changed from COUNT to time aeons ago. - Too busy driving to stop for gas! -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@bama.ua.edu with the message: GET IBM-MAIN INFO Search the archives at http://bama.ua.edu/archives/ibm-main.html -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@bama.ua.edu with the message: GET IBM-MAIN INFO Search the archives at http://bama.ua.edu/archives/ibm-main.html
Re: Degraded I/O performance in 1.10?
It was probably a good value to use aeons ago when it took a real SLED 3390 16.67 ms. to spin around once, so 8.3 ms. was 1/2 revolution. Today, however, is aeons later as far as the hardware is concerned, especially channel speed when delivering data from controller cache instead of straight from the platter. Talk to IBM! I didn't make up the number. It's just what it is. - Too busy driving to stop for gas! -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@bama.ua.edu with the message: GET IBM-MAIN INFO Search the archives at http://bama.ua.edu/archives/ibm-main.html
Re: Degraded I/O performance in 1.10?
Ted, I like to see the documentation. The Channel measurement block records connect time and SRM in turn converts that to IO service units. Are you saying that 8.3ms was equivalent to 1 IO service Unit? If that's the case it still seems strange as I thought they would at least use something that divides exactly by 128 microseconds. Ron -Original Message- From: IBM Mainframe Discussion List [mailto:ibm-m...@bama.ua.edu] On Behalf Of Ted MacNEIL Sent: Thursday, August 13, 2009 4:52 PM To: IBM-MAIN@bama.ua.edu Subject: Re: [IBM-MAIN] Degraded I/O performance in 1.10? It was probably a good value to use aeons ago when it took a real SLED 3390 16.67 ms. to spin around once, so 8.3 ms. was 1/2 revolution. Today, however, is aeons later as far as the hardware is concerned, especially channel speed when delivering data from controller cache instead of straight from the platter. Talk to IBM! I didn't make up the number. It's just what it is. - Too busy driving to stop for gas! -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@bama.ua.edu with the message: GET IBM-MAIN INFO Search the archives at http://bama.ua.edu/archives/ibm-main.html -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@bama.ua.edu with the message: GET IBM-MAIN INFO Search the archives at http://bama.ua.edu/archives/ibm-main.html
Re: Degraded I/O performance in 1.10?
Bill, My memory, and follow up calculation, says that a 3390 rotated every 14.2ms, not 16.67ms. Even so, it would hardly seem a good move to multiply or divide a metric based on transfer time by the avg latency of a disk drive. I don't see the relationship. Ron -Original Message- From: IBM Mainframe Discussion List [mailto:ibm-m...@bama.ua.edu] On Behalf Of Bill Fairchild Sent: Thursday, August 13, 2009 4:40 PM To: IBM-MAIN@bama.ua.edu Subject: Re: [IBM-MAIN] Degraded I/O performance in 1.10? It was probably a good value to use aeons ago when it took a real SLED 3390 16.67 ms. to spin around once, so 8.3 ms. was 1/2 revolution. Today, however, is aeons later as far as the hardware is concerned, especially channel speed when delivering data from controller cache instead of straight from the platter. Bill Fairchild Software Developer Rocket Software 275 Grove Street * Newton, MA 02466-2272 * USA Tel: +1.617.614.4503 * Mobile: +1.508.341.1715 Email: bi...@mainstar.com Web: www.rocketsoftware.com -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@bama.ua.edu with the message: GET IBM-MAIN INFO Search the archives at http://bama.ua.edu/archives/ibm-main.html
Re: Degraded I/O performance in 1.10?
Ted, I like to see the documentation. The Channel measurement block records connect time and SRM in turn converts that to IO service units. Are you saying that 8.3ms was equivalent to 1 IO service Unit? Gotta look over 20 years ago. IOSRV=COUNT IOSRV=TIME was an option, a long time ago. I believe XA, but (as always) I could be wrong. And, back then, it was documented as 8.3ms. Or was it IOSRVC? - Too busy driving to stop for gas! -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@bama.ua.edu with the message: GET IBM-MAIN INFO Search the archives at http://bama.ua.edu/archives/ibm-main.html