Re: z/OS v1.7 JES2: StcInRdr vs. IntRdr
IIRC, JES2 runs the number of INTRDR processors (tasks) specifien on the INTRDR init statement. Not sure what happens when more SYSOUT=(*,INTRDR) than INTRDR PCEs are trying to submit jobs. I guess that the OPEN will just wait for an INTRDR PCE to become available to handle the request. No matter, INTRDRs don't start the batch jobs (and it's only batch jobs, so STCINRDR is out of scope), they only write the JCL to the JES2 spool queueing then on the conversion queue. Some time later the conversion PCEs will pick up the JCLs and place the converted JCL onto the spool queueing the jobs on the execution queue. Again some time later, those jobs will eventually be picked by an initiator that *is already running* when using JES managed initiators. No new address spaces are being created. If using WLM managed initiators, WLM decides it more of them shall be started depending on current system load. We often have burst of batch jobs submitted from our scheduler when day end processing is scheduled on our test system. The execution queue has some 150+ jobs waiting to be executed but this does not slow down the system in any special manner. -- Peter Hunkeler CREDIT SUISSE -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to [EMAIL PROTECTED] with the message: GET IBM-MAIN INFO Search the archives at http://bama.ua.edu/archives/ibm-main.html
Re: z/OS v1.7 JES2: StcInRdr vs. IntRdr
I don't know if you can specify the number of conversion tasks allowed or not; last time I tried to care, I couldn't specify a number for conversion tasks, but I could specify up to 10 INTRDR's. Number of conversion tasks is specified on the PCEDEF statement (CNVTNUM=). I think it's been there for a lng time (JES2 V3 at least). -- Peter Hunkeler Credist Suisse -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to [EMAIL PROTECTED] with the message: GET IBM-MAIN INFO Search the archives at http://bama.ua.edu/archives/ibm-main.html
Re: z/OS v1.7 JES2: StcInRdr vs. IntRdr
Neil, you mentioned that you were getting $HASP050: 90% JNUM messages during your problem time periods. Others have suggested PCE settings and some other ways to increase the number of internal readers, in order to increase throughput. Personally, I'd also look at ways to avoid JNUM 90% and to increase the total number of jobs that can be in the system at any given time. Can you tell us your settings that control JNUM? Can you increase those values, without exceeding maximums (or third- party software limitations on job numbers, ranges, format of JOBID, etc)? That's where I'd start looking at to solve this issue. Regards, Ulrich Krueger On Wed, 25 Jun 2008 18:23:59 -0400, Neil Duffee [EMAIL PROTECTED] wrote: Hey there. I'm grasping at straws and am hoping someone remembers their JES2 internals. I've looked in the JES2 Innita Tuna? manual (Ch 2. Controlling JES2 processes) without success and can't find a RedBook that helps. Perhaps someone remembers or can point me to a Fine Manual. (I'll ETR otherwise.) Background: z/OS v1.7, DB2 v7. During our (peak) registration periods, we experience occasional, un-explainable slow-downs in 1-3 minute bursts on the order of 3-5 in a 2-3 day period. To date, no particular culprit has been positively identified. Aside from 100% CPU 20+ un- dispatched tasks, one reported symptom is an increasing number of DB2 threads (from OmegaMon) waiting for Stored Procedure start-ups ie. for WLM to start another address space. (@15 TCBs each) Sure enough, once the dust settles, there can be 10+ WLM address spaces that slowly disappear as idle. This line of inquiry (among others) focuses on JES2's internal readers. We suspect processes generating e-mail to students with a 1-1 ratio of jobs to messages ie. 1 job=1 e-message, using SYSOUT=(*,INTRDR). (We're also pursuing multi-step jobs since $HASP050: 90% JNUM has already been encountered.) In a given scenario, we could have 200+ jobs with e- mail (bulk to a large class) directed at INTRDR while WLM is trying to start 1-5 Stored Procedure address spaces via STCINRDR. So, the question is, presuming it's already working on INTRDR, how does JES2 contend with this load? Are all the jobs in INTRDR converted then JES2 switches to STCINRDR? Does STCINRDR have precedence for JES2 and INTRDR is interrupted at the next JOB card? Are they simultaneous with their own TCBs? Curious minds would like to know. (or even hear speculation...) As mentioned before, if there's no satisfactory consensus, I'll pursue an ETR and relay the response. Tks much folx. -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to [EMAIL PROTECTED] with the message: GET IBM-MAIN INFO Search the archives at http://bama.ua.edu/archives/ibm-main.html
Re: z/OS v1.7 JES2: StcInRdr vs. IntRdr
In a message dated 6/26/2008 12:14:42 P.M. Central Daylight Time, [EMAIL PROTECTED] writes: Personally, I'd also look at ways to avoid JNUM 90% and to increase the total number of jobs that can be in the system at any given time. Can you tell us your settings that control JNUM? Can you increase those values, without exceeding maximums (or third- party software limitations on job numbers, ranges, format of JOBID, etc)? That's where I'd start looking at to solve this issue. Well guess a good place to start would be offloading the SMTP services to another machine or using UDP(Lionel's got a good write up in his XMITIP gem at _www.lbdsoftware.com_ (http://www.lbdsoftware.com) ). For DB/2 have to watch threads like a hawk. When they spill over to the common pool big time bottlenecks(hangs) while it thrashes it out with everything else. Then there's just bad SQL. What's his name(Platinum) quotes 70% for performance problems. **Gas prices getting you down? Search AOL Autos for fuel-efficient used cars. (http://autos.aol.com/used?ncid=aolaut000507) -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to [EMAIL PROTECTED] with the message: GET IBM-MAIN INFO Search the archives at http://bama.ua.edu/archives/ibm-main.html
Re: z/OS v1.7 JES2: StcInRdr vs. IntRdr
If you believe that the slow down is in JES, have you researched the output from JMonitor and/or the JDHistory, etc. These usually go a long way in explaining what JES is doing and what's its problem? Jack Kelly 202-502-2390 (Office) -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to [EMAIL PROTECTED] with the message: GET IBM-MAIN INFO Search the archives at http://bama.ua.edu/archives/ibm-main.html
Re: z/OS v1.7 JES2: StcInRdr vs. IntRdr
Neil Duffee wrote: ... This line of inquiry (among others) focuses on JES2's internal readers. We suspect processes generating e-mail to students with a 1-1 ratio of jobs to messages ie. 1 job=1 e-message, using SYSOUT=(*,INTRDR). (We're also pursuing multi-step jobs since $HASP050: 90% JNUM has already been encountered.) In a given scenario, we could have 200+ jobs with e-mail (bulk to a large class) directed at INTRDR while WLM is trying to start 1-5 Stored Procedure address spaces via STCINRDR. So, the question is, presuming it's already working on INTRDR, how does JES2 contend with this load? Are all the jobs in INTRDR converted then JES2 switches to STCINRDR? Does STCINRDR have precedence for JES2 and INTRDR is interrupted at the next JOB card? Are they simultaneous with their own TCBs? Curious minds would like to know. (or even hear speculation...) I'm assuming your JES2 is also at z/OS 1.7 level. I suspect your problem doesn't lie in INTRDR processing. As of z/OS 1.7 JES2 internal readers are processed in the address space that allocates them--you can't even specify how many there are. So their own TCBs is yes. Bob -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to [EMAIL PROTECTED] with the message: GET IBM-MAIN INFO Search the archives at http://bama.ua.edu/archives/ibm-main.html
Re: z/OS v1.7 JES2: StcInRdr vs. IntRdr
At 18:23 -0400 on 06/25/2008, Neil Duffee wrote about z/OS v1.7 JES2: StcInRdr vs. IntRdr: So, the question is, presuming it's already working on INTRDR, how does JES2 contend with this load? Are all the jobs in INTRDR converted then JES2 switches to STCINRDR? Does STCINRDR have precedence for JES2 and INTRDR is interrupted at the next JOB card? Are they simultaneous with their own TCBs? Curious minds would like to know. (or even hear speculation...) If you think you might have contention in JES2, you might want to try going to Poly-JES. This is basically starting a 2nd copy of JES2 on the CPU as another member of your JES2 Multi-Access Spool system. You submit the INTRDR Jobs with an /*EQU MEMBER2 Card and they will execute on the 2nd JES2 along with using its INTRDRs. The jobs that are submitted can either /*EQU back to the main JES2 or execute on MEMBER2. When not handling this work MEMBER2 is essentially idle and has little impact on the Main JES2 Member (set the MEMBER2 Spool Hold settings to quickly release the Checkpoint Record so it does not impact the Main JES@ Mamber). If you know when you will be doing the flood submission of EMAIL Jobs, you can start MEMBER2 and then shut it down when you are done to control its impact when not doing anything. -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to [EMAIL PROTECTED] with the message: GET IBM-MAIN INFO Search the archives at http://bama.ua.edu/archives/ibm-main.html
Re: z/OS v1.7 JES2: StcInRdr vs. IntRdr
-snip--- Hey there. I'm grasping at straws and am hoping someone remembers their JES2 internals. I've looked in the JES2 Innita Tuna? manual (Ch 2. Controlling JES2 processes) without success and can't find a RedBook that helps. Perhaps someone remembers or can point me to a Fine Manual.(I'll ETR otherwise.) Background: z/OS v1.7, DB2 v7. During our (peak) registration periods, we experience occasional, un-explainable slow-downs in 1-3 minute bursts on the order of 3-5 in a 2-3 day period. To date, no particular culprit has been positively identified. Aside from 100% CPU 20+ un-dispatched tasks, one reported symptom is an increasing number of DB2 threads (from OmegaMon) waiting for Stored Procedure start-ups ie. for WLM to start another address space. (@15 TCBs each) Sure enough, once the dust settles, there can be 10+ WLM address spaces that slowly disappear as idle. This line of inquiry (among others) focuses on JES2's internal readers. We suspect processes generating e-mail to students with a 1-1 ratio of jobs to messages ie. 1 job=1 e-message, using SYSOUT=(*,INTRDR). (We're also pursuing multi-step jobs since $HASP050: 90% JNUM has already been encountered.) In a given scenario, we could have 200+ jobs with e-mail (bulk to a large class) directed at INTRDR while WLM is trying to start 1-5 Stored Procedure address spaces via STCINRDR. So, the question is, presuming it's already working on INTRDR, how does JES2 contend with this load? Are all the jobs in INTRDR converted then JES2 switches to STCINRDR? Does STCINRDR have precedence for JES2 and INTRDR is interrupted at the next JOB card? Are they simultaneous with their own TCBs? Curious minds would like to know. (or even hear speculation...) As mentioned before, if there's no satisfactory consensus, I'll pursue an ETR and relay the response. Tks much folx. --unsnip- Neil, what's the observed CPU utilization of your JES2? If it's not really high, I'd suggest you look elsewhere. It's been a long time since I looked in this area, but IIRC, STC's take a slightly different path through JES2 processing. A more likely culprit might be AS-Create; even more likely, IMHO, a ENQ/DEQ contention issue. JES2 uses something called a PCE, a Process Control Element, to represent multiple INTRDR's, rather than TCB's. They're managed internally by JES2 and typically run fairly quickly. I'm not sure, but I think conversion is done under a PCE as well, so the CPU time involved would be attributed to JES2. How many INTRDR's do you have defined? I don't know if you can specify the number of conversion tasks allowed or not; last time I tried to care, I couldn't specify a number for conversion tasks, but I could specify up to 10 INTRDR's. Without a much broader picture of your system, it's going to be hard to make any definitive diagnosis. Are you running RMF? I like RMF, with a 10-minute recording interval. If you'll do that, over a period where you're affected by this issue, and send me the reports, beginning two periods before and ending two periods after, I'll try and give you some pointers. (ZIP the reports, please.) :-) -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to [EMAIL PROTECTED] with the message: GET IBM-MAIN INFO Search the archives at http://bama.ua.edu/archives/ibm-main.html