Re: EREP Symptom and/or Software Records

2017-06-18 Thread Edward Finnell
>From what we've gleaned so far, EREP is just 'holding the mirror' and some  
sort of WAIT is causing the CATALOG REDRIVE and is symptomatic of a tuning  
opportunity.  From a sysprog perspective I'm going to review WLM, DB2  
buffer use and device busy. Craig Mullins DB/2 series is great and suggests  
bad 
SQL is the root cause of a large percentage of DB2 problems.
 
 
In a message dated 6/18/2017 1:40:12 A.M. Central Daylight Time,  
haresystemssupp...@comcast.net writes:

It's  been a few years (retired / consulting now) since I had the duty of 
looking at  EREP records but as with what many others have said, it's about 
what seems  "normal" in your environment, although I've had instances of 0C4 
that was  occuring regularly, though handled by the piece of software in 
question  without killing anything. 


--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: EREP Symptom and/or Software Records

2017-06-18 Thread Tim Hare
It's been a few years (retired / consulting now) since I had the duty of 
looking at EREP records but as with what many others have said, it's about what 
seems "normal" in your environment, although I've had instances of 0C4 that was 
occuring regularly, though handled by the piece of software in question without 
killing anything. 

As for searching for APARs, you used to be able to copy those symptom strings 
into the search field such as:

PIDS/5695DF105 RIDS/IGG0CLA9 RIDS/IGG0CLX0#L PRCS/00F6 PRCS/ 

and it was a pretty reliable way to zero in on something.  I don't know if 
IBM's support site search still works that way, but if it does it saves quite a 
lot of time.

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: EREP Symptom and/or Software Records

2017-06-16 Thread Attila Fogarasi
Most likely cause is SYSZTIOT or SYSZVVDS contention -- so the abend 91A
are symptoms of that latent problem (which affects your SLA potentially).
Catalog added contention management enhancement in z/OS 1.12 which causes
abend 91A for excessive wait and when the wait time threshold is reached
the abend "resolves" the serialization.  Whether trivial or significant
having hundreds per day depends on your specific configuration, but worthy
of analysis.  Lots of possible causes, particularly for DB2, so pointless
to speculate further.

On Fri, Jun 16, 2017 at 4:43 PM, Peter Hunkeler  wrote:

>
> >Catalog processing is not my area of expertise, but I think that this
> symptom record is the result of a catalog service task being ABTERMed with
> code 91A, possibly by a catalog analysis task.  I don't remember offhand if
> a 91A is also used to terminate a service task when the suspended requesting
> task gets ABTERMed.
>
>
>
>
> So this seems to be part of some CATALOG health checking and recovery
> processing. A good thing, basically, but if you have hundreds of those a
> day, it may indicate some latent problem.  Somewhere in your system setup,
> or in some software calling CATALOG services, or even in CATALOG itself.
>
>
> As Jim suggested, you may want to take one or a few dumps and try to find
> out why you see so many of them. DAE will suppress duplicate dumps, i.e.
> dumps where the symptoms (symptom string) indicate it is likely for the
> same cause. You might want to change the SLIP to ACTION=NOSUP,MATCHLIM=5 so
> that DAE will not suppress duplicates, until the number of dumps you want
> has been taken (MATCHLIM=n). Without the MATCHLIM, each occurrence you
> initate a dump being taken, which could kill your preformance.
>
>
> --
> Peter Hunkeler
>
>
>
>
> --
> Peter Hunkeler
>
> --
> For IBM-MAIN subscribe / signoff / archive access instructions,
> send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN
>

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: EREP Symptom and/or Software Records

2017-06-16 Thread Peter Hunkeler

>Catalog processing is not my area of expertise, but I think that this symptom 
>record is the result of a catalog service task being ABTERMed with code 91A, 
>possibly by a catalog analysis task.  I don't remember offhand if a 91A is 
>also used to terminate a service task when the suspended requesting
task gets ABTERMed.




So this seems to be part of some CATALOG health checking and recovery 
processing. A good thing, basically, but if you have hundreds of those a day, 
it may indicate some latent problem.  Somewhere in your system setup, or in 
some software calling CATALOG services, or even in CATALOG itself.


As Jim suggested, you may want to take one or a few dumps and try to find out 
why you see so many of them. DAE will suppress duplicate dumps, i.e. dumps 
where the symptoms (symptom string) indicate it is likely for the same cause. 
You might want to change the SLIP to ACTION=NOSUP,MATCHLIM=5 so that DAE will 
not suppress duplicates, until the number of dumps you want has been taken 
(MATCHLIM=n). Without the MATCHLIM, each occurrence you initate a dump being 
taken, which could kill your preformance.


--
Peter Hunkeler




--
Peter Hunkeler

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: EREP Symptom and/or Software Records

2017-06-15 Thread Jim Mulder
  Catalog processing is not my area of expertise, but I think that this 
symptom 
record is the result of a catalog service task being ABTERMed with code 
91A,
possibly by a catalog analysis task.  I don't remember offhand if a 91A
is also used to terminate a service task when the suspended requesting 
task 
gets ABTERMed. 
I would suggest looking at the syslog to see if there are any catalog 
messages
at the times of these symrecs which might explain the situation.  Absent 
that,
I would do 
SLIP MOD,ID=X91A,DISABLE 
to get a dump of a 91A abend, and look at that.
You may need to open a PMR to catalog for assistance with dump analysis.
I would at least look at the SYSTRACE in the dump to see who initiated the
91A ABTERM. 


Jim Mulder z/OS Diagnosis, Design, Development, Test  IBM Corp. 
Poughkeepsie NY

IBM Mainframe Discussion List <IBM-MAIN@LISTSERV.UA.EDU> wrote on 
06/15/2017 05:20:21 PM:

> From: Turner Cheryl L <cheryl.l.tur...@irs.gov>
> To: IBM-MAIN@LISTSERV.UA.EDU
> Date: 06/15/2017 11:03 PM
> Subject: Re: EREP Symptom and/or Software Records
> Sent by: IBM Mainframe Discussion List <IBM-MAIN@LISTSERV.UA.EDU>
> 
> I understand why you may have thought that but no I understand it is
> not the slip that spawns the records.  But couldn't it be said that 
> the slip parms are indicating IBM's view of the severity of the 
> event? I am so new to this that heck I may not be even asking the 
> questions right.  For that I'm sorry.
> 
> For example.  Here is one symptom record in particular we are 
> constantly seeing (there are others but let's use this as an example): 
> 
> PIDS/5695DF105 RIDS/IGG0CLA9 RIDS/IGG0CLX0#L PRCS/00F6 PRCS/ 

>  PRCS/ JOBN/DBP1DBM1< 
> the JOBN changes but they all seem to be DB2 related tasks. All 
> the other information is the same.
> 
>  SYSTEM ENVIRONMENT: 
>  CPU MODEL:  2964   DATE:  166  17 
>  CPU SERIAL: 0207C7 TIME:  01:06:27.72 
>  SYSTEM: MBI2   BCP:   MVS 
>  RELEASE LEVEL OF SERVICE ROUTINE:  HBB77A0 
>  SYSTEM DATA AT ARCHITECTURE LEVEL: 10 
>  COMPONENT DATA AT ARCHITECTURE LEVEL:  10 
>  SYSTEM DATA:   || 
>  COMPONENT INFORMATION: 
>  COMPONENT ID: 5695DF105 
>  COMPONENT RELEASE LEVEL:  220 
>  SERVICE RELEASE LEVEL:UA82137 
>  DESCRIPTION OF FUNCTION:  CATPROB DATA 
> 
> PRIMARY SYMPTOM STRING: 
> PIDS/5695DF105 RIDS/IGG0CLA9 RIDS/IGG0CLX0#L PRCS/00F6 
> PRCS/ JOBN/DBP3DBM1 
> 
> SYMPTOMSYMPTOM DATA EXPLANATION 
> ------- 
> PIDS/5695DF105 5695DF105COMPONENT IDENTIFIER 
> RIDS/IGG0CLA9  IGG0CLA9 ROUTINE IDENTIFIER 
> RIDS/IGG0CLX0#LIGG0CLX0#L   ROUTINE IDENTIFIER 
> PRCS/00F6  00F6 RETURN CODE 
> PRCS/   RETURN CODE 
> JOBN/DBP3DBM1  DBP3DBM1 JOB NAME 
> 
> THE SYMPTOM RECORD DOES NOT CONTAIN A SECONDARY SYMPTOM STRING. 
> FREE FORMAT COMPONENT INFORMATION:  
> 
> And then there appears to be a snap dump of storage on each one.
> 
> Nothing on IBMLINK matching anything that I can think to search on 
> from the fields.  In the syslog we see IBM slip trap x91A taken 
> about the time of each record. 
> 2017166 01:06:27.72  0284  IEA989I SLIP TRAP ID=X91A 
> MATCHED.  JOBNAME=CATALOG , ASID=0086.
> 
> And there are sometimes 100s of this particular symptom records on a
> given lpar, per day.
> Slip settings are:
> ID=X91A,NONPER,ENABLED 
> ACTION=NODUMP,SET BY CONS INTERNAL,RBLEVEL=ERROR,COMP=91A
> 
> 91A  
>  
>  Explanation:  A request to abnormally end the catalog address space 
(CAS) 
>  service task was issued either through the MODIFY CATALOG,RESTART 
command, 
>  or through catalog analysis task processing.  
>  
>  System Action:  The system re-drives the catalog request currently in   
 
>  process.  
> 
> We are not issuing a MODIFY CATALOG RESTART command at the time of 
> any of the logrecs being cut.  SO might there something wrong with 
> the catalog process that all these redrives are necessary?  Is it 
> normal behavior?  So many questions and I'm clueless, unfortunately.
> 
> So what I guess I was trying to wrap my head around is:  if there 
> isn't a need to take a dump, etc. (as specified in the SLIP setting)
> then why have logic to cut 100's of symptom records at all for that 
> particular issue?  And if we're cutting 100's of records - is it 
> really a problem? And like Ed said, it's noise, and I don't know 
> enough to know it's a proble

Re: EREP Symptom and/or Software Records

2017-06-15 Thread Lizette Koehler
Cheryl,

So, IBM supplies many standard Slip Suppressions.  the ACTION would be NODUMP.  
This is due to various events in z/OS that will just issue them, 91A, 133, D37, 
B37, E37 just because dumps are not really needed and "everyone" just knows 
what they are.  I consider these annoyance and just part of the process.

As for Logrec, IBM uses the philosophy, just in case - create a logrec record.  
Sometimes with storage sometimes without storage.

This is incase you need to go back and look for symptoms about an event.  If 
the Logrec data is kept longer than a day then you would be able be able to go 
and see if there are system control block information that could be helpful in 
diagnosing the issue.

What I usually go by are the following guidelines
1) How often is the event occurring


-Original Message-
>From: Turner Cheryl L <cheryl.l.tur...@irs.gov>
>Sent: Jun 15, 2017 2:20 PM
>To: IBM-MAIN@LISTSERV.UA.EDU
>Subject: Re: EREP Symptom and/or Software Records
>
>I understand why you may have thought that but no I understand it is not the 
>slip that spawns the records.  But couldn't it be said that the slip parms are 
>indicating IBM's view of the severity of the event? I am so new to this that 
>heck I may not be even asking the questions right.  For that I'm sorry.
>
>For example.  Here is one symptom record in particular we are constantly 
>seeing (there are others but let's use this as an example): 
>
>PIDS/5695DF105 RIDS/IGG0CLA9 RIDS/IGG0CLX0#L PRCS/00F6 PRCS/ 
> PRCS/ JOBN/DBP1DBM1< the JOBN 
> changes but they all seem to be DB2 related tasks. All the other 
> information is the same.
>  
> SYSTEM ENVIRONMENT:  
> CPU MODEL:  2964   DATE:  166  17
> CPU SERIAL: 0207C7 TIME:  01:06:27.72
> SYSTEM: MBI2   BCP:   MVS
> RELEASE LEVEL OF SERVICE ROUTINE:  HBB77A0   
> SYSTEM DATA AT ARCHITECTURE LEVEL: 10
> COMPONENT DATA AT ARCHITECTURE LEVEL:  10
> SYSTEM DATA:   ||
> COMPONENT INFORMATION:   
> COMPONENT ID: 5695DF105  
> COMPONENT RELEASE LEVEL:  220
> SERVICE RELEASE LEVEL:UA82137
> DESCRIPTION OF FUNCTION:  CATPROB DATA  
>
>PRIMARY SYMPTOM STRING:   
>PIDS/5695DF105 RIDS/IGG0CLA9 RIDS/IGG0CLX0#L PRCS/00F6
>PRCS/ JOBN/DBP3DBM1   
>  
>SYMPTOMSYMPTOM DATA EXPLANATION   
>-------   
>PIDS/5695DF105 5695DF105COMPONENT IDENTIFIER  
>RIDS/IGG0CLA9  IGG0CLA9 ROUTINE IDENTIFIER
>RIDS/IGG0CLX0#LIGG0CLX0#L   ROUTINE IDENTIFIER
>PRCS/00F6  00F6 RETURN CODE   
>PRCS/   RETURN CODE   
>JOBN/DBP3DBM1  DBP3DBM1 JOB NAME  
>  
>THE SYMPTOM RECORD DOES NOT CONTAIN A SECONDARY SYMPTOM STRING.   
>FREE FORMAT COMPONENT INFORMATION: 
>
>And then there appears to be a snap dump of storage on each one.
>
>Nothing on IBMLINK matching anything that I can think to search on from the 
>fields.  In the syslog we see IBM slip trap x91A taken about the time of each 
>record.  
>2017166 01:06:27.72  0284  IEA989I SLIP TRAP ID=X91A MATCHED.  
>JOBNAME=CATALOG , ASID=0086.
>
>And there are sometimes 100s of this particular symptom records on a given 
>lpar, per day.
>Slip settings are:
>ID=X91A,NONPER,ENABLED   
>ACTION=NODUMP,SET BY CONS INTERNAL,RBLEVEL=ERROR,COMP=91A
>
>91A  
> 
> Explanation:  A request to abnormally end the catalog address space (CAS)   
> service task was issued either through the MODIFY CATALOG,RESTART command,  
> or through catalog analysis task processing.
> 
> System Action:  The system re-drives the 

Re: EREP Symptom and/or Software Records

2017-06-15 Thread Turner Cheryl L
I understand why you may have thought that but no I understand it is not the 
slip that spawns the records.  But couldn't it be said that the slip parms are 
indicating IBM's view of the severity of the event? I am so new to this that 
heck I may not be even asking the questions right.  For that I'm sorry.

For example.  Here is one symptom record in particular we are constantly seeing 
(there are others but let's use this as an example): 

PIDS/5695DF105 RIDS/IGG0CLA9 RIDS/IGG0CLX0#L PRCS/00F6 PRCS/ 
 PRCS/ JOBN/DBP1DBM1< the JOBN 
changes but they all seem to be DB2 related tasks. All the other 
information is the same.
  
 SYSTEM ENVIRONMENT:  
 CPU MODEL:  2964   DATE:  166  17
 CPU SERIAL: 0207C7 TIME:  01:06:27.72
 SYSTEM: MBI2   BCP:   MVS
 RELEASE LEVEL OF SERVICE ROUTINE:  HBB77A0   
 SYSTEM DATA AT ARCHITECTURE LEVEL: 10
 COMPONENT DATA AT ARCHITECTURE LEVEL:  10
 SYSTEM DATA:   ||
 COMPONENT INFORMATION:   
 COMPONENT ID: 5695DF105  
 COMPONENT RELEASE LEVEL:  220
 SERVICE RELEASE LEVEL:UA82137
 DESCRIPTION OF FUNCTION:  CATPROB DATA  

PRIMARY SYMPTOM STRING:   
PIDS/5695DF105 RIDS/IGG0CLA9 RIDS/IGG0CLX0#L PRCS/00F6
PRCS/ JOBN/DBP3DBM1   
  
SYMPTOMSYMPTOM DATA EXPLANATION   
-------   
PIDS/5695DF105 5695DF105COMPONENT IDENTIFIER  
RIDS/IGG0CLA9  IGG0CLA9 ROUTINE IDENTIFIER
RIDS/IGG0CLX0#LIGG0CLX0#L   ROUTINE IDENTIFIER
PRCS/00F6  00F6 RETURN CODE   
PRCS/   RETURN CODE   
JOBN/DBP3DBM1  DBP3DBM1 JOB NAME  
  
THE SYMPTOM RECORD DOES NOT CONTAIN A SECONDARY SYMPTOM STRING.   
FREE FORMAT COMPONENT INFORMATION: 

And then there appears to be a snap dump of storage on each one.

Nothing on IBMLINK matching anything that I can think to search on from the 
fields.  In the syslog we see IBM slip trap x91A taken about the time of each 
record.  
2017166 01:06:27.72  0284  IEA989I SLIP TRAP ID=X91A MATCHED.  
JOBNAME=CATALOG , ASID=0086.

And there are sometimes 100s of this particular symptom records on a given 
lpar, per day.
Slip settings are:
ID=X91A,NONPER,ENABLED   
ACTION=NODUMP,SET BY CONS INTERNAL,RBLEVEL=ERROR,COMP=91A

91A  
 
 Explanation:  A request to abnormally end the catalog address space (CAS)   
 service task was issued either through the MODIFY CATALOG,RESTART command,  
 or through catalog analysis task processing.
 
 System Action:  The system re-drives the catalog request currently in   
 process.

We are not issuing a MODIFY CATALOG RESTART command at the time of any of the 
logrecs being cut.  SO might there something wrong with the catalog process 
that all these redrives are necessary?  Is it normal behavior?  So many 
questions and I'm clueless, unfortunately.

So what I guess I was trying to wrap my head around is:  if there isn't a need 
to take a dump, etc. (as specified in the SLIP setting) then why have logic to 
cut 100's of symptom records at all for that particular issue?  And if we're 
cutting 100's of records - is it really a problem? And like Ed said, it's 
noise, and I don't know enough to know it's a problem or not and sometimes how 
to go about diagnosing.  So I was hoping to get some help (which I have) in how 
to handle these and others going forward.  

Thanks for your and others responses, though. It's much appreciated and I'm 
taking it all in as much as I can.

-Original Message-
From: IBM Mainframe Discussion List [mailto:IBM-MAIN@listserv.ua.edu] On Behalf 
Of Peter Hunkeler
Sent: Thursday, June 15, 2017 4:20 PM
To: IBM-MAIN@listserv.ua.edu
Subject: AW: Re: EREP Symptom and/or Software Records

>But I still

AW: Re: EREP Symptom and/or Software Records

2017-06-15 Thread Peter Hunkeler
>But I still can't get my head around, why cut 100's of symptom/software 
>records a day at all for a particular problem, if we're just going to ignore 
>them - abend or not. But I'll try to let that not keep me awake at night.



I may well be wrong with my interpretation of the above statement and a similar 
one in you initial post. Anyway, here I go...


I seem to understand that you got the impression that it is all those SLIPs 
that are responsible for the logrec entries, that is, the logrec records are 
written because of a SLIP. This is not the case. A problem arises in the 
software, and this may lead to an ABEND (SVC 13) being issued either by the 
software explicitly, or by some service routine that was called. Logrec records 
are a consequence of this.


SLIPs are set to perform an action when events, such as ABENDs, and a lot more, 
occur. The logrec records are written independently.


--
Peter Hunkeler



--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: EREP Symptom and/or Software Records

2017-06-15 Thread Edward Gould
> On Jun 14, 2017, at 8:47 AM, Turner Cheryl L  wrote:
> 
> was hoping you'd chime in John (and I appreciate the responses from Skip, 
> Lizette and EdG, as well).  Since I am the person that does the maintenance 
> and OS upgrades now, I was taking it upon myself to be a bit proactive (or 
> creating busy work?) and looking at the EREP reports for potential problems 
> where there may be PTFs available but maybe we just didn't have them on at 
> the time due to our maintenance windows.  We are still in the process of fine 
> tuning which reports to generate and we are unloading the reports to GDGs .
> 
> So I will summarize the advice as this:  Look at them, as you have time.  
> Decide if any of them are a true problem or something worth investigating 
> further, check out IBMLINK and/or look for a way to fix it. Open a PMR to 
> IBM/vendor if really unsure.
> 
> But I still can't get my head around, why cut 100's of symptom/software 
> records a day at all for a particular problem, if we're just going to ignore 
> them - abend or not. But I'll try to let that not keep me awake at night.   
> 
> Thanks everyone.

I don’t know about others but I like a clean ship. Its a few minutes a day 
routine. If you spot anything that looks unusual then its worth a look. Unusual 
is different for each person. If I saw more than 2 that was unusual to me.
The research was done with the daily look at any hyper APARS. The number of 
hypers was small (usually). I hated to call IBM for a problem that was already 
addressed. Generally talking with IBM was not an issue except for thing like 
CBPDO and other system distributions. The CBPDO calls turned to longish calls 
and were on average the least productive. Other places at IBM were superb and 
it was 5 minute at most, unless we forget the PSF people who were thorough and 
understood traces better than I. When it came to level 3 the phone calls were 
longish but except for a couple of instances relatively painless. When I got 
into a disagreement with level 2, I just called the duty manager and explained 
the issue and he seemed to always come out on my side.
This clean ship as I called it, showed in the fact we almost always never had 
an outage due to software. We were on the bleeding edge with DASD and always 
had to keep DFDSS up to date. That was the other reason I kept the system up to 
date.

Ed


--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: EREP Symptom and/or Software Records

2017-06-15 Thread Turner Cheryl L
Thank you Bruce.  A coworker had downloaded the Logrec Viewer many moons ago, 
but I wasn't aware of it or that we had it.  Your post reminded him to tell me 
about it and confirm it still works :)

We are not currently exploiting PFA.

-Original Message-
From: IBM Mainframe Discussion List [mailto:IBM-MAIN@listserv.ua.edu] On Behalf 
Of Bruce Hewson
Sent: Thursday, June 15, 2017 1:20 AM
To: IBM-MAIN@listserv.ua.edu
Subject: Re: EREP Symptom and/or Software Records

Hello Cheryl,

check out the Logrec Viewer exec - 


https://www.ibm.com/systems/z/os/zos/tools/downloads/logrec-viewer.html


also you can monitor the PFA_LOGREC_ARRIVAl_RATE healthcheck to tell you when 
you get a burst of LOGREC events.

Regards
Bruce

On Tue, 13 Jun 2017 16:24:34 -0500, Turner Cheryl L <cheryl.l.tur...@irs.gov> 
wrote:

>Greetings.
>Our former sysprog, who paid attention to the more finer system details, has 
>left the building for greener pastures.  So now we seem to have to step up our 
>game.  However, I'm not sure what to do or how.
>
>We are running several EREP reports to see what software or symptom records 
>are being cut per LPAR (mostly just HISTORY reports for now).  We are finding 
>that a lot of records are being cut at the time an IBM supplied SLIP trap is 
>taken (for example X13E, X47B, X91A).  Some of these records can exceed 
>hundreds on a given day.
>
>What should we/I be doing? Reporting them to IBM? I just don't understand why 
>IBM would set the SLIP yet cut a symptom or software record too.  We can't be 
>the only shop seeing these. Yet I've tried to research a few on IBMLINK but 
>can't find any hits for known problems.  Or maybe there is a way to turn of 
>the creation of the software/symptom record?  Though I can't wrap my head 
>around that either, thinking why are they then being cut at all if it's not 
>anything to look into?
>
>Any schooling you can give, would be most appreciated! But please, be gentle. 
>I'm out of my element. Many thanks to you all.
>
>--
>For IBM-MAIN subscribe / signoff / archive access instructions, send 
>email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN

--
For IBM-MAIN subscribe / signoff / archive access instructions, send email to 
lists...@listserv.ua.edu with the message: INFO IBM-MAIN

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: EREP Symptom and/or Software Records

2017-06-14 Thread Bruce Hewson
Hello Cheryl,

check out the Logrec Viewer exec - 


https://www.ibm.com/systems/z/os/zos/tools/downloads/logrec-viewer.html


also you can monitor the PFA_LOGREC_ARRIVAl_RATE healthcheck to tell you when 
you get a burst of LOGREC events.

Regards
Bruce

On Tue, 13 Jun 2017 16:24:34 -0500, Turner Cheryl L  
wrote:

>Greetings.
>Our former sysprog, who paid attention to the more finer system details, has 
>left the building for greener pastures.  So now we seem to have to step up our 
>game.  However, I'm not sure what to do or how.
>
>We are running several EREP reports to see what software or symptom records 
>are being cut per LPAR (mostly just HISTORY reports for now).  We are finding 
>that a lot of records are being cut at the time an IBM supplied SLIP trap is 
>taken (for example X13E, X47B, X91A).  Some of these records can exceed 
>hundreds on a given day.
>
>What should we/I be doing? Reporting them to IBM? I just don't understand why 
>IBM would set the SLIP yet cut a symptom or software record too.  We can't be 
>the only shop seeing these. Yet I've tried to research a few on IBMLINK but 
>can't find any hits for known problems.  Or maybe there is a way to turn of 
>the creation of the software/symptom record?  Though I can't wrap my head 
>around that either, thinking why are they then being cut at all if it's not 
>anything to look into?
>
>Any schooling you can give, would be most appreciated! But please, be gentle. 
>I'm out of my element. Many thanks to you all.
>
>--
>For IBM-MAIN subscribe / signoff / archive access instructions,
>send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: EREP Symptom and/or Software Records

2017-06-14 Thread Edward Finnell
Hopefully DAE has kicked in and you are just getting symptom  records. IPCS 
DAE(3.5) you can get a little insight into who's waving their  hands.
 
http://www-01.ibm.com/support/docview.wss?uid=isg1OA08663
 
 
In a message dated 6/14/2017 8:47:57 A.M. Central Daylight Time,  
cheryl.l.tur...@irs.gov writes:

But I  still can't get my head around, why cut 100's of symptom/software 
records a  day at all for a particular problem, if we're just going to ignore 
them -  abend or not. But I'll try to let that not keep me awake at night.   
 


--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: EREP Symptom and/or Software Records

2017-06-14 Thread John Eells

Turner Cheryl L wrote:

I was hoping you'd chime in John (and I appreciate the responses from Skip, 
Lizette and EdG, as well).  Since I am the person that does the maintenance and 
OS upgrades now, I was taking it upon myself to be a bit proactive (or creating 
busy work?) and looking at the EREP reports for potential problems where there 
may be PTFs available but maybe we just didn't have them on at the time due to 
our maintenance windows.  We are still in the process of fine tuning which 
reports to generate and we are unloading the reports to GDGs .

So I will summarize the advice as this:  Look at them, as you have time.  
Decide if any of them are a true problem or something worth investigating 
further, check out IBMLINK and/or look for a way to fix it. Open a PMR to 
IBM/vendor if really unsure.

But I still can't get my head around, why cut 100's of symptom/software records 
a day at all for a particular problem, if we're just going to ignore them - 
abend or not. But I'll try to let that not keep me awake at night.



Well, I think we likely don't know whether they are truly problems or 
not, so we cut the records so we have first failure data capture for an 
actual problem.  Back when first we started to do that, there were 
probably only a few of these records a day.  But systems...well..."got 
bigger and faster."


I'd suggest getting familiar with what's out there, as a side research 
project, and then deciding for yourself whether there is preventive 
value in spending time on things like this.  There might be, if you find 
unusual records or patterns of activity, but doing this by IEB-EYEBALL 
takes an awful lot of time so post-processing them is probably a better 
way to do that.  The value in doing the research at all is in being able 
to skip over things you don't care about when you have to look at the 
data to try to find a problem.


Likewise, you will find that if you clear out the list of dumps to be 
suppressed that many of the dumps that result won't be worth pursuing. 
(RTM, in particular, was or is often a victim.)


Anyway, that's my opinion.

--
John Eells
IBM Poughkeepsie
ee...@us.ibm.com

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: EREP Symptom and/or Software Records

2017-06-14 Thread Turner Cheryl L
I was hoping you'd chime in John (and I appreciate the responses from Skip, 
Lizette and EdG, as well).  Since I am the person that does the maintenance and 
OS upgrades now, I was taking it upon myself to be a bit proactive (or creating 
busy work?) and looking at the EREP reports for potential problems where there 
may be PTFs available but maybe we just didn't have them on at the time due to 
our maintenance windows.  We are still in the process of fine tuning which 
reports to generate and we are unloading the reports to GDGs .

So I will summarize the advice as this:  Look at them, as you have time.  
Decide if any of them are a true problem or something worth investigating 
further, check out IBMLINK and/or look for a way to fix it. Open a PMR to 
IBM/vendor if really unsure.

But I still can't get my head around, why cut 100's of symptom/software records 
a day at all for a particular problem, if we're just going to ignore them - 
abend or not. But I'll try to let that not keep me awake at night.   

Thanks everyone.

-Original Message-
From: IBM Mainframe Discussion List [mailto:IBM-MAIN@listserv.ua.edu] On Behalf 
Of John Eells
Sent: Wednesday, June 14, 2017 9:10 AM
To: IBM-MAIN@listserv.ua.edu
Subject: Re: EREP Symptom and/or Software Records

Turner Cheryl L wrote:
> Greetings.
> Our former sysprog, who paid attention to the more finer system details, has 
> left the building for greener pastures.  So now we seem to have to step up 
> our game.  However, I'm not sure what to do or how.
>
> We are running several EREP reports to see what software or symptom records 
> are being cut per LPAR (mostly just HISTORY reports for now).  We are finding 
> that a lot of records are being cut at the time an IBM supplied SLIP trap is 
> taken (for example X13E, X47B, X91A).  Some of these records can exceed 
> hundreds on a given day.
>
> What should we/I be doing? Reporting them to IBM? I just don't understand why 
> IBM would set the SLIP yet cut a symptom or software record too.  We can't be 
> the only shop seeing these. Yet I've tried to research a few on IBMLINK but 
> can't find any hits for known problems.  Or maybe there is a way to turn of 
> the creation of the software/symptom record?  Though I can't wrap my head 
> around that either, thinking why are they then being cut at all if it's not 
> anything to look into?
>
> Any schooling you can give, would be most appreciated! But please, be gentle. 
> I'm out of my element. Many thanks to you all.

If you care to research these, look up the abends and see whether you really 
care.  Using 13E as an example:

Explanation: The task that created a subtask issued a DETACH macro for that 
subtask, specifying STAE=NO, before the subtask ended.
This may or may not be an error, depending on the intent of the user. 
Consequently, the system does not abnormally end the task issuing the DETACH 
macro.

With no proof to back my assumptions, I nonetheless suspect that, if you look 
at these, you will find that most or all of those are from application programs 
that summarily shot their subtasks rather than signaling them to complete or 
waiting for them to complete.

You can discuss such things with the application owner, but you might find most 
of them reluctant to "fix" the problem when what they are doing is working from 
their perspective and the net ill effect is "noise" in EREP reports.

I will observe, though, that the system programmer action for ABEND13E could 
use some updates...

--
John Eells
IBM Poughkeepsie
ee...@us.ibm.com

--
For IBM-MAIN subscribe / signoff / archive access instructions, send email to 
lists...@listserv.ua.edu with the message: INFO IBM-MAIN

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: EREP Symptom and/or Software Records

2017-06-14 Thread John Eells

Turner Cheryl L wrote:

Greetings.
Our former sysprog, who paid attention to the more finer system details, has 
left the building for greener pastures.  So now we seem to have to step up our 
game.  However, I'm not sure what to do or how.

We are running several EREP reports to see what software or symptom records are 
being cut per LPAR (mostly just HISTORY reports for now).  We are finding that 
a lot of records are being cut at the time an IBM supplied SLIP trap is taken 
(for example X13E, X47B, X91A).  Some of these records can exceed hundreds on a 
given day.

What should we/I be doing? Reporting them to IBM? I just don't understand why 
IBM would set the SLIP yet cut a symptom or software record too.  We can't be 
the only shop seeing these. Yet I've tried to research a few on IBMLINK but 
can't find any hits for known problems.  Or maybe there is a way to turn of the 
creation of the software/symptom record?  Though I can't wrap my head around 
that either, thinking why are they then being cut at all if it's not anything 
to look into?

Any schooling you can give, would be most appreciated! But please, be gentle. 
I'm out of my element. Many thanks to you all.


If you care to research these, look up the abends and see whether you 
really care.  Using 13E as an example:


Explanation: The task that created a subtask issued a DETACH macro for 
that subtask, specifying STAE=NO, before

the subtask ended.
This may or may not be an error, depending on the intent of the user. 
Consequently, the system does not abnormally

end the task issuing the DETACH macro.

With no proof to back my assumptions, I nonetheless suspect that, if you 
look at these, you will find that most or all of those are from 
application programs that summarily shot their subtasks rather than 
signaling them to complete or waiting for them to complete.


You can discuss such things with the application owner, but you might 
find most of them reluctant to "fix" the problem when what they are 
doing is working from their perspective and the net ill effect is 
"noise" in EREP reports.


I will observe, though, that the system programmer action for ABEND13E 
could use some updates...


--
John Eells
IBM Poughkeepsie
ee...@us.ibm.com

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: EREP Symptom and/or Software Records

2017-06-13 Thread Lizette Koehler
So Logrec has two types of sysprogs using it

the ones that want to know everything that is going on - read it hourly (i use 
to do that)

the ones that only look at it when it is associated with a problem. (i do that 
now.)

If your shop has a lot of issues with software and/or hardware, then reviewing 
logrec is helpful.  There is a tool called LOGGER that that helps with 
reviewing logrec online.  I am not sure where that REXX is but I think you 
should be able to find it in an IBM Tools area.


If you do not have a lot of software or hardware issue, then you might want to 
revisit what should be done about logrec.

I have it in a offloaded GDG and rarely look at it.  When I had physical 
cartridge tape, I needed to see how bad the errors were on the tapes.  Now that 
i have virtual tape, I hardly look at it at all.

Lizette



-Original Message-
>From: Turner Cheryl L <cheryl.l.tur...@irs.gov>
>Sent: Jun 13, 2017 2:24 PM
>To: IBM-MAIN@LISTSERV.UA.EDU
>Subject: EREP Symptom and/or Software Records
>
>Greetings.
>Our former sysprog, who paid attention to the more finer system details, has 
>left the building for greener pastures.  So now we seem to have to step up our 
>game.  However, I'm not sure what to do or how.
>
>We are running several EREP reports to see what software or symptom records 
>are being cut per LPAR (mostly just HISTORY reports for now).  We are finding 
>that a lot of records are being cut at the time an IBM supplied SLIP trap is 
>taken (for example X13E, X47B, X91A).  Some of these records can exceed 
>hundreds on a given day.
>
>What should we/I be doing? Reporting them to IBM? I just don't understand why 
>IBM would set the SLIP yet cut a symptom or software record too.  We can't be 
>the only shop seeing these. Yet I've tried to research a few on IBMLINK but 
>can't find any hits for known problems.  Or maybe there is a way to turn of 
>the creation of the software/symptom record?  Though I can't wrap my head 
>around that either, thinking why are they then being cut at all if it's not 
>anything to look into?
>
>Any schooling you can give, would be most appreciated! But please, be gentle. 
>I'm out of my element. Many thanks to you all.
>

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: EREP Symptom and/or Software Records

2017-06-13 Thread Edward Gould
> On Jun 13, 2017, at 4:24 PM, Turner Cheryl L  wrote:
> 
> Greetings.
> Our former sysprog, who paid attention to the more finer system details, has 
> left the building for greener pastures.  So now we seem to have to step up 
> our game.  However, I'm not sure what to do or how.
> 
> We are running several EREP reports to see what software or symptom records 
> are being cut per LPAR (mostly just HISTORY reports for now).  We are finding 
> that a lot of records are being cut at the time an IBM supplied SLIP trap is 
> taken (for example X13E, X47B, X91A).  Some of these records can exceed 
> hundreds on a given day.
> 
> What should we/I be doing? Reporting them to IBM? I just don't understand why 
> IBM would set the SLIP yet cut a symptom or software record too.  We can't be 
> the only shop seeing these. Yet I've tried to research a few on IBMLINK but 
> can't find any hits for known problems.  Or maybe there is a way to turn of 
> the creation of the software/symptom record?  Though I can't wrap my head 
> around that either, thinking why are they then being cut at all if it's not 
> anything to look into?

I was doing this in the 1990’s. I would look through the report to see if there 
were possibly duplicates. If there was I would fire up ibmlink and do a search 
on the modules and abend code. I almost always get a hit. I would research the 
apartment and look up the pff number and it would go in ASAP (scheduled IPL 
every weekend). Sometimes a system dump would show up as well. Kept an eye on 
those as well.
It kept our system up and running and no strange things ever happened.
I also kept a eagle eye on PSF bends as we were ECS and always needed the 
latest software available. I also was on a first name basis of level 2/3 for 
PSF issues. It was a lot of work because IMO PSF was bug ridden when we got it.
Ed
> 
> Any schooling you can give, would be most appreciated! But please, be gentle. 
> I'm out of my element. Many thanks to you all.
> 
> --
> For IBM-MAIN subscribe / signoff / archive access instructions,
> send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: EREP Symptom and/or Software Records

2017-06-13 Thread Jesse 1 Robinson
Those IBM supplied SLIP traps are generally there to suppress irrelevant SVC 
dumps:

SL SET,C=13E,ID=X13E,A=NODUMP,END

Imagine what would happen if you got a dump for each S13E in addition to the 
LOGREC record? The truth is that some abends relating to 'improper' termination 
cleanup are just part of MVS noise. They seldom mean anything at all, but they 
are after all abends, so they are acknowledged and recorded. But unless you (or 
a vendor) needs to pursue the cause, nothing else needs to be done. 

There are also NODUMP or NOSVCD SLIP traps for S013, Sx37, and other common 
'user error' abends. If you need a dump for such an error, you can override the 
SLIP trap, but in most cases, the cause--and the solution--are obvious. 

.
.
J.O.Skip Robinson
Southern California Edison Company
Electric Dragon Team Paddler 
SHARE MVS Program Co-Manager
323-715-0595 Mobile
626-543-6132 Office ⇐=== NEW
robin...@sce.com


-Original Message-
From: IBM Mainframe Discussion List [mailto:IBM-MAIN@LISTSERV.UA.EDU] On Behalf 
Of Turner Cheryl L
Sent: Tuesday, June 13, 2017 2:25 PM
To: IBM-MAIN@LISTSERV.UA.EDU
Subject: (External):EREP Symptom and/or Software Records

Greetings.
Our former sysprog, who paid attention to the more finer system details, has 
left the building for greener pastures.  So now we seem to have to step up our 
game.  However, I'm not sure what to do or how.

We are running several EREP reports to see what software or symptom records are 
being cut per LPAR (mostly just HISTORY reports for now).  We are finding that 
a lot of records are being cut at the time an IBM supplied SLIP trap is taken 
(for example X13E, X47B, X91A).  Some of these records can exceed hundreds on a 
given day.

What should we/I be doing? Reporting them to IBM? I just don't understand why 
IBM would set the SLIP yet cut a symptom or software record too.  We can't be 
the only shop seeing these. Yet I've tried to research a few on IBMLINK but 
can't find any hits for known problems.  Or maybe there is a way to turn of the 
creation of the software/symptom record?  Though I can't wrap my head around 
that either, thinking why are they then being cut at all if it's not anything 
to look into?

Any schooling you can give, would be most appreciated! But please, be gentle. 
I'm out of my element. Many thanks to you all.


--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


EREP Symptom and/or Software Records

2017-06-13 Thread Turner Cheryl L
Greetings.
Our former sysprog, who paid attention to the more finer system details, has 
left the building for greener pastures.  So now we seem to have to step up our 
game.  However, I'm not sure what to do or how.

We are running several EREP reports to see what software or symptom records are 
being cut per LPAR (mostly just HISTORY reports for now).  We are finding that 
a lot of records are being cut at the time an IBM supplied SLIP trap is taken 
(for example X13E, X47B, X91A).  Some of these records can exceed hundreds on a 
given day.

What should we/I be doing? Reporting them to IBM? I just don't understand why 
IBM would set the SLIP yet cut a symptom or software record too.  We can't be 
the only shop seeing these. Yet I've tried to research a few on IBMLINK but 
can't find any hits for known problems.  Or maybe there is a way to turn of the 
creation of the software/symptom record?  Though I can't wrap my head around 
that either, thinking why are they then being cut at all if it's not anything 
to look into?

Any schooling you can give, would be most appreciated! But please, be gentle. 
I'm out of my element. Many thanks to you all.

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN