Re: CEEDUMP possible following 'new' failure
Lots of questions ... Charles -Original Message- From: IBM Mainframe Discussion List [mailto:IBM-MAIN@LISTSERV.UA.EDU] On Behalf Of Bill Woodger Sent: Sunday, October 23, 2016 10:08 AM To: IBM-MAIN@LISTSERV.UA.EDU Subject: Re: CEEDUMP possible following 'new' failure > Do you need to reacquire the storage, or does the LE dump routine hang around > for the second-time-through? My logic is such that I terminate (neatly, with CSA and exit cleanup) at that point so I never have a second time through. One would have to experiment. > Would it be possible to load the LE dump routine instead of doing the initial > GETMAIN? And your own routine? That occurred to me but I did not try it. That would potentially be a better solution because one would not be guessing at the amount of storage required. I'm a commercial product developer, not a z/OS researcher () and so "works perfectly" is good enough for me. Also, this is just an interim solution until Mr. R beats the LE people into putting their stuff into LPA. There is no "my routine" -- my recovery code is all resident in the main load module. > Do you have issues with something else looking for storage while you are > processing the first one? Nope. There quite intentionally are no new's, GETMAINs or malloc()'s in the recovery code. And very quickly it starts doing delete's like there was no tomorrow (which there isn't, at least not for that instance). > Is the dump output in the dataset where the four blank lines were, and are > they still there, or perhaps some "artefact" of failure? I did not think to look explicitly for the four blank lines but the dump looked "perfect" at first glance. It has the traceback which is the main thing I was interested in. Perhaps the four blank lines are "between sections." Before, the sections were missing so the blank lines looked out of place. Now the sections are there, so the blank lines look appropriate. -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN
Re: CEEDUMP possible following 'new' failure
Do you need to reacquire the storage, or does the LE dump routine hang around for the second-time-through? Would it be possible to load the LE dump routine instead of doing the initial GETMAIN? And your own routine? Do you have issues with something else looking for storage while you are processing the first one? Is the dump output in the dataset where the four blank lines were, and are they still there, or perhaps some "artefact" of failure? -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN
Re: CEEDUMP possible following 'new' failure
I know this is a slightly aged thread but I wanted to share an interim remediation with anyone who has this problem. I was able to resolve the problem by doing a GETMAIN (vanilla SP=0) at startup and freeing the area just before attempting to invoke the LE dump. I started out with 60Ki bytes and that resolved the problem. I did not experiment to see if a somewhat smaller area would do the trick. Charles -Original Message- From: IBM Mainframe Discussion List [mailto:IBM-MAIN@LISTSERV.UA.EDU] On Behalf Of Jim Mulder Sent: Thursday, October 06, 2016 4:33 PM To: IBM-MAIN@LISTSERV.UA.EDU Subject: Re: CEEDUMP possible following 'new' failure From some internal discussion after this issue was raised today, our intention is that LE will move the CEEDUMP modules to SCEELPA in the next release of z/OS. Jim Mulder z/OS Diagnosis, Design, Development, Test IBM Corp. -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN
Re: CEEDUMP possible following 'new' failure
Hi Peter, yes I am aware of that, but we also have hang conditions, where the jobs are not out of memory and it might help in those situations. Many of this situation occur when unix system services are in the game. Thanks. -Original Message- From: Peter Relson <rel...@us.ibm.com> To: IBM-MAIN <IBM-MAIN@LISTSERV.UA.EDU> Sent: Tue, Oct 11, 2016 1:44 pm Subject: Re: CEEDUMP possible following 'new' failure The "callrtm command" will do no better than anything else that requires private storage of the address space to run. It is nothing more than a targeted cancel. Out of private storage is out of private storage. Peter Relson z/OS Core Technology Design -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN
Re: CEEDUMP possible following 'new' failure
The "callrtm command" will do no better than anything else that requires private storage of the address space to run. It is nothing more than a targeted cancel. Out of private storage is out of private storage. Peter Relson z/OS Core Technology Design -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN
Re: CEEDUMP possible following 'new' failure
Hi Barbara, no, we did not have Compuware in this environment. But the callrtm command seams something we need to look at. Thanks for that. Denis. -Original Message- From: Barbara Nitz <nitz-...@gmx.net> To: IBM-MAIN <IBM-MAIN@LISTSERV.UA.EDU> Sent: Mon, Oct 10, 2016 9:22 am Subject: Re: CEEDUMP possible following 'new' failure >I cannot remember exactly, but what happened was that in IMS the STOP REGION >command was issued and the address space was not listed anymore in IMS >(Display active showed it was gone). >It was visible in JES but nothing could be done about it, it did neither >accept cancel nor force. Do you by any chance run Compuware Xpediter or something under IMS? A few jobs ago, we had the same problem with our IMS regions, and it turned out that Compuware code was hindering termination. (I took a dump and looked.) In some cases I was successful by using the (unsupported) callrtm program that is now a (supported) operator command. I specified the asid and the bottommost tcb and ran that program a number of times. In about 70% of the cases it succeeded in terminating the hung region, in the remaining 30% they needed a complete IMS restart (to enable to user whose id was blocked) to work again. Barbara -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN
Re: CEEDUMP possible following 'new' failure
>I cannot remember exactly, but what happened was that in IMS the STOP REGION >command was issued and the address space was not listed anymore in IMS >(Display active showed it was gone). >It was visible in JES but nothing could be done about it, it did neither >accept cancel nor force. Do you by any chance run Compuware Xpediter or something under IMS? A few jobs ago, we had the same problem with our IMS regions, and it turned out that Compuware code was hindering termination. (I took a dump and looked.) In some cases I was successful by using the (unsupported) callrtm program that is now a (supported) operator command. I specified the asid and the bottommost tcb and ran that program a number of times. In about 70% of the cases it succeeded in terminating the hung region, in the remaining 30% they needed a complete IMS restart (to enable to user whose id was blocked) to work again. Barbara -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN
Re: CEEDUMP possible following 'new' failure
It was visible in JES but nothing could be done about it, it did neither accept cancel nor force. Fault Analyzer showed that the last thing that happened in the address space was trying to load some z/OS routines for termination (if it was not memory termination then it must have been task termination) and failed to load those routines because of an out of storage condition. You might consider the possibility that JES2 or Fault Analyzer is in error if this is true. If force is not accepted then the address space no longer exists as far as the system is concerned. What exactly do you mean by "accept...force"? It is true that if memory termination fails because of some system error an address space could land in limbo. But that would not have anything to do with insufficient memory in the address space. And if the master address space (where memterm runs) has insufficient storage then some authorized program has screwed things up and you'll likely need to re-IPL (and find/fix that program). If task termination, which is an operating system function, requires storage in an address space with no storage left, it should ensure that there is always enough room for task termination. Unfortunately in the general case this is provably impossible. Just as it is provably impossible in the general case to detect infinite looping. I would guess that a huge percentage of customers have chosen a prudent approach with their user exits to prevent the vast majority of these cases, by limiting regions size somewhat, not allowing someone to allocate as user-region storage all available storage (whether below or above 16M or even above 2G). Peter Relson z/OS Core Technology Design -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN
Re: CEEDUMP possible following 'new' failure
I have this note from a few years ago that consurs: - Early SAS notes recommended REGION=0M, but with early V9.1, only for diagnostics AFTER an out of memory condition, a specific REGION was recommended in http://support.sas.com/kb/18401: "The thought is that if we have exhausted the full region and abnormal termination occurs as a result there is not sufficient ceiling within the address space to properly clean up. This can lead to damaged libraries, potential hanging and looping within the address space. -Original Message- From: IBM Mainframe Discussion List [mailto:IBM-MAIN@LISTSERV.UA.EDU] On Behalf Of Peter Hunkeler Sent: Sunday, October 9, 2016 9:44 AM To: IBM-MAIN@LISTSERV.UA.EDU Subject: AW: Re: CEEDUMP possible following 'new' failure >Fault Analyzer showed that the last thing that happened in the address space was trying to load some z/OS routines for termination (if it was not memory termination then it must have been task termination) and failed to load those routines because of an out of storage condition. I have in mind that, at least in earlier times, it was recommended *not* to code REGION=0M? With REGION=0M the code can to eat up all storage in the address space. Of course, authorized software may easily overcome any REGION setting (I guess). -- Peter Hunkeler -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN
AW: Re: CEEDUMP possible following 'new' failure
>Fault Analyzer showed that the last thing that happened in the address space >was trying to load some z/OS routines for termination (if it was not memory >termination then it must have been task termination) and failed to load those >routines because of an out of storage condition. I have in mind that, at least in earlier times, it was recommended *not* to code REGION=0M? With REGION=0M the code can to eat up all storage in the address space. Of course, authorized software may easily overcome any REGION setting (I guess). -- Peter Hunkeler -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN
Re: CEEDUMP possible following 'new' failure
Hi Jim, I cannot remember exactly, but what happened was that in IMS the STOP REGION command was issued and the address space was not listed anymore in IMS (Display active showed it was gone). It was visible in JES but nothing could be done about it, it did neither accept cancel nor force. Fault Analyzer showed that the last thing that happened in the address space was trying to load some z/OS routines for termination (if it was not memory termination then it must have been task termination) and failed to load those routines because of an out of storage condition. So the expectation of everyone for this situation is, task termination should be possible regardless if there was an IEFUSI reserving the 512k below or not. If task termination, which is an operating system function, requires storage in an address space with no storage left, it should ensure that there is always enough room for task termination. Thanks. -Original Message- From: Jim Mulder <d10j...@us.ibm.com> To: IBM-MAIN <IBM-MAIN@LISTSERV.UA.EDU> Sent: Fri, Oct 7, 2016 8:39 pm Subject: Re: CEEDUMP possible following 'new' failure > this reminds me of some hanging IMS jobs that could neither be > cancelled nor forced because the routines for memterm could not be > loaded because of memory exhausted. Only BMC Tooling allowed to get > rid of them. > The suggestion in the PMR was to code an IEFUSI to reserve 512k > below to allow memterm to happen in any case. > > Could you please raise another internal discussion why IEFUSI has to > be coded at all in order to allow memterm to happen? > Why can't z/OS just ensure that there is always enough storage > available in the address space for memterm? Since memterm does not access the storage of the address being terminated, there is no connection between IEFUSI and memterm. There is no requirement for any available storage in the address space being memtermed. Task termination, yes. Memory termination, no. Jim Mulder z/OS Diagnosis, Design, Development, Test IBM Corp. Poughkeepsie NY -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN
Re: CEEDUMP possible following 'new' failure
On 10/7/2016 1:38 PM, Jim Mulder wrote: Since memterm does not access the storage of the address being terminated, there is no connection between IEFUSI and memterm. There is no requirement for any available storage in the address space being memtermed. Task termination, yes. Memory termination, no. I think have been cases where an address space determined it was in trouble and wanted to MEMTERM itself, but unable to issue a CALLRTM MEMTERM for *itself* due to storage being exhausted. But, as Jim said, a FORCE command runs in *MASTER* and does not need any storage in the address space. Greg -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN
Re: CEEDUMP possible following 'new' failure
> this reminds me of some hanging IMS jobs that could neither be > cancelled nor forced because the routines for memterm could not be > loaded because of memory exhausted. Only BMC Tooling allowed to get > rid of them. > The suggestion in the PMR was to code an IEFUSI to reserve 512k > below to allow memterm to happen in any case. > > Could you please raise another internal discussion why IEFUSI has to > be coded at all in order to allow memterm to happen? > Why can't z/OS just ensure that there is always enough storage > available in the address space for memterm? Since memterm does not access the storage of the address being terminated, there is no connection between IEFUSI and memterm. There is no requirement for any available storage in the address space being memtermed. Task termination, yes. Memory termination, no. Jim Mulder z/OS Diagnosis, Design, Development, Test IBM Corp. Poughkeepsie NY -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN
Re: CEEDUMP possible following 'new' failure
Hi, this reminds me of some hanging IMS jobs that could neither be cancelled nor forced because the routines for memterm could not be loaded because of memory exhausted. Only BMC Tooling allowed to get rid of them. The suggestion in the PMR was to code an IEFUSI to reserve 512k below to allow memterm to happen in any case. Could you please raise another internal discussion why IEFUSI has to be coded at all in order to allow memterm to happen? Why can't z/OS just ensure that there is always enough storage available in the address space for memterm? Thanks. -Original Message- From: Jim Mulder <d10j...@us.ibm.com> To: IBM-MAIN <IBM-MAIN@LISTSERV.UA.EDU> Sent: Thu, Oct 6, 2016 10:33 pm Subject: Re: CEEDUMP possible following 'new' failure >From some internal discussion after this issue was raised today, our intention is that LE will move the CEEDUMP modules to SCEELPA in the next release of z/OS. Jim Mulder z/OS Diagnosis, Design, Development, Test IBM Corp. Poughkeepsie NY > > So, when will CEE.SCEELPA be z/OS standard? :) > > > -Original Message- > > From: IBM Mainframe Discussion List [mailto:IBM-MAIN@LISTSERV.UA.EDU] > > On Behalf Of Jim Mulder > > Sent: Thursday, October 06, 2016 10:48 AM > > To: IBM-MAIN@LISTSERV.UA.EDU > > Subject: Re: CEEDUMP possible following 'new' failure > > > > > The remaining problem is that I am not getting any diagnostic > > information, > > > in other words, exactly *which* new failed -- which will of course > > > make > > any > > > bug of this sort in the field hard to find. I call CEEDUMP to get a > > > call trace and it produces an *empty* four-line dataset. On the > > > console I get > > > > > > IEW4000I FETCH FOR MODULE CEEMENU3 FROM DDNAME *VLF* FAILED > > BECAUSE > > > INSUFFICIENT STORAGE WAS AVAILABLE. > > > CSV031I LIBRARY ACCESS FAILED FOR MODULE CEEMENU3, RETURN CODE > > 24, > > REASON > > > CODE 26080021, DDNAME *LNKLST* > > > > I would suggest putting the CEEDUMP-related modules in LPA. Our > > intention in z/OS is that modules involved in the production of > > SYSABEND/SYSUDUMP/SYSMDUMP/IEATDUMP/SDUMP > > should be in LPA, so that they don't need get loaded into exhausted REGION- > > constrained storage while trying to take a dump of REGION-constrained > > storage exhaustion. (And I say "our intention" > > because we do sometimes find cases where we did not do what we > > intended). -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN
AW: Re: CEEDUMP possible following 'new' failure
>From some internal discussion after this issue was raised today, our intention is that LE will move the CEEDUMP modules to SCEELPA in the next release of z/OS. Didn't look up and didn't care so far, but now that you mention it, I'm astonished those modules are not currrently part of LE's LPA library. Could you please have another internal discussion :-) and provide list of the the LE recovery and dump modules which are not currently in LPA but should be? -- Peter Hunkeler -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN
Re: CEEDUMP possible following 'new' failure
Awesome! Thanks, Charles -Original Message- From: IBM Mainframe Discussion List [mailto:IBM-MAIN@LISTSERV.UA.EDU] On Behalf Of Jim Mulder Sent: Thursday, October 06, 2016 1:33 PM To: IBM-MAIN@LISTSERV.UA.EDU Subject: Re: CEEDUMP possible following 'new' failure From some internal discussion after this issue was raised today, our intention is that LE will move the CEEDUMP modules to SCEELPA in the next release of z/OS. -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN
Re: CEEDUMP possible following 'new' failure
From some internal discussion after this issue was raised today, our intention is that LE will move the CEEDUMP modules to SCEELPA in the next release of z/OS. Jim Mulder z/OS Diagnosis, Design, Development, Test IBM Corp. Poughkeepsie NY > > So, when will CEE.SCEELPA be z/OS standard? :) > > > -Original Message- > > From: IBM Mainframe Discussion List [mailto:IBM-MAIN@LISTSERV.UA.EDU] > > On Behalf Of Jim Mulder > > Sent: Thursday, October 06, 2016 10:48 AM > > To: IBM-MAIN@LISTSERV.UA.EDU > > Subject: Re: CEEDUMP possible following 'new' failure > > > > > The remaining problem is that I am not getting any diagnostic > > information, > > > in other words, exactly *which* new failed -- which will of course > > > make > > any > > > bug of this sort in the field hard to find. I call CEEDUMP to get a > > > call trace and it produces an *empty* four-line dataset. On the > > > console I get > > > > > > IEW4000I FETCH FOR MODULE CEEMENU3 FROM DDNAME *VLF*FAILED > > BECAUSE > > > INSUFFICIENT STORAGE WAS AVAILABLE. > > > CSV031I LIBRARY ACCESS FAILED FOR MODULE CEEMENU3, RETURN CODE > > 24, > > REASON > > > CODE 26080021, DDNAME *LNKLST* > > > > I would suggest putting the CEEDUMP-related modules in LPA. Our > > intention in z/OS is that modules involved in the production of > > SYSABEND/SYSUDUMP/SYSMDUMP/IEATDUMP/SDUMP > > should be in LPA, so that they don't need get loaded into exhausted REGION- > > constrained storage while trying to take a dump of REGION-constrained > > storage exhaustion. (And I say "our intention" > > because we do sometimes find cases where we did not do what we > > intended). -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN
Re: CEEDUMP possible following 'new' failure
So, when will CEE.SCEELPA be z/OS standard? :) > -Original Message- > From: IBM Mainframe Discussion List [mailto:IBM-MAIN@LISTSERV.UA.EDU] > On Behalf Of Jim Mulder > Sent: Thursday, October 06, 2016 10:48 AM > To: IBM-MAIN@LISTSERV.UA.EDU > Subject: Re: CEEDUMP possible following 'new' failure > > > The remaining problem is that I am not getting any diagnostic > information, > > in other words, exactly *which* new failed -- which will of course > > make > any > > bug of this sort in the field hard to find. I call CEEDUMP to get a > > call trace and it produces an *empty* four-line dataset. On the > > console I get > > > > IEW4000I FETCH FOR MODULE CEEMENU3 FROM DDNAME *VLF*FAILED > BECAUSE > > INSUFFICIENT STORAGE WAS AVAILABLE. > > CSV031I LIBRARY ACCESS FAILED FOR MODULE CEEMENU3, RETURN CODE > 24, > REASON > > CODE 26080021, DDNAME *LNKLST* > > I would suggest putting the CEEDUMP-related modules in LPA. Our > intention in z/OS is that modules involved in the production of > SYSABEND/SYSUDUMP/SYSMDUMP/IEATDUMP/SDUMP > should be in LPA, so that they don't need get loaded into exhausted REGION- > constrained storage while trying to take a dump of REGION-constrained > storage exhaustion. (And I say "our intention" > because we do sometimes find cases where we did not do what we > intended). > > Jim Mulder z/OS Diagnosis, Design, Development, Test IBM Corp. > Poughkeepsie NY > > > > -- > For IBM-MAIN subscribe / signoff / archive access instructions, send email to > lists...@listserv.ua.edu with the message: INFO IBM-MAIN -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN
Re: CEEDUMP possible following 'new' failure
Ah! Most excellent. Thank you. Charles -Original Message- From: IBM Mainframe Discussion List [mailto:IBM-MAIN@LISTSERV.UA.EDU] On Behalf Of Jim Mulder Sent: Thursday, October 06, 2016 10:48 AM To: IBM-MAIN@LISTSERV.UA.EDU Subject: Re: CEEDUMP possible following 'new' failure > The remaining problem is that I am not getting any diagnostic information, > in other words, exactly *which* new failed -- which will of course > make any > bug of this sort in the field hard to find. I call CEEDUMP to get a > call trace and it produces an *empty* four-line dataset. On the > console I get > > IEW4000I FETCH FOR MODULE CEEMENU3 FROM DDNAME *VLF*FAILED BECAUSE > INSUFFICIENT STORAGE WAS AVAILABLE. > CSV031I LIBRARY ACCESS FAILED FOR MODULE CEEMENU3, RETURN CODE 24, REASON > CODE 26080021, DDNAME *LNKLST* I would suggest putting the CEEDUMP-related modules in LPA. Our intention in z/OS is that modules involved in the production of SYSABEND/SYSUDUMP/SYSMDUMP/IEATDUMP/SDUMP should be in LPA, so that they don't need get loaded into exhausted REGION-constrained storage while trying to take a dump of REGION-constrained storage exhaustion. (And I say "our intention" because we do sometimes find cases where we did not do what we intended). -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN
Re: CEEDUMP possible following 'new' failure
The reserve seems to be used as the new stack segment, and anything else can still gobble it up. Gets a U4008 with 1004 not a 1024 apparently. A larger reserve may help if you still have things acquiring storage. But then you didn't get a U4008. Does the production of an LE Dump acquire storage? I'd suspect so. Interesting here, you said POSIX(ON), but I have no idea what you meant by "dubbed"? "When an error occurs that would cause a CEEDUMP to be taken, and this is a POSIX application, Language Environment writes this dump to the current directory. Output from CEE3DMP is written to one of the following (in top-down order): The directory found in _CEE_DMPTARG, if found. The current working directory, if this is not the root (/), and if the directory is writable, and if the dump pathname (made up of the cwd pathname plus the dump file name) does not exceed 1024 characters. The directory found in environment variable TMPDIR (if the temporary directory is not /TMP. /TMP" You are probably looking in the right place, it started producing, and died. Sliding off-topic, is "writable" an American word? Following the CSV message, it seems a pretty internal thing, unless you have preceding messages, which I guess you don't. -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN
Re: CEEDUMP possible following 'new' failure
> The remaining problem is that I am not getting any diagnostic information, > in other words, exactly *which* new failed -- which will of course make any > bug of this sort in the field hard to find. I call CEEDUMP to get a call > trace and it produces an *empty* four-line dataset. On the console I get > > IEW4000I FETCH FOR MODULE CEEMENU3 FROM DDNAME *VLF*FAILED BECAUSE > INSUFFICIENT STORAGE WAS AVAILABLE. > CSV031I LIBRARY ACCESS FAILED FOR MODULE CEEMENU3, RETURN CODE 24, REASON > CODE 26080021, DDNAME *LNKLST* I would suggest putting the CEEDUMP-related modules in LPA. Our intention in z/OS is that modules involved in the production of SYSABEND/SYSUDUMP/SYSMDUMP/IEATDUMP/SDUMP should be in LPA, so that they don't need get loaded into exhausted REGION-constrained storage while trying to take a dump of REGION-constrained storage exhaustion. (And I say "our intention" because we do sometimes find cases where we did not do what we intended). Jim Mulder z/OS Diagnosis, Design, Development, Test IBM Corp. Poughkeepsie NY -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN
Re: CEEDUMP possible following 'new' failure
No, no, I am not trying to improve a program that is using too much memory. I am trying to write recovery code that will help diagnose future allocation failures, whatever their cause. LE provides leak analysis tools and I use them. I know where the storage is going: static const unsigned int size_to_allocate = 5000; for ( int i = 0; i < INT_MAX; i++ ) char *foo = new char[size_to_allocate]; Charles -Original Message- From: IBM Mainframe Discussion List [mailto:IBM-MAIN@LISTSERV.UA.EDU] On Behalf Of Bernd Oppolzer Sent: Thursday, October 06, 2016 9:34 AM To: IBM-MAIN@LISTSERV.UA.EDU Subject: Re: CEEDUMP possible following 'new' failure Some suggestions: - try REPORT (LE option) to see where the storage is used (below, above, User heap, LE below- or anyheap) and how much storage is used before you get in trouble; does it depend from the amount of input data? REPORT will also show if you can do any better by playing with the LE options. - if it's not an easy case (storage acquired below, but should be above / any), there will be some processing which acquires storage and does not release it; you will have to identify that processing and repair it. - There are some tools on the market which allow you to find program parts which do such things (for example: CEL4MCHK, that is: LE memory checking heap manager) - I wrote a routine that takes some sort of snapshot of the heap at a certain point in time, and then again later, and then it compares the two snapshots and shows the areas that have been acquired in the meantime and not freed. I was very successful in finding memory leaks with this tool at my current customer's -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN
Re: CEEDUMP possible following 'new' failure
Mmm. I hear you. May be the right answer. I am kind of in love with the "character" CEE3DMP because it is easy for customers to send and not screw up. Easy to view without uploading to the mainframe and firing up IPCS. Seems to point quickly to about 99 out of 100 problems. I do call some terminate-type function. CEE3DMP() is not "fatal" -- it returns to the caller. Charles -Original Message- From: IBM Mainframe Discussion List [mailto:IBM-MAIN@LISTSERV.UA.EDU] On Behalf Of Don Poitras Sent: Thursday, October 06, 2016 8:57 AM To: IBM-MAIN@LISTSERV.UA.EDU Subject: Re: CEEDUMP possible following 'new' failure In article <01a301d21fe6$1dbcd760$59368620$@mcn.org> you wrote: > I have been wrestling with the issue of recovery from a failure of 'new' > (kind of like a GETMAIN for those of you who are not C people; just > like > malloc() for those of you who are C but not C++ people) in XLC/LE C++ code. > (Yes, I know, the right answer is "don't do too many 'new's" but this > is error recovery code. Stuff happens, or IBM would not have invented ESTAE. > "More region size" is not the answer -- this is an intentional test of > storage exhaustion. More storage would just make it slower to fail > .) > First I got past the non-standard behavior of XLC in that rather than > blowing up per the standard, XLC just returns NULL (0) for a failed > new. Got the LANGLVL(NEWEXCP) in place. > So I am catching 'new' exceptions. The next problem was that it was > impossible to do anything meaningful after catching the 'new' > exception because storage was exhausted. So I read somewhere that one > had to specify #pragma RUNOPTS( STORAGE(,,,32K) ) to reserve some > storage for the storage exhaustion case. I specified 48K just to be on > the safe side. This made things better: I go through my error > recovery, clean things up, and end gracefully. > The remaining problem is that I am not getting any diagnostic > information, in other words, exactly *which* new failed -- which will > of course make any bug of this sort in the field hard to find. I call > CEEDUMP to get a call trace and it produces an *empty* four-line > dataset. On the console I get > IEW4000I FETCH FOR MODULE CEEMENU3 FROM DDNAME *VLF*FAILED BECAUSE > INSUFFICIENT STORAGE WAS AVAILABLE. > CSV031I LIBRARY ACCESS FAILED FOR MODULE CEEMENU3, RETURN CODE 24, > REASON CODE 26080021, DDNAME *LNKLST* > Any suggestions? > Charles Don't call CEEDUMP. Call abort() or something like: {int *ptr; ptr = 0; *ptr = 1;} Add another runopts: #pragma runopts(TERMTHDACT(UADUMP)) and make sure you have a SYSMDUMP DD with a dataset to use later with IPCS. IP VERBX LEDATA 'ceedump nthreads(*)' will give you tracebacks for all threads including the one that caused the abend. -- Don Poitras - SAS Development - SAS Institute Inc. - SAS Campus Drive sas...@sas.com (919) 531-5637Cary, NC 27513 -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN
Re: CEEDUMP possible following 'new' failure
STC, so sort of batch. Not UNIX command line, but POSIX(ON) and dubbed. > almost anything else asking for more memory could fail, couldn't it? Yes, in my model of how this works, LE GETMAINs a bunch of storage for heap. When I do a new, it sub-allocates from that heap. If the heap is exhausted LE does another GETMAIN. Lather, rinse, repeat. When storage is gone, it's gone, pretty much no matter who you are. I'm not sure *exactly* what the reserved storage is used for. What is the "trigger" or "authorization" for a particular request to be able to be satisfied from the reserved storage? Is that where the failed load is attempting to get its storage from? I am going to guess not, so reserving a megabyte won't make any difference. Maybe. And yes, exactly, I meant CEE3DMP(). Charles -Original Message- From: IBM Mainframe Discussion List [mailto:IBM-MAIN@LISTSERV.UA.EDU] On Behalf Of Bill Woodger Sent: Thursday, October 06, 2016 8:59 AM To: IBM-MAIN@LISTSERV.UA.EDU Subject: Re: CEEDUMP possible following 'new' failure Non-batch, I assume. Whilst your "news" are sucking up memory, almost anything else asking for more memory could fail, couldn't it? Not just one of yours? -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN
Re: CEEDUMP possible following 'new' failure
Some suggestions: - try REPORT (LE option) to see where the storage is used (below, above, User heap, LE below- or anyheap) and how much storage is used before you get in trouble; does it depend from the amount of input data? REPORT will also show if you can do any better by playing with the LE options. - if it's not an easy case (storage acquired below, but should be above / any), there will be some processing which acquires storage and does not release it; you will have to identify that processing and repair it. - There are some tools on the market which allow you to find program parts which do such things (for example: CEL4MCHK, that is: LE memory checking heap manager) - I wrote a routine that takes some sort of snapshot of the heap at a certain point in time, and then again later, and then it compares the two snapshots and shows the areas that have been acquired in the meantime and not freed. I was very successful in finding memory leaks with this tool at my current customer's site. (The culprit was a PL/1 program, BTW, but it could have been C++, too ... the tool finds them all). If you are interested to know more about this, contact me offline. Kind regards Bernd Am 06.10.2016 um 17:27 schrieb Charles Mills: I have been wrestling with the issue of recovery from a failure of 'new' (kind of like a GETMAIN for those of you who are not C people; just like malloc() for those of you who are C but not C++ people) in XLC/LE C++ code. (Yes, I know, the right answer is "don't do too many 'new's" but this is error recovery code. Stuff happens, or IBM would not have invented ESTAE. "More region size" is not the answer -- this is an intentional test of storage exhaustion. More storage would just make it slower to fail .) First I got past the non-standard behavior of XLC in that rather than blowing up per the standard, XLC just returns NULL (0) for a failed new. Got the LANGLVL(NEWEXCP) in place. So I am catching 'new' exceptions. The next problem was that it was impossible to do anything meaningful after catching the 'new' exception because storage was exhausted. So I read somewhere that one had to specify #pragma RUNOPTS( STORAGE(,,,32K) ) to reserve some storage for the storage exhaustion case. I specified 48K just to be on the safe side. This made things better: I go through my error recovery, clean things up, and end gracefully. The remaining problem is that I am not getting any diagnostic information, in other words, exactly *which* new failed -- which will of course make any bug of this sort in the field hard to find. I call CEEDUMP to get a call trace and it produces an *empty* four-line dataset. On the console I get IEW4000I FETCH FOR MODULE CEEMENU3 FROM DDNAME *VLF*FAILED BECAUSE INSUFFICIENT STORAGE WAS AVAILABLE. CSV031I LIBRARY ACCESS FAILED FOR MODULE CEEMENU3, RETURN CODE 24, REASON CODE 26080021, DDNAME *LNKLST* Any suggestions? Charles -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN
Re: CEEDUMP possible following 'new' failure
Sounds to me like 48K is not enough to allow the CEEDUMP process to complete. 48K seems unreasonably low to me, I would make it 128K or even more to permit the CEEDUMP process to run. I would research the CEEDUMP documentation to see if the minimum storage requirements for it to succeed are published and if not open a PMR with IBM to ask for it. Also, is that "storage exhaustion reserve" above the line or below the line? It might make a difference where it is reserved. HTH Peter -Original Message- From: IBM Mainframe Discussion List [mailto:IBM-MAIN@LISTSERV.UA.EDU] On Behalf Of Charles Mills Sent: Thursday, October 06, 2016 11:27 AM To: IBM-MAIN@LISTSERV.UA.EDU Subject: CEEDUMP possible following 'new' failure I have been wrestling with the issue of recovery from a failure of 'new' (kind of like a GETMAIN for those of you who are not C people; just like malloc() for those of you who are C but not C++ people) in XLC/LE C++ code. (Yes, I know, the right answer is "don't do too many 'new's" but this is error recovery code. Stuff happens, or IBM would not have invented ESTAE. "More region size" is not the answer -- this is an intentional test of storage exhaustion. More storage would just make it slower to fail .) First I got past the non-standard behavior of XLC in that rather than blowing up per the standard, XLC just returns NULL (0) for a failed new. Got the LANGLVL(NEWEXCP) in place. So I am catching 'new' exceptions. The next problem was that it was impossible to do anything meaningful after catching the 'new' exception because storage was exhausted. So I read somewhere that one had to specify #pragma RUNOPTS( STORAGE(,,,32K) ) to reserve some storage for the storage exhaustion case. I specified 48K just to be on the safe side. This made things better: I go through my error recovery, clean things up, and end gracefully. The remaining problem is that I am not getting any diagnostic information, in other words, exactly *which* new failed -- which will of course make any bug of this sort in the field hard to find. I call CEEDUMP to get a call trace and it produces an *empty* four-line dataset. On the console I get IEW4000I FETCH FOR MODULE CEEMENU3 FROM DDNAME *VLF*FAILED BECAUSE INSUFFICIENT STORAGE WAS AVAILABLE. CSV031I LIBRARY ACCESS FAILED FOR MODULE CEEMENU3, RETURN CODE 24, REASON CODE 26080021, DDNAME *LNKLST* Any suggestions? Charles -- This message and any attachments are intended only for the use of the addressee and may contain information that is privileged and confidential. If the reader of the message is not the intended recipient or an authorized representative of the intended recipient, you are hereby notified that any dissemination of this communication is strictly prohibited. If you have received this communication in error, please notify us immediately by e-mail and delete the message and any attachments from your system. -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN
Re: CEEDUMP possible following 'new' failure
Non-batch, I assume. Whilst your "news" are sucking up memory, almost anything else asking for more memory could fail, couldn't it? Not just one of yours? Do you mean CEE3DMP? CEEDUMP is just for setting the options for am LE dump. -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN
Re: CEEDUMP possible following 'new' failure
In article <01a301d21fe6$1dbcd760$59368620$@mcn.org> you wrote: > I have been wrestling with the issue of recovery from a failure of 'new' > (kind of like a GETMAIN for those of you who are not C people; just like > malloc() for those of you who are C but not C++ people) in XLC/LE C++ code. > (Yes, I know, the right answer is "don't do too many 'new's" but this is > error recovery code. Stuff happens, or IBM would not have invented ESTAE. > "More region size" is not the answer -- this is an intentional test of > storage exhaustion. More storage would just make it slower to fail .) > First I got past the non-standard behavior of XLC in that rather than > blowing up per the standard, XLC just returns NULL (0) for a failed new. Got > the LANGLVL(NEWEXCP) in place. > So I am catching 'new' exceptions. The next problem was that it was > impossible to do anything meaningful after catching the 'new' exception > because storage was exhausted. So I read somewhere that one had to specify > #pragma RUNOPTS( STORAGE(,,,32K) ) to reserve some storage for the storage > exhaustion case. I specified 48K just to be on the safe side. This made > things better: I go through my error recovery, clean things up, and end > gracefully. > The remaining problem is that I am not getting any diagnostic information, > in other words, exactly *which* new failed -- which will of course make any > bug of this sort in the field hard to find. I call CEEDUMP to get a call > trace and it produces an *empty* four-line dataset. On the console I get > IEW4000I FETCH FOR MODULE CEEMENU3 FROM DDNAME *VLF*FAILED BECAUSE > INSUFFICIENT STORAGE WAS AVAILABLE. > CSV031I LIBRARY ACCESS FAILED FOR MODULE CEEMENU3, RETURN CODE 24, REASON > CODE 26080021, DDNAME *LNKLST* > Any suggestions? > Charles Don't call CEEDUMP. Call abort() or something like: {int *ptr; ptr = 0; *ptr = 1;} Add another runopts: #pragma runopts(TERMTHDACT(UADUMP)) and make sure you have a SYSMDUMP DD with a dataset to use later with IPCS. IP VERBX LEDATA 'ceedump nthreads(*)' will give you tracebacks for all threads including the one that caused the abend. -- Don Poitras - SAS Development - SAS Institute Inc. - SAS Campus Drive sas...@sas.com (919) 531-5637Cary, NC 27513 -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN
Re: CEEDUMP possible following 'new' failure
This looks very like ABEND 80A in assembler's equivalent. The below-line region space has been exhausted. Most likely because in the recovery process, too much below line staorge are used and no more module can be loaded. James -Original Message- From: IBM Mainframe Discussion List [mailto:IBM-MAIN@LISTSERV.UA.EDU] On Behalf Of Charles Mills Sent: Thursday, October 06, 2016 11:27 AM To: IBM-MAIN@LISTSERV.UA.EDU Subject: CEEDUMP possible following 'new' failure I have been wrestling with the issue of recovery from a failure of 'new' (kind of like a GETMAIN for those of you who are not C people; just like malloc() for those of you who are C but not C++ people) in XLC/LE C++ code. (Yes, I know, the right answer is "don't do too many 'new's" but this is error recovery code. Stuff happens, or IBM would not have invented ESTAE. "More region size" is not the answer -- this is an intentional test of storage exhaustion. More storage would just make it slower to fail .) First I got past the non-standard behavior of XLC in that rather than blowing up per the standard, XLC just returns NULL (0) for a failed new. Got the LANGLVL(NEWEXCP) in place. So I am catching 'new' exceptions. The next problem was that it was impossible to do anything meaningful after catching the 'new' exception because storage was exhausted. So I read somewhere that one had to specify #pragma RUNOPTS( STORAGE(,,,32K) ) to reserve some storage for the storage exhaustion case. I specified 48K just to be on the safe side. This made things better: I go through my error recovery, clean things up, and end gracefully. The remaining problem is that I am not getting any diagnostic information, in other words, exactly *which* new failed -- which will of course make any bug of this sort in the field hard to find. I call CEEDUMP to get a call trace and it produces an *empty* four-line dataset. On the console I get IEW4000I FETCH FOR MODULE CEEMENU3 FROM DDNAME *VLF*FAILED BECAUSE INSUFFICIENT STORAGE WAS AVAILABLE. CSV031I LIBRARY ACCESS FAILED FOR MODULE CEEMENU3, RETURN CODE 24, REASON CODE 26080021, DDNAME *LNKLST* Any suggestions? Charles -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN ATTENTION: - The information contained in this message (including any files transmitted with this message) may contain proprietary, trade secret or other confidential and/or legally privileged information. Any pricing information contained in this message or in any files transmitted with this message is always confidential and cannot be shared with any third parties without prior written approval from Syncsort. This message is intended to be read only by the individual or entity to whom it is addressed or by their designee. If the reader of this message is not the intended recipient, you are on notice that any use, disclosure, copying or distribution of this message, in any form, is strictly prohibited. If you have received this message in error, please immediately notify the sender and/or Syncsort and destroy all copies of this message in your possession, custody or control. -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN
CEEDUMP possible following 'new' failure
I have been wrestling with the issue of recovery from a failure of 'new' (kind of like a GETMAIN for those of you who are not C people; just like malloc() for those of you who are C but not C++ people) in XLC/LE C++ code. (Yes, I know, the right answer is "don't do too many 'new's" but this is error recovery code. Stuff happens, or IBM would not have invented ESTAE. "More region size" is not the answer -- this is an intentional test of storage exhaustion. More storage would just make it slower to fail .) First I got past the non-standard behavior of XLC in that rather than blowing up per the standard, XLC just returns NULL (0) for a failed new. Got the LANGLVL(NEWEXCP) in place. So I am catching 'new' exceptions. The next problem was that it was impossible to do anything meaningful after catching the 'new' exception because storage was exhausted. So I read somewhere that one had to specify #pragma RUNOPTS( STORAGE(,,,32K) ) to reserve some storage for the storage exhaustion case. I specified 48K just to be on the safe side. This made things better: I go through my error recovery, clean things up, and end gracefully. The remaining problem is that I am not getting any diagnostic information, in other words, exactly *which* new failed -- which will of course make any bug of this sort in the field hard to find. I call CEEDUMP to get a call trace and it produces an *empty* four-line dataset. On the console I get IEW4000I FETCH FOR MODULE CEEMENU3 FROM DDNAME *VLF*FAILED BECAUSE INSUFFICIENT STORAGE WAS AVAILABLE. CSV031I LIBRARY ACCESS FAILED FOR MODULE CEEMENU3, RETURN CODE 24, REASON CODE 26080021, DDNAME *LNKLST* Any suggestions? Charles -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN