Frank, Error handling becomes much easier once you have a proper infrastructure to report errors. Years ago I've convinced our CICS systems programmers to add the following line to the standard CICS JCL
//PTRACE DD SYSOUT=*,DCB=(RECFM=V,BLKSIZE=137,DSORG=PS) This DD is pointed to by the definition of an extrapartition TDQ, PTRC, so when a program issues WRITEQ TD Q(PTRC) the output goes into the above SYSOUT. This is where we dump our program traces, informational messages and basically anything we want, [Eventually, this SYSOUT has became so important in debugging various application issues that now it even gets archived along with the CICS jobs] Over the years we've built various automation rules, like if a message has a specific prefix it is echoed to MSGUSR and some of them even become alerts and go to the console. Then we've added a "standard" subroutine that would decipher both EIBFN and EIBRESP fields into a human-readable format and log them to the TDQ along with the whole EIB block for further analysis. So, now the application code does just this: EXEC CICS LINK PROGRAM(ACTION) COMMAREA(ACTION-COMMAREA) LENGTH (LENGTH OF ACTION-COMMAREA) NOHANDLE END-EXEC. * IF EIBRESP NOT = DFHRESP(NORMAL) PERFORM SHOW-CICS-ERROR STRING MY-PGM-NAME ' - Unable to LINK to =' ACTION BLANK-PAD DELIMITED BY SIZE INTO MESSAGE-TO-LOG PERFORM LOG-AND-QUIT END-IF where SHOW-CICS-ERROR is the place where we call the above "standard" subroutine. Notice, that in this case we perform paragraph LOG-AND-QUIT which terminates the task normally. However, in other cases the application may want to LOG-AND-DIE [i.e. ABEND with a specific abend code], or just LOG- MESSAGE and keep going [retry in 1 sec, send input to "error queue", etc.] This very approach is also used for abend handling, and here we always produce a transaction dump, because the CICS itself won't dump when an abend is HANDLEd. HTH, -Victor- On Wed, 16 Feb 2011 16:30:18 -0700, Frank Swarbrick <frank.swarbr...@efirstbank.com> wrote: >>>> On 2/16/2011 at 8:00 AM, in message ><listserv%201102160900125806.0...@bama.ua.edu>, Tom Marchant ><m42tom-ibmm...@yahoo.com> wrote: >> On Tue, 15 Feb 2011 17:53:30 -0700, Frank Swarbrick wrote: >> >>>In my 15 years of application programming I've always hated the need >>>to check 'status result' fields for conditions that, well, should not occur. >> >> >> The point of those return codes is that those conditions *do* occur. >> Checking them is part of the job of programming. The reason z/OS >> is such a robust system is because of all that error checking. > >I'm not suggesting (god forbid!) that status codes not be checked. I am suggesting that status codes not be used at all; rather exceptions should be generated that cause everything to come to a halt, with some useful messages and a useful dump. That way the users of routines need not be bothered with conditions that it is no possibility of handling, other than notifying the user of the unexpected condition and terminating. > >>>Most (not all, of course!) of the time its going to be the case that >>>the application is simply going to have some code to check for >>>expected status conditions, and if the status is not expected print >>>out an error message and exit (hopefully with a 'non-successful' >>>exit code of some sort) or force an abend. So why not just allow >>>the invoked routine to abend itself? >> >> The invoked routine can only protect itself. It can not issue a message >> that includes the context in which it is called. That context is necessary >> when the code is to be changed to deal with the situation. > >Running under LE, in any case, you get both. You know the routine that recognised the condition and you know the routine that called the routine where the condition occured. An LE dump, especially with symbolics, is a thing of elegance and beauty. >I can't speak for non LE dumps. > >>>For batch I will then have a global USRHDLR that will query the >>>operator as to what they want to do >> >> How is the operator going to know what to do? Will you provide >> sufficient documentation so that he can make a meaningful decision? >> If you can provide proper documentation, you can also code it to not >> require operator intervention. IMO, asking the operator is the lazy >> way out. In 1970, I had to have a very good reason to ask for an >> operator reply, especially from a Cobol program, and we only had >> three jobs running on the system at a time. Today it is not unusual >> to have hundreds. > >The operator probably will not know what to do. They will call the on-call application support person who will analyze the issue and instruct the operator on what to do. > >Let me give a very specific example. A change to a program has been made to access a new file. The JCL with the new DD for whatever reason was not implemented. When the program runs and attempts to open the file the open fails. What kind of automatic recovery can resolve this issue? None, of course. Therefore the application must do *something*. So what are the somethings? > >1) Issue a message to the console and terminate with a return code indicating the job did not complete successfully. >2) Issue a message and abend. >3) Issue a message and don't set a return-code, thus making it appear the job completed successfully. (Not recommended!! <g>) >4) Do any of the above but after issuing the message wait for an operator reply. > >The only reason I'm suggesting operator intervention prior to the job completing (abnormally) is that you can give an opportunity to actually continue, if it makes sense to continue, or to give the opportunity > >Let me give an example of why you may want to have the OPTION to continue in the case of a particular condition. We have a job that should not run until after 3:30pm. If the program detects that this rule has been violated (this was put in place before we had a scheduler!) and queries the operator whether to continue or to abort. My point is, there are some situations where a decision needs to be made, and that decision cannot be made by a computer. It must be made by a human who has analyzed the situation. So rather than having to code all of the logic to write the message, receive the response, and act on the response in each program why not simply signal an exception and have the global exception handler write the exception, wait for a response, and act on the response. > >>>By doing all of this it seems to me the applications will need a lot >>>less checking for errors from which the application cannot recover anyway. >> >> It sounds to me as if you are saying that you want to simplify your >> code by removing meaningful error messages and replacing them >> with a dump when unexpected situations occur. IMO, this is a >> mistake. > >I'm not saying remove meaningful error messages. I am saying signal an appropriate exception condition which is related to a meaningful error message. > >>>what point is there of having the routine return back to your code >>>when all your code is going to do is say "unexpected error in call >>>to xxx" and terminate/abend. >> >> If that is all your code is going to do, without even giving the return >> code that was issued or where the call was made, there isn't much >> point. In the end you'll spend more time looking at dumps than you >> would have spent coding it correctly in the first place. > >What? Of course it should have been coded properly in the first place. The point is to handle, without a lot of extra redundant coding, the "unexpected". I don't expect that I will forget to put a DD in my JCL. > >>>would this not make life simpler? And in most cases no less robust. >> >> Simpler to code, yes, because the code is less robust. And more >> difficult to diagnose errors. It's ok. I'm here to develop tools to >> make it easier for you to figure out what went wrong. >> >> If you are looking for a justification for lazy programming, you won't >> get it from me. I started in this business over 40 years ago as an >> applications programmer, coding mostly in Cobol with a bit of >> assembler. Validating data before using them and checking return >> codes was always part of the job. > >I'm not looking to justify lazy programming. I am looking to simplify programming. > >> Today, I am doing software development in assembler and checking >> for error conditions is still a big part of the job. When I neglect to >> check for a condition, it usually causes problems. > >I guess I wasn't clear. I am certainly not saying that you should not check for conditions. I am saying that there are many cases where it make sense for the routine that detects the condition to "generate an exception" that can be handled in a general manner, rather than having each program that uses the routine having to check to see if his call failed and then, well, doing something so that the issue can be addressed. If the program can usefully continue to process when an exception occurs then certainly it should do so. My point is that there shouldn't have to be a lot of things like this... > >CALL ROUTINE USING PARM1, PARM2, PARM3, RC >IF RC-SUCCESSFUL OR RC-EXPECTED-CONDITION > CONTINUE >ELSE > DISPLAY 'An unexpected condtion ' RC ' occured in the call to ROUTINE' UPON CONSOLE > MOVE 16 TO RETURN-CODE > STOP RUN >END-IF > >or worse >EVALUATE TRUE >WHEN RC-SUCCESSFUL > CONTINUE >WHEN RC-EXPECTED-CONDITION > . ..DO SOMETHING... >WHEN RC = 1 > DISPLAY 'Invalid PARM1 in call to ROUTINE: ' PARM1 UPON CONSOLE > MOVE 16 TO RETURN-CODE > STOP RUN >WHEN RC = 2 > DISPLAY 'Invalid PARM2 in call to ROUTINE: ' PARM2 UPON CONSOLE > MOVE 16 TO RETURN-CODE > STOP RUN >...ETC, ETC... >WHEN OTHER > DISPLAY 'An unexpected condtion ' RC ' occured in the call to ROUTINE' UPON CONSOLE > MOVE 16 TO RETURN-CODE > STOP RUN >END-EVALAUTE > >If I called ROUTINE with an invalid PARM1 wouldn't it be better if the routine that knows what is valid and what is not simply produce a useful error message and then terminate, perhaps with a dump that can be analyzed if the message itself doesn't tell the whole story? > >Frank > >-- > >Frank Swarbrick >Applications Architect - Mainframe Applications Development >FirstBank Data Corporation - Lakewood, CO USA >P: 303-235-1403 > > > > >The information contained in this electronic communication and any document attached hereto or transmitted herewith is confidential and intended for the exclusive use of the individual or entity named above. If the reader of this message is not the intended recipient or the employee or agent responsible for delivering it to the intended recipient, you are hereby notified that any examination, use, dissemination, distribution or copying of this communication or any part thereof is strictly prohibited. If you have received this communication in error, please immediately notify the sender by reply e- mail and destroy this communication. Thank you. > >---------------------------------------------------------------------- >For IBM-MAIN subscribe / signoff / archive access instructions, >send email to lists...@bama.ua.edu with the message: GET IBM-MAIN INFO >Search the archives at http://bama.ua.edu/archives/ibm-main.html ---------------------------------------------------------------------- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@bama.ua.edu with the message: GET IBM-MAIN INFO Search the archives at http://bama.ua.edu/archives/ibm-main.html