Re: Re. Whacking a Job, or Getting rid of an Address Space

2017-05-17 Thread Sam Golob

Hi Folks,

I just wish to thank all the contributors to this thread.  I feel 
that every single contribution added to our general knowledge.  Thank 
you all.


This is what the IBM-Main forum is all about.

All the best of everything to all of you.

Sincerely,Sam

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: Re. Whacking a Job, or Getting rid of an Address Space

2017-05-16 Thread Edward Gould
> On May 16, 2017, at 10:40 AM, Walt Farrell  wrote:
> 
> On Tue, 16 May 2017 09:57:16 -0400, Sam Golob  wrote:
> 
>>That having been said, the system doctor sometimes has to deal with
>> things that go wrong.  It's nice when the system is working as
>> designed.  But sometimes, the NON-CANCELABLE job or STC goes awry, and
>> it has to be restarted.  In such a case, as in the middle of a day's
>> production, you want to avoid an emergency IPL.  And so you need a tool
>> in the toolbox, to cancel the job or STC.  Sometimes the only solution
>> is to blow it away.
> 
> However, you should be prepared, when you use the tool, to have to IPL 
> anyway. And that should be clearly stated in any documentation for the tool.
> 
> -- 
> Walt
—SNIP———

In the early days of MVS we had constant issues of jobs going non cancelable.
Our IBM SE wrote a super cancel (callrtm)  command.
We *WERE* using it sparingly. It was a last grasp to IPL.
Although most of our issues had to due with allocation (Q4) getting hung.
We were IPLing once or twice in 3 days (sometimes as many as 8).
IBM rewrote allocation and just about eliminated the need for it. We did still 
use it from time to time and it did save IPL’s.
Our other IBMer who worked behind mountains of standalone dumps was used to it 
and didn’t raise it as an issue.
But it was understood by the group that using it was a last gasp attempt and 
then only making sure there were no allocation hangups.

Ed

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: Whacking a Job, or Getting rid of an Address Space

2017-05-16 Thread Sam Golob

Hi Folks,

I want to point out that not all versions of CNCLPG on CBT File 826 
have BURN or KILL capability for JOBs or STCs.  The earlier versions of 
the program (included in the file) have less power.


You have the choice of installing one of the earlier versions of 
the program (1.10, 1.11, or 1.20) if you only want to make jobs 
non-cancelable or non-swappable (1.11 and 1.20). So then you can use one 
of the earlier versions.


My point is to let you know that in an emergency, the power is there.

All the best of everything to all of you.

Sincerely,Sam

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: Whacking a Job, or Getting rid of an Address Space

2017-05-16 Thread Sam Golob

Hi Folks,

I just want to tell you that I very much appreciate this 
discussion.  Very much depends on POINT OF VIEW, and when all the points 
of view get together, real progress is made, and everybody becomes wiser.


There are at least three separate points of view here on this forum:

1.  The systems programmers who have to set up and run the system 
software in data centers.


2.  The professional "system level" programmers who usually work for 
vendors.


3.  The IBM programmers who design and build the system software.

There may be other people here also, such as application 
programmers and "programmer toolmakers" and more types, as well.


Everybody has a separate point of view.  In summary, here they are:

People who run data centers, have to make sure everything run 
smoothly, and they have to deal with "the problems of the non-ideal 
world".  Something breaks--fix it.  Keep the system up.  Make sure the 
system levels are correctly set for what we are doing, and for what we need.


Professional "system level" programmers dig deep into the system.  
"Authority" is not what is usually on their mind, unless they are 
dealing with a security-related product.  For example, doing 
cross-memory programming is usually "a piece of cake" for them.  But 
changing some fields in another user's control blocks, which might be 
easy for THEM to do, is a nightmare from the system administrator's 
point of view, so you already see a difference in point of view between 
these two groups.


Finally, the IBM designers and programmers have a big 
responsibility of delivering a consistent and reliable system, but they 
may tend (depending on the individual person's actual experience) to be 
a bit separated from the system programmer's "real world" problems, and 
the things that actually come up in a real data center, day by day.


I am glad that my post is bringing these 3 points of view together, 
in a productive and fruitful way.  If I did not write about this topic, 
then some sysprog might be without a necessary tool in his/her toolbox.  
When the emergency came up, they would be as helpless as I was, many 
years ago.  On the other hand, we know that the tool can be used 
improperly, either by the right people or the wrong people.  So I had a 
quandary:  "To say, or NOT to say.  That was the question."


I opted to "say".  I remember the pain in my heart, when JES2 
couldn't be removed, and we had to IPL in the middle of the day.  It was 
easy to fix if we could just cancel JES2, and restart it.  I had already 
proven that to myself, at that time. But I was helpless and adrift.  We 
had to IPL.  NEVER AGAIN!  I won't let that happen to someone!  NEVER!!!


So there.  I trust we've all been helpful...

All the best of everything to everyone.

Sincerely,Sam

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: Re. Whacking a Job, or Getting rid of an Address Space

2017-05-16 Thread Walt Farrell
On Tue, 16 May 2017 09:57:16 -0400, Sam Golob  wrote:

> That having been said, the system doctor sometimes has to deal with
>things that go wrong.  It's nice when the system is working as
>designed.  But sometimes, the NON-CANCELABLE job or STC goes awry, and
>it has to be restarted.  In such a case, as in the middle of a day's
>production, you want to avoid an emergency IPL.  And so you need a tool
>in the toolbox, to cancel the job or STC.  Sometimes the only solution
>is to blow it away.

However, you should be prepared, when you use the tool, to have to IPL anyway. 
And that should be clearly stated in any documentation for the tool.

-- 
Walt

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: Whacking a Job, or Getting rid of an Address Space

2017-05-16 Thread Elardus Engelbrecht
Paul Gilmartin wrote:

>I sense a gradual escalation here.

>Long ago, there was the CANCEL command so operators could terminate 
>troublesome jobs.
>But a designer felt that sometimes the programmer knows better, and provided 
>the non-cancellable attribute.
>Then a designer felt that sometimes the operator knows even better and 
>provided the FORCE command.
>Then a designer felt that sometimes the programmer knows even better and 
>provided the non-forcible attribute.
>Now someone feels that operators know better and is providing a WHACK facility.

Not good. All involved must look why you need to WHACK it. 

For example, your job itself is waiting for a mount or is waiting for HSM to 
recall something, but there is a mount problem. Solve that, and you don't need 
all those fancy measures including a 222 abend. 

Ok, that is just one sample reason why 'WHACKING' or all those 'x who knows 
better' are not suitable.

One example we got a few weeks ago was, a session was holding a CICS region. 
CPU% and region consumed climbed. Response times dropped on all CICS regions. 
Instead having WHACK down the troublesome CICS region, the network people 
simply VARY that session offline and all things returned to normal. No STCs 
were stopped at all.

But, so, normally check all normal avenues, then escalate using more and more 
extreme measures as per Paul' suggestion.

Groete / Greetings
Elardus Engelbrecht

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: Re. Whacking a Job, or Getting rid of an Address Space

2017-05-16 Thread Rivera, Dan
..and sometimes a Doctor uses a Sonic Screw Driver.

-Original Message-
From: IBM Mainframe Discussion List [mailto:IBM-MAIN@LISTSERV.UA.EDU] On Behalf 
Of Tom Marchant
Sent: Tuesday, May 16, 2017 10:11 AM
To: IBM-MAIN@LISTSERV.UA.EDU
Subject: Re: Re. Whacking a Job, or Getting rid of an Address Space

On Tue, 16 May 2017 09:57:16 -0400, Sam Golob wrote:

> the system doctor sometimes has to deal with things that go wrong.
>
>The
>doctor needs to have a scalpel.  Most doctors don't often need to use
>the scalpel.

Sometimes the "doctor" isn't really a doctor, but a boy scout who may have 
learned some rudimentary first aid.

And sometimes that thing that he thought was a scalpel is really a chain saw.

--
Tom Marchant

--
For IBM-MAIN subscribe / signoff / archive access instructions, send email to 
lists...@listserv.ua.edu with the message: INFO IBM-MAIN


The information contained in this message, and any attachments thereto,
is intended solely for the use of the addressee(s) and may contain
confidential and/or privileged material. Any review, retransmission,
dissemination, copying, or other use of the transmitted information is
prohibited. If you received this in error, please contact the sender
and delete the material from any computer. UNIGROUP.COM



--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: Re. Whacking a Job, or Getting rid of an Address Space

2017-05-16 Thread Tom Marchant
On Tue, 16 May 2017 09:57:16 -0400, Sam Golob wrote:

> the system doctor sometimes has to deal with
>things that go wrong. 
>
>The
>doctor needs to have a scalpel.  Most doctors don't often need to use
>the scalpel.

Sometimes the "doctor" isn't really a doctor, but a boy scout who may 
have learned some rudimentary first aid.

And sometimes that thing that he thought was a scalpel is really a 
chain saw.

-- 
Tom Marchant

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: Whacking a Job, or Getting rid of an Address Space

2017-05-16 Thread Paul Gilmartin
On Tue, 16 May 2017 07:59:56 -0400, Peter Relson wrote:

>Maybe it's me, but I found this post kind of inappropriate since it came 
>without caveats. One might think/hope that whoever defined a space as 
>non-cancelale or non-memtermable had a legitimate reason for doing so. 
>That likely isn't of course always true, but isn't that what you really 
>need to assume?
> 
I sense a gradual escalation here.

Long ago, there was the CANCEL command so operators could
terminate troublesome jobs.

But a designer felt that sometimes the programmer knows better,
and provided the non-cancellable attribute.

Then a designer felt that sometimes the operator knows even better
and provided the FORCE command.

Then a designer felt that sometimes the programmer knows even better
and provided the non-forcible attribute.

Now someone feels that operators know better and is providing
a WHACK facility.
...

Perhaps there should be a numeric attribute and a CANCEL command
argument, such that if the value supplied by the operator exceeds the
program's attribute, the CANCEL just works.

Floating point, of course.  Decimal floating point.

The operator will always have the nuclear option.

-- gil

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re. Whacking a Job, or Getting rid of an Address Space

2017-05-16 Thread Sam Golob

Hi Folks,

Of course, you're right, Peter.  Jobs or STCs are marked 
NON-CANCELABLE for a very good reason.  Under normal circumstances, they 
should not be cancelled, because it would endanger the system.  That's 
what the PPT is for.  Almost always, the safeguards that are there, are 
there for a very good reason. And I'm on management's side all the way.  
The idea is to keep the systems running as flawlessly and smoothly as 
possible.


That having been said, the system doctor sometimes has to deal with 
things that go wrong.  It's nice when the system is working as 
designed.  But sometimes, the NON-CANCELABLE job or STC goes awry, and 
it has to be restarted.  In such a case, as in the middle of a day's 
production, you want to avoid an emergency IPL.  And so you need a tool 
in the toolbox, to cancel the job or STC.  Sometimes the only solution 
is to blow it away.  The expensive multi-utility packages all contain 
such tools.  You wouldn't criticize Omegamon (TM) or RESOLVE (TM), would 
you?  But a shop which can't afford to buy them is sometimes stuck, and 
is forced to IPL and lose a lot of production time.  That's why I wrote 
CNCLPG 20 years after I had such an emergency, which I never forgot about.


Systems programmers do not live in an ideal world. Problems come 
up, in running the data center, which can be very unforeseen.  The 
doctor needs to have a scalpel.  Most doctors don't often need to use 
the scalpel.  But when you need it, and nothing else works, it's nice to 
know that it is there, sitting in the toolbox.  This is where I come 
from.  It's all for the purpose of keeping the shop running smoothly.


Thanks for listening.  All the best of everything to all of you.

Sincerely,Sam

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: Whacking a Job, or Getting rid of an Address Space

2017-05-16 Thread Peter Relson
Maybe it's me, but I found this post kind of inappropriate since it came 
without caveats. One might think/hope that whoever defined a space as 
non-cancelale or non-memtermable had a legitimate reason for doing so. 
That likely isn't of course always true, but isn't that what you really 
need to assume?

Unless you are willing to risk your system and its data by assuming that 
it is OK to cancel something that is non-cancelable or memterm something 
that is non-memtermable then the action taken by this tool is 
inappropriate.

And by including "and its data" I mean to include that you could 
conceivably break some data that you won't be able to fix by re-IPL. Not 
likely, but conceivable.

Peter Relson
z/OS Core Technology Design


--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Whacking a Job, or Getting rid of an Address Space

2017-05-14 Thread Sam Golob

Hi Folks,

 Hopefully this info will help get you out of a jam sometime..

Sam

GETTING RID OF AN ADDRESS SPACE (or WHACKING A JOB)

 In my career as a system doctor, I've had trouble, more than
once, in getting rid of an address space that was malfunctioning, and
starting over.  Sometimes the address space was marked "NON-CANCELABLE"
and I've even seen address spaces marked "NON-FORCIBLE".

 Mentioning this problem to fellow sysprogs, I've gotten answers
like: "You've got to learn how to use FORCE correctly."  Or they'd say
some similar nonsense.  Sometimes they're right.  But a bunch of times,
there are a couple of bits in the way.  And if you can't get past them,
you can't get rid of the job or other address space.  I've seen this
situation force an IPL in the middle of the day.  (NO GOOD!!!)

 So what do you do?  There are two free APF-authorized TSO commands
which can help you.

 One is called CSCF, and it is on CBT File 954.  The other is
called CNCLPG, and it is on CBT File 826 (Updates Page).  CSCF can

get rid of the main offending bits.  CNCLPG (with the KILL option)

can do that, and then whack the job or address space.

 Both of these commands do multiple functions.  But to get rid of
a job or system task, you first need to change its status to CANCELABLE
or FORCIBLE, and then you need to CANCEL it or FORCE it. Sometimes,
you can just "whack it".  To do so, use the KILL subcommand of the
CNCLPG command (Updates page of www.cbttape.org).

 The KILL subcommand of CNCLPG will do a CALLRTM TYPE=MEMTERM
operation on the address space, but before it does so, it turns off the
ASCBNOMT and ASCBNOMD bits in the ASCB.  ASCBNOMT is what makes a job
"NON-FORCIBLE", and ASCBNOMD off, makes it FORCIBLE even if the error
was a DAT error.  THEN the KILL subcommand does the CALLRTM MEMTERM.
In that way, KILL makes sure that nothing will get in the way of the
"FORCE" operation, and the address space will be duly "whacked".  Then
you can start it over.

 One note of caution:  You have to whack or alter he correct
address space.  If you don't, you can cause havoc.

 WHY?  Both CNCLPG and CSCF have to run the CSCB chain. This is
a chain representing all the active jobs, system tasks, and TSU's in
the system.  Sometimes there are many address spaces with the SAME
name.  And there can be more than one address space with the SAME
ASID (I bet you didn't know that).  So in order to make sure you are
altering the correct address space, you have to specify BOTH the ASID
and the JOBNAME when you run CNCLPG.

 How do you get that information in the first place?

 Run CNCLPG with the DISPLAY command.

 The DISPLAY command will show all matches and all occurrences.

 So if you run CNCLPG jobname DISP, you will see all the CSCB
entries matching your jobname, and you can specify the one with the
correct ASID by using the ASID(hex) parameter together with the
jobname parameter.

 Do this first, and you won't be sorry later.  Do DISP several
times, until you see only one entry--the entry that you want to alter.

 Best of everything.  Use this in good health..

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN