Re: Re. Whacking a Job, or Getting rid of an Address Space

2017-05-17 Thread Sam Golob

Hi Folks,

I just wish to thank all the contributors to this thread.  I feel 
that every single contribution added to our general knowledge.  Thank 
you all.


This is what the IBM-Main forum is all about.

All the best of everything to all of you.

Sincerely,Sam

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: Re. Whacking a Job, or Getting rid of an Address Space

2017-05-16 Thread Edward Gould
> On May 16, 2017, at 10:40 AM, Walt Farrell  wrote:
> 
> On Tue, 16 May 2017 09:57:16 -0400, Sam Golob  wrote:
> 
>>That having been said, the system doctor sometimes has to deal with
>> things that go wrong.  It's nice when the system is working as
>> designed.  But sometimes, the NON-CANCELABLE job or STC goes awry, and
>> it has to be restarted.  In such a case, as in the middle of a day's
>> production, you want to avoid an emergency IPL.  And so you need a tool
>> in the toolbox, to cancel the job or STC.  Sometimes the only solution
>> is to blow it away.
> 
> However, you should be prepared, when you use the tool, to have to IPL 
> anyway. And that should be clearly stated in any documentation for the tool.
> 
> -- 
> Walt
—SNIP———

In the early days of MVS we had constant issues of jobs going non cancelable.
Our IBM SE wrote a super cancel (callrtm)  command.
We *WERE* using it sparingly. It was a last grasp to IPL.
Although most of our issues had to due with allocation (Q4) getting hung.
We were IPLing once or twice in 3 days (sometimes as many as 8).
IBM rewrote allocation and just about eliminated the need for it. We did still 
use it from time to time and it did save IPL’s.
Our other IBMer who worked behind mountains of standalone dumps was used to it 
and didn’t raise it as an issue.
But it was understood by the group that using it was a last gasp attempt and 
then only making sure there were no allocation hangups.

Ed

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: Whacking a Job, or Getting rid of an Address Space

2017-05-16 Thread Sam Golob

Hi Folks,

I want to point out that not all versions of CNCLPG on CBT File 826 
have BURN or KILL capability for JOBs or STCs.  The earlier versions of 
the program (included in the file) have less power.


You have the choice of installing one of the earlier versions of 
the program (1.10, 1.11, or 1.20) if you only want to make jobs 
non-cancelable or non-swappable (1.11 and 1.20). So then you can use one 
of the earlier versions.


My point is to let you know that in an emergency, the power is there.

All the best of everything to all of you.

Sincerely,Sam

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: Whacking a Job, or Getting rid of an Address Space

2017-05-16 Thread Sam Golob

Hi Folks,

I just want to tell you that I very much appreciate this 
discussion.  Very much depends on POINT OF VIEW, and when all the points 
of view get together, real progress is made, and everybody becomes wiser.


There are at least three separate points of view here on this forum:

1.  The systems programmers who have to set up and run the system 
software in data centers.


2.  The professional "system level" programmers who usually work for 
vendors.


3.  The IBM programmers who design and build the system software.

There may be other people here also, such as application 
programmers and "programmer toolmakers" and more types, as well.


Everybody has a separate point of view.  In summary, here they are:

People who run data centers, have to make sure everything run 
smoothly, and they have to deal with "the problems of the non-ideal 
world".  Something breaks--fix it.  Keep the system up.  Make sure the 
system levels are correctly set for what we are doing, and for what we need.


Professional "system level" programmers dig deep into the system.  
"Authority" is not what is usually on their mind, unless they are 
dealing with a security-related product.  For example, doing 
cross-memory programming is usually "a piece of cake" for them.  But 
changing some fields in another user's control blocks, which might be 
easy for THEM to do, is a nightmare from the system administrator's 
point of view, so you already see a difference in point of view between 
these two groups.


Finally, the IBM designers and programmers have a big 
responsibility of delivering a consistent and reliable system, but they 
may tend (depending on the individual person's actual experience) to be 
a bit separated from the system programmer's "real world" problems, and 
the things that actually come up in a real data center, day by day.


I am glad that my post is bringing these 3 points of view together, 
in a productive and fruitful way.  If I did not write about this topic, 
then some sysprog might be without a necessary tool in his/her toolbox.  
When the emergency came up, they would be as helpless as I was, many 
years ago.  On the other hand, we know that the tool can be used 
improperly, either by the right people or the wrong people.  So I had a 
quandary:  "To say, or NOT to say.  That was the question."


I opted to "say".  I remember the pain in my heart, when JES2 
couldn't be removed, and we had to IPL in the middle of the day.  It was 
easy to fix if we could just cancel JES2, and restart it.  I had already 
proven that to myself, at that time. But I was helpless and adrift.  We 
had to IPL.  NEVER AGAIN!  I won't let that happen to someone!  NEVER!!!


So there.  I trust we've all been helpful...

All the best of everything to everyone.

Sincerely,Sam

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: Re. Whacking a Job, or Getting rid of an Address Space

2017-05-16 Thread Walt Farrell
On Tue, 16 May 2017 09:57:16 -0400, Sam Golob  wrote:

> That having been said, the system doctor sometimes has to deal with
>things that go wrong.  It's nice when the system is working as
>designed.  But sometimes, the NON-CANCELABLE job or STC goes awry, and
>it has to be restarted.  In such a case, as in the middle of a day's
>production, you want to avoid an emergency IPL.  And so you need a tool
>in the toolbox, to cancel the job or STC.  Sometimes the only solution
>is to blow it away.

However, you should be prepared, when you use the tool, to have to IPL anyway. 
And that should be clearly stated in any documentation for the tool.

-- 
Walt

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: Whacking a Job, or Getting rid of an Address Space

2017-05-16 Thread Elardus Engelbrecht
Paul Gilmartin wrote:

>I sense a gradual escalation here.

>Long ago, there was the CANCEL command so operators could terminate 
>troublesome jobs.
>But a designer felt that sometimes the programmer knows better, and provided 
>the non-cancellable attribute.
>Then a designer felt that sometimes the operator knows even better and 
>provided the FORCE command.
>Then a designer felt that sometimes the programmer knows even better and 
>provided the non-forcible attribute.
>Now someone feels that operators know better and is providing a WHACK facility.

Not good. All involved must look why you need to WHACK it. 

For example, your job itself is waiting for a mount or is waiting for HSM to 
recall something, but there is a mount problem. Solve that, and you don't need 
all those fancy measures including a 222 abend. 

Ok, that is just one sample reason why 'WHACKING' or all those 'x who knows 
better' are not suitable.

One example we got a few weeks ago was, a session was holding a CICS region. 
CPU% and region consumed climbed. Response times dropped on all CICS regions. 
Instead having WHACK down the troublesome CICS region, the network people 
simply VARY that session offline and all things returned to normal. No STCs 
were stopped at all.

But, so, normally check all normal avenues, then escalate using more and more 
extreme measures as per Paul' suggestion.

Groete / Greetings
Elardus Engelbrecht

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: Re. Whacking a Job, or Getting rid of an Address Space

2017-05-16 Thread Rivera, Dan
..and sometimes a Doctor uses a Sonic Screw Driver.

-Original Message-
From: IBM Mainframe Discussion List [mailto:IBM-MAIN@LISTSERV.UA.EDU] On Behalf 
Of Tom Marchant
Sent: Tuesday, May 16, 2017 10:11 AM
To: IBM-MAIN@LISTSERV.UA.EDU
Subject: Re: Re. Whacking a Job, or Getting rid of an Address Space

On Tue, 16 May 2017 09:57:16 -0400, Sam Golob wrote:

> the system doctor sometimes has to deal with things that go wrong.
>
>The
>doctor needs to have a scalpel.  Most doctors don't often need to use
>the scalpel.

Sometimes the "doctor" isn't really a doctor, but a boy scout who may have 
learned some rudimentary first aid.

And sometimes that thing that he thought was a scalpel is really a chain saw.

--
Tom Marchant

--
For IBM-MAIN subscribe / signoff / archive access instructions, send email to 
lists...@listserv.ua.edu with the message: INFO IBM-MAIN


The information contained in this message, and any attachments thereto,
is intended solely for the use of the addressee(s) and may contain
confidential and/or privileged material. Any review, retransmission,
dissemination, copying, or other use of the transmitted information is
prohibited. If you received this in error, please contact the sender
and delete the material from any computer. UNIGROUP.COM



--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: Re. Whacking a Job, or Getting rid of an Address Space

2017-05-16 Thread Tom Marchant
On Tue, 16 May 2017 09:57:16 -0400, Sam Golob wrote:

> the system doctor sometimes has to deal with
>things that go wrong. 
>
>The
>doctor needs to have a scalpel.  Most doctors don't often need to use
>the scalpel.

Sometimes the "doctor" isn't really a doctor, but a boy scout who may 
have learned some rudimentary first aid.

And sometimes that thing that he thought was a scalpel is really a 
chain saw.

-- 
Tom Marchant

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: Whacking a Job, or Getting rid of an Address Space

2017-05-16 Thread Paul Gilmartin
On Tue, 16 May 2017 07:59:56 -0400, Peter Relson wrote:

>Maybe it's me, but I found this post kind of inappropriate since it came 
>without caveats. One might think/hope that whoever defined a space as 
>non-cancelale or non-memtermable had a legitimate reason for doing so. 
>That likely isn't of course always true, but isn't that what you really 
>need to assume?
> 
I sense a gradual escalation here.

Long ago, there was the CANCEL command so operators could
terminate troublesome jobs.

But a designer felt that sometimes the programmer knows better,
and provided the non-cancellable attribute.

Then a designer felt that sometimes the operator knows even better
and provided the FORCE command.

Then a designer felt that sometimes the programmer knows even better
and provided the non-forcible attribute.

Now someone feels that operators know better and is providing
a WHACK facility.
...

Perhaps there should be a numeric attribute and a CANCEL command
argument, such that if the value supplied by the operator exceeds the
program's attribute, the CANCEL just works.

Floating point, of course.  Decimal floating point.

The operator will always have the nuclear option.

-- gil

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re. Whacking a Job, or Getting rid of an Address Space

2017-05-16 Thread Sam Golob

Hi Folks,

Of course, you're right, Peter.  Jobs or STCs are marked 
NON-CANCELABLE for a very good reason.  Under normal circumstances, they 
should not be cancelled, because it would endanger the system.  That's 
what the PPT is for.  Almost always, the safeguards that are there, are 
there for a very good reason. And I'm on management's side all the way.  
The idea is to keep the systems running as flawlessly and smoothly as 
possible.


That having been said, the system doctor sometimes has to deal with 
things that go wrong.  It's nice when the system is working as 
designed.  But sometimes, the NON-CANCELABLE job or STC goes awry, and 
it has to be restarted.  In such a case, as in the middle of a day's 
production, you want to avoid an emergency IPL.  And so you need a tool 
in the toolbox, to cancel the job or STC.  Sometimes the only solution 
is to blow it away.  The expensive multi-utility packages all contain 
such tools.  You wouldn't criticize Omegamon (TM) or RESOLVE (TM), would 
you?  But a shop which can't afford to buy them is sometimes stuck, and 
is forced to IPL and lose a lot of production time.  That's why I wrote 
CNCLPG 20 years after I had such an emergency, which I never forgot about.


Systems programmers do not live in an ideal world. Problems come 
up, in running the data center, which can be very unforeseen.  The 
doctor needs to have a scalpel.  Most doctors don't often need to use 
the scalpel.  But when you need it, and nothing else works, it's nice to 
know that it is there, sitting in the toolbox.  This is where I come 
from.  It's all for the purpose of keeping the shop running smoothly.


Thanks for listening.  All the best of everything to all of you.

Sincerely,Sam

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: Whacking a Job, or Getting rid of an Address Space

2017-05-16 Thread Peter Relson
Maybe it's me, but I found this post kind of inappropriate since it came 
without caveats. One might think/hope that whoever defined a space as 
non-cancelale or non-memtermable had a legitimate reason for doing so. 
That likely isn't of course always true, but isn't that what you really 
need to assume?

Unless you are willing to risk your system and its data by assuming that 
it is OK to cancel something that is non-cancelable or memterm something 
that is non-memtermable then the action taken by this tool is 
inappropriate.

And by including "and its data" I mean to include that you could 
conceivably break some data that you won't be able to fix by re-IPL. Not 
likely, but conceivable.

Peter Relson
z/OS Core Technology Design


--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN