Testing SFM policy

2016-06-06 Thread Jesse 1 Robinson
I'd like to test my Sysplex Failure Management policy. The question is how to 
make a system stop responding long enough to trigger expulsion from the 
sysplex. I'm thinking of issuing QUIESCE on a member. Have not used that in 
decades. Will it cause lack of XCF heartbeat? I can just try it unless someone 
has a better suggestion.

The last time this happened for reals was when a system ran clean out of SQA on 
account of a bad dog product. That's pretty hard to recreate. All I want is to 
go through the pain and agony of partitioning to test message handling and auto 
SAD.

.
.
.
J.O.Skip Robinson
Southern California Edison Company
Electric Dragon Team Paddler
SHARE MVS Program Co-Manager
323-715-0595 Mobile
626-302-7535 Office
robin...@sce.com


--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: Testing SFM policy

2016-06-06 Thread Mark Jacobs - Listserv
Since Quiesce will put the system in a restartable wait state, I'd think 
that the XCF heartbeat would stop too.


Mark Jacobs


Jesse 1 Robinson 
June 6, 2016 at 7:32 PM
I'd like to test my Sysplex Failure Management policy. The question is 
how to make a system stop responding long enough to trigger expulsion 
from the sysplex. I'm thinking of issuing QUIESCE on a member. Have 
not used that in decades. Will it cause lack of XCF heartbeat? I can 
just try it unless someone has a better suggestion.


The last time this happened for reals was when a system ran clean out 
of SQA on account of a bad dog product. That's pretty hard to 
recreate. All I want is to go through the pain and agony of 
partitioning to test message handling and auto SAD.


.
.
.
J.O.Skip Robinson
Southern California Edison Company
Electric Dragon Team Paddler
SHARE MVS Program Co-Manager
323-715-0595 Mobile
626-302-7535 Office
robin...@sce.com


--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Please be alert for any emails that may ask you for login information 
or directs you to login via a link. If you believe this message is a 
phish or aren't sure whether this message is trustworthy, please send 
the original message as an attachment to 'phish...@timeinc.com'.




--

Mark Jacobs
Time Customer Service
Technology and Product Engineering

The standard you walk past is the standard you accept.
Lt. Gen. David Morrison


--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: Testing SFM policy

2016-06-06 Thread Rob Schramm
Deactivate? Stop? Force all chpids, paths, devices offline? copy a bunch of
zeros into low storage and wait for the havoc to begin..

Free CSA.. almost guaranteed IPL

I am sure the creative members of IBM-MAIN can come up with a plethora of
normally undesirable actions to exercise SFM.

Rob Schramm




On Mon, Jun 6, 2016, 7:44 PM Mark Jacobs - Listserv <
mark.jac...@custserv.com> wrote:

> Since Quiesce will put the system in a restartable wait state, I'd think
> that the XCF heartbeat would stop too.
>
> Mark Jacobs
>
> > Jesse 1 Robinson 
> > June 6, 2016 at 7:32 PM
> > I'd like to test my Sysplex Failure Management policy. The question is
> > how to make a system stop responding long enough to trigger expulsion
> > from the sysplex. I'm thinking of issuing QUIESCE on a member. Have
> > not used that in decades. Will it cause lack of XCF heartbeat? I can
> > just try it unless someone has a better suggestion.
> >
> > The last time this happened for reals was when a system ran clean out
> > of SQA on account of a bad dog product. That's pretty hard to
> > recreate. All I want is to go through the pain and agony of
> > partitioning to test message handling and auto SAD.
> >
> > .
> > .
> > .
> > J.O.Skip Robinson
> > Southern California Edison Company
> > Electric Dragon Team Paddler
> > SHARE MVS Program Co-Manager
> > 323-715-0595 Mobile
> > 626-302-7535 Office
> > robin...@sce.com
> >
> >
> > --
> > For IBM-MAIN subscribe / signoff / archive access instructions,
> > send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN
> >
> >
> > Please be alert for any emails that may ask you for login information
> > or directs you to login via a link. If you believe this message is a
> > phish or aren't sure whether this message is trustworthy, please send
> > the original message as an attachment to 'phish...@timeinc.com'.
> >
>
> --
>
> Mark Jacobs
> Time Customer Service
> Technology and Product Engineering
>
> The standard you walk past is the standard you accept.
> Lt. Gen. David Morrison
>
>
> --
> For IBM-MAIN subscribe / signoff / archive access instructions,
> send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN
>
-- 

Rob Schramm
The Art of Mainframe, Inc

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: Testing SFM policy

2016-06-06 Thread Ed Jaffe

On 6/6/2016 4:32 PM, Jesse 1 Robinson wrote:

I'd like to test my Sysplex Failure Management policy. The question is how to 
make a system stop responding long enough to trigger expulsion from the 
sysplex. I'm thinking of issuing QUIESCE on a member.


If you quiesce a system, it will indeed be fenced/stopped by SFM.

FWIW, when we fence a system that way, we don't get a SAD or an auto re-IPL.

--
Edward E Jaffe
Phoenix Software International, Inc
831 Parkview Drive North
El Segundo, CA 90245
http://www.phoenixsoftware.com/

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: Testing SFM policy

2016-06-06 Thread Vernooij, CP (ITOPT1) - KLM
No system escapes from the System Reset button on the HMC. This is how we 
tested our CF failure procedures.

Kees.

-Original Message-
From: IBM Mainframe Discussion List [mailto:IBM-MAIN@LISTSERV.UA.EDU] On Behalf 
Of Rob Schramm
Sent: 07 June, 2016 2:42
To: IBM-MAIN@LISTSERV.UA.EDU
Subject: Re: Testing SFM policy

Deactivate? Stop? Force all chpids, paths, devices offline? copy a bunch of
zeros into low storage and wait for the havoc to begin..

Free CSA.. almost guaranteed IPL

I am sure the creative members of IBM-MAIN can come up with a plethora of
normally undesirable actions to exercise SFM.

Rob Schramm




On Mon, Jun 6, 2016, 7:44 PM Mark Jacobs - Listserv <
mark.jac...@custserv.com> wrote:

> Since Quiesce will put the system in a restartable wait state, I'd think
> that the XCF heartbeat would stop too.
>
> Mark Jacobs
>
> > Jesse 1 Robinson <mailto:jesse1.robin...@sce.com>
> > June 6, 2016 at 7:32 PM
> > I'd like to test my Sysplex Failure Management policy. The question is
> > how to make a system stop responding long enough to trigger expulsion
> > from the sysplex. I'm thinking of issuing QUIESCE on a member. Have
> > not used that in decades. Will it cause lack of XCF heartbeat? I can
> > just try it unless someone has a better suggestion.
> >
> > The last time this happened for reals was when a system ran clean out
> > of SQA on account of a bad dog product. That's pretty hard to
> > recreate. All I want is to go through the pain and agony of
> > partitioning to test message handling and auto SAD.
> >
> > .
> > .
> > .
> > J.O.Skip Robinson
> > Southern California Edison Company
> > Electric Dragon Team Paddler
> > SHARE MVS Program Co-Manager
> > 323-715-0595 Mobile
> > 626-302-7535 Office
> > robin...@sce.com<mailto:robin...@sce.com>
> >
> >
> > --
> > For IBM-MAIN subscribe / signoff / archive access instructions,
> > send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN
> >
> >
> > Please be alert for any emails that may ask you for login information
> > or directs you to login via a link. If you believe this message is a
> > phish or aren't sure whether this message is trustworthy, please send
> > the original message as an attachment to 'phish...@timeinc.com'.
> >
>
> --
>
> Mark Jacobs
> Time Customer Service
> Technology and Product Engineering
>
> The standard you walk past is the standard you accept.
> Lt. Gen. David Morrison
>
>
> --
> For IBM-MAIN subscribe / signoff / archive access instructions,
> send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN
>
-- 

Rob Schramm
The Art of Mainframe, Inc

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN

For information, services and offers, please visit our web site: 
http://www.klm.com. This e-mail and any attachment may contain confidential and 
privileged material intended for the addressee only. If you are not the 
addressee, you are notified that no part of the e-mail or any attachment may be 
disclosed, copied or distributed, and that any other action related to this 
e-mail or attachment is strictly prohibited, and may be unlawful. If you have 
received this e-mail by error, please notify the sender immediately by return 
e-mail, and delete this message. 

Koninklijke Luchtvaart Maatschappij NV (KLM), its subsidiaries and/or its 
employees shall not be liable for the incorrect or incomplete transmission of 
this e-mail or any attachments, nor responsible for any delay in receipt. 
Koninklijke Luchtvaart Maatschappij N.V. (also known as KLM Royal Dutch 
Airlines) is registered in Amstelveen, The Netherlands, with registered number 
33014286




--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: Testing SFM policy

2016-06-07 Thread Mike Myers

Mark and Skip:

I can assure you that QUIESCE will stop the XCF heartbeat. I have used 
it often in training to demonstrate the loss of heartbeat and triggering 
the sysplex to respond to the loss of a member. With an active SFM 
policy, it will trigger your desired action.


The advantage to using QUIESCE is that you can back out and reactivate 
the system with PSW restart if SFM is not active or you can beat it to 
the punch and will not lose the member.


Mike Myers
Senior z/OS Systems Programmer and Instructor
Mentor Services Corporation
Goldsboro, NC
(919) 341-5210

On 06/06/2016 07:44 PM, Mark Jacobs - Listserv wrote:
Since Quiesce will put the system in a restartable wait state, I'd 
think that the XCF heartbeat would stop too.


Mark Jacobs


Jesse 1 Robinson 
June 6, 2016 at 7:32 PM
I'd like to test my Sysplex Failure Management policy. The question 
is how to make a system stop responding long enough to trigger 
expulsion from the sysplex. I'm thinking of issuing QUIESCE on a 
member. Have not used that in decades. Will it cause lack of XCF 
heartbeat? I can just try it unless someone has a better suggestion.


The last time this happened for reals was when a system ran clean out 
of SQA on account of a bad dog product. That's pretty hard to 
recreate. All I want is to go through the pain and agony of 
partitioning to test message handling and auto SAD.


.
.
.
J.O.Skip Robinson
Southern California Edison Company
Electric Dragon Team Paddler
SHARE MVS Program Co-Manager
323-715-0595 Mobile
626-302-7535 Office
robin...@sce.com


--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Please be alert for any emails that may ask you for login information 
or directs you to login via a link. If you believe this message is a 
phish or aren't sure whether this message is trustworthy, please send 
the original message as an attachment to 'phish...@timeinc.com'.






--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: Testing SFM policy

2016-06-07 Thread Jesse 1 Robinson
I will proceed with QUIESCE once our automation SME is ready to test. Key goal 
is to capture the partitioning message on another member of the sysplex and 
blast out some alerts. In a recent failure at oh-dark-thirty on a Saturday 
morning, Ops did not notice the wait state failure (SQA exhausted). We also 
have a new SAD volume to test: single Mod-54. Thanks for all the advice. 

.
.
.
J.O.Skip Robinson
Southern California Edison Company
Electric Dragon Team Paddler 
SHARE MVS Program Co-Manager
323-715-0595 Mobile
626-302-7535 Office
robin...@sce.com

-Original Message-
From: IBM Mainframe Discussion List [mailto:IBM-MAIN@LISTSERV.UA.EDU] On Behalf 
Of Mike Myers
Sent: Tuesday, June 07, 2016 5:45 AM
To: IBM-MAIN@LISTSERV.UA.EDU
Subject: (External):Re: Testing SFM policy

Mark and Skip:

I can assure you that QUIESCE will stop the XCF heartbeat. I have used it often 
in training to demonstrate the loss of heartbeat and triggering the sysplex to 
respond to the loss of a member. With an active SFM policy, it will trigger 
your desired action.

The advantage to using QUIESCE is that you can back out and reactivate the 
system with PSW restart if SFM is not active or you can beat it to the punch 
and will not lose the member.

Mike Myers
Senior z/OS Systems Programmer and Instructor Mentor Services Corporation 
Goldsboro, NC
(919) 341-5210

On 06/06/2016 07:44 PM, Mark Jacobs - Listserv wrote:
> Since Quiesce will put the system in a restartable wait state, I'd 
> think that the XCF heartbeat would stop too.
>
> Mark Jacobs
>
>> Jesse 1 Robinson <mailto:jesse1.robin...@sce.com> June 6, 2016 at 
>> 7:32 PM I'd like to test my Sysplex Failure Management policy. The 
>> question is how to make a system stop responding long enough to 
>> trigger expulsion from the sysplex. I'm thinking of issuing QUIESCE 
>> on a member. Have not used that in decades. Will it cause lack of XCF 
>> heartbeat? I can just try it unless someone has a better suggestion.
>>
>> The last time this happened for reals was when a system ran clean out 
>> of SQA on account of a bad dog product. That's pretty hard to 
>> recreate. All I want is to go through the pain and agony of 
>> partitioning to test message handling and auto SAD.
>>
>> .
>> .
>> .
>> J.O.Skip Robinson
>> Southern California Edison Company
>> Electric Dragon Team Paddler
>> SHARE MVS Program Co-Manager
>> 323-715-0595 Mobile
>> 626-302-7535 Office
>> robin...@sce.com<mailto:robin...@sce.com>

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: Testing SFM policy

2016-06-07 Thread Jesse 1 Robinson
Testing was semi-successful. On one hand, QUIESCE stopped the system. Missing 
heartbeat was detected by SFM and system was partitioned out. However, I was 
also trying to test SAD and AutoIPL. Nothing happened in that arena, so--as I 
do when all else fails--I RTFM. Found this:

"For restartable wait states, Loadwait will ignore the AutoIPL policy unless a 
matching WSAT entry is found that has one or both flags on. If a bit is found 
on, then the corresponding SAD or re-IPL will be performed. As of this writing, 
the WSAT contains no entries matching any restartable wait state and reason 
codes, so a restartable wait state request will not result in any AutoIPL 
action."

Because the QUIESCE wait state (x'CCC') is restartable, AutoIPL and SAD are 
ignored. I need to find another way to kill the system. Ironically we've had a 
few AutoIPLs over the years, mostly (all?) involving virtual storage exhaustion 
of some kind. None of them intentional.  


.
.
.
J.O.Skip Robinson
Southern California Edison Company
Electric Dragon Team Paddler 
SHARE MVS Program Co-Manager
323-715-0595 Mobile
626-302-7535 Office
robin...@sce.com

-Original Message-
From: IBM Mainframe Discussion List [mailto:IBM-MAIN@LISTSERV.UA.EDU] On Behalf 
Of Jesse 1 Robinson
Sent: Tuesday, June 07, 2016 8:19 AM
To: IBM-MAIN@LISTSERV.UA.EDU
Subject: (External):Re: Testing SFM policy

I will proceed with QUIESCE once our automation SME is ready to test. Key goal 
is to capture the partitioning message on another member of the sysplex and 
blast out some alerts. In a recent failure at oh-dark-thirty on a Saturday 
morning, Ops did not notice the wait state failure (SQA exhausted). We also 
have a new SAD volume to test: single Mod-54. Thanks for all the advice. 

.
.
.
J.O.Skip Robinson
Southern California Edison Company
Electric Dragon Team Paddler
SHARE MVS Program Co-Manager
323-715-0595 Mobile
626-302-7535 Office
robin...@sce.com

-Original Message-
From: IBM Mainframe Discussion List [mailto:IBM-MAIN@LISTSERV.UA.EDU] On Behalf 
Of Mike Myers
Sent: Tuesday, June 07, 2016 5:45 AM
To: IBM-MAIN@LISTSERV.UA.EDU
Subject: (External):Re: Testing SFM policy

Mark and Skip:

I can assure you that QUIESCE will stop the XCF heartbeat. I have used it often 
in training to demonstrate the loss of heartbeat and triggering the sysplex to 
respond to the loss of a member. With an active SFM policy, it will trigger 
your desired action.

The advantage to using QUIESCE is that you can back out and reactivate the 
system with PSW restart if SFM is not active or you can beat it to the punch 
and will not lose the member.

Mike Myers
Senior z/OS Systems Programmer and Instructor Mentor Services Corporation 
Goldsboro, NC
(919) 341-5210

On 06/06/2016 07:44 PM, Mark Jacobs - Listserv wrote:
> Since Quiesce will put the system in a restartable wait state, I'd 
> think that the XCF heartbeat would stop too.
>
> Mark Jacobs
>
>> Jesse 1 Robinson <mailto:jesse1.robin...@sce.com> June 6, 2016 at
>> 7:32 PM I'd like to test my Sysplex Failure Management policy. The 
>> question is how to make a system stop responding long enough to 
>> trigger expulsion from the sysplex. I'm thinking of issuing QUIESCE 
>> on a member. Have not used that in decades. Will it cause lack of XCF 
>> heartbeat? I can just try it unless someone has a better suggestion.
>>
>> The last time this happened for reals was when a system ran clean out 
>> of SQA on account of a bad dog product. That's pretty hard to 
>> recreate. All I want is to go through the pain and agony of 
>> partitioning to test message handling and auto SAD.
>>

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: Testing SFM policy

2016-06-07 Thread Tom Brennan
When I saw your original post, the first thing I thought of was a 
program that sets the PSW so it can't be interrupted, then loops for a 
time (while checking the clock) before exiting.  ATTACH a few of these 
tasks and maybe you could take over all CPU's and better simulate the 
bad-dog.  Just a theory though.


Jesse 1 Robinson wrote:

Testing was semi-successful. On one hand, QUIESCE stopped the system. Missing 
heartbeat was detected by SFM and system was partitioned out. However, I was 
also trying to test SAD and AutoIPL. Nothing happened in that arena, so--as I 
do when all else fails--I RTFM. Found this:

"For restartable wait states, Loadwait will ignore the AutoIPL policy unless a 
matching WSAT entry is found that has one or both flags on. If a bit is found on, then 
the corresponding SAD or re-IPL will be performed. As of this writing, the WSAT contains 
no entries matching any restartable wait state and reason codes, so a restartable wait 
state request will not result in any AutoIPL action."

Because the QUIESCE wait state (x'CCC') is restartable, AutoIPL and SAD are ignored. I need to find another way to kill the system. Ironically we've had a few AutoIPLs over the years, mostly (all?) involving virtual storage exhaustion of some kind. None of them intentional.  



.
.
.
J.O.Skip Robinson
Southern California Edison Company
Electric Dragon Team Paddler 
SHARE MVS Program Co-Manager

323-715-0595 Mobile
626-302-7535 Office
robin...@sce.com

-Original Message-
From: IBM Mainframe Discussion List [mailto:IBM-MAIN@LISTSERV.UA.EDU] On Behalf 
Of Jesse 1 Robinson
Sent: Tuesday, June 07, 2016 8:19 AM
To: IBM-MAIN@LISTSERV.UA.EDU
Subject: (External):Re: Testing SFM policy

I will proceed with QUIESCE once our automation SME is ready to test. Key goal is to capture the partitioning message on another member of the sysplex and blast out some alerts. In a recent failure at oh-dark-thirty on a Saturday morning, Ops did not notice the wait state failure (SQA exhausted). We also have a new SAD volume to test: single Mod-54. Thanks for all the advice. 


.
.
.
J.O.Skip Robinson
Southern California Edison Company
Electric Dragon Team Paddler
SHARE MVS Program Co-Manager
323-715-0595 Mobile
626-302-7535 Office
robin...@sce.com

-Original Message-
From: IBM Mainframe Discussion List [mailto:IBM-MAIN@LISTSERV.UA.EDU] On Behalf 
Of Mike Myers
Sent: Tuesday, June 07, 2016 5:45 AM
To: IBM-MAIN@LISTSERV.UA.EDU
Subject: (External):Re: Testing SFM policy

Mark and Skip:

I can assure you that QUIESCE will stop the XCF heartbeat. I have used it often 
in training to demonstrate the loss of heartbeat and triggering the sysplex to 
respond to the loss of a member. With an active SFM policy, it will trigger 
your desired action.

The advantage to using QUIESCE is that you can back out and reactivate the 
system with PSW restart if SFM is not active or you can beat it to the punch 
and will not lose the member.

Mike Myers
Senior z/OS Systems Programmer and Instructor Mentor Services Corporation 
Goldsboro, NC
(919) 341-5210

On 06/06/2016 07:44 PM, Mark Jacobs - Listserv wrote:

Since Quiesce will put the system in a restartable wait state, I'd 
think that the XCF heartbeat would stop too.


Mark Jacobs



Jesse 1 Robinson <mailto:jesse1.robin...@sce.com> June 6, 2016 at
7:32 PM I'd like to test my Sysplex Failure Management policy. The 
question is how to make a system stop responding long enough to 
trigger expulsion from the sysplex. I'm thinking of issuing QUIESCE 
on a member. Have not used that in decades. Will it cause lack of XCF 
heartbeat? I can just try it unless someone has a better suggestion.


The last time this happened for reals was when a system ran clean out 
of SQA on account of a bad dog product. That's pretty hard to 
recreate. All I want is to go through the pain and agony of 
partitioning to test message handling and auto SAD.





--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN




--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: Testing SFM policy

2016-06-07 Thread Gerhard Adam
You can write a simple assembler program to load a WAIT PSW that is
disabled for interrupts or you can modify the RESTART NEW PSW in the
PSW with the wait bit on and disabled for interrupts.

If you use the RESTART NEW PSW, then a PSW RESTART should result in
your system going into a wait state.

Both should kill your system.

Adam

-From: "Jesse 1 Robinson" 
To: 
Cc: 
Sent: Tue, 7 Jun 2016 19:19:08 +
Subject: Re: Testing SFM policy

 Testing was semi-successful. On one hand, QUIESCE stopped the system.
Missing heartbeat was detected by SFM and system was partitioned out.
However, I was also trying to test SAD and AutoIPL. Nothing happened
in that arena, so--as I do when all else fails--I RTFM. Found this:

 "For restartable wait states, Loadwait will ignore the AutoIPL policy
unless a matching WSAT entry is found that has one or both flags on.
If a bit is found on, then the corresponding SAD or re-IPL will be
performed. As of this writing, the WSAT contains no entries matching
any restartable wait state and reason codes, so a restartable wait
state request will not result in any AutoIPL action."

 Because the QUIESCE wait state (x'CCC') is restartable, AutoIPL and
SAD are ignored. I need to find another way to kill the system.
Ironically we've had a few AutoIPLs over the years, mostly (all?)
involving virtual storage exhaustion of some kind. None of them
intentional. 

 .
 .
 .
 J.O.Skip Robinson
 Southern California Edison Company
 Electric Dragon Team Paddler 
 SHARE MVS Program Co-Manager
 323-715-0595 Mobile
 626-302-7535 Office
 robin...@sce.com

 -Original Message-
 From: IBM Mainframe Discussion List [mailto:IBM-MAIN@LISTSERV.UA.EDU]
On Behalf Of Jesse 1 Robinson
 Sent: Tuesday, June 07, 2016 8:19 AM
 To: IBM-MAIN@LISTSERV.UA.EDU
 Subject: (External):Re: Testing SFM policy

 I will proceed with QUIESCE once our automation SME is ready to test.
Key goal is to capture the partitioning message on another member of
the sysplex and blast out some alerts. In a recent failure at
oh-dark-thirty on a Saturday morning, Ops did not notice the wait
state failure (SQA exhausted). We also have a new SAD volume to test:
single Mod-54. Thanks for all the advice. 

 .
 .
 .
 J.O.Skip Robinson
 Southern California Edison Company
 Electric Dragon Team Paddler
 SHARE MVS Program Co-Manager
 323-715-0595 Mobile
 626-302-7535 Office
 robin...@sce.com

 -Original Message-
 From: IBM Mainframe Discussion List [mailto:IBM-MAIN@LISTSERV.UA.EDU]
On Behalf Of Mike Myers
 Sent: Tuesday, June 07, 2016 5:45 AM
 To: IBM-MAIN@LISTSERV.UA.EDU
 Subject: (External):Re: Testing SFM policy

 Mark and Skip:

 I can assure you that QUIESCE will stop the XCF heartbeat. I have
used it often in training to demonstrate the loss of heartbeat and
triggering the sysplex to respond to the loss of a member. With an
active SFM policy, it will trigger your desired action.

 The advantage to using QUIESCE is that you can back out and
reactivate the system with PSW restart if SFM is not active or you can
beat it to the punch and will not lose the member.

 Mike Myers
 Senior z/OS Systems Programmer and Instructor Mentor Services
Corporation Goldsboro, NC
 (919) 341-5210

 On 06/06/2016 07:44 PM, Mark Jacobs - Listserv wrote:
 > Since Quiesce will put the system in a restartable wait state, I'd 
 > think that the XCF heartbeat would stop too.
 >
 > Mark Jacobs
 >
 >> Jesse 1 Robinson  June 6, 2016 at
 >> 7:32 PM I'd like to test my Sysplex Failure Management policy. The

 >> question is how to make a system stop responding long enough to 
 >> trigger expulsion from the sysplex. I'm thinking of issuing
QUIESCE 
 >> on a member. Have not used that in decades. Will it cause lack of
XCF 
 >> heartbeat? I can just try it unless someone has a better
suggestion.
 >>
 >> The last time this happened for reals was when a system ran clean
out 
 >> of SQA on account of a bad dog product. That's pretty hard to 
 >> recreate. All I want is to go through the pain and agony of 
 >> partitioning to test message handling and auto SAD.
 >>

 --
 For IBM-MAIN subscribe / signoff / archive access instructions,
 send email to lists...@listserv.ua.edu with the message: INFO
IBM-MAIN


--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: Testing SFM policy

2016-06-07 Thread Jesse 1 Robinson
For completeness I'm posting the 'Wait state action table (WSAT)', which 
determines the DIAGxx action(s) to take for various wait states. I'm surprised 
at how short this table is. 

Entries are of the form fwww, where
f
represents flags

represents the reason code
www
represents the wait state code
The '0010'b flag indicates that SADMP is to be IPLed.

The '0001'b flag indicates that z/OSĀ® is to be IPLed.

Both flags on ('0011'b) indicates that SADMP is to be IPLed, followed by z/OS.

The '1000'b flag indicates that any reason code (for this wait state code) 
should be considered a match.

The entries coded into the WSAT as of this writing are as follows:
X'40A2'
X'1017C0A2'
X'201800A2'
X'301840A2'
X'200010B5'
X'200020B5'
X'A007'
X'A009'
X'A037'
X'A039'
X'A056'

.
.
.
J.O.Skip Robinson
Southern California Edison Company
Electric Dragon Team Paddler 
SHARE MVS Program Co-Manager
323-715-0595 Mobile
626-302-7535 Office
robin...@sce.com


-Original Message-
From: IBM Mainframe Discussion List [mailto:IBM-MAIN@LISTSERV.UA.EDU] On Behalf 
Of Gerhard Adam
Sent: Tuesday, June 07, 2016 12:42 PM
To: IBM-MAIN@LISTSERV.UA.EDU
Subject: (External):Re: Testing SFM policy

You can write a simple assembler program to load a WAIT PSW that is disabled 
for interrupts or you can modify the RESTART NEW PSW in the PSW with the wait 
bit on and disabled for interrupts.

If you use the RESTART NEW PSW, then a PSW RESTART should result in your system 
going into a wait state.

Both should kill your system.

Adam

-From: "Jesse 1 Robinson" 
To: 
Cc: 
Sent: Tue, 7 Jun 2016 19:19:08 +
Subject: Re: Testing SFM policy

 Testing was semi-successful. On one hand, QUIESCE stopped the system.
Missing heartbeat was detected by SFM and system was partitioned out.
However, I was also trying to test SAD and AutoIPL. Nothing happened in that 
arena, so--as I do when all else fails--I RTFM. Found this:

 "For restartable wait states, Loadwait will ignore the AutoIPL policy unless a 
matching WSAT entry is found that has one or both flags on.
If a bit is found on, then the corresponding SAD or re-IPL will be performed. 
As of this writing, the WSAT contains no entries matching any restartable wait 
state and reason codes, so a restartable wait state request will not result in 
any AutoIPL action."

 Because the QUIESCE wait state (x'CCC') is restartable, AutoIPL and SAD are 
ignored. I need to find another way to kill the system.
Ironically we've had a few AutoIPLs over the years, mostly (all?) involving 
virtual storage exhaustion of some kind. None of them intentional. 

 .
 .
 .
 J.O.Skip Robinson
 Southern California Edison Company
 Electric Dragon Team Paddler
 SHARE MVS Program Co-Manager
 323-715-0595 Mobile
 626-302-7535 Office
 robin...@sce.com

 -Original Message-----
 From: IBM Mainframe Discussion List [mailto:IBM-MAIN@LISTSERV.UA.EDU] On 
Behalf Of Jesse 1 Robinson
 Sent: Tuesday, June 07, 2016 8:19 AM
 To: IBM-MAIN@LISTSERV.UA.EDU
 Subject: (External):Re: Testing SFM policy

 I will proceed with QUIESCE once our automation SME is ready to test.
Key goal is to capture the partitioning message on another member of the 
sysplex and blast out some alerts. In a recent failure at oh-dark-thirty on a 
Saturday morning, Ops did not notice the wait state failure (SQA exhausted). We 
also have a new SAD volume to test:
single Mod-54. Thanks for all the advice. 

 .
 .
 .
 J.O.Skip Robinson
 Southern California Edison Company
 Electric Dragon Team Paddler
 SHARE MVS Program Co-Manager
 323-715-0595 Mobile
 626-302-7535 Office
 robin...@sce.com

 -Original Message-----
 From: IBM Mainframe Discussion List [mailto:IBM-MAIN@LISTSERV.UA.EDU] On 
Behalf Of Mike Myers
 Sent: Tuesday, June 07, 2016 5:45 AM
 To: IBM-MAIN@LISTSERV.UA.EDU
 Subject: (External):Re: Testing SFM policy

 Mark and Skip:

 I can assure you that QUIESCE will stop the XCF heartbeat. I have used it 
often in training to demonstrate the loss of heartbeat and triggering the 
sysplex to respond to the loss of a member. With an active SFM policy, it will 
trigger your desired action.

 The advantage to using QUIESCE is that you can back out and reactivate the 
system with PSW restart if SFM is not active or you can beat it to the punch 
and will not lose the member.

 Mike Myers
 Senior z/OS Systems Programmer and Instructor Mentor Services Corporation 
Goldsboro, NC
 (919) 341-5210

 On 06/06/2016 07:44 PM, Mark Jacobs - Listserv wrote:
 > Since Quiesce will put the system in a restartable wait state, I'd  > think 
 > that the XCF heartbeat would stop too.
 >
 > Mark Jacobs
 >
 >> Jesse 1 Robinson  June 6, 2016 at
 >> 7:32 PM I'd like to

Re: Testing SFM policy

2016-06-07 Thread Jim Mulder
> You can write a simple assembler program to load a WAIT PSW that is
> disabled for interrupts or you can modify the RESTART NEW PSW in the
> PSW with the wait bit on and disabled for interrupts.
> 
> If you use the RESTART NEW PSW, then a PSW RESTART should result in
> your system going into a wait state.
> 
> Both should kill your system.

  That will not drive AutoIPL.  To drive AutoIPL, the wait state
must be loaded using the WTO macro with the WSPARM keyword specified. 

Jim Mulder   z/OS System Test   IBM Corp.  Poughkeepsie,  NY



--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: Testing SFM policy

2016-06-07 Thread Jim Mulder
> Testing was semi-successful. On one hand, QUIESCE stopped the 
> system. Missing heartbeat was detected by SFM and system was 
> partitioned out. However, I was also trying to test SAD and AutoIPL.
> Nothing happened in that arena, so--as I do when all else fails--I 
> RTFM. Found this:
> 
> "For restartable wait states, Loadwait will ignore the AutoIPL 
> policy unless a matching WSAT entry is found that has one or both 
> flags on. If a bit is found on, then the corresponding SAD or re-IPL
> will be performed. As of this writing, the WSAT contains no entries 
> matching any restartable wait state and reason codes, so a 
> restartable wait state request will not result in any AutoIPL action."
> 
> Because the QUIESCE wait state (x'CCC') is restartable, AutoIPL and 
> SAD are ignored. I need to find another way to kill the system. 
> Ironically we've had a few AutoIPLs over the years, mostly (all?) 
> involving virtual storage exhaustion of some kind. None of them 
intentional. 

  To zap the WSAT such that all restartable wait states drive 
AutoIPL with SADMP and ReIPL:

//D10JHM1P JOB 'D1003P,D10,?',MULDER,CLASS=J,MSGCLASS=H, 
// MSGLEVEL=(1,1),NOTIFY=D10JHM1 
//STEP1 EXEC PGM=AMASPZAP,PARM='IGNIDRFULL' 
//SYSPRINT DD SYSOUT=* 
//SYSLIB   DD DSN=SYS1.NUCLEUS,DISP=SHR 
//SYSIN DD  * 
 NAME  IEANUC01 BLWWSATC 
 VER 0030 0FFF 
 REP 0030 3FFF 
/* 
// 

To add an entry to the WSAT which causes the QUIESCE CCC wait code 
to drive AutoIPL with SADMP and ReIPL:
 
//D10JHM1P JOB 'D1003P,D10,?',MULDER,CLASS=J,MSGCLASS=H, 
// MSGLEVEL=(1,1),NOTIFY=D10JHM1 
//STEP1 EXEC PGM=AMASPZAP,PARM='IGNIDRFULL' 
//SYSPRINT DD SYSOUT=* 
//SYSLIB   DD DSN=SYS1.NUCLEUS,DISP=SHR 
//SYSIN DD  * 
 NAME  IEANUC01 BLWWSATC 
 VER 0020 0013 
 REP 0020 0014 
 VER 0078  
 REP 0078 3CCC 
/* 
// 

  Then reIPL to activate the changed WSAT. 



Jim Mulder   z/OS System Test   IBM Corp.  Poughkeepsie,  NY


--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: Testing SFM policy

2016-06-07 Thread Jesse 1 Robinson
Followed Jim's instructions to add a WSAT entry for QUIESCE wait state. IPLed 
it in, then did another QUIESCE command. Got SAD and AutoIPL. SAD title:

AUTOIPL WAIT STATE CODE 0CCC

Thanks to everyone. 

.
.
.
J.O.Skip Robinson
Southern California Edison Company
Electric Dragon Team Paddler 
SHARE MVS Program Co-Manager
323-715-0595 Mobile
626-302-7535 Office
robin...@sce.com


-Original Message-
From: IBM Mainframe Discussion List [mailto:IBM-MAIN@LISTSERV.UA.EDU] On Behalf 
Of Jim Mulder
Sent: Tuesday, June 07, 2016 2:08 PM
To: IBM-MAIN@LISTSERV.UA.EDU
Subject: (External):Re: Testing SFM policy

> Testing was semi-successful. On one hand, QUIESCE stopped the system. 
> Missing heartbeat was detected by SFM and system was partitioned out. 
> However, I was also trying to test SAD and AutoIPL.
> Nothing happened in that arena, so--as I do when all else fails--I 
> RTFM. Found this:
> 
> "For restartable wait states, Loadwait will ignore the AutoIPL policy 
> unless a matching WSAT entry is found that has one or both flags on. 
> If a bit is found on, then the corresponding SAD or re-IPL will be 
> performed. As of this writing, the WSAT contains no entries matching 
> any restartable wait state and reason codes, so a restartable wait 
> state request will not result in any AutoIPL action."
> 
> Because the QUIESCE wait state (x'CCC') is restartable, AutoIPL and 
> SAD are ignored. I need to find another way to kill the system.
> Ironically we've had a few AutoIPLs over the years, mostly (all?) 
> involving virtual storage exhaustion of some kind. None of them
intentional. 

  To zap the WSAT such that all restartable wait states drive AutoIPL with 
SADMP and ReIPL:

//D10JHM1P JOB 'D1003P,D10,?',MULDER,CLASS=J,MSGCLASS=H, 
// MSGLEVEL=(1,1),NOTIFY=D10JHM1 
//STEP1 EXEC PGM=AMASPZAP,PARM='IGNIDRFULL' 
//SYSPRINT DD SYSOUT=* 
//SYSLIB   DD DSN=SYS1.NUCLEUS,DISP=SHR 
//SYSIN DD  *
 NAME  IEANUC01 BLWWSATC
 VER 0030 0FFF
 REP 0030 3FFF
/*
// 

To add an entry to the WSAT which causes the QUIESCE CCC wait code to drive 
AutoIPL with SADMP and ReIPL:
 
//D10JHM1P JOB 'D1003P,D10,?',MULDER,CLASS=J,MSGCLASS=H, 
// MSGLEVEL=(1,1),NOTIFY=D10JHM1 
//STEP1 EXEC PGM=AMASPZAP,PARM='IGNIDRFULL' 
//SYSPRINT DD SYSOUT=* 
//SYSLIB   DD DSN=SYS1.NUCLEUS,DISP=SHR 
//SYSIN DD  *
 NAME  IEANUC01 BLWWSATC
 VER 0020 0013
 REP 0020 0014
 VER 0078 
 REP 0078 3CCC
/*
// 

  Then reIPL to activate the changed WSAT. 



Jim Mulder   z/OS System Test   IBM Corp.  Poughkeepsie,  NY

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN