Re: z10 power problem notification

Knutson, Sam Mon, 08 Dec 2008 05:25:53 -0800

Redundantly reporting on multiple LPARs for hardware errors is not bad either.


Many customers including us have true sandbox LPARs where automation may be 
limited or operators may be conditioned to not pay attention to it.

I think for true hardware failures reporting on every applicable LPAR is a good 
thing.  It makes it likely that one of those LPARs will include automation 
prepared to handle the message and that it will be visible to operations staff. 
 It also helps operations to understand this is a hardware problem with 
processor x when it comes out on n of n LPARs and all the impacted LPARs reside 
on the same processor.

        Best Regards, 

                Sam Knutson, GEICO 
                System z Performance and Availability Management 
                mailto:[EMAIL PROTECTED] 
                (office)  301.986.3574 
                (cell) 301.996.1318              

"Think big, act bold, start simple, grow fast..." 


-----Original Message-----
From: IBM Mainframe Discussion List [mailto:[EMAIL PROTECTED] On Behalf Of Jim 
Mulder
Sent: Monday, December 08, 2008 12:45 AM
To: IBM-MAIN@bama.ua.edu
Subject: Re: z10 power problem notification

  You are correct, those things are the job of the service processor/HMC.
In the "good old days", we did not have a service processor with the 
capability
to perform those functions, so the operating system had to do it.  It now 
makes
more sense to use the service processor, so that we do not need to provide
those functions in at least 4 operating systems (MVS, VM, VSE, and Linux), 

and so that multiple LPARs are not all reporting the same condition. 
We do not like to put model dependent code into the operating system
when that can be avoided. 
 
> It seems odd to me to go down the path of ill-tested SNMP or emails from 
the
> service processor, when there is a much more robust and *architected*
> mechanism already there. Sure, the guest OS can fail because of the 
hardware
> failure, but that'll get noticed PDQ!

  If you are suggesting that some customers might find it convenient to 
have the option of having some processor issues redundantly reported via 
MVS 
WTOs because they already have an automation infrastructure designed 
around
WTOs, or because the MVS Syslog or Operlog is a convenient place for them 
to
look for such things, or to correlate them with other operating system 
events,
I can't disagree with that (especially since that might be able to work 
with already existing MVS code, albeit unused for the past decade AFAIK).
You may have a reasonable request to make to the service processor folks
in Endicott. 

Jim Mulder   z/OS System Test   IBM Corp.  Poughkeepsie,  NY

====================
This email/fax message is for the sole use of the intended
recipient(s) and may contain confidential and privileged information.
Any unauthorized review, use, disclosure or distribution of this
email/fax is prohibited. If you are not the intended recipient, please
destroy all paper and electronic copies of the original message.

----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: GET IBM-MAIN INFO
Search the archives at http://bama.ua.edu/archives/ibm-main.html

Re: z10 power problem notification

Reply via email to