From the tn3270 sessions hanging to the phone call to me - 2-3 minutes. From then till we decided we had to IPL - maybe 15-20 minutes. But 30 minutes (maybe 45-60 till all the apps were back up) on a major online system is a lot. It was 35 minutes from the message capping the virtual storage at 8TB till the IPL time from Q CPLEVEL. So no, not long considering the size. And yes, I suspect it would PGT004 eventually.

And yes, if CP unceremoniously chopped my wrong size from 9.7TB to 8TB, why could it not do the same to either a user specified system limit or a "this is the biggest machine this CP can run in this configuration"...

Lee

Gentry, Stephen wrote:
What Lee doesn't mention is how long he waited before doing the IPL.
Had he waited to see what happens maybe VM would have finally come
around, so to speak. We all have different thresholds of pain. I think I
would have done what Lee did, long day, not really wanting to wait
around to see if VM recovers, just IPL.  Lee did you have access to the
HMC and thus the SAD screen to see what was going on? Sort of my last
line of defense if I can't get logged in.  Granted all it will tell you
is if you have CPU or I/O utilization, but at least you have something
to go to IBM with.
Maybe a SYSTEM CONFIG file option, like MAX_USER_SIZE, if it's set then
guest machine size is verified, if not available PAGE area and SPOOL
size is checked (calculated) and if the guest exceeds that size then the
quest doesn't start or a severe warning is issued.
Steve

-----Original Message-----
From: The IBM z/VM Operating System [mailto:ib...@listserv.uark.edu] On
Behalf Of Schuh, Richard
Sent: Tuesday, September 15, 2009 12:59 PM
To: IBMVM@LISTSERV.UARK.EDU
Subject: Re: VM lockup due to storage typo

Maybe CP couldn't know that the guest would do something bad, but it
should know that it has opened itself to the possibility that the guest
could, in normal operation, cause the problem. One of Alan's first precepts of information security and integrity is
that the guest cannot be allowed to harm the CP. This clearly violates
that.

Regards, Richard Schuh
-----Original Message-----
From: The IBM z/VM Operating System [mailto:ib...@listserv.uark.edu] On Behalf Of Tom Duerbusch
Sent: Tuesday, September 15, 2009 9:19 AM
To: IBMVM@LISTSERV.UARK.EDU
Subject: Re: VM lockup due to storage typo

CP wouldn't know at IPL time, the guest would, not could, but would cause such harm.

Just because you say you can use xxx GB, doesn't mean you would actually use them.

When page fills, it over flows to spool.
When spool fills, CP abends on the next pageout.

Tom Duerbusch
THD Consulting

Marcy Cortes <marcy.d.cor...@wellsfargo.com> 9/15/2009
11:02 AM >>>
See a thread on this list with subject "Sanity check?" from Oct 2007 for what happened when I did the same thing ;)

You probably filled page space.

I still think IBM should refuse to IPL a guest that will cause such harm.


Marcy "This message may contain confidential and/or privileged information. If you are not the addressee or authorized to receive this for the addressee, you must not use, copy, disclose, or take any action based on this message or any information herein. If you have received this message in error, please advise the sender immediately by reply e-mail and delete this message. Thank you for your cooperation."


-----Original Message-----
From: The IBM z/VM Operating System [mailto:ib...@listserv.uark.edu] On Behalf Of Lee Stewart
Sent: Tuesday, September 15, 2009 8:39 AM
To: IBMVM@LISTSERV.UARK.EDU
Subject: [IBMVM] VM lockup due to storage typo

Does anyone have an idea of how we might have gotten out of this without an IPL?

VM LPAR has 175G of memory and a flock of Linux Oracle guests... Several guests needed more memory added so the directory was updated and one by one the guests shutdown, logged off and back on. So far, so good.

But... In changing the memory for many guests, and it being late at night after a long day, while meaning to set a guest's memory to 9728M, it got set to 9728G. When that guest was cycled we see the message on the console that it's memory was limited to 8TB (HCPLGN093E), then the VM system appeared to freeze.

We couldn't get in via TCP/IP, or the HMC Operating System Messages screen, or the HMC Integrated 3270.

Finally had to IPL. Even that was wierd as I'd have expected the Load Normal to shutdown, it just IPLed. We did NoAutolog, fixed the typo and all came back up ok...

I suspect CP was scrambling paging everything in the world out as Linux tried to initialize that 8TB of memory... But I'm surprised I couldn't even get into the HMC consoles (to kill just that one guest as opposed to all of them)..

Any thoughts?
Lee
--

Lee Stewart, Senior SE
Sirius Computer Solutions
Phone: (303) 996-7122
Email: lee.stew...@siriuscom.com Web: www.siriuscom.com




--

Lee Stewart, Senior SE
Sirius Computer Solutions
Phone: (303) 996-7122
Email: lee.stew...@siriuscom.com
Web:   www.siriuscom.com

Reply via email to