Re: Strange behaviour.
Thanks to everyone for your responses. It does look like it was a paging problem of some sort. Regards Bob
Re: Strange behaviour.
Time to call IBM service. The processor may have called home. Jim On 5/26/2010 10:05 AM, Bob Atton wrote: This is a multipart message in MIME format. --=_alternative 004D688A8025772F_= Content-Type: text/plain; charset=US-ASCII We have had a very strange and unusual message today on our z/VM 5.4 system. I had a report that our zLinux guests were not accessible (putty, ping, ssh all no good). I logged on to MAINT OK but soon after when I had issued a command I got the following messages. 10:29:47 * MSG FROM MAINT : DMSITM319T Machine check interrupt was encountered; MCIC = X'4002AFBD 400B ' 10:29:47 * MSG FROM MAINT : DMSITM319T Disabled wait entered, please re-IPL CMS. HCPGIR450W CP entered; disabled wait PSW 000A 817C009E CMS would not IPL but I could IPL 190. When I tried to start the zLinux guests they initially failed with the messages above because they IPL CMS but when I IPLed 190 and ran the profile etc to start linux I saw HCPMCV1459E The virtual machine is placed in check-stop state due to a system malfunction with CPU 00. After a while we IPLed the VM system and all appears to be OK. Has anyone else seen such strange behaviour? Regards Bob -- James Bohnsack (972) 596-6377 home/office (972) 342-5823 cell
Re: Strange behaviour.
On Wednesday, 05/26/2010 at 10:06 EDT, Bob Atton bob.j.at...@rrd.com wrote: We have had a very strange and unusual message today on our z/VM 5.4 system. I had a report that our zLinux guests were not accessible (putty, ping, ssh all no good). I logged on to MAINT OK but soon after when I had issued a command I got the following messages. 10:29:47 * MSG FROM MAINT : DMSITM319T Machine check interrupt was encountered; MCIC = X'4002AFBD 400B ' 40 = System processing damage 02 = Backed up That means that the CPU had an error and 'backed up' to the most recent internal consistent checkpoint, avoiding damage to memory, registers, timers, or the PSW. The condition is fatal. CMS would not IPL but I could IPL 190. When I tried to start the zLinux guests they initially failed with the messages above because they IPL CMS but when I IPLed 190 and ran the profile etc to start linux I saw HCPMCV1459E The virtual machine is placed in check-stop state due to a system malfunction with CPU 00. After a while we IPLed the VM system and all appears to be OK. I'm fuzzy on how CPU sparing and recovery work, and I don't know why you couldn't IPL an NSS. Certainly you should check the HMC for hardware messages. It likely called home. Alan Altmark z/VM Development IBM Endicott
Re: Strange behaviour.
On Wednesday, 05/26/2010 at 10:06 EDT, Bob Atton bob.j.at...@rrd.com wrote: We have had a very strange and unusual message today on our z/VM 5.4 system. I had a report that our zLinux guests were not accessible (putty, ping, ssh all no good). I logged on to MAINT OK but soon after when I had issued a command I got the following messages. 10:29:47 * MSG FROM MAINT : DMSITM319T Machine check interrupt was encountered; MCIC = X'4002AFBD 400B ' 10:29:47 * MSG FROM MAINT : DMSITM319T Disabled wait entered, please re-IPL CMS. HCPGIR450W CP entered; disabled wait PSW 000A 817C009E CMS would not IPL but I could IPL 190. D'oh: One other thing, check the operator's console for paging errors (paging volume or spool). When I tried to start the zLinux guests they initially failed with the messages above because they IPL CMS but when I IPLed 190 and ran the profile etc to start linux I saw HCPMCV1459E The virtual machine is placed in check-stop state due to a system malfunction with CPU 00. The machine check above is what you would get if the guest had page 0/1 resident when the paging error occurred. You can get this if CP cannot page in guest page 0 or 1 in order to store PSWs and logout data in order to present said machine check. I do wish CP was just a tad more specific than simply system malfunction. Alan Altmark z/VM Development IBM Endicott
Re: Strange behaviour.
I do wish CP was just a tad more specific than simply system malfunction. Hey, if it's good enough for Windows... The information transmitted is intended only for the person or entity to which it is addressed and may contain confidential and/or privileged material. Any review retransmission dissemination or other use of or taking any action in reliance upon this information by persons or entities other than the intended recipient or delegate is strictly prohibited. If you received this in error please contact the sender and delete the material from any computer. The integrity and security of this message cannot be guaranteed on the Internet. The sender accepts no liability for the content of this e-mail or for the consequences of any actions taken on the basis of information provided. The recipient should check this e-mail and any attachments for the presence of viruses. The sender accepts no liability for any damage caused by any virus transmitted by this e-mail. This disclaimer is property of the TTC and must not be altered or circumvented in any manner.