Re: RHEL 6.9 servers getting Unknown program exception: 0028 SMP
And for anyone who has trouble IPLing CMS (or other OSs) after the trace command, we found doing a SYSTEM CLEAR, then the IPL cleared that issue... Lee Lee Stewart ● VM System Support ● Visa ● Phone: 6(750)4601 - +1-303-389-4601 ● lstew...@visa.com -Original Message- From: Linux on 390 Port [mailto:LINUX-390@VM.MARIST.EDU] On Behalf Of Bill Bitner Sent: Friday, April 13, 2018 9:37 AM To: LINUX-390@VM.MARIST.EDU Subject: Re: RHEL 6.9 servers getting Unknown program exception: 0028 SMP Rick, As we discussed on phone, but for benefit of others. My last append forgot to point out that the TRACE needs to be done for all the virtual processors in the virtual machine. Otherwise only, only the base VMDBK is protected. This can be done by using the "CPU" command: CPU ALL CMD TRACE PROG 28 NOTERM NOPRINT RUN I apologize for giving both of us a fright and for those following along. Regards, Bill ___ Bill Bitner - z/VM Customer Focus and Care - 607-429-3286 bitn...@us.ibm.com "Making systems practical and profitable for customers through virtualization and its exploitation." - z/VM -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/ -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/
Re: RHEL 6.9 servers getting Unknown program exception: 0028 SMP
Rick, As we discussed on phone, but for benefit of others. My last append forgot to point out that the TRACE needs to be done for all the virtual processors in the virtual machine. Otherwise only, only the base VMDBK is protected. This can be done by using the "CPU" command: CPU ALL CMD TRACE PROG 28 NOTERM NOPRINT RUN I apologize for giving both of us a fright and for those following along. Regards, Bill ___ Bill Bitner - z/VM Customer Focus and Care - 607-429-3286 bitn...@us.ibm.com "Making systems practical and profitable for customers through virtualization and its exploitation." - z/VM -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/
Re: RHEL 6.9 servers getting Unknown program exception: 0028 SMP
The circumvention of using the TRACE PROG 28 did not work for us. We had some servers that failed and could not re-boot - most likely related to the trace being active. We are booting Linux back to the older kernel level that was not experiencing the problem - 2.6.32-696.18.7.el6.s390x. Clearly there seems to be something on those late kernel patches that makes the servers susceptible to this problem. I am also pushing to get at least one LPAR IPLed with VM65414 installed. Rick Barlow On Thu, Apr 12, 2018 at 10:53 AM, Bill Bitner wrote: > > > > The real solution is to apply VM65414's PTF which is available. A > mitigation would be to do a CP TRACE PROG 28 NOTERM NOPRINT RUN for the > virtual machines through the console or if logged off and on, through a > directory command statement. > > Thank you for understanding the communication challenges in this space. I'm > sure you'll let us know if you need more info. :-) > > Regards, > Bill > > ___ > > Bill Bitner - z/VM Customer Focus and Care - 607-429-3286 > bitn...@us.ibm.com > "Making systems practical and profitable for customers through > virtualization and its exploitation." - z/VM > -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/
Re: RHEL 6.9 servers getting Unknown program exception: 0028 SMP
Thanks for the info Bill Thanks, Terri Glowaniak Mainframe Systems Engineer terri.glowan...@regions.com (205) 261-6883 (W) -Original Message- From: Linux on 390 Port [mailto:LINUX-390@VM.MARIST.EDU] On Behalf Of Bill Bitner Sent: Thursday, April 12, 2018 9:54 AM To: LINUX-390@VM.MARIST.EDU Subject: Re: RHEL 6.9 servers getting Unknown program exception: 0028 SMP [External Content] Please use caution. Rick, Terri, Greg, and others, apologies I should have given out the PTF numbers and a little more background on my earlier append. APAR VM65396 PTF UM34851 is the one that introduced the prog check problem. We discovered the problem February 26th, and marked VM65396 in error on March 1st with VM65414 the PE fix to it. The correction to that error is included in APAR VM65414 PTF UM34853, which closed on March 23rd. For other reasons, both of these APARs are Security/Integrity APARs, so see my earlier append on the IBM Z Security portal. However, the functional problem, other than being introduced by VM65396, is not connected to the security/integrity aspects of these APARs. The functional problem is related to a virtual processor coming out of SIE for a fast path operation, as opposed to the typical normal exit from SIE. (For those unfamiliar with SIE, think of it as dispatching or running a virtual processor). Fast path exits are far more prevalent in a CMS environment than in other guest operating systems, but it is possible. In all cases, fast path exits are a subset, mostly a very small subset, of all exits. There is a scenario where taking the fast path exit can erroneously cause z/VM to present the 028 program check. The error further involves the upper part of a register not being cleared. If this register's upper half contains zeroes, it would not trigger the error condition, but over time, it appears the upper half changes and the program checks start appearing. So from that perspective it's a timing/workload dependent problem. This is a simplification, but I hope it's enough for you to appreciate the aspects you need to be aware. The real solution is to apply VM65414's PTF which is available. A mitigation would be to do a CP TRACE PROG 28 NOTERM NOPRINT RUN for the virtual machines through the console or if logged off and on, through a directory command statement. Thank you for understanding the communication challenges in this space. I'm sure you'll let us know if you need more info. :-) Regards, Bill ___ Bill Bitner - z/VM Customer Focus and Care - 607-429-3286 bitn...@us.ibm.com "Making systems practical and profitable for customers through virtualization and its exploitation." - z/VM -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/ -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/
Re: RHEL 6.9 servers getting Unknown program exception: 0028 SMP
Rick, Terri, Greg, and others, apologies I should have given out the PTF numbers and a little more background on my earlier append. APAR VM65396 PTF UM34851 is the one that introduced the prog check problem. We discovered the problem February 26th, and marked VM65396 in error on March 1st with VM65414 the PE fix to it. The correction to that error is included in APAR VM65414 PTF UM34853, which closed on March 23rd. For other reasons, both of these APARs are Security/Integrity APARs, so see my earlier append on the IBM Z Security portal. However, the functional problem, other than being introduced by VM65396, is not connected to the security/integrity aspects of these APARs. The functional problem is related to a virtual processor coming out of SIE for a fast path operation, as opposed to the typical normal exit from SIE. (For those unfamiliar with SIE, think of it as dispatching or running a virtual processor). Fast path exits are far more prevalent in a CMS environment than in other guest operating systems, but it is possible. In all cases, fast path exits are a subset, mostly a very small subset, of all exits. There is a scenario where taking the fast path exit can erroneously cause z/VM to present the 028 program check. The error further involves the upper part of a register not being cleared. If this register's upper half contains zeroes, it would not trigger the error condition, but over time, it appears the upper half changes and the program checks start appearing. So from that perspective it's a timing/workload dependent problem. This is a simplification, but I hope it's enough for you to appreciate the aspects you need to be aware. The real solution is to apply VM65414's PTF which is available. A mitigation would be to do a CP TRACE PROG 28 NOTERM NOPRINT RUN for the virtual machines through the console or if logged off and on, through a directory command statement. Thank you for understanding the communication challenges in this space. I'm sure you'll let us know if you need more info. :-) Regards, Bill ___ Bill Bitner - z/VM Customer Focus and Care - 607-429-3286 bitn...@us.ibm.com "Making systems practical and profitable for customers through virtualization and its exploitation." - z/VM -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/
Re: RHEL 6.9 servers getting Unknown program exception: 0028 SMP
Thanks Bill. I was a little suspicious of VM65414. However, VM65396 has been on these LPARs since March (before I knew about the later PTF) and we did not see any problems until Linux got patched this week. I guess I will have to scramble to get another VM IPL. 8-\ Rick Barlow On Thu, Apr 12, 2018 at 8:26 AM, Bill Bitner wrote: > Rick, if you have VM65396 on, but not VM65414, that would be my guess. > VM65414 corrected a problem introduced by VM65396 where a guest could > erroneously be given a program check 28. > > ___ > > Bill Bitner - z/VM Customer Focus and Care - 607-429-3286 > bitn...@us.ibm.com > "Making systems practical and profitable for customers through > virtualization and its exploitation." - z/VM > > -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/
Re: RHEL 6.9 servers getting Unknown program exception: 0028 SMP
Greg, For information on vulnerabilities, please see the IBM Z Security Portal. Information on the portal can be found if you page down slightly on https://www.ibm.com/it-infrastructure/z/capabilities/system-integrity Regards, Bill ___ Bill Bitner - z/VM Customer Focus and Care - 607-429-3286 bitn...@us.ibm.com "Making systems practical and profitable for customers through virtualization and its exploitation." - z/VM -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/
Re: RHEL 6.9 servers getting Unknown program exception: 0028 SMP
On Thu, 12 Apr 2018 08:13:43 -0400 Rick Barlow wrote: > We have started seeing a rash of errors on some of our Linux virtual > servers. The message we see on the virtual machine console is "Unknown > program exception: 0028 [#1] SMP". It appears to only affect servers that > were recently patched to kernel level "2.6.32-696.23.1.el6.s390x #1 SMP Sat > Feb 10 11:11:31 EST 2018". It does not affect all of our servers that were > recently patched. I suspect that it might be related to patches related to > Spectre. I did a google search and did not get any hits on the message. I > expect our Linux team will contact Red Hat support. Program check 28 is ALET-specification exception. It is very unlikely that Linux causes this exception. The only piece of code that uses the access register mode is the clock_gettime() function in the vdso code. And then only for the CPUCLOCK_VIRT clock source with a constant ALET. -- blue skies, Martin. "Reality continues to ruin my life." - Calvin. -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/
Re: RHEL 6.9 servers getting Unknown program exception: 0028 SMP
VM65414 is PE'd... can't find anything on it now. It looks like PTF UM34768 is the latest out to fix this, and it came out April 4? I have VM65396 on, but have different Redhat levels than Rick mentioned. Thanks, Terri Glowaniak Mainframe Systems Engineer terri.glowan...@regions.com (205) 261-6883 (W) -Original Message- From: Linux on 390 Port [mailto:LINUX-390@VM.MARIST.EDU] On Behalf Of Bill Bitner Sent: Thursday, April 12, 2018 7:26 AM To: LINUX-390@VM.MARIST.EDU Subject: Re: RHEL 6.9 servers getting Unknown program exception: 0028 SMP [External Content] Please use caution. Rick, if you have VM65396 on, but not VM65414, that would be my guess. VM65414 corrected a problem introduced by VM65396 where a guest could erroneously be given a program check 28. ___ Bill Bitner - z/VM Customer Focus and Care - 607-429-3286 bitn...@us.ibm.com "Making systems practical and profitable for customers through virtualization and its exploitation." - z/VM -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/ -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/
Re: RHEL 6.9 servers getting Unknown program exception: 0028 SMP
Bill, Has the dust settled now on the z/VM Spectre fixes? We put on VM65396 as well on a couple of systems then halted that until further fixes arrive. On 4/12/2018 7:26 AM, Bill Bitner wrote: Rick, if you have VM65396 on, but not VM65414, that would be my guess. VM65414 corrected a problem introduced by VM65396 where a guest could erroneously be given a program check 28. ___ Bill Bitner - z/VM Customer Focus and Care - 607-429-3286 bitn...@us.ibm.com "Making systems practical and profitable for customers through virtualization and its exploitation." - z/VM -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/ -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/
Re: RHEL 6.9 servers getting Unknown program exception: 0028 SMP
Rick, if you have VM65396 on, but not VM65414, that would be my guess. VM65414 corrected a problem introduced by VM65396 where a guest could erroneously be given a program check 28. ___ Bill Bitner - z/VM Customer Focus and Care - 607-429-3286 bitn...@us.ibm.com "Making systems practical and profitable for customers through virtualization and its exploitation." - z/VM -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/
RHEL 6.9 servers getting Unknown program exception: 0028 SMP
We have started seeing a rash of errors on some of our Linux virtual servers. The message we see on the virtual machine console is "Unknown program exception: 0028 [#1] SMP". It appears to only affect servers that were recently patched to kernel level "2.6.32-696.23.1.el6.s390x #1 SMP Sat Feb 10 11:11:31 EST 2018". It does not affect all of our servers that were recently patched. I suspect that it might be related to patches related to Spectre. I did a google search and did not get any hits on the message. I expect our Linux team will contact Red Hat support. Has anyone else seen this? Thanks, Rick Barlow -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/