Re: Ghost users on z/VM 5.2
On Tue, 11 Sep 2007 10:56:21 -0400, Hooker, Don [EMAIL PROTECTED] wrote: And Alan, I guess I'm glad to know we're not alone. I will look up VM64184. It is for z/VM 5.2? Both z/VM 5.2 and 5.3. APAR Identifier .. VM64184 Last Changed 07/08/30 HUNG USER AFTER VARY ON COMMAND ISSUED TO PAV DASD Symptom .. WS WAIT Status ... CLOSED PER Severity ... 3 Date Closed . 07/08/29 Component .. 568411202 Duplicate of Reported Release . 520 Fixed Release 999 Component Name VM CPSpecial Notice HIPER Current Target Date ..07/08/15 Flags RESTART/BOOT/IPL SCP ... Platform Status Detail: SHIPMENT - Packaged solution is available for shipment. PE PTF List: PTF List: Release 520 : UM32093 available 07/08/30 (1000 ) Release 530 : UM32096 available 07/08/30 (1000 ) Parent APAR: Child APAR list: ERROR DESCRIPTION: At a certain time, some DASDs that were PAV Alias devices were reconfigured as PAV Base devices. This DASD reconfiguration occurred while the current IPL was active and z/VM retained the old PAV Alias indications for the devices that were switched to PAV Bases and hence were set as both PAV Bases and Aliases. These indications are defined to be mutually exclusive and since they were both set, this caused the I/O scheduler problems with locks and hanged the VARY command. LOCAL FIX: . PROBLEM SUMMARY: * USERS AFFECTED: All z/VM users of Parallel Access Volumes* * ( PAV ). * * PROBLEM DESCRIPTION: * * RECOMMENDATION: APPLY PTF* At one time on a D/T2105 controller devices 2200-22FF were defined as D/T3390 Model 9 DASD and devices 2200-2219 and 2280-2299 were configured as PAV Bases and 221A-227F and 229A-22FF were configured as PAV Aliases. The devices were all offline to z/VM, but the associated RDEVs were marked as PAV Aliases or PAV Bases as appropriate. . Subsequently the devices were reconfigured as D/T3390 Model 3 DASDs and devices 2200-2235 and 2280-22B5 were defined as PAV Bases and 2236-227F and 22B6-22FF were defined as PAV Aliases. . For a device that switches from being a PAV Alias to a PAV Base (Device 221A for example), the RDEV retains its old PAV Alias setting (RDEVPVAL) and also gets marked as a PAV Base (RDEVPVBA) when it goes through device initialization because of a VARY ON command. These indications are defined to be mutually exclusive and since they are both set, this causes the I/O scheduler (HCPIQM) to get into a deadlock situation with the RDEV lock for the device and the VARY command hangs. PROBLEM CONCLUSION: Device initialization code was modified to clear a PAV device's old state when it doesn't match the new state. Specifically, if device initialization determines that a device is a PAV Alias device, then it will now check to see if the RDEV is currently marked as a PAV Base device. If so, then the device's old life (PAV Base) will be reset before indicating in the RDEV that the device is a PAV Alias. If device initialization determines that the device is NOT a PAV Alias device, but it is currently marked that way in the RDEV, then the device's old life (PAV Alias) will be reset. If device initialization determines that a device is a PAV Base device, then it will now check to see if the RDEV is currently marked as a PAV Alias device. If so, then the device's old life (PAV Alias) will be reset before indicating in the RDEV that the device is a PAV Base. TEMPORARY FIX: * * HIPER * * FOR RELEASE VM/ESA CP/ESA R520 : PREREQ: VM63855 VM64128 CO-REQ: NONE IF-REQ: NONE FOR RELEASE VM/ESA CP/ESA R530 : PREREQ: VM64222 CO-REQ: NONE IF-REQ: NONE COMMENTS: MODULES/MACROS: HCPRDI HCPSDV HCPVPA SRLS: NONE RTN CODES: CIRCUMVENTION: MESSAGE TO SUBMITTER:
Ghost users on z/VM 5.2
Thanks for the replies. And thanks for the TRACK commands, Mike. I use TRACK all the time but mostly for viewing consoles. TRACK shows an active I/O to a VSWITCH connected to an OSAX. I am reluctant to do a HALT on that device as I do not know if it would impact other users of the VSWITCH. And Alan, I guess I'm glad to know we're not alone. I will look up VM64184. It is for z/VM 5.2? I saw an item on IBMLINK for a hung z/OS user and VSWITCH, but it was for z/VM 4.4. I will take a SNAPDUMP before our next IPL and open a PMR.
Ghost users on z/VM 5.2
It's been *so* long since I've seen users stuck in Logoff/Force Pending state for any length of time that I thought it had been fixed. We have a z/VSE guest stuck in that state since Sunday. Anybody else seen this on current levels of z/VM? Many years ago when it was a more frequent problem, I wrote an assembler program that massaged some bits in the (then) VMBLOCK to free it up. I did not take some things into account at the time, so sometimes it worked, but then othertimes... expletive deleted Does anybody have any current tricks to free up a user in this state (short of VM IPL).
Re: Ghost users on z/VM 5.2
Might I humbly suggest going through the attached process before twiddling any bits? Getting doc for IBM is always a good idea so that hung users don't make a reappearance. Each time one appears, shoot the problem so we don't have to shoot the users. Even if it might be a good idea to shoot some users! ;-) Mike Walter Hewitt Associates Any opinions expressed herein are mine alone and do not necessarily represent the opinions or policies of Hewitt Associates. READ: HUNGUSER HELPME When a userid becomes hung in a LOGOFF/FORCE PENDING state, the following alternatives may be tried -- some require more than class G privs: - If you have the time, simply waiting 15 minutes for CP to perform housecleaning chores might free the userid, completing the LOGOFF or FORCE. - Use the public domain utility TRACK to determine if the userid is awaiting completion of an I/O to a particular unresponsive device. Use the commands: TRACK hungid DEV CLASS * IO PENDing TRACK hungid DEV CLASS * IO ACTIVE Nota bene: As of 23 Feb 2006 TRACK can be obtained from: http://vm.marist.edu/track/code.html - Before attempting anything that actually changes the hung userid, if you can (consider communication time-outs which may occur that could affect other users) before muddying the waters, get a current system dump for IBM to diagnose later. From a privclass A' user: CP QUERY DUMP (then ensure that it is going to disk) Then: CP WNG ALL This system may be non-responsive for a few minutes while diagnostic information is obtained. Then: CP SNAPDUMP - Sometimes a simple message frees up the hungid without further ado. From a privclass A, B, or C userid, issue: CP WNG hungid Hello - If the ID was awaiting I/O to a terminal, simply connecting from a working terminal may free the ID. From a free terminal, issue: CP LOGON hungid HERE - For users logged on via TELNET terminals, issue: NETSTAT TELNET Find the matching tn3270 connection, and issue: NETSTAT DROP conn_num - CPHX is reported to cancel pending CP commands: ATTACH, LOCATE, LOCATEVM, and VARY ONLINE|OFFLINE (see HELP for more detail). From a privclass A userid, issue: CP CPHX hungid - If TRACK (above) showed an active I/O which cannot be remedied (e.g. by making a tape drive Ready), the I/O may be able to be cancelled. From a privclass A userid, issue: CP HALT rdev Due to queued I/Os or recalcitrant devices, HALT may need to be issued repeatedly until the following message is received: Halt was not initiated to tape because the device as not active - If nothing freed the hung user, open a Problem Management Report (PMR) with IBM, and provide the SNAPDUMP for analysis. .cm Last updated 2007/01/04 mrw Hooker, Don [EMAIL PROTECTED] Sent by: The IBM z/VM Operating System IBMVM@LISTSERV.UARK.EDU 09/10/2007 04:38 PM Please respond to The IBM z/VM Operating System IBMVM@LISTSERV.UARK.EDU To IBMVM@LISTSERV.UARK.EDU cc Subject Ghost users on z/VM 5.2 It's been *so* long since I've seen users stuck in Logoff/Force Pending state for any length of time that I thought it had been fixed. We have a z/VSE guest stuck in that state since Sunday. Anybody else seen this on current levels of z/VM? Many years ago when it was a more frequent problem, I wrote an assembler program that massaged some bits in the (then) VMBLOCK to free it up. I did not take some things into account at the time, so sometimes it worked, but then othertimes... expletive deleted Does anybody have any current tricks to free up a user in this state (short of VM IPL). The information contained in this e-mail and any accompanying documents may contain information that is confidential or otherwise protected from disclosure. If you are not the intended recipient of this message, or if this message has been addressed to you in error, please immediately alert the sender by reply e-mail and then delete this message, including any attachments. Any dissemination, distribution or other use of the contents of this message by anyone other than the intended recipient is strictly prohibited.
Re: Ghost users on z/VM 5.2
A Logoff/Force Pending state means that something unexpected happened. For years whenever a new cause was discovered IBM has added a test for that condition and does something about it. That is why they have gotten fewer and fewer. Apparently you found another unexpected condition. So call IBM and send them the info to find and fix it. Maybe it is the last one and no one will ever have a hung user again. :) Hooker, Don wrote: It's been *so* long since I've seen users stuck in Logoff/Force Pending state for any length of time that I thought it had been fixed. We have a z/VSE guest stuck in that state since Sunday. Anybody else seen this on current levels of z/VM? Many years ago when it was a more frequent problem, I wrote an assembler program that massaged some bits in the (then) VMBLOCK to free it up. I did not take some things into account at the time, so sometimes it worked, but then othertimes... expletive deleted Does anybody have any current tricks to free up a user in this state (short of VM IPL). -- Stephen Frazier Information Technology Unit Oklahoma Department of Corrections 3400 Martin Luther King Oklahoma City, Ok, 73111-4298 Tel.: (405) 425-2549 Fax: (405) 425-2554 Pager: (405) 690-1828 email: stevef%doc.state.ok.us