Tasks ENQ'ing exclusive on resource not getting control in FIFO order

2013-01-17 Thread Peter Hunkeler
It was my understanding that tasks enqueueing on a resource are getting control 
in FIFO order, if contention existed. Today, we had a situation where this was 
not true.

Environment is as follows:

- 4 system parallel sysplex, z/OS V1.13, GRS STAR mode.
- a dozen or so jobs running the same program are active across all 4 systems 
at a time. More job being submitted as jobs end (more or less).
- the programs are serializing using EXCLUSIVE ENQ on a resource, scope systems

As expected, one job is running, all others are waiting to get the resource 
assigned.
But suddenly, we the recognized that jobs on two systems never got running. 
They have been waiting for the resource for hours, while newer jobs got control 
one after the other. So resource assignment is clearly not FIFO. We then saw 
(in EJES) that once a job ended, all waiting jobs are active for a very short 
time, then one job continues to run while all other are waiting again.

I have RTFM, and still think ENQ is FIFO. I have not found anything related to 
GRS STAR mode that contradicts.

I have not followed GRS new lately. What am I missing?

--
Peter Hunkeler

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: Tasks ENQ'ing exclusive on resource not getting control in FIFO order

2013-01-17 Thread Bill Fairchild
It sounds as if the relative CPU speed and/or hypervisor dispatching of LPARs 
may be involved in any given LPAR's getting the resource.  If after GRS sends a 
signal to all enqueing systems that the resource is available it then waits for 
a response from said systems, it may be that GRS will give the resource to 
whichever system responded first and is ignoring the order in which the 
processors did the original ENQs.  Perhaps a timestamp needs to be associated 
with each ENQ request and the global resource allocator made sensitive to the 
timestamp.   Or maybe the documentation needs to be updated to reflect the 
different way that SCOPE=SYSTEMS ENQ works in GRS from SCOPE=SYSTEM with no GRS 
involved.

The relative processor speed certainly has been known to affect which of 
several sharing processors will next get access to a shared DASD volume using 
the RESERVE/RELEASE hardware function.

Bill Fairchild
Programmer
Rocket Software
408 Chamberlain Park Lane • Franklin, TN 37069-2526 • USA
t: +1.617.614.4503 •  e: bfairch...@rocketsoftware.com • w: 
www.rocketsoftware.com


-Original Message-
From: IBM Mainframe Discussion List [mailto:IBM-MAIN@LISTSERV.UA.EDU] On Behalf 
Of Peter Hunkeler
Sent: Thursday, January 17, 2013 2:46 PM
To: IBM-MAIN@LISTSERV.UA.EDU
Subject: Tasks ENQ'ing exclusive on resource not getting control in FIFO order

It was my understanding that tasks enqueueing on a resource are getting control 
in FIFO order, if contention existed. Today, we had a situation where this was 
not true.

Environment is as follows:

- 4 system parallel sysplex, z/OS V1.13, GRS STAR mode.
- a dozen or so jobs running the same program are active across all 4 systems 
at a time. More job being submitted as jobs end (more or less).
- the programs are serializing using EXCLUSIVE ENQ on a resource, scope systems

As expected, one job is running, all others are waiting to get the resource 
assigned.
But suddenly, we the recognized that jobs on two systems never got running. 
They have been waiting for the resource for hours, while newer jobs got control 
one after the other. So resource assignment is clearly not FIFO. We then saw 
(in EJES) that once a job ended, all waiting jobs are active for a very short 
time, then one job continues to run while all other are waiting again.

I have RTFM, and still think ENQ is FIFO. I have not found anything related to 
GRS STAR mode that contradicts.

I have not followed GRS new lately. What am I missing?

--
Peter Hunkeler

--
For IBM-MAIN subscribe / signoff / archive access instructions, send email to 
lists...@listserv.ua.edu with the message: INFO IBM-MAIN

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: Tasks ENQ'ing exclusive on resource not getting control in FIFO order

2013-01-17 Thread Jim Mulder
IBM Mainframe Discussion List IBM-MAIN@listserv.ua.edu wrote on 
01/17/2013 04:42:33 PM:


 It sounds as if the relative CPU speed and/or hypervisor dispatching
 of LPARs may be involved in any given LPAR's getting the resource. 
 If after GRS sends a signal to all enqueing systems that the 
 resource is available it then waits for a response from said 
 systems, it may be that GRS will give the resource to whichever 
 system responded first and is ignoring the order in which the 
 processors did the original ENQs.  Perhaps a timestamp needs to be 
 associated with each ENQ request and the global resource allocator 
 made sensitive to the timestamp.   Or maybe the documentation needs 
 to be updated to reflect the different way that SCOPE=SYSTEMS ENQ 
 works in GRS from SCOPE=SYSTEM with no GRS involved.
 
 The relative processor speed certainly has been known to affect 
 which of several sharing processors will next get access to a shared
 DASD volume using the RESERVE/RELEASE hardware function.
 
 Bill Fairchild
 Programmer
 Rocket Software
 408 Chamberlain Park Lane ? Franklin, TN 37069-2526 ? USA
 t: +1.617.614.4503 ?  e: bfairch...@rocketsoftware.com ? w: 
 www.rocketsoftware.com
 
 
 -Original Message-
 From: IBM Mainframe Discussion List [mailto:IBM-MAIN@LISTSERV.UA.EDU
 ] On Behalf Of Peter Hunkeler
 Sent: Thursday, January 17, 2013 2:46 PM
 To: IBM-MAIN@LISTSERV.UA.EDU
 Subject: Tasks ENQ'ing exclusive on resource not getting control in FIFO 
order
 
 It was my understanding that tasks enqueueing on a resource are 
 getting control in FIFO order, if contention existed. Today, we had 
 a situation where this was not true.
 
 Environment is as follows:
 
 - 4 system parallel sysplex, z/OS V1.13, GRS STAR mode.
 - a dozen or so jobs running the same program are active across all 
 4 systems at a time. More job being submitted as jobs end (more or 
less).
 - the programs are serializing using EXCLUSIVE ENQ on a resource, 
 scope systems
 
 As expected, one job is running, all others are waiting to get the 
 resource assigned.
 But suddenly, we the recognized that jobs on two systems never got 
 running. They have been waiting for the resource for hours, while 
 newer jobs got control one after the other. So resource assignment 
 is clearly not FIFO. We then saw (in EJES) that once a job ended, 
 all waiting jobs are active for a very short time, then one job 
 continues to run while all other are waiting again.
 
 I have RTFM, and still think ENQ is FIFO. I have not found anything 
 related to GRS STAR mode that contradicts.
 
 I have not followed GRS new lately. What am I missing?
 
 --
 Peter Hunkeler

  GRS resource contention is intended to be processed FIFO, regardless
of Ring mode vs. Star mode. 

Jim Mulder   z/OS System Test   IBM Corp.  Poughkeepsie,  NY

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: Tasks ENQ'ing exclusive on resource not getting control in FIFO order

2013-01-17 Thread Robert A. Rosenberg
Have you checked to see if the system that ends up winning the ENQ is 
the same as the one which was running the job that just ended and did 
the DEQ? It seems to me that the DEQing system has the best chance of 
winning any race condition since it is the first to know of the DEQ 
and thus the first to be able to grab the ENQ lock.





At 14:46 -0600 on 01/17/2013, Peter Hunkeler wrote about Tasks 
ENQ'ing exclusive on resource not getting control in :


x-charset UTF-8It was my understanding that tasks enqueueing on a 
resource are getting control in FIFO order, if contention existed. 
Today, we had a situation where this was not true.


Environment is as follows:

- 4 system parallel sysplex, z/OS V1.13, GRS STAR mode.
- a dozen or so jobs running the same program are active across all 
4 systems at a time. More job being submitted as jobs end (more or 
less).
- the programs are serializing using EXCLUSIVE ENQ on a resource, 
scope systems


As expected, one job is running, all others are waiting to get the 
resource assigned.
But suddenly, we the recognized that jobs on two systems never got 
running. They have been waiting for the resource for hours, while 
newer jobs got control one after the other. So resource assignment 
is clearly not FIFO. We then saw (in EJES) that once a job ended, 
all waiting jobs are active for a very short time, then one job 
continues to run while all other are waiting again.


I have RTFM, and still think ENQ is FIFO. I have not found anything 
related to GRS STAR mode that contradicts.


I have not followed GRS new lately. What am I missing?

--
Peter Hunkeler

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN
/x-charset


--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: Tasks ENQ'ing exclusive on resource not getting control in FIFO order

2013-01-17 Thread Hunkeler Peter (KIUP 4)
Have you checked to see if the system that ends up winning the ENQ is 
the same as the one which was running the job that just ended and did 
the DEQ? It seems to me that the DEQing system has the best chance of 
winning any race condition since it is the first to know of the DEQ 
and thus the first to be able to grab the ENQ lock.

Yes, we saw symptoms like that, too, but if GRS contention resolution
is defined to be FIFO, then there is no race condition one system can
win.

I wanted to confirm I'm not missing something that changed in GRS
processing, before opening a PMR. Jim Mulder confirmed that it still
had to be FIFO. We now have to find out why we saw different behavior.

--
Peter Hunkeler

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN