On 01/23/2012 01:21 PM, Bill Fairchild wrote:
In<45e5f2f45d7878458ee5ca679697335502e25...@usdaexch01.kbm1.loc>, on
01/23/2012
    at 09:08 AM, "Staller, Allan"<allan.stal...@kbmg.com>  said:

From the viewpoint of the Operating System, you now
have 3 times as much data behind the actuator on Mod-9's as Mod-3's.
If the Operating system *thinks* the device is busy, the IO is queued
off the UCB and never even tried until it comes to head of queue.

You MIGHT have up to three times as much data behind the actuator.  That 
depends on how fully loaded the three mod-3s are which are to be merged onto 
the same single mod-9; i.e.,  it depends on which three mod-3s you choose to 
merge together.

If all data sets on all volumes are equally and randomly accessed, then you 
will have three times as much requirement to access the new mod-9 as any of the 
three mod-3s had which were merged.  However, most data centers have highly 
skewed access patterns.  80% of the actuators might have only 20% of the total 
I/O workload.  Which means your volumes are almost certainly NOT equally and 
randomly accessed.  You have some volumes that are almost never accessed and 
some others that are accessed all the time.

When z/OS starts in I/O on DASD device xxxx, z/OS turns on a flag bit in the 
UCB for that device that indicates that this particular z/OS image has started 
an I/O on that device.  But if the device is shared, then another z/OS image 
may have already started an I/O on the same device, turned that same device's 
UCB flag bit on in its copy of the UCB for the device (which might be device 
yyyy on the other image), and not informed any of the other sharing z/OS images 
that it is now doing I/O on that shared device.  So when image A tests its 
private copy of the flag bit and finds it off, that does not necessarily mean 
that the device is unbusy.  Image A doesn't care, however.  It starts the I/O 
and turns the bit on.  If the shared control unit attached to this device is 
not an IBM 2105 SHARK (vintage ca. 2000), plug-compatible equivalent, or some 
successor technology, then image A's I/O will not really be started until image 
B's already started I/O ends.  This will show up on im
a!
  ge A as a spike in device pending time, not in IOSQ time.  The 2105 and newer 
technology have the ability to let multiple I/O requests from multiple sharing 
systems run simultaneously against the same device as long as there is no 
conflict between any of the simultaneous I/Os involving both reads and writes 
for the same range of tracks.

The only way to know what will probably happen is to do I/O measurement on your current mod-3 
workload.  If you don't see much IOSQ time now, then you will see "not much" multiplied 
by three after merging.  How much is not much and/or is negligible is up to you to decide.  You 
might also get an idea as to how to merge volumes together based on their individual IOSQ times; 
e.g., merge the one with the highest IOSQ time now with the two mod-3s that now have the lowest 
average IOSQ times.  After merging them, measure again for IOSQ time.  Only if you have 
"excessive" IOSQ time, where how much is excessive is up to you to decide, would you need 
to consider using PAV devices.

Currently z/OS's I/O Supervisor has no knowledge of the real RAID architecture 
backing the virtual SLED, so many of the classic performance- and space-related 
bottlenecks can theoretically still occur.

Bill Fairchild

Note the original question from Dennis McCarthy (Jan 20) was not an arbitrary 3390-3 to 3390-9 migration but specifically moving a VSAM file occupying 27 3390-3's to 10 3390-9's, so except for last volume we ARE definitely talking about three times the data behind a logical volume, but the usage and activity rate of the dataset were not specified.

IOSQ time and related response time elongation is highly non-linear as device utilization approaches 100%. You could see negligible IOSQ time on each of three 3390-3's running at 34% utilization become astronomical if you merge data from those to a single 3390-9 without PAV, trying to run a load that can't even be satisfied at 100% device busy. Given PAVs and assuming there is enough cache and different physical drives and internal bandwidth on the EMC backing the logical volume, you can in effect exceed 100% logical volume busy (have average number of active I/Os to the volume exceed 1.00) and still get acceptable response.

--
Joel C. Ewing,    Bentonville, AR       jcew...@acm.org 

----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@bama.ua.edu with the message: INFO IBM-MAIN

Reply via email to