Re: RMF and disk activity questions

Pommier, Rex R. Sat, 14 Aug 2010 20:32:51 -0700

Ron,

I sure appreciate your help.  The thing I neglected to mention is that when the 
backup isn't running, the disk is idling.  We don't run 24/7 (yeah, lucky us), 
so when the backups are running, batch (and online) is pretty much done for the 
night.


I'll check out the monitor 2 options Monday when I get back to work.

Thanks.

Rex

-----Original Message-----
From: IBM Mainframe Discussion List [mailto:[email protected]] On Behalf Of 
Ron Hawkins
Sent: Saturday, August 14, 2010 8:07 PM
To: [email protected]
Subject: Re: RMF and disk activity questions

Rex,

Perhaps an oversimplification, but if you had an IO rate of 5/sec at 200ms
response time for 20 minutes, and then 500/sec and 2ms response time for 10
minutes the average response time for the whole 30 minutes would be just shy
of 6ms.

        Avgrspms        =((5*200*1200)+(500*2*600))/((5*1200)+(500*600))
                        =5.88ms

I'd suggest starting an RMF II Background session to monitor these volumes
at intervals of one minute or less, using the delta option.

If the vendor needs to see RMF data as hard fact then this is just as good
as a monitor 1 report from RMFPP. A 5 second interval would give you a lot
granularity as to the behavior from when the clone process starts and ends,
and you can fold the five second data into larger intervals if required.

Ron




> -----Original Message-----
> From: IBM Mainframe Discussion List [mailto:[email protected]] On
Behalf Of
> Pommier, Rex R.
> Sent: Saturday, August 14, 2010 2:10 PM
> To: [email protected]
> Subject: Re: [IBM-MAIN] RMF and disk activity questions
>
> Thanks, Ron.  I don't know how I missed that explanation for TIME but I
did.
> Obviously I was looking in the wrong place.
>
> As far as the disk symptoms, I agree wholeheartedly as to the cause
(sibling
> pend).  Convincing the vendor that's the problem and figuring out what to
do
> about it are the next steps.  The vendor is trying to convince us that the
> problem is simply that a few of the source and target LUNs for the cloning
are
> on the same physical spindles.  While I can agree that is part of the
problem,
> the problem is bigger than that, in that it doesn't matter what disk I try
to
> hit while the clone is running, I get the horrendous response times.
>
> I guess my question about the difference between Omegamon and RMF interval
> reporting was that they were SO far apart.  When I saw a backup consuming
20+
> minutes of the 30 minute interval with 200+ millisecond response times, I
> would have thought that RMF would have shown larger numbers than 5-10
> millisecond over the reporting period.
>
> Rex
>
> -----Original Message-----
> From: IBM Mainframe Discussion List [mailto:[email protected]] On
Behalf Of
> Ron Hawkins
> Sent: Saturday, August 14, 2010 1:27 PM
> To: [email protected]
> Subject: Re: RMF and disk activity questions
>
> Rex,
>
> The meaning of Time is explained quite nicely at
>
http://publibz.boulder.ibm.com/cgi-bin/bookmgr_OS390/BOOKS/ERBZRA91/5.1.3?SH
> ELF=ERBZBK91&DT=20091215152130. If you're having a problem reading the
fine
> manual, it says "The time the interval began, where hh is hours, mm is the
> minutes, and ss is seconds."
>
> I think you have answered the other questions for your self - the 30
minute
> interval.
>
> Omegamon and RMF get their data from the same source, RMF, so the only
> difference is that with Omega on real time displays the values you are
> looking at are more like RMF Monitor II with Delta on, than the average of
> response time over 30 minutes.
>
> The symptoms you describe are exactly what one would expect to see from
> classic Sibling Pend. The IO activity for the resynch is hosing your MF
> sequential Pre-fetch. You may have already had a problem before the Open
> System Cloning started running, and this has tipped it over the edge. Time
> for some classic disk tuning activity.
>
> Ron
>
> > -----Original Message-----
> > From: IBM Mainframe Discussion List [mailto:[email protected]] On
> Behalf Of
> > Pommier, Rex R.
> > Sent: Friday, August 13, 2010 2:36 PM
> > To: [email protected]
> > Subject: [IBM-MAIN] RMF and disk activity questions
> >
> > Hello list,
> >
> > I have a couple questions, one general question on RMF reporting and the
> other
> > on a specific DASD problem I'm having.
> >
> > First the general question.  On the monitor 1 post processing reports,
> what
> > exactly does the "TIME" field represent?  I know this may sound like a
> silly
> > question, but here's where I'm coming from.  If I have RMF set to sync
> with
> > SMF at a 15 minute interval, RMF cuts records at the end of the 15
minute
> > interval for the preceding 15 minutes and passes this record to SMF for
> > safekeeping.  Along comes the post processor, which I will run for a 2
> hour
> > interval, RTOD(1400,1600) for example.  The first interval report from
the
> > post processor shows a TIME field of 14.00.00.  Does this report segment
> > represent the 15 minute interval beginning at 14:00 or the one ending at
> 14:00
> > (ie, when the RMF record was cut and handed off to SMF)?
> >
> > The preceding question came about because I'm trying to diagnose a
problem
> > that we're having with our disk array performing cloning on the
> non-mainframe
> > side of the array that is killing performance.  I'm seeing wildly
> different
> > pictures coming out of Omegamon and RMF.  Here's the scenario.   The
> Oracle
> > DBA kicks off a clone of a database that is sitting on the same physical
> > spindles as my Z data.  At the same time I'm running a DFDSS job that
> dumps 40
> > or so 3390 mod 9 volumes to 3592 tape.  Under normal conditions (without
> the
> > Oracle junk happening), this dump job runs in about 2 hours, with each
> volume
> > taking 2-4 minutes on average.  When the cloning is taking place, the
> volume
> > dumps are taking 15-20 minutes per volume.  Watching Omegamon, it is
> recording
> > 200-250 millisecond response times on the disk volume being backed up,
> with
> > most of that time being spent in DISCONNECT.  Our disk vendor is aware
of
> this
> > and is attempting to figure out how to fix it.  The problem I'm ha!
> >  ving is that RMF monitor 1 DASD reports is showing average response
times
> in
> > the 5-10 millisecond range and I'm not seeing these huge response times
> that
> > Omegamon is showing.  Caveat, during this time frame, RMF was recording
at
> 30
> > minute intervals.
> >
> > Can somebody explain why I would be seeing good response times in RMF,
> even
> > though Omegamon and the clock are both showing that the disk response
time
> is
> > in the tank?  I would have thought that even with the 30 minute
interval,
> the
> > fact that a backup was running on the affected volume for 15 minutes of
> that
> > interval, with nothing else running on the volume while it wasn't
getting
> > backed up, it would show poor response times.
> >
> > Thanks.
> >
> > Rex

The information contained in this e-mail may contain confidential and/or 
privileged information and is intended for the sole use of the intended 
recipient. If you are not the intended recipient, you are hereby notified that 
any unauthorized use, disclosure, distribution or copying of this communication 
is strictly prohibited. If you received this e-mail in error, please reply to 
sender and destroy or delete the message and any attachments. Thank you.

----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to [email protected] with the message: GET IBM-MAIN INFO
Search the archives at http://bama.ua.edu/archives/ibm-main.html

Re: RMF and disk activity questions

Reply via email to