Re: Are there tasks that don't play by WLM's rules

Vernooy, C.P. - SPLXM Thu, 18 Oct 2007 00:41:49 -0700

"Barbara Nitz" <[EMAIL PROTECTED]> wrote in message
news:<[EMAIL PROTECTED]>...
> Mark and Shane,
> 
> >Do you have enough engines and capacity where bursts won't hurt other
> >workloads? If so, I might be inclined to run this work in SYSSTC (I 
> >assume discretionary won't work for this since you mentioned "on
time")
> >and then WLM doesn't have to manage it.
> 
> thanks for that suggestion. It was vetoed vehemently by the 'head WLM
developer' when I suggested it. (And given that the address spaces have
a life time from less than a second to forever, there is also no real
way to give them a response time goal based on how long they live.) But
now that the two of you suggest this, too, I might broach the idea again
with my colleague (and ignore the developer). Should things get really
nasty, we may want to do that.
> 
> Let me give you some more information on this:
> 1. The lpar is not the lpar with the highest weight on that box. There
is another sysplex on it that has a significantly higher weight. I
mention this because our workload is basically driven by what the stock
exchange does (completely outside our control), and in our experience
the weights (and number of processors) determine how fast we are once
business picks up significantly. During the stock crashes early in the
year we ended up taking down and deactivating low importance lpars in
advance just to handle the spikes. So I wouldn't really call that
'enough capacity' to spare, given the prohibitive software prices.
> 2. Varying the spare processor online to that lpar (giving it 3 cps)
doesn't even show up in the PI. The reason being, (according to WLM)
that the delays are not caused by cpu, but that's not what RMF tells me.
RMF MIII says that about 60% is processor delay, the rest falls into
'other'.
> 3. That application has firm ties in MQSeries, so it shouldn't really
run higher than that. (I think.) MQS is in STCHIGH (Imp1, exvel 50%),
sysplex wide service class.
> 4. When the first complaints came, I started looking at STCHIGH, which
incidentally was also the SC 'the broker' (these 65 asids on average)
run in. 'The broker' only runs on that lpar, together with its own MQS
and other infrastructure. The only other stuff there is a websphere
application server that does not have that much load and runs in a lower
importance SC.
> 5. First thing we did was put 'the brokers' into their own SC (Imp1,
exvel 40%). And that's when the problem really showed up with PIs
generally in the double digits, sometimes even more, and never near 1.
(I even did a new SAS report to show all intervals with a PI>1 fro all
service classes.)
> -
> The ETR is still open, and I had sent WLM SMF99 records. The 'head WLM
developer' promised to tell us if we can live with this situation when
the box gets full (and the lpar gets full). Interestingly enough, the
last update in the ETR says "<he> has sent his findings to <you>. Please
contact <him> to discuss the situation." Now the question is: Why aren't
*we* informed what 'the situation' is? This really has my suspicions up,
especially as we still have a complaint open with that ETR.
> -
> With this information, any more ideas what to do?
> Best regards, Barbara
> --


Barbara,

I have studied the entire thread carefully for the last 30 minutes and I
am a little lost. What goes through my mind is:
- Is the WLM PI number the real problem, or is the performance actually
bad? A bad report about a well running application is not the end of the
world. 
I suppose you really have performance problems with the application.
- If they all pop up at about the same time, how would you like to see a
2 CP Lpar handle 65 tasks at (about) the same moment??? WLM or no WLM,
here you have a real problem I think, even if the Lpar Weight were high
enough.
- I thought about SYSCTC too, but your hesitation in relation to MQ
seems valid. Something else that can contribute: did you define them as
CPU CRITICAL? This will prevent de DP of the address spaces to fall
below your other, less important work and also eliminate the delay WLM
needs to raise the DP to the necessary value.

ExVel goals have always been a problem to me too, even for simple batch.
My conclusion is that a exvel=40 only means that everyone gets 40% of
what they ask. Big yelling jobs get 40% of what they call for; well
behaving silent jobs also get 40% of what they politely request. This
makes it difficult to give both jobs a fair share of CPU resources.

Regards,
Kees.
**********************************************************************
For information, services and offers, please visit our web site:
http://www.klm.com. This e-mail and any attachment may contain
confidential and privileged material intended for the addressee
only. If you are not the addressee, you are notified that no part
of the e-mail or any attachment may be disclosed, copied or
distributed, and that any other action related to this e-mail or
attachment is strictly prohibited, and may be unlawful. If you have
received this e-mail by error, please notify the sender immediately
by return e-mail, and delete this message. 

Koninklijke Luchtvaart Maatschappij NV (KLM), its subsidiaries
and/or its employees shall not be liable for the incorrect or
incomplete transmission of this e-mail or any attachments, nor
responsible for any delay in receipt.
Koninklijke Luchtvaart Maatschappij N.V. (also known as KLM Royal
Dutch Airlines) is registered in Amstelveen, The Netherlands, with
registered number 33014286 
**********************************************************************

----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: GET IBM-MAIN INFO
Search the archives at http://bama.ua.edu/archives/ibm-main.html

Re: Are there tasks that don't play by WLM's rules

Reply via email to