If it isn't using dedicated CPUs every VM system running in an LPAR can
have the same 'involuntary wait' problem as a VM running second level.
The point is, in both cases, that the system doesn't have 100% of a CPU
available but only the part LPAR management (or the higher CP level) let
it use. As Kris wrote you can calculate relative CPU utilization by dividing
the actually used CPU time
- by either elapsed time, yielding an absolute CPU utilization percentage
that will depend only on the actual work done and the power of the CPU.
This value can be reliably reproduced and is suitable for capacity
planning
and trend analysis. It will not tell you, on the other hand, how close to
saturation the VM runs (when running with low priority and/or capped
a system using 10 percent CPU may actually be using all the CPU power
it is allowed to have).
- or by the total of *voluntary* wait time and actual CPU time, yielding a
'logical' CPU utilization percentage that tells you right away how close
to saturation the VM is running. This is nice, but since the value depends
also on the priority and capping for the logical partition (or the virtual
machine), and on the amount of contention for CPUs by other LPARs or
virtual machines, the value does not reflect the actual work done by the
VM system and is meaningless for capacity planning purposes.
That said, my recollection is that IND LOAD logic (which initially really
did show logical utilization) was changed quite some time ago to now
show absolute utilization based on elapsed time, and so display the same
numbers as most performance monitors. But I may be wrong: check the
IND LOAD command logic; the difference between absolute and logical
load would really be a nice explanation for what you see!
Eginhard Jaeger
Fran Josef Pohlen wrote:
the system is a z9 BC with zVM 5.3 in Lpar 1 as the only running system on
the machine. No second level just a normal VM/VSE system, not something
special. My colleague has told me, he had such a behaviour at another site
some time ago and it was a defect channel, where VM tried a permanent
recovery. Maybe that one of the converter channels or an osa has a
problem. We will try to configure one channel after the other offline and
see what will happen.
kind regards
Franz Josef Pohlen
Kris Buelens schrieb:
My guess is that this VM system is second level (probably under LPAR, but
a host VM would yield the same).
The involuntary CPU waits imposed by the first level (LPAR or CP) is not
known to IND LOAD, it is taken as busy. If the first level host decides
your zVM gets e.g. 50% of the entire system, and this zVM want to use 50%
or more: IND LOAD will display 100% busy as the zVM never places itself
in voluntary wait. Logically your zVM system is indeed 100% busy (it
cannot execute more instructions than it currently does); physically it
gets 50%. And, if you sum up what is used in zVM, you will get at 50%
indeed.
I don't know the HMC monitor enough to know if you can see logical and
physical CPU busy for an LPAR.
When the entire system is not heavily loaded, and capping isn't turned
on, then the difference between logical and physical CPU usage is small.
Maybe this system got a bit more load making that the involuntary waits
are no longer neglectable.
2008/11/24 [EMAIL PROTECTED] <mailto:[EMAIL PROTECTED]>
<[EMAIL PROTECTED] <mailto:[EMAIL PROTECTED]>>
There is no perfomance monitor. I had written a rexx procedure
which issues an IND USER for each machine every minute and
calculates from the difference to the value of the minute before
the cpu consumption for this single machine. IBM in Germany (in
Person Hans Joachim Ebert, meanwhile a retired VSE/CICS expert
from IBM Munich) has compared the values from my procedure with a
real monitor and it was nearly the same. The result of this
procedure is that the added cpu consumption of the measured users
is not more than 50%.
The main purpose for this procedure was to get an estimate of the
cpu usage if customers have no monitor. So unfortunately I cannot
look into details.
kind regards
Franz Josef Pohlen
Rich Smrcina schrieb:
Do you have a performance monitor that can help you narrow
down if a specific VSE machine is the culprit?
[EMAIL PROTECTED] <mailto:[EMAIL PROTECTED]> wrote:
Hello listers,
I have installed zVSE 4.1.1 und zVM 5.3 (Level 0703) at a
customer where we replaced vm/vtam with an additional 4th
zVSE machine with only vtam running for cross-domain
between three zVSE4 systems. Before the change of the
environment the customer was on zVSE 4.1.0. <http://4.1.0.>
The added cpu consumption of the running guests is between
40-50% but the IND LOAD and the activity on HMC shows
permanently between 85 and 100% cpu usage. This creates of
course massive performance problems.
I have no clue what process(es) may create this additional
load. We have restarted the zVM but it didn't help. If you
put e. g. a batch job on the zVSE the cpu load of the zVSE
rises as expected to 30-50% and the the vm load as
indicated by IND LOAD is between 85 and 100%. Except of
the production VSE the load of the other machines
including VM tcpip, FTPSERVE and any others can be almost
neglected (in sum below 10%).
There are no error msgs from hardware which could explain
such a behaviour.
For documentation I have attached the user direct entries
of the vse systems, the system config, the autolog profile
and the cms profile of the VSE systems.
Has anybody experienced such a behaviour of zVM and an
idea where to look for the error. I know that the IND LOAD
cpu value is rounded over a period of time, but I have
never seen a difference greater than 5 - 10% to the added
cpu% of the running systems.
--
Kris Buelens,
IBM Belgium, VM customer support