When I see an imbalance like this: PROC 0000-082% CP PROC 0006-057% CP PROC 0001-059% CP PROC 0002-059% CP PROC 0003-060% CP PROC 0004-059% CP PROC 0005-059% CP
I immediately think that there may be problems with "Master Processor Onl y" work. IBM has been making great progress in getting rid of master processor only work. That unfortunately means that the remaining items vary from release to release and only IBM (and maybe Barton Robinson) really knows what the current cases are. There is an ESAMAP report (I'm not at m y PC, so I cannot check it's name) that lists values for multiple (sometimes) esoteric variables. Comparing a good example and bad example ESAMAP might find something interesting - but I would pro bably have to send the reports to Barton to get any more information. However, 82% does not look that bad, so the problem may have nothing to d o with Master Processor work. Bote ESAMON/ESAMAP and the Performance Toolkit have the User State Sampli ng numbers, and that is where I would look, to see what the userid with the problem is wa iting on. In ESAMON this is the ESAXACT screen -- I cannot tell you for PerfKit, but I'm sure someone can. This is really your first tool for user performance problems! Alan Ackerman Alan (dot) Ackerman (at) Bank of America (dot) com On Tue, 5 Oct 2010 16:22:47 +0200, Colin Allinson <cgallin...@amadeus.com > wrote: >We have a very large VM system that runs in a partition on a Z9 with 7 >dedicated processors and 24G/4g stor/xstor > >We can, at peak times, get up to the upper 9x% level but this week we ha ve >been running in the 60-75% range > >Earlier this afternoon I had a user complaining about performance and I >could see the following (notice PROC 0000) :- > >AVGPROC-062% 07 >XSTORE-000000/SEC MIGRATE-0000/SEC >MDC READS-000001/SEC WRITES-000000/SEC HIT RATIO-100% >PAGING-1/SEC STEAL-000% >Q0-00004(00000) DORMANT-00253 >Q1-00018(00000) E1-00000(00000) >Q2-00006(00000) EXPAN-002 E2-00000(00000) >Q3-00048(00000) EXPAN-002 E3-00000(00000) >PROC 0000-082% CP PROC 0006-057% CP >PROC 0001-059% CP PROC 0002-059% CP >PROC 0003-060% CP PROC 0004-059% CP >PROC 0005-059% CP >LIMITED-00000 >Ready; T=0.01/0.01 13:23:56 > >I have seen this behaviour before when :- > >a) The system is very busy and we are approaching 100% keeping the >scheduler over employed. >b) We are getting streaming CP messages (i.e. device retries). >c) Huge amounts of data are being spooled from some userid > >I have checked and, as far as I can tell, none of these were relevant at >the time. > >Can anyone suggest any other likely conditions that would cause such a >heavy overload on processor 0? > >btw: The situation improved (with an even balance of processor usage bei ng >restored) over a period of about 45 minutes. > > >Colin Allinson >VM Systems Support >Amadeus Data Processing GmbH >