Hi, We’re experiencing a strange behaviour of our paging sub system and have big difficulties to solve the problem – hope some of you are able to help us. Our VM system pages extremely slowly: a long inactive Linux guest is paged in from DASD by z/VM with approximately 1 MB/sec in an otherwise almost idle system. The overall system page rate is 400-800 pages/second – half for reading pages and half for writing pages. We have 12 paging disks (3390-9) distributed over 2 LCUs and attached with 6 FICON channels – neither disks nor channels are in any way a bottleneck according to our performance readings.
At the same time the main storage is badly utilized: Performance Toolkit reports a storage utilization of 65-75% and it reports strange values for “Total real Storage” and “Total available” storage: namely 0kb. What can be wrong? Kind regards, Klaus Johansen Additional information: We have a z/VM system with approximately 25 zPenguins in an LPAR with 2 IFLs, 12GB main storage and 3GB expanded storage. We have overcommitted storage 2:1. We are fully aware that the storage size for the Linux guest should be minimized as much as possible. Linux uses all available memory for file cache etc. - We have surely seen that z/VM pages file cache out on to the paging volumes, making allocation of memory slow. We have not considered this a problem since the guests are rarely “active” at the same time (the memory accessed by active Linuxes should easily fit in main storage). But we didn’t know that VM pages with 1MB/sec… The “User Wait States” screen in Performance Toolkit confirms that the guests are in page wait (97-100%) during these “1MB/sec. page-ins”. The system is actually capable of paging faster: When all linuxes are paging at the same time (haven’t had time to reschedule daily cron job for log rotation) we see a momentarily page rate at 7000pages/sec. – much better but nothing extraordinary. We have considered that our SRM LDUBUF, SRM STOBUF etc. needs tuning but according the “Performance book” these values mostly affect the scheduler and dispatcher in relation to the Q1-Q3-queues – and there seems to be no problems entering the dispatch list (furthermore: it has no effect to enable “quick dispatch” for a slow paging guest). Storage Utilization Interval 15:39:59-15:40:59, on 2007/12/21 (CURRENT interval, select average for mean data) Main storage utilization: XSTORE utilization: Total real storage 0kB Total available 0kB Total available 0kB Att. to virt. machines 0kB Offline storage frames .......kB Size of CP partition 3'072MB SYSGEN storage size .......kB CP XSTORE utilization 99% CP resident nucleus .......kB Low threshold for migr. 1'680kB Shared storage 117'377MB XSTORE allocation rate 0/s FREE storage pages .......kB Average age of XSTORE blks 2855s FREE stor. subpools 52'976kB Average age at migration ...s Subpool stor. utilization 0% Total DPA size 12'132MB MDCACHE utilization: Locked pages 37'548kB Min. size in XSTORE 0kB Trace table .......kB Max. size in XSTORE 3'072MB Pageable 12'095MB Ideal size in XSTORE 77'692kB Storage utilization 86% Act. size in XSTORE 88'788kB Tasks waiting for a frame 3 Bias for XSTORE 1.00 Tasks waiting for a page 4/s Min. size in main stor. 0kB Max. size in main stor. 12'288MB V=R area: Ideal size in main stor. 1'265MB Size defined ...kB Act. size in main stor. 234'228kB FREE storage ...kB Bias for main stor. 1.00 V=R recovery area in use ...% MDCACHE limit / user 32'544kB V=R user ........ Users with MDCACHE inserts 2 MDISK cache read rate 2/s Paging / spooling activity: MDISK cache write rate ...../s Page moves <2GB for trans. 0/s MDISK cache read hit rate 2/s Fast path page-in rate 89/s MDISK cache read hit ratio 76% Long path page-in rate 1/s Long path page-out rate 0/s VDISKs: Page read rate 179/s System limit (blocks) Unlim. Page write rate 0/s User limit (blocks) Unlim. Page read blocking factor 6 Main store page frames 11312 Page write blocking factor ... Expanded stor. pages 302 Migrate-out blocking factor ... Pages on DASD 426964 Paging SSCH rate 20/s SPOOL read rate 0/s SPOOL write rate 0/s CP owned disks: 1 out of 12 paging-disk during “1MB/sec paging in” Interval 15:37:04-15:37:05, on 2007/12/21 Detailed Analysis for Device 9F01 ( CP OWNED ) Device type : 3390-9 Function pend.: .2ms Device busy : 12% VOLSER : VSPPG7 Disconnected : .0ms I/O contention: 0% Nr. of LINKs: 0 Connected : 3.9ms Reserved : 0% Last SEEK : 1525 Service time : 4.1ms SENSE SSCH : 0 SSCH rate/s : 30.0 Response time : 4.1ms Recovery SSCH : 0 Avoided/s : .0 CU queue time : .0ms Throttle del/s: ... Status: ONLINE System Page/Spool I/O Details Page reads/s : 10.0 Total pages/s : 87.5 PG serv. time: 1.7ms Page writes/s : 77.5 I/Os avoided/s : .0 PG resp. time: 1.7ms Spool reads/s : .0 System I/Os /s : 87.5 PG queue len.: 1.07 Spool writes/s: .0 User interfer./s: .0 Avail. bsize : 1 Path(s) to device 9F01: 61 65 78 79 6C 6F Channel path status : ON ON ON ON ON ON Device Overall CU-Cache Performance Split DIR ADDR VOLSER IO/S %READ %RDHIT %WRHIT ICL/S BYP/S IO/S %READ %RDHIT 16 9F01 VSPPG7 5.3 15 25 100 .0 .0 5.3 15 25 (N) .0 0 0 (S) .0 0 0 (F) MDISK Extent Userid Addr IO/s VSEEK Status LINK MDIO/s +-------------------------------------------------------------------------+ ! 1 - 10016 System PAGE RD/s WR/s MLOAD Used IO/S ! ! LOAD ====> 10.0 77.5 1.7 20% 87.5 ! +-------------------------------------------------------------------------+ -- Psssst! Schon vom neuen GMX MultiMessenger gehört? Der kann`s mit allen: http://www.gmx.net/de/go/multimessenger?did=10