I'm sponsoring this case for Steve Lawrence. I've placed copies of the new man pages in the case directory.
Thanks, Jerry Template Version: @(#)sac_nextcase 1.70 03/30/10 SMI This information is Copyright (c) 2010, Oracle and/or its affiliates. All rights reserved. 1. Introduction 1.1. Project/Component Working Name: zonestat 1.2. Name of Document Author/Supplier: Author: Stephen Lawrence 1.3 Date of This Document: 26 July, 2010 4. Technical Description zonestat Command for Zones Cpu and Memory Observability SUMMARY This fast-track proposes the addition of a command line tool to facilitate the observation of system resources consumed by Solaris Zones. The tool is specifically designed to observe the following: 1. Memory and cpu utilization of zones. 2. Utilization of resource control limits. 3. Resource utilization versus physical resources and versus configured limits. 4. Various cpu-related resource partitioning schemes, such as processor sets, pools, fair share scheduler, and cpu-caps. 5. Total utilization and the per-zone break-down. 6. Aggregate, average, and peak utilization over specified time periods. This case seeks patch binding. PROBLEM: Solaris zones are tightly integrated with many Solaris resource management features, but deciphering the configuration and utilization is no small feat. An administrator must make use of many commands in pursuit of this information, such as: zoneadm(1m), pooladm(1m), poolstat(1), prctl(1m), prstat(1), priocntl(1m), rcapstat(1m). The user must essentially follow the trail of crumbs to re-construct the configuration and compute the utilization. The opensolaris community has developed a zonestat script for monitoring zones, which makes use of the existing commands. This is an excellent example of the kind of tool needed, and is used by administrators for its intended purpose. It is also an example of the hoops users must jump through to collection zone resource utilization data. Also, this tool suffers from the shortcomings of the existing Solaris commands. Many commands must be invoked and tied together, and some information is not readily available at all. For example, the most commonly used tool to observe zone cpu utilization is prstat -Z. prstat polls /proc, and will not account for cpu used by short-lived processes. The individual problems in reconstructing the configuration and utilization are as follows: 1. Each zone must be mapped to its configured limits. 2. If processor sets are in use, each zone must be mapped to the processor sets to which it has processes bound. 3. Memory utilization is available via a private system call (getvmusage()), which is expensive to invoke. 4. Cpu utilization must be collected via extended accounting, which requires privilege to configure. The extended accounting log must also be managed. Extended accounting must be used to compute the cpu utilization because it will contain all data associated with processes which have exited. 5. In the case of fair share scheduling, zones sharing processors must be tallied to compute relative share. PROPOSAL: This case proposes solving these problems by implementing: 1. A zones monitoring daemon: zonestatd(1m) A new smf service svc:/system/zones-monitoring:default will execute a zones monitoring daemon. This daemon will implement monitoring the system configuration, managing extended accounting, and computing memory utilization on behalf of clients. zonestatd will respond to local connections (from both the global and non-global zones) via a door, and provide the current configuration and utilization data on request. zonestatd will run only in the global zone. The daemon binary and service will not be delivered to non-global zones. In order to minimize system performance impact, zonestatd will only compute utilization data when there is a client request. Cpu utilization will be collected over the lifespan of clients at a regular interval, in order to keep cpu utilization charged correctly, and to keep the extended accounting logs from accumulating. The calculation of memory data will be throttled by the same interval. This interval will be configurable vi the "config/sample_interval" property on the zones-monitoring SMF service. When contacted from a non-global zone, zonestatd will provide filtered utilization data. Since, in reality, each zone shares resources with other zones, each zone will see the total resource utilization, its resource utilization, and the resource consumption by "everything else". Example of the zstat(1m) command (described below) invoked from a non-global zone. All resource usage by the kernel and other zones is reported as [system]: # zonestat -r physical-memory PHYSICAL-MEMORY SYSTEM-MEMORY mem_default 2046M ZONE USED PCT CAP %CAP [total] 725M 35% - - [system] 408M 19% - - myzone 317M 15% - - Reporting total resource and usage within a non-global zone may be considered a security issue. However this information is available today via the existing kstats and kstat-related commands such as vmstat and mpstat when used from within a non-global zone. The value in reporting this information is to allow the non-global zone user to observe the true quantity of resource which is currently available. Without such information, they may attempt to overuse resources, leading to performance degradation. See also the attached zonestatd(1m) man page. 2. A zones monitoring library: libzonestat.so(3lib) The library will implement connecting to zonestatd and fetching the current utilization data. It will also implement the various usage calculations, and provide utility functions to facilitate comparing utilization data over intervals. This library will be private, and only consumed by the following command described below. A future case may propose making the library public (committed) for external consumption by programmatic clients. 3. A zones monitoring CLI: zonestat(1). The zonestat command will invoke the monitoring library to fetch the current utilization data and report it in both human readable and machine parseable formats. The command output is a multi-line report, showing each system resource, the aggregate utilization, system utilization, and per-zone utilization. The zonestat command invocation is as follows: zonestat [-z zonelist] [-r reslist] [-n namelist] [-T u | d | i] [-R reports] [-q] [-p [-P lines]] [-S cols] interval [duration [report]] zonestat uses "interval count" style operands, similar to the existing *stat commands. A third "report" argument can be added to allow users to get summary information at a chosen interval. The operands can also be specified as duration in seconds instead of as counts. My default, the zstat(1) command will report a summary of cpu, physical-memory, and virtual-memory utilization. More details resource-specific views are reported when specific resources are requested. The CLI options provide filtering based on zone and resource, and sorting based on usage and configured limits. Parseable output and timestamps can also be specified. See the attached zonestat(1) man page for a detailed description of the CLI invocation and output. Example 1: Summary of cpu and memory usage over a 5 second interval: # zonestat 5 1 SUMMARY -----CPU------------- ----PHYSICAL--- ----VIRTUAL---- ZONE USED %PART %CAP %SHRU USED PCT %CAP USED PCT %CAP [total] 9.74 30% - - 7576M 23% - 11.6G 24% - [system] 0.28 0.8% - - 6535M 19% - 10.4G 21% - global 9.10 28% - - 272M 0.8% - 366M 0.7% - kodiak-ab 0.32 1.0% - - 256M 0.7% - 265M 0.5% - kodiak-dp 0.00 0.0% - - 77.6M 0.2% - 71.1M 0.1% - kodiak-gjelinek 0.00 0.0% - - 58.7M 0.1% - 59.3M 0.1% - kodiak-edp 0.00 0.0% - - 53.0M 0.1% - 58.9M 0.1% - kodiak-johnlev 0.00 0.0% - - 51.9M 0.1% - 57.4M 0.1% - kodiak-jordan 0.00 0.0% - - 51.7M 0.1% - 56.8M 0.1% - kodiak-steve 0.00 0.0% - - 51.5M 0.1% - 56.2M 0.1% - kodiak-susan 0.00 0.0% - - 48.9M 0.1% - 55.7M 0.1% - kodiak-batschul 0.00 0.0% - - 48.5M 0.1% - 49.5M 0.1% - kodiak-garypen 0.00 0.0% - - 46.3M 0.1% - 49.5M 0.1% - kodiak-rie 0.00 0.0% - - 22.7M 0.0% - 49.4M 0.1% - Example 2: Detailed usage of the default pset over a 4 second interval: # zonestat -n default-pset 5 1 Collecting data for first interval... Interval: 1, Duration: 0:00:01 PROCESSOR-SET TYPE ONLINE/CPUS MIN/MAX pset_default default-pset 1/1 1/- ZONE USED %PCT CAP %CAP SHRS %SHR %SHRU [total] 0.02 2.3% - - 2 - - [system] 0.00 0.4% - - - - - global 0.01 1.4% - - 1 50% 2.8% foo 0.00 0.1% - - 1 50% 0.3% aaa 0.00 0.1% - - no-fss - - In this example, the quantity of cpus in the default pset are listed, followed by the utilization. It can be seen that the zones are using cpu shares, although zone "aaa" is not configured to use the fair share scheduler. This is likely a mis-configuration that the admin will need to address. Example 3: Usage reports over a 24 hour period: Sample usage every 30 seconds for 24 hours, and produce machine parseable high and average usage reports for each hour: # zonestat -p -r memory -q -R average,high 30s 24h 1h In order to generate reports, the zonestat command will match zones and processor sets by name. This means if a zone is booted and halted a number of times during the life of a zonestat invocation, the utilization will be aggregated using the zone's name, even though they id of the zone (and possibly the processor set) will change. If a zone or pset is renamed, it will appear twice in the report, once as each name. EXPORTED INTERFACES Interface Classification ---------------------------------------------------------------- zonestat(1) invocation Committed zonestat(1) parseable output Committed zonestat(1) human-readable output Uncommitted libzonestat.so.1 Consolidation Private zonestatd(1m) Consolidation Private svc:/system/zones-monitoring:default Consolidation Private config/sample_interval Committed. /var/run/zones/zonestat_door Consolidation Private IMPORTED INTERFACES Interface Classification ----------------------------------------------- /proc/psinfo Committed libexacct.so Committed libkstat.so Committed libpool.so Committed pset_info(2) Committed getrctl(2) Committed acctctl(2) Consolidation Private swapctl(2) Consolidation Private getvmusage(2) Consolidation Private zone_lockedmem(kstat) Consolidation Private zone_swapresv(kstat) Consolidation Private REFERENCES [1] PSARC/2002/174 Virtualization and Namespace Isolation in Solaris [2] PSARC/2006/496 Improved Zones/RM Integration [3] PSARC/2004/402 CPU Caps [4] PSARC/2006/598 Swap resource control; locked memory RM improvements [5] PSARC/2000/136 Administrative support for processor sets and extensions [6] PSARC/2000/452 Revised Share Scheduler (2000/452) [7] OpenSolaris Project: Zones Statistics http://hub.opensolaris.org/bin/view/Project+zonestat/WebHome 6. Resources and Schedule 6.4. Steering Committee requested information 6.4.1. Consolidation C-team Name: ON 6.5. ARC review type: FastTrack 6.6. ARC Exposure: open _______________________________________________ opensolaris-arc mailing list opensolaris-arc@opensolaris.org