I'm sponsoring this case for Steve Lawrence.  I've placed copies
of the new man pages in the case directory.

Thanks,
Jerry

Template Version: @(#)sac_nextcase 1.70 03/30/10 SMI
This information is Copyright (c) 2010, Oracle and/or its affiliates. All 
rights reserved.
1. Introduction
    1.1. Project/Component Working Name:
         zonestat
    1.2. Name of Document Author/Supplier:
         Author:  Stephen Lawrence
    1.3  Date of This Document:
        26 July, 2010
4. Technical Description
zonestat Command for Zones Cpu and Memory Observability

SUMMARY

This fast-track proposes the addition of a command line tool to
facilitate the observation of system resources consumed by Solaris
Zones.  The tool is specifically designed to observe the following:

        1.  Memory and cpu utilization of zones.
        2.  Utilization of resource control limits.
        3.  Resource utilization versus physical resources and
            versus configured limits.
        4.  Various cpu-related resource partitioning schemes,
            such as processor sets, pools, fair share scheduler,
            and cpu-caps.
        5.  Total utilization and the per-zone break-down.
        6.  Aggregate, average, and peak utilization over specified
            time periods.

This case seeks patch binding.

PROBLEM:

Solaris zones are tightly integrated with many Solaris resource
management features, but deciphering the configuration and utilization
is no small feat.  An administrator must make use of many commands in
pursuit of this information, such as:

    zoneadm(1m), pooladm(1m), poolstat(1), prctl(1m), prstat(1),  
    priocntl(1m), rcapstat(1m).

The user must essentially follow the trail of crumbs to re-construct the
configuration and compute the utilization.  The opensolaris community
has developed a zonestat script for monitoring zones, which makes use of
the existing commands.  This is an excellent example of the kind of tool
needed, and is used by administrators for its intended purpose.  It is
also an example of the hoops users must jump through to collection zone
resource utilization data.  Also, this tool suffers from the
shortcomings of the existing Solaris commands.  Many commands must be
invoked and tied together, and some information is not readily available
at all.  For example, the most commonly used tool to observe zone cpu
utilization is prstat -Z.  prstat polls /proc, and will not account for
cpu used by short-lived processes.

The individual problems in reconstructing the configuration
and utilization are as follows:

    1.  Each zone must be mapped to its configured limits.
    2.  If processor sets are in use, each zone must be mapped to
        the processor sets to which it has processes bound.
    3.  Memory utilization is available via a private system call
        (getvmusage()), which is expensive to invoke.
    4.  Cpu utilization must be collected via extended accounting,
        which requires privilege to configure.  The extended accounting
        log must also be managed.  Extended accounting must be used to
        compute the cpu utilization because it will contain all data
        associated with processes which have exited.
    5.  In the case of fair share scheduling, zones sharing processors
        must be tallied to compute relative share.

PROPOSAL:

This case proposes solving these problems by implementing:

    1.  A zones monitoring daemon: zonestatd(1m)
   
    A new smf service svc:/system/zones-monitoring:default will execute
    a zones monitoring daemon.  This daemon will implement monitoring
    the system configuration, managing extended accounting, and
    computing memory utilization on behalf of clients.  zonestatd will
    respond to local connections (from both the global and non-global
    zones) via a door, and provide the current configuration and
    utilization data on request.

    zonestatd will run only in the global zone.  The daemon binary
    and service will not be delivered to non-global zones.

    In order to minimize system performance impact, zonestatd will only
    compute utilization data when there is a client request.  Cpu
    utilization will be collected over the lifespan of clients at
    a regular interval, in order to keep cpu utilization charged
    correctly, and to keep the extended accounting logs from
    accumulating.  The calculation of memory data will be throttled by
    the same interval.  This interval will be configurable vi the
    "config/sample_interval" property on the zones-monitoring SMF
    service.

    When contacted from a non-global zone, zonestatd will provide filtered
    utilization data.  Since, in reality, each zone shares resources
    with other zones, each zone will see the total resource utilization,
    its resource utilization, and the resource consumption by
    "everything else".

    Example of the zstat(1m) command (described below) invoked from
    a non-global zone.  All resource usage by the kernel and other zones
    is reported as [system]:

    # zonestat -r physical-memory
    PHYSICAL-MEMORY                SYSTEM-MEMORY 
    mem_default                            2046M
                                      ZONE  USED  PCT   CAP %CAP
                                   [total]  725M  35%     -    -
                                  [system]  408M  19%     -    -
                                    myzone  317M  15%     -    -

    Reporting total resource and usage within a non-global zone may be
    considered a security issue.  However this information is available
    today via the existing kstats and kstat-related commands such as
    vmstat and mpstat when used from within a non-global zone.  The
    value in reporting this information is to allow the non-global zone
    user to observe the true quantity of resource which is currently
    available.  Without such information, they may attempt to overuse
    resources, leading to performance degradation.

    See also the attached zonestatd(1m)  man page.

    2.  A zones monitoring library: libzonestat.so(3lib)

    The library will implement connecting to zonestatd and fetching
    the current utilization data.  It will also implement the
    various usage calculations, and provide utility functions to
    facilitate comparing utilization data over intervals.  This
    library will be private, and only consumed by the following
    command described below.

    A future case may propose making the library public (committed)
    for external consumption by programmatic clients.

    3.  A zones monitoring CLI: zonestat(1).

    The zonestat command will invoke the monitoring library to fetch
    the current utilization data and report it in both human
    readable and machine parseable formats. 

    The command output is a multi-line report, showing each system
    resource, the aggregate utilization, system utilization, and
    per-zone utilization.  The zonestat command invocation is as follows:

        zonestat [-z zonelist] [-r reslist] [-n namelist] [-T u | d | i]
              [-R reports] [-q] [-p [-P lines]] [-S cols]
              interval [duration [report]]

    zonestat uses "interval count" style operands, similar to the
    existing *stat commands.  A third "report" argument can be added to
    allow users to get summary information at a chosen interval. The
    operands can also be specified as duration in seconds instead of as
    counts.

    My default, the zstat(1) command will report a summary of cpu,
    physical-memory, and virtual-memory utilization.  More details
    resource-specific views are reported when specific resources are
    requested.

    The CLI options provide filtering based on zone and resource, and
    sorting based on usage and configured limits.  Parseable output and
    timestamps can also be specified.

    See the attached zonestat(1) man page for a detailed description
    of the CLI invocation and output.

    Example 1: Summary of cpu and memory usage over a 5 second interval:
    # zonestat 5 1
    SUMMARY
                        -----CPU------------- ----PHYSICAL--- ----VIRTUAL----
                   ZONE USED %PART %CAP %SHRU USED  PCT  %CAP  USED  PCT %CAP
                [total] 9.74   30%    -     - 7576M  23%    - 11.6G  24%    -
               [system] 0.28  0.8%    -     - 6535M  19%    - 10.4G  21%    -
                 global 9.10   28%    -     -  272M 0.8%    -  366M 0.7%    -
              kodiak-ab 0.32  1.0%    -     -  256M 0.7%    -  265M 0.5%    -
              kodiak-dp 0.00  0.0%    -     - 77.6M 0.2%    - 71.1M 0.1%    -
        kodiak-gjelinek 0.00  0.0%    -     - 58.7M 0.1%    - 59.3M 0.1%    -
             kodiak-edp 0.00  0.0%    -     - 53.0M 0.1%    - 58.9M 0.1%    -
         kodiak-johnlev 0.00  0.0%    -     - 51.9M 0.1%    - 57.4M 0.1%    -
          kodiak-jordan 0.00  0.0%    -     - 51.7M 0.1%    - 56.8M 0.1%    -
           kodiak-steve 0.00  0.0%    -     - 51.5M 0.1%    - 56.2M 0.1%    -
           kodiak-susan 0.00  0.0%    -     - 48.9M 0.1%    - 55.7M 0.1%    -
        kodiak-batschul 0.00  0.0%    -     - 48.5M 0.1%    - 49.5M 0.1%    -
         kodiak-garypen 0.00  0.0%    -     - 46.3M 0.1%    - 49.5M 0.1%    -
             kodiak-rie 0.00  0.0%    -     - 22.7M 0.0%    - 49.4M 0.1%    -

    Example 2:  Detailed usage of the default pset over a 4 second
    interval:
    
    # zonestat -n default-pset  5 1
    Collecting data for first interval...
    Interval: 1, Duration: 0:00:01
    PROCESSOR-SET   TYPE           ONLINE/CPUS     MIN/MAX
    pset_default    default-pset           1/1         1/-
                          ZONE  USED  %PCT  CAP %CAP   SHRS  %SHR %SHRU
                        [total]  0.02 2.3%     -    -      2     -     -
                       [system]  0.00 0.4%     -    -      -     -     -
                         global  0.01 1.4%     -    -      1   50%  2.8%
                            foo  0.00 0.1%     -    -      1   50%  0.3%
                            aaa  0.00 0.1%     -    - no-fss     -     - 

    In this example, the quantity of cpus in the default pset are
    listed, followed by the utilization.  It can be seen that the zones
    are using cpu shares, although zone "aaa" is not configured to use
    the fair share scheduler.  This is likely a mis-configuration that
    the admin will need to address.

    Example 3: Usage reports over a 24 hour period:

    Sample usage every 30 seconds for 24 hours, and produce
    machine parseable high and average usage reports for each hour:

    # zonestat -p -r memory -q -R average,high 30s 24h 1h

    In order to generate reports, the zonestat command will match zones
    and processor sets by name.  This means if a zone is booted and
    halted a number of times during the life of a zonestat invocation, the
    utilization will be aggregated using the zone's name, even though
    they id of the zone (and possibly the processor set) will change.
    If a zone or pset is renamed, it will appear twice in the report,
    once as each name.


EXPORTED INTERFACES

        Interface                               Classification
        ----------------------------------------------------------------
        zonestat(1) invocation                  Committed
        zonestat(1) parseable output            Committed
        zonestat(1) human-readable output       Uncommitted

        libzonestat.so.1                        Consolidation Private
        zonestatd(1m)                           Consolidation Private
        svc:/system/zones-monitoring:default    Consolidation Private
            config/sample_interval              Committed.
        /var/run/zones/zonestat_door            Consolidation Private


IMPORTED INTERFACES

        Interface                       Classification
        -----------------------------------------------
        /proc/psinfo                    Committed 
        libexacct.so                    Committed
        libkstat.so                     Committed
        libpool.so                      Committed
        pset_info(2)                    Committed
        getrctl(2)                      Committed
        acctctl(2)                      Consolidation Private
        swapctl(2)                      Consolidation Private
        getvmusage(2)                   Consolidation Private
        zone_lockedmem(kstat)           Consolidation Private
        zone_swapresv(kstat)            Consolidation Private

REFERENCES

   [1] PSARC/2002/174  Virtualization and Namespace Isolation in Solaris 
   [2] PSARC/2006/496  Improved Zones/RM Integration
   [3] PSARC/2004/402  CPU Caps
   [4] PSARC/2006/598  Swap resource control; locked memory RM
       improvements
   [5] PSARC/2000/136  Administrative support for processor sets and
       extensions
   [6] PSARC/2000/452  Revised Share Scheduler (2000/452)
   [7] OpenSolaris Project: Zones Statistics
       http://hub.opensolaris.org/bin/view/Project+zonestat/WebHome


6. Resources and Schedule
    6.4. Steering Committee requested information
        6.4.1. Consolidation C-team Name:
                ON
    6.5. ARC review type: FastTrack
    6.6. ARC Exposure: open

_______________________________________________
opensolaris-arc mailing list
opensolaris-arc@opensolaris.org

Reply via email to