On Sat, Nov 11, 2006 at 09:02:48PM -0800, Gary Winiger wrote:
> 
>       First off, sorry for the stutter in the spec update mail.
> 
> > The project team didn't supply a summary of the changes, so I'll be
> > asking for one in a follow on.
> 

I've addressed your comments way below.  Here is my change summary and case
discussion summary:

SUMMARY OF CHANGES

1.  Change to the proposed uncommitted kstat names and statistics.
    From the form:

        zone:{zoneid}:vm

        with statistics:
                zonename
                swap_reserved
                max_swap_reserved
                locked_memory
                max_locked_memory

    To the form:

        caps:{zoneid}:swaprsev_zone_{zoneid}
        caps:{zoneid}:lockedmem_zone_{zoneid}
        caps:{zoneid}:lockedmem_project_{projid}

        with statistics:
                zonename
                usage
                value

    This sets up a generic scheme for adding kstats to project and
    zone rctls.  A kstat is created per rctl, instead of per zone.

2.  Addition of zonecfg(1m) minimums for setting zone.max-swap.

    When setting zone.max-swap via zonecfg(1m), a minimum value will be
    enforced:

        global zone:     100M
        non-global zone: 50M

    Currently, this is about 20M more than is needed to boot after a
    default installation.

3.  Addition of zonecfg(1m) warnings when setting zone.max-swap and
    zone.max-lwps on the global zone.

        global:capped-memory> set swap=200M
        Warning: Setting capped swap on the global zone can impact
        system availability.

SUMMARY OF CASE DISCUSSION:

The case disussion has focused on the problem that the zone.max-swap
rctl on the global zone can affect system availability.  An identical
problem exists today with task/project/zone.max-lwps.

Solutions to this problem may involve one or more of:

        - Exempting project 0 in the global zone from zone.* rctls.
        - Preventing task/project.* rctls from being set on project 0
          in the global zone.
        - Modifying root's default project.
        - Adding a new privilege to exempt a process from rctls.
        - Updating system service manifests to drop the new privilege.

Solving this problem in a way that will prevent the global zone (on a
default system) from becoming unavailable due to a resource control setting
will require a signficant change to the system.  I believe solving this
problem is outside the scope of the "zone.max-swap" case, and would be better
solved by another case which is "not" seeking patch binding.

To minimize this problem for zone.max-swap (and zone.max-lwps), I've instead
proposed zonecfg enhacements to assist the admin in configuring these rctls
safely.

> 
> >   1. This case proposes adding the following resource control:
> > 
> >     INTERFACE                               COMMITMENT      BINDING
> >     "zone.max-swap"                          Committed        Patch
> > 
> >      This control will limit the swap reserved by processes and tmpfs
> >      mounts within the global zone and non-global zones.  This resource
> >      control serves to address the referenced RFE[6].
> 
>       There was some considerable discussion on the global zone aspect
>       of this part of the proposal.  Perhaps I missed in the spec how
>       the new proposal mitigates the risk of the global zone not being
>       able to administer the system.
> 
> > DETAIL:
> > 
> >   1. "zone.max-swap" resource control.
> > 
> >      Limits swap consumed by user process address space mappings and
> >      tmpfs mounts within a zone.
> 
> >      While a low zone.max-swap setting for the global zone can lead to
> >      a difficult-to-administer global zone, the same problem exists
> >      today when configuring the zone.max-lwps resource control on the
> >      global zone, or when all system swap is reserved.  The zonecfg(1m)
> >      enhancements detailed below will help administrators configure
> >      zone.max-swap safely.
> 
>       Perhaps I misunderstood the interaction between project 0
>       and zone.max-lwps in the global zone.  If a max-lwps is set
>       is project 0 bound by it?

Currently yes.  zone.* rctls bound all processes in the global zone, regardless
of project.  This is the issue that my "other" proposal is attempting to
address.

>       Perhaps a short summary of the offline discussion on project 0
>       and the project teams feeling that the discussions conclusions
>       might not be patch qualified.  I realize the need for this project
>       to have a patch binding.

I've added this summary above.

> 
> >   2. "swap" and "locked" properties for zonecfg(1m) "capped_memory"
> >      resource.
> 
> >      To prevent administrators from configuring a low swap limit that
> >      will prevent a system from booting, zonecfg will not allow a
> >      swap limit to be configured to less than:
> > 
> >     Global zone:     100M
> >     Non-global zone: 50M.
> > 
> >      These numbers are based on the swap needed to boota zone after a
> >      default installation.
> >  
> >      Also, if zone.max-swap is configured (via zonecfg(1m)) on the
> >      global zone, a warning will be printed:
> > 
> >     global:capped-memory> set swap=200M
> >     Warning: Setting capped swap on the global zone can impact
> >     system availability.
> > 
> >      Similar warnings will be printed for setting other rctls on the
> >      global zone which can affect availability, such as zone.max-lwps.
> 
>       I don't doubt that 100M and 50M are currently reasonable numbers,
>       however, how will they be tracked (computed/changed) in future.

Good question.  These are essentially "virtual system requirements".

For the global zone, we could compute this number "on demand", such as
during system boot, instead of using a hard-coded value.  Unfortunately,
the resource management and zones services execute before X is started,
so if these services watermark the minimum swap value, they will
under-estimate.  We could hack the milestone services to cache the
amount of reserved swap when they complete.  We could then use that
value (plus some buffer) as the minimum.

For non-global-zones, there is a bit of a "chicken-and-egg" problem.
It is impossible to know how much swap a zone will need before booting
it.  Of course, under-estimating for a non-global zone is not
catastrophic.  We could just up this minumum when somebody files a bug
that it is too low.

We should probally also not "hard enforce" these minimum in zonecfg,
but rather warn verbosely.  This will allow admins that "really want it"
to get it.  It will also allow us to amply over-estimate the minimums,
which is safer.

In general, it wouldn't be bad if the various milestones recorded the
resource utilization (swap, rss, lwps, processes, etc) when they are
reached.

Ugg, looks like zones and X services start "after" multi-user server.
There is no actual milestone service that maps to the "all" milestone.
So much for snapshotting swap usage vi milestone start scripts.  I'd
rather not hard code this resource utilization snapshot into startd.

Choosing better minimums can be addressed by:
        - Make the global zone swap minimum to the release system
          memory requirements.  This number should be available.
        - Leave the non-global zone minimum at 50M
        - Make violating the zonecfg minimum a "very verbose warning"
          instead of an error.

-Steve





-Steve


> 
> Gary..
_______________________________________________
zones-discuss mailing list
zones-discuss@opensolaris.org

Reply via email to