On Sat, Nov 11, 2006 at 09:02:48PM -0800, Gary Winiger wrote: > > First off, sorry for the stutter in the spec update mail. > > > The project team didn't supply a summary of the changes, so I'll be > > asking for one in a follow on. >
I've addressed your comments way below. Here is my change summary and case discussion summary: SUMMARY OF CHANGES 1. Change to the proposed uncommitted kstat names and statistics. From the form: zone:{zoneid}:vm with statistics: zonename swap_reserved max_swap_reserved locked_memory max_locked_memory To the form: caps:{zoneid}:swaprsev_zone_{zoneid} caps:{zoneid}:lockedmem_zone_{zoneid} caps:{zoneid}:lockedmem_project_{projid} with statistics: zonename usage value This sets up a generic scheme for adding kstats to project and zone rctls. A kstat is created per rctl, instead of per zone. 2. Addition of zonecfg(1m) minimums for setting zone.max-swap. When setting zone.max-swap via zonecfg(1m), a minimum value will be enforced: global zone: 100M non-global zone: 50M Currently, this is about 20M more than is needed to boot after a default installation. 3. Addition of zonecfg(1m) warnings when setting zone.max-swap and zone.max-lwps on the global zone. global:capped-memory> set swap=200M Warning: Setting capped swap on the global zone can impact system availability. SUMMARY OF CASE DISCUSSION: The case disussion has focused on the problem that the zone.max-swap rctl on the global zone can affect system availability. An identical problem exists today with task/project/zone.max-lwps. Solutions to this problem may involve one or more of: - Exempting project 0 in the global zone from zone.* rctls. - Preventing task/project.* rctls from being set on project 0 in the global zone. - Modifying root's default project. - Adding a new privilege to exempt a process from rctls. - Updating system service manifests to drop the new privilege. Solving this problem in a way that will prevent the global zone (on a default system) from becoming unavailable due to a resource control setting will require a signficant change to the system. I believe solving this problem is outside the scope of the "zone.max-swap" case, and would be better solved by another case which is "not" seeking patch binding. To minimize this problem for zone.max-swap (and zone.max-lwps), I've instead proposed zonecfg enhacements to assist the admin in configuring these rctls safely. > > > 1. This case proposes adding the following resource control: > > > > INTERFACE COMMITMENT BINDING > > "zone.max-swap" Committed Patch > > > > This control will limit the swap reserved by processes and tmpfs > > mounts within the global zone and non-global zones. This resource > > control serves to address the referenced RFE[6]. > > There was some considerable discussion on the global zone aspect > of this part of the proposal. Perhaps I missed in the spec how > the new proposal mitigates the risk of the global zone not being > able to administer the system. > > > DETAIL: > > > > 1. "zone.max-swap" resource control. > > > > Limits swap consumed by user process address space mappings and > > tmpfs mounts within a zone. > > > While a low zone.max-swap setting for the global zone can lead to > > a difficult-to-administer global zone, the same problem exists > > today when configuring the zone.max-lwps resource control on the > > global zone, or when all system swap is reserved. The zonecfg(1m) > > enhancements detailed below will help administrators configure > > zone.max-swap safely. > > Perhaps I misunderstood the interaction between project 0 > and zone.max-lwps in the global zone. If a max-lwps is set > is project 0 bound by it? Currently yes. zone.* rctls bound all processes in the global zone, regardless of project. This is the issue that my "other" proposal is attempting to address. > Perhaps a short summary of the offline discussion on project 0 > and the project teams feeling that the discussions conclusions > might not be patch qualified. I realize the need for this project > to have a patch binding. I've added this summary above. > > > 2. "swap" and "locked" properties for zonecfg(1m) "capped_memory" > > resource. > > > To prevent administrators from configuring a low swap limit that > > will prevent a system from booting, zonecfg will not allow a > > swap limit to be configured to less than: > > > > Global zone: 100M > > Non-global zone: 50M. > > > > These numbers are based on the swap needed to boota zone after a > > default installation. > > > > Also, if zone.max-swap is configured (via zonecfg(1m)) on the > > global zone, a warning will be printed: > > > > global:capped-memory> set swap=200M > > Warning: Setting capped swap on the global zone can impact > > system availability. > > > > Similar warnings will be printed for setting other rctls on the > > global zone which can affect availability, such as zone.max-lwps. > > I don't doubt that 100M and 50M are currently reasonable numbers, > however, how will they be tracked (computed/changed) in future. Good question. These are essentially "virtual system requirements". For the global zone, we could compute this number "on demand", such as during system boot, instead of using a hard-coded value. Unfortunately, the resource management and zones services execute before X is started, so if these services watermark the minimum swap value, they will under-estimate. We could hack the milestone services to cache the amount of reserved swap when they complete. We could then use that value (plus some buffer) as the minimum. For non-global-zones, there is a bit of a "chicken-and-egg" problem. It is impossible to know how much swap a zone will need before booting it. Of course, under-estimating for a non-global zone is not catastrophic. We could just up this minumum when somebody files a bug that it is too low. We should probally also not "hard enforce" these minimum in zonecfg, but rather warn verbosely. This will allow admins that "really want it" to get it. It will also allow us to amply over-estimate the minimums, which is safer. In general, it wouldn't be bad if the various milestones recorded the resource utilization (swap, rss, lwps, processes, etc) when they are reached. Ugg, looks like zones and X services start "after" multi-user server. There is no actual milestone service that maps to the "all" milestone. So much for snapshotting swap usage vi milestone start scripts. I'd rather not hard code this resource utilization snapshot into startd. Choosing better minimums can be addressed by: - Make the global zone swap minimum to the release system memory requirements. This number should be available. - Leave the non-global zone minimum at 50M - Make violating the zonecfg minimum a "very verbose warning" instead of an error. -Steve -Steve > > Gary.. _______________________________________________ zones-discuss mailing list zones-discuss@opensolaris.org