Re: [zones-discuss] PSARC/2006/598 Swap resource control; locked memory RM improvements
Comments inline. I've snipped stuff not relevant to comments. 4. prstat(1m) output changes to report swap reserved. INTERFACE COMMITMENT BINDING prstat(1m) output Uncommitted Patch This case proposes changing the SIZE column of prstat -Z zone output lines to SWAP. The swap reported will be the total swap consumed by the zone's processes and tmpfs mounts. This value will assist administrators in monitoring the swap reserved by each zone, allowing them to choose a reasonable zone.max-swap settings. The SIZE column will also be changed to SWAP for prstat options a, T, and J, for users, tasks, and projects. The reason for not changing this column in the default output would be helpful. I have a seperate private interface used by prstat(1m) to get aggregate swap reserved by users, tasks, projects, and zones. Default prstat output is per-process, and the information is accessed via /proc. Currently, per-process, or per-address-space, swap reservation is not counted or made available via /proc. From proc(4): typedef struct psinfo { ... size_t pr_size; /* size of process image in Kbytes */ ... size of process image is pretty meaningless. If we can change pr_size to be swap reserved by process, then we could change SIZE to SWAP for all prstat(1m) output. Would such a change to psinfo_t be reasonable? Currently a global or non-global zone can consume all swap resources available on the system, limiting the usefulness of zones as an application container. zone.max-swap provides a mechanism to I would rephrase that as the container of an application to avoid confusion with the Solaris feature set called Containers. I assume that the former was meant moreso than the latter even though Containers are Solaris' implementation of an application container. I'm not sure what you mean, but ok. By the Solaris feature set called Containers., do you mean zones + RM, or do you mean zones, xen, ldoms. zone.max-swap will be configurable on both the global zone, and non-global zones. The affect on processes in a zone reaching its zone.max-swap limit is the same as if all system swap is reserved. Callers of mmap(2) and sbrk(2) will receive EAGAIN. Writes to tmpfs will return ENOSPC, which is the same errno returned when a tmpfs mount reaches it's size mount option. The size mount option limits the quantity of swap that a tmpfs mount can reserve. With S10 11/06, some zone limitations are now configurable, e.g. setting the system time clock. Similarly, the ability to modify a zone's swap limit could be given to the zone's root user, which might be valuable in some situations. This would be analogous to the 'basic' privilege level. It would allow an advisory limit to be placed on a zone - a limit that the zone admin could modify in unusual circumstances. I realize that this opens a can of worms in that most rctls are protected by the sys_res_config priv, which is not allowed in a zone even with 11/06. Further, it makes sense to consistently allow or forbid rctl-modification in zones. I just wanted to mention this idea so that it is not unintentionally overlooked. Currently, all zone.* rctls are not modifiable from a non global zone. The established mechanism for a zone admin to set rctls within the zone is via project.* rctls set on projects within the zone. Granted, in the zone.max-swap case, we are not proposing a project.max-swap, due to implementation complexity and risk. With sufficient customer damand, we could investigate implementing project.max-swap in the future. Currently no zone.* rctls allow basic rctl values to be set. The only project.* rctl which allows basic is project.max-contracts, and perhaps that is a bug. A basic rctl is an unprivileged rctl that only affects the process within the task, project, or zone which sets it. It is pretty useless, except for process.* rctls. I'd be happy to address the general issues of privilege related to project and zone rctls as a seperate case. A possible solution may be to redefine basic for project and zone rctls, and/or introduce more fine grained privileges. I agree that work is needed here. STATISTIC DESCRIPTION zonenameThe name of the zone with {zoneid} swap reserved: swap reserved by zone in bytes. Does swap_reserved include pages shared with other zones, e.g. text pages? Each process mapping text reserves unique swap for that mapping. Even though the underlying physical page may be shared between processes/zones, each process needs it's own swap reservation. This is because each process may cow the page, and then may need to page the private copy to disk. max_swap_reserved: current zone.max-swap limit
Re: [zones-discuss] 3 questions about zones and containers
This question was asked: 2. if a zone pool shares out resources dynamically how do I correlate that with my performance data? For example if a CPU were to be 'imported' by one zone from another, how do I know by looking at the performance data? It was suggestion to use poolstat. which supports an interval and a count. Could an example output be provided showing how this is interpreted? Just a comment on some other ideas that might be useful. For validating variable processes, log into the zone and verify that the number of processor are indeed enabled by using the "psrinfo -vp", workzone1# psrinfo -vp The physical processor has 1 virtual processor (0) x86 (AuthenticAMD family 15 model 5 step 1 clock 2193 MHz) AMD Opteron(tm) Processor 248 The physical processor has 1 virtual processor (1) x86 (AuthenticAMD family 15 model 5 step 1 clock 2193 MHz) AMD Opteron(tm) Processor 248 The physical processor has 1 virtual processor (2) x86 (AuthenticAMD family 15 model 5 step 1 clock 2193 MHz) AMD Opteron(tm) Processor 248 workzone1# Also prstat -Z -n 9,11 -R will produce a display that will dynamicall change as processing is executed. Use /usr/bin/prstat -Z. to show zone process status. global# /usr/bin/prstat -Z PID USERNAME SIZE RSS STATE PRI NICE TIME CPU PROCESS/NLWP 2008 root 4000K 1168K cpu513 28 0 0:02:11 3.7% cpuhog.pl/1 2018 root 4000K 1168K cpu1 32 0 0:02:11 3.7% cpuhog.pl/1 2015 root 4000K 1168K cpu515 30 0 0:02:13 3.6% cpuhog.pl/1 2020 root 4000K 1168K cpu3 29 0 0:02:13 3.6% cpuhog.pl/1 2010 root 4000K 1168K run 17 0 0:02:11 3.5% cpuhog.pl/1 2013 root 4000K 1168K run 28 0 0:02:11 3.5% cpuhog.pl/1 2005 root 4008K 2320K run 8 0 0:02:11 3.5% cpuhog.pl/1 2014 root 4000K 1168K cpu0 30 0 0:02:11 3.5% cpuhog.pl/1 2007 root 4000K 1168K run 20 0 0:02:11 3.5% cpuhog.pl/1 2016 root 4000K 1168K cpu512 28 0 0:02:12 3.5% cpuhog.pl/1 2021 root 4000K 1168K run 17 0 0:02:11 3.4% cpuhog.pl/1 2009 root 4000K 1168K run 14 0 0:02:14 3.3% cpuhog.pl/1 2012 root 4000K 1168K run 16 0 0:02:08 3.3% cpuhog.pl/1 2006 root 4000K 1304K run 18 0 0:02:13 3.3% cpuhog.pl/1 2017 root 4000K 1168K run 25 0 0:02:10 3.3% cpuhog.pl/1 ZONEID NPROC SIZE RSS MEMORY TIME CPU ZONE 2 51 182M 93M 0.5% 0:37:27 59% workzone1 4 51 182M 92M 0.5% 0:16:25 30% workzone2 3 51 183M 93M 0.5% 0:16:30 10% workzone3 0 61 359M 194M 1.1% 0:00:11 0.1% global 1 34 116M 72M 0.4% 0:00:12 0.0% workzone4 Total: 248 processes, 659 lwps, load averages: 51.19, 40.28, 20.52 control -C global# Jeff Victor wrote: George Davis wrote: Zone/Container Gurus, My customers' DBAs ask: 1. how do I collect historical performance data on a 'per zone' basis? With extended accounting. See acctadm(1M) and docs.sun.com. 2. if a zone pool shares out resources dynamically how do I correlate that with my performance data? For example if a CPU were to be 'imported' by one zone from another, how do I know by looking at the performance data? poolstat(1M) tells you this. 3. is it still true that you need to reboot a zone when adding a new disk? Don't know. -- Jeff VICTOR Sun Microsystems jeff.victor @ sun.com OS Ambassador Sr. Technical Specialist Solaris 10 Zones FAQ: http://www.opensolaris.org/os/community/zones/faq -- ___ zones-discuss mailing list zones-discuss@opensolaris.org -- Michael Barto Software Architect LogiQwest Inc. 16458 Bolsa Chica Street, # 15 Huntington Beach, CA92649 http://www.logiqwest.com/ [EMAIL PROTECTED] Tel:714 377 3705 Fax:714 840 3937 Cell: 714 883 1949 'tis a gift to be simple This e-mail may contain LogiQwest proprietary information and should be treated as confidential. ___ zones-discuss mailing list zones-discuss@opensolaris.org
Re: [zones-discuss] PSARC/2006/598 Swap resource control; locked memory RM improvements
On Thu 26 Oct 2006 at 11:50AM, Steve Lawrence wrote: size of process image is pretty meaningless. If we can change pr_size to be swap reserved by process, then we could change SIZE to SWAP for all prstat(1m) output. Would such a change to psinfo_t be reasonable? You'd have to check in with Roger, I think (and doing so would probably be worth doing anyway). Adding a new field might be feasible. Currently a global or non-global zone can consume all swap resources available on the system, limiting the usefulness of zones as an application container. zone.max-swap provides a mechanism to I would rephrase that as the container of an application to avoid confusion with the Solaris feature set called Containers. I assume that the former was meant moreso than the latter even though Containers are Solaris' implementation of an application container. I'm not sure what you mean, but ok. By the Solaris feature set called Containers., do you mean zones + RM, or do you mean zones, xen, ldoms. Steve, I think the text is fine. This document isn't intended for consumption by customers, and the text is clear enough to anyone trying to absorb its meaning. Similarly, the ability to modify a zone's swap limit could be given to the zone's root user, which might be valuable in some situations. This would be analogous to the 'basic' privilege level. It would allow an advisory limit to be placed on a zone - a limit that the zone admin could modify in unusual circumstances. I just wanted to mention this idea so that it is not unintentionally overlooked. Currently, all zone.* rctls are not modifiable from a non global zone. The established mechanism for a zone admin to set rctls within the zone is via project.* rctls set on projects within the zone. Granted, in the zone.max-swap case, we are not proposing a project.max-swap, due to implementation complexity and risk. With sufficient customer damand, we could investigate implementing project.max-swap in the future. I think I'd agree that allowing a zone to modify its own zone.* rctls (perhaps only to lower them) is something we *could do* at some point. But I'm aware of neither an RFE for this nor stated customer demand. If someone wants this, then let's get that recorded as an RFE in the bug database, please. Thanks, -dp -- Daniel Price - Solaris Kernel Engineering - [EMAIL PROTECTED] - blogs.sun.com/dp ___ zones-discuss mailing list zones-discuss@opensolaris.org