Re: Storage Management Enhancement Ideas (was: VM lockup due to storage typo)
Now THAT would have exposed the problem. :-) Actually, he already tried that and the result has been this discussion. Regards, Richard Schuh > I'm sure that there are a couple other ways of preventing the > problem, like IPL'ing the machine first and doing a Q V ALL > to see what resources you really did ask for, could have > stopped the problemif the Systems Programmer did it.
Re: Storage Management Enhancement Ideas
The partition is the easy part, just follow the model of the DUMP space in spool. But how, if you cannot even get a command entered by a logged on user, are you going to be able to make use of it? I think that the idea of putting the collar on the runaway process or guest has to be solved first. Then, it might be possible to take corrective action from already existing userids such as OPERATOR, obviating the need for partitioning. Regards, Richard Schuh > -Original Message- > From: The IBM z/VM Operating System > [mailto:ib...@listserv.uark.edu] On Behalf Of Alan Altmark > Sent: Sunday, September 20, 2009 11:03 PM > To: IBMVM@LISTSERV.UARK.EDU > Subject: Re: Storage Management Enhancement Ideas > > On Sunday, 09/20/2009 at 04:26 EDT, Rob van der Heij > wrote: > > On Sat, Sep 19, 2009 at 6:21 PM, John P. Baker > > > wrote: > > > > > I recommend that the idea of splitting page space into multiple > > > pools > be > > > considered, where individual users can be assigned to different > pools. For > > > the purposes of discussion, let us consider that following > enhancement: > > > > I don't like the idea to use only a subset of your paging > capacity for > > part of the workload. It's not just about space but also about > > throughput. This is imho a very complicated approach to exclude some > > (small) important users from an OOM killer. The real question is > > whether you can do an OOM killer at all and achieve > something useful > > by doing so. > > > > Most performance tuning gets harder when you split resources and > > consumers in different groups and manage them separately. > Sharing is > > easier with large numbers. > > And does not address the core issue: At some point, there is > a shortage of resources. How should CP respond? > > o Deny the request? > o Wait for the resources to be available? > o Steal the resources from someone else? > > You can partition and reserve all the resource you want, but > eventually you run out. > > Alan Altmark > z/VM Development > IBM Endicott >
Re: Storage Management Enhancement Ideas (was: VM lockup due to storage typo)
I mentioned earlier some sort of preferred paging space for CP areas, kind of like the DUMP area and SPOL. But either way, it still depends on a Systems Programmer, which was the weak link in this discussion. Recall that a Systems Programmer caused the problem of authorizing an 8 TB guest. And that System's Programmer will never do that again, IMHO. So setting up preferred paging area, or paging pools, is just another thing that most of us will never do, until we get shot in the foot. I bet that there are more VM systems that are running without a DUMP area then with. And they are the smaller shops that may be able to handle an outage better than others. The DIRMAINT exit to prevent this amount of storage from being authorized, would have stopped itthat is, if the Systems Programmer did it. Your VM performance monitor could have purged the machine and stopped itif the Systems Programmer did it. I'm sure that there are a couple other ways of preventing the problem, like IPL'ing the machine first and doing a Q V ALL to see what resources you really did ask for, could have stopped the problemif the Systems Programmer did it. Perhaps we are just too dangerous to be around anymore . Time to hide us behind panels and such Tom Duerbusch THD Consulting >>> "John P. Baker" 9/19/2009 11:21 AM >>> All, Since we have now beat the issue of storage management to death, I would like to set forth some concrete ideas for consideration. First, it has been pointed out that it may not currently be possible to LOGON to MAINT or OPERATOR or to some other service machine in order to diagnose the problem. I recommend that the idea of splitting page space into multiple pools be considered, where individual users can be assigned to different pools. For the purposes of discussion, let us consider that following enhancement: . In the SYSTEM CONFIG file o DEFBACKSTGPOOL pool-id-8 o BACKSTGPOOL pool-id-8 volser-6 . In the CP directory o OPTION BACKSTGPOOL pool-name-8 . Extend the CLASS B CP QUERY command o QUERY BACKSTGPOOL user-id-8 o QUERY DEFBACKSTGPOOL . Extend the CLASS B CP SET command o SET BACKSTGPOOL user-id-8 {DEFAULT | pool-name-8} . Extend the CLASS G CP QUERY command o QUERY BACKSTGPOOL Each paging volume will be allocated to a specific backing storage pool. A LOGON will be rejected if the backing storage pool does not exist. The SET BACKSTGPOOL command will be rejected if the backing storage pool does not exist. Second, provide a specification on whether a virtual machine requires full backing storage for its defined memory size. . In the SYSTEM CONFIG file o DEFBACKSTG {SYSTEM | VMSIZE} . In the CP directory o OPTION BACKSTG {DEFAULT | SYSTEM | VMSIZE} . Extend the CLASS B CP QUERY command o QUERY BACKSTG user-id-8 o QUERY DEFBACKSTG . Extend the CLASS B CP SET command o SET BACKSTG user-id-8 { DEFAULT | SYSTEM | VMSIZE} . Extend the CLASS G CP QUERY command o QUERY BACKSTG If BACKSTG is set or defaulted to SYSTEM, page allocation will continue to operate as it does today. If BACKSTG is set or defaulted to VMSIZE, there must be available within the backing storage spool sufficient space to accommodate the entirety of the specified VMSIZE, otherwise the LOGON, DEFINE STORAGE, or SET BACKSTG command will be failed. The SETBACKSTG command will force a virtual machine reset to occur. These changes will address some of the issues raised. I am certain that other changes would be required, and that other ideas should be considered. Please post your ideas. Don't hesitate to point out any problems. John P. Baker
Re: Storage Management Enhancement Ideas (was: VM lockup due to storage typo)
Bill, You may well be correct. Of course, that permits me to pose the question of how such a condition could effectively be avoided. Ideas, anyone? John P. Baker -Original Message- From: The IBM z/VM Operating System [mailto:ib...@listserv.uark.edu] On Behalf Of Bill Holder Sent: Monday, September 21, 2009 11:32 AM To: IBMVM@LISTSERV.UARK.EDU Subject: Re: Storage Management Enhancement Ideas (was: VM lockup due to storage typo) These are very interesting ideas, but I suspect (no way to prove, since no doc will be forthcoming) that the hang was not a paging issue, but rather a central storage fragmentation issue involving attempts to allocate four contiguous frames for region and segment tables. Don't let me throw cold water on the current discussion, though, I just wanted to point out that all of the interesting paging ideas probably wouldn't help the situation that triggered this entire discussion. - Bill Holder, z/VM Development, IBM
Re: Storage Management Enhancement Ideas (was: VM lockup due to storage typo)
These are very interesting ideas, but I suspect (no way to prove, since n o doc will be forthcoming) that the hang was not a paging issue, but rather a central storage fragmentation issue involving attempts to allocate four contiguous frames for region and segment tables. Don't let me throw cold water on the current discussion, though, I just wanted to point out that all of the interesting paging ideas probably wouldn't help the situation that triggered this entire discussion. - Bill Holder, z/VM Development, IBM
Re: Storage Management Enhancement Ideas
Alan, I disagree. Yes, you still have the possibility of a resource shortage. However, partitioning provides the installation more flexibility in protecting critical resources. As far as how should CP respond, if sufficient page space is unavailable within a particular backing storage pool according to the criteria set forth, then the request (LOGON, DEFINE STORAGE) should be denied. John P. Baker -Original Message- From: The IBM z/VM Operating System [mailto:ib...@listserv.uark.edu] On Behalf Of Alan Altmark Sent: Monday, September 21, 2009 2:03 AM To: IBMVM@LISTSERV.UARK.EDU Subject: Re: Storage Management Enhancement Ideas And does not address the core issue: At some point, there is a shortage of resources. How should CP respond? o Deny the request? o Wait for the resources to be available? o Steal the resources from someone else? You can partition and reserve all the resource you want, but eventually you run out. Alan Altmark z/VM Development IBM Endicott
Re: Storage Management Enhancement Ideas
On Sunday, 09/20/2009 at 04:26 EDT, Rob van der Heij wrote: > On Sat, Sep 19, 2009 at 6:21 PM, John P. Baker wrote: > > > I recommend that the idea of splitting page space into multiple pools be > > considered, where individual users can be assigned to different pools. For > > the purposes of discussion, let us consider that following enhancement: > > I don't like the idea to use only a subset of your paging capacity for > part of the workload. It's not just about space but also about > throughput. This is imho a very complicated approach to exclude some > (small) important users from an OOM killer. The real question is > whether you can do an OOM killer at all and achieve something useful > by doing so. > > Most performance tuning gets harder when you split resources and > consumers in different groups and manage them separately. Sharing is > easier with large numbers. And does not address the core issue: At some point, there is a shortage of resources. How should CP respond? o Deny the request? o Wait for the resources to be available? o Steal the resources from someone else? You can partition and reserve all the resource you want, but eventually you run out. Alan Altmark z/VM Development IBM Endicott
Re: Storage Management Enhancement Ideas (was: VM lockup due to storage typo)
On 9/20/09 4:26 AM, "Rob van der Heij" wrote: > > Most performance tuning gets harder when you split resources and > consumers in different groups and manage them separately. Sharing is > easier with large numbers. > Rob Although with SSD coming back into vogue, the idea of swap vs page (shades of HPO) might be worth considering again. If the goal is to get a very large number of pages out of the way quickly and/or adding some additional levels of paging hierarchy back into CP, I can see where that would have merit.
Re: Storage Management Enhancement Ideas (was: VM lockup due to storage typo)
Rob, In many instances you would be correct. However, in this case, the decisions targeting a specific backing storage pool are made either at LOGON time or during a DEFINE STORAGE command. This is actually a very simple approach to the problem. Also, once the backup storage pool placement decision is made, there should be no impact on the instruction path length. John P. Baker -Original Message- From: The IBM z/VM Operating System [mailto:ib...@listserv.uark.edu] On Behalf Of Rob van der Heij Sent: Sunday, September 20, 2009 4:26 AM To: IBMVM@LISTSERV.UARK.EDU Subject: Re: Storage Management Enhancement Ideas (was: VM lockup due to storage typo) On Sat, Sep 19, 2009 at 6:21 PM, John P. Baker wrote: I don't like the idea to use only a subset of your paging capacity for part of the workload. It's not just about space but also about throughput. This is imho a very complicated approach to exclude some (small) important users from an OOM killer. The real question is whether you can do an OOM killer at all and achieve something useful by doing so. Most performance tuning gets harder when you split resources and consumers in different groups and manage them separately. Sharing is easier with large numbers. Rob -- Rob van der Heij Velocity Software http://www.velocitysoftware.com/
Re: Storage Management Enhancement Ideas (was: VM lockup due to storage typo)
On Sat, Sep 19, 2009 at 6:21 PM, John P. Baker wrote: > I recommend that the idea of splitting page space into multiple pools be > considered, where individual users can be assigned to different pools. For > the purposes of discussion, let us consider that following enhancement: I don't like the idea to use only a subset of your paging capacity for part of the workload. It's not just about space but also about throughput. This is imho a very complicated approach to exclude some (small) important users from an OOM killer. The real question is whether you can do an OOM killer at all and achieve something useful by doing so. Most performance tuning gets harder when you split resources and consumers in different groups and manage them separately. Sharing is easier with large numbers. Rob -- Rob van der Heij Velocity Software http://www.velocitysoftware.com/
Re: Storage Management Enhancement Ideas (was: VM lockup due to storage typo)
Rich, Something else that comes to mind is that page space spills into spool space when page space fills up. It may be worth considering to provide system configuration options (both a default and for each backing storage pool) that would determine whether page over-allocation could be spilled into spool space. John P. Baker -Original Message- From: The IBM z/VM Operating System [mailto:ib...@listserv.uark.edu] On Behalf Of Rich Smrcina Sent: Saturday, September 19, 2009 1:19 PM To: IBMVM@LISTSERV.UARK.EDU Subject: Re: Storage Management Enhancement Ideas (was: VM lockup due to storage typo) Nicely written -- Rich Smrcina Phone: 414-491-6001 http://www.linkedin.com/in/richsmrcina Catch the WAVV! http://www.wavv.org WAVV 2010 - Apr 9-14, 2010 Covington, KY
Re: Storage Management Enhancement Ideas (was: VM lockup due to storage typo)
Nicely written John P. Baker wrote: All, Since we have now beat the issue of storage management to death, I would like to set forth some concrete ideas for consideration. First, it has been pointed out that it may not currently be possible to LOGON to MAINT or OPERATOR or to some other service machine in order to diagnose the problem. I recommend that the idea of splitting page space into multiple pools be considered, where individual users can be assigned to different pools. For the purposes of discussion, let us consider that following enhancement: · In the SYSTEM CONFIG file o DEFBACKSTGPOOL pool-id-8 o BACKSTGPOOL pool-id-8 volser-6 · In the CP directory o OPTION BACKSTGPOOL pool-name-8 · Extend the CLASS B CP QUERY command o QUERY BACKSTGPOOL user-id-8 o QUERY DEFBACKSTGPOOL · Extend the CLASS B CP SET command o SET BACKSTGPOOL user-id-8 {DEFAULT | pool-name-8} · Extend the CLASS G CP QUERY command o QUERY BACKSTGPOOL Each paging volume will be allocated to a specific backing storage pool. A LOGON will be rejected if the backing storage pool does not exist. The SET BACKSTGPOOL command will be rejected if the backing storage pool does not exist. Second, provide a specification on whether a virtual machine requires full backing storage for its defined memory size. · In the SYSTEM CONFIG file o DEFBACKSTG {SYSTEM | VMSIZE} · In the CP directory o OPTION BACKSTG {DEFAULT | SYSTEM | VMSIZE} · Extend the CLASS B CP QUERY command o QUERY BACKSTG user-id-8 o QUERY DEFBACKSTG · Extend the CLASS B CP SET command o SET BACKSTG user-id-8 { DEFAULT | SYSTEM | VMSIZE} · Extend the CLASS G CP QUERY command o QUERY BACKSTG If BACKSTG is set or defaulted to SYSTEM, page allocation will continue to operate as it does today. If BACKSTG is set or defaulted to VMSIZE, there must be available within the backing storage spool sufficient space to accommodate the entirety of the specified VMSIZE, otherwise the LOGON, DEFINE STORAGE, or SET BACKSTG command will be failed. The SETBACKSTG command will force a virtual machine reset to occur. These changes will address some of the issues raised. I am certain that other changes would be required, and that other ideas should be considered. Please post your ideas. Don’t hesitate to point out any problems. John P. Baker -- Rich Smrcina Phone: 414-491-6001 http://www.linkedin.com/in/richsmrcina Catch the WAVV! http://www.wavv.org WAVV 2010 - Apr 9-14, 2010 Covington, KY
Storage Management Enhancement Ideas (was: VM lockup due to storage typo)
All, Since we have now beat the issue of storage management to death, I would like to set forth some concrete ideas for consideration. First, it has been pointed out that it may not currently be possible to LOGON to MAINT or OPERATOR or to some other service machine in order to diagnose the problem. I recommend that the idea of splitting page space into multiple pools be considered, where individual users can be assigned to different pools. For the purposes of discussion, let us consider that following enhancement: . In the SYSTEM CONFIG file o DEFBACKSTGPOOL pool-id-8 o BACKSTGPOOL pool-id-8 volser-6 . In the CP directory o OPTION BACKSTGPOOL pool-name-8 . Extend the CLASS B CP QUERY command o QUERY BACKSTGPOOL user-id-8 o QUERY DEFBACKSTGPOOL . Extend the CLASS B CP SET command o SET BACKSTGPOOL user-id-8 {DEFAULT | pool-name-8} . Extend the CLASS G CP QUERY command o QUERY BACKSTGPOOL Each paging volume will be allocated to a specific backing storage pool. A LOGON will be rejected if the backing storage pool does not exist. The SET BACKSTGPOOL command will be rejected if the backing storage pool does not exist. Second, provide a specification on whether a virtual machine requires full backing storage for its defined memory size. . In the SYSTEM CONFIG file o DEFBACKSTG {SYSTEM | VMSIZE} . In the CP directory o OPTION BACKSTG {DEFAULT | SYSTEM | VMSIZE} . Extend the CLASS B CP QUERY command o QUERY BACKSTG user-id-8 o QUERY DEFBACKSTG . Extend the CLASS B CP SET command o SET BACKSTG user-id-8 { DEFAULT | SYSTEM | VMSIZE} . Extend the CLASS G CP QUERY command o QUERY BACKSTG If BACKSTG is set or defaulted to SYSTEM, page allocation will continue to operate as it does today. If BACKSTG is set or defaulted to VMSIZE, there must be available within the backing storage spool sufficient space to accommodate the entirety of the specified VMSIZE, otherwise the LOGON, DEFINE STORAGE, or SET BACKSTG command will be failed. The SETBACKSTG command will force a virtual machine reset to occur. These changes will address some of the issues raised. I am certain that other changes would be required, and that other ideas should be considered. Please post your ideas. Don't hesitate to point out any problems. John P. Baker