Re: How to convince others. Was: Re: mono keep guest active - ban the blips.
You can try getting more control of the environment. We don't install all these 'Unix/Linux' std packages in zLinux, because they don't fit in, or give inaccurate data. CPU load for example, we get that from z/VM instead, and our arguments to the organisation here is bought. We select appropriate stuff to monitor that is vaild and works without bloating the cpu to much. Yes, that is a balance, and we always try to minimize things, and just as said in this forum: we really need to think differently. And it is also true, we now starts getting company from other virtual environments than run into problems with resources. So time is working for us :) ___ Tore Agblad Volvo Information Technology Infrastructure Mainframe Design Development, Linux servers Dept 4352 DA1S SE-405 08, Gothenburg Sweden Telephone: +46-31-3233569 E-mail: tore.agb...@volvo.com http://www.volvo.com/volvoit/global/en-gb/ -Original Message- From: Linux on 390 Port [mailto:linux-...@vm.marist.edu] On Behalf Of Rob van der Heij Sent: den 20 augusti 2010 23:08 To: LINUX-390@VM.MARIST.EDU Subject: Re: How to convince others. Was: Re: mono keep guest active - ban the blips. On Fri, Aug 20, 2010 at 12:40 AM, Berry van Sleeuwen berry.vansleeu...@xs4all.nl wrote: Nagios is in use at the server side. Each client (our servers) has the nagios client, with scipting instead of the nagios plugins, and sec. While parts of the Nagios user interface are pretty slick, it just does not scale. While the rather simple architecture does not help, the real problem appears to be in the admins who keep adding additional checks. You can do a lot of silly things on discrete servers with 5% avg utilization, but that does not mean it is a smart thing to do in a shared resource environment. Sec is in use for monitoring the /var/log/messages, it makes the server go into Q3 and stay there and has quite some CPU load as well. Usefull, I don't know, perhaps but why brun so many cycles and keep busy all the time? I mean, how many message can you write and consequently read? At least when we monitor the linux console with PROP we won't have that much overhead. It's probably polling with a very short delay while reading the open file. Obviously it could have used a much longer delay. Which still is pretty silly when nothing is happening in the system that writes data into the log file. You could be off worse. We ran into a commercial product that used this to start a new log file at midnight: - sleep until 23:59:59 - while time() 00:00 do ; You probably figure why this process went into a busy wait for 24 hours ... We have used SCIF to route the Linux console logging into a PROP-like service that checked for bad things and also allowed trusted processes to issue privileged commmands on the Linux guests. That's cheaper and does not keep the Linux guest awake. The other part is scripting scheduled in cron to monitor the filesystem and processes. They tend to run at the same time for all servers and have some CPU load as well. I did notice the mon_fsstat and such, that only have minor impact on the linuxsystem and they even write records every minute. So in this case, usefull yes, but at a cost. So if you have monitor data telling you almost nothing was written to disk, does it still make sense to frequently run commands to check whether the file systems filled up? Similar reasoning for checking installed software levels - if you know nobody issued privileged commands since last time, why check again? Some of this really requires a different way of thinking. Not all the teams that currently deploy a few Linux servers can make that change. If they can't, it really hurts to let them dictate how one should manage an order of magnitude more servers... -- Rob van der Heij Velocity Software http://www.velocitysoftware.com/ -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/ -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/
Re: How to convince others. Was: Re: mono keep guest active - ban the blips.
You know it, I know it. But some people tend to believe only what they *think* they know. In this case unfortunalty the monitoring team is regarded as the specialist and I'm 'only' a VM sysprog. I have proven *) on several occasions that the numbers are off, in some case even way off but still they are convinced the tooling on linux is telling the truth. It is hard to convice management that our VM numbers are more correct when so-called specialists only narrow their view to a single guest. Especially the blipping thing is so hard to explain when everbody else is telling that they don't see anything wrong. (nothing wrong, no problem, so stop complaining). So therefore my question, how to convice them in a way I didn't think of (yet). *) I once did an install in a small LPAR (small in CPU resources that is, storage was enough). The LPAR had so little MIPS available that any linuxactivity quickly drove the real CPU to 100%. Next, 1 linuxguest was running an install. The other 2 linuxguests were idle or next to idle. The performance toolkit revealed that 1 server was running over 90%. The other two at 0.2%. The two linux guests themselves however report they were both running at 100% CPU. While only the one other guest was truly running at next to 100%. As long as the LPAR isn't running at full load the numbers keep more or less in line with the truth. But once CP is deciding who gets the resources linux is clueless as to what it's actual resource usage is. Berry. -Original Message- From: Linux on 390 Port [mailto:linux-...@vm.marist.edu] On Behalf Of Rich Smrcina Sent: vrijdag 20 augustus 2010 1:39 To: LINUX-390@VM.MARIST.EDU Subject: Re: How to convince others. Was: Re: mono keep guest active - ban the blips. When your monitoring department looks at top, vmstat and sar to detect problems, don't forget the kernel numbers lie. Even the new steal timer is a little off. -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/ ÿþD i t b e r i c h t i s v e r t r o u w e l i j k e n k a n g e h e i m e i n f o r m a t i e b e v a t t e n e n k e l b e s t e m d v o o r d e g e a d r e s s e e r d e . I n d i e n d i t b e r i c h t n i e t v o o r u i s b e s t e m d , v e r z o e k e n w i j u d i t o n m i d d e l l i j k a a n o n s t e m e l d e n e n h e t b e r i c h t t e v e r n i e t i g e n . A a n g e z i e n d e i n t e g r i t e i t v a n h e t b e r i c h t n i e t v e i l i g g e s t e l d i s m i d d e l s v e r z e n d i n g v i a i n t e r n e t , k a n A t o s O r i g i n n i e t a a n s p r a k e l i j k w o r d e n g e h o u d e n v o o r d e i n h o u d d a a r v a n . H o e w e l w i j o n s i n s p a n n e n e e n v i r u s v r i j n e t w e r k t e h a n t e r e n , g e v e n w i j g e e n e n k e l e g a r a n t i e d a t d i t b e r i c h t v i r u s v r i j i s , n o c h a a n v a a r d e n w i j e n i g e a a n s p r a k e l i j k h e i d v o o r d e m o g e l i j k e a a n w e z i g h e i d v a n e e n v i r u s i n d i t b e r i c h t . O p a l o n z e r e c h t s v e r h o u d i n g e n , a a n b i e d i n g e n e n o v e r e e n k o m s t e n w a a r o n d e r A t o s O r i g i n g o e d e r e n e n / o f d i e n s t e n l e v e r t z i j n m e t u i t s l u i t i n g v a n a l l e a n d e r e v o o r w a a r d e n d e L e v e r i n g s v o o r w a a r d e n v a n A t o s O r i g i n v a n t o e p a s s i n g . D e z e w o r d e n u o p a a n v r a a g d i r e c t k o s t e l o o s t o e g e z o n d e n . T h i s e - m a i l a n d t h e d o c u m e n t s a t t a c h e d a r e c o n f i d e n t i a l a n d i n t e n d e d s o l e l y f o r t h e a d d r e s s e e ; i t m a y a l s o b e p r i v i l e g e d . I f y o u r e c e i v e t h i s e - m a i l i n e r r o r , p l e a s e n o t i f y t h e s e n d e r i m m e d i a t e l y a n d d e s t r o y i t . A s i t s i n t e g r i t y c a n n o t b e s e c u r e d o n t h e I n t e r n e t , t h e A t o s O r i g i n g r o u p l i a b i l i t y c a n n o t b e t r i g g e r e d f o r t h e m e s s a g e c o n t e n t . A l t h o u g h t h e s e n d e r e n d e a v o u r s t o m a i n t a i n a c o m p u t e r v i r
Re: How to convince others. Was: Re: mono keep guest active - ban the blips.
If only the monitor could 'know' that the machine was running this batch load at a certain time of day and had an absolute share and was running 100% for an extended period of time. It could be set up to not sent out alerts based on all of these criteria. Wow! That would be a very nice feature. Nagios 3 has that feature. -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/
Re: How to convince others. Was: Re: mono keep guest active - ban the blips.
It's smart enough to know that *z/VM* has allocated it an absolute share? On 08/20/2010 05:13 AM, David Boyes wrote: If only the monitor could 'know' that the machine was running this batch load at a certain time of day and had an absolute share and was running 100% for an extended period of time. It could be set up to not sent out alerts based on all of these criteria. Wow! That would be a very nice feature. Nagios 3 has that feature. -- Rich Smrcina Phone: 414-491-6001 http://www.linkedin.com/in/richsmrcina Catch the WAVV! http://www.wavv.org WAVV 2011 - April 15-19, 2011 Colorado Springs, CO -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/
Re: How to convince others. Was: Re: mono keep guest active - ban the blips.
It's smart enough to know that *z/VM* has allocated it an absolute share? It does have the ability to set time of day/shift-based parameters. As to the z/VM part, come to OLF and see. 8-) -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/
Re: How to convince others. Was: Re: mono keep guest active - ban the blips.
David, i'm confuse now... nagios 3 will be able to comunicate with zvm directely or you talking about a especific plugin using vmcp ou something like this ? Sorry if i ask something obvious... On Fri, Aug 20, 2010 at 11:12 AM, David Boyes dbo...@sinenomine.net wrote: It's smart enough to know that *z/VM* has allocated it an absolute share? It does have the ability to set time of day/shift-based parameters. As to the z/VM part, come to OLF and see. 8-) -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/ -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/
Re: How to convince others. Was: Re: mono keep guest active - ban the blips.
forget David.. i figured out now... 2010/8/20 Rogério Soares rogerio.soa...@gmail.com David, i'm confuse now... nagios 3 will be able to comunicate with zvm directely or you talking about a especific plugin using vmcp ou something like this ? Sorry if i ask something obvious... On Fri, Aug 20, 2010 at 11:12 AM, David Boyes dbo...@sinenomine.netwrote: It's smart enough to know that *z/VM* has allocated it an absolute share? It does have the ability to set time of day/shift-based parameters. As to the z/VM part, come to OLF and see. 8-) -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/ -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/
Re: How to convince others. Was: Re: mono keep guest active - ban the blips.
On Fri, Aug 20, 2010 at 12:40 AM, Berry van Sleeuwen berry.vansleeu...@xs4all.nl wrote: Nagios is in use at the server side. Each client (our servers) has the nagios client, with scipting instead of the nagios plugins, and sec. While parts of the Nagios user interface are pretty slick, it just does not scale. While the rather simple architecture does not help, the real problem appears to be in the admins who keep adding additional checks. You can do a lot of silly things on discrete servers with 5% avg utilization, but that does not mean it is a smart thing to do in a shared resource environment. Sec is in use for monitoring the /var/log/messages, it makes the server go into Q3 and stay there and has quite some CPU load as well. Usefull, I don't know, perhaps but why brun so many cycles and keep busy all the time? I mean, how many message can you write and consequently read? At least when we monitor the linux console with PROP we won't have that much overhead. It's probably polling with a very short delay while reading the open file. Obviously it could have used a much longer delay. Which still is pretty silly when nothing is happening in the system that writes data into the log file. You could be off worse. We ran into a commercial product that used this to start a new log file at midnight: - sleep until 23:59:59 - while time() 00:00 do ; You probably figure why this process went into a busy wait for 24 hours ... We have used SCIF to route the Linux console logging into a PROP-like service that checked for bad things and also allowed trusted processes to issue privileged commmands on the Linux guests. That's cheaper and does not keep the Linux guest awake. The other part is scripting scheduled in cron to monitor the filesystem and processes. They tend to run at the same time for all servers and have some CPU load as well. I did notice the mon_fsstat and such, that only have minor impact on the linuxsystem and they even write records every minute. So in this case, usefull yes, but at a cost. So if you have monitor data telling you almost nothing was written to disk, does it still make sense to frequently run commands to check whether the file systems filled up? Similar reasoning for checking installed software levels - if you know nobody issued privileged commands since last time, why check again? Some of this really requires a different way of thinking. Not all the teams that currently deploy a few Linux servers can make that change. If they can't, it really hurts to let them dictate how one should manage an order of magnitude more servers... -- Rob van der Heij Velocity Software http://www.velocitysoftware.com/ -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/
How to convince others. Was: Re: mono keep guest active - ban the blips.
That's a good way to make things clear. Especially to management. Here is a challenge. We are in the process of enrolling new machines into production. Part of that is that they want to force us to install a general monitoring tool (nagios and local scripting). We noticed quite a dramatic increase in resource usage. CPU at least doubles and the guests all go to Q3. Upon our comments on wasting resources, poorer storage handling etc. management responds so then we have to buy storage. So we now have to write a bussinesscase why we NOT should increase storage to handle the load. What are convincing arguments? After a few years of discussing this over and over again I'm out of ideas. Thanks, Berry. Op 17-08-10 23:35, Barton Robinson schreef: The reason these blips are so virtual unfriendly - think about poor old z/vm storage management. We need to steal some pages for some real work going on. Do we steal it from the server doing real transactions? or from the one that is blipping? oops, we can't tell the difference. -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/
Re: How to convince others. Was: Re: mono keep guest active - ban the blips.
Are Nagios and local scripts waking up needlessly? or are they doing legitimate work even if it is wasteful? David Kreuter Original Message Subject: How to convince others. Was: Re: mono keep guest active - ban the blips. From: Berry van Sleeuwen berry.vansleeu...@xs4all.nl Date: Thu, August 19, 2010 3:49 pm To: LINUX-390@VM.MARIST.EDU That's a good way to make things clear. Especially to management. Here is a challenge. We are in the process of enrolling new machines into production. Part of that is that they want to force us to install a general monitoring tool (nagios and local scripting). We noticed quite a dramatic increase in resource usage. CPU at least doubles and the guests all go to Q3. Upon our comments on wasting resources, poorer storage handling etc. management responds so then we have to buy storage. So we now have to write a bussinesscase why we NOT should increase storage to handle the load. What are convincing arguments? After a few years of discussing this over and over again I'm out of ideas. Thanks, Berry. Op 17-08-10 23:35, Barton Robinson schreef: The reason these blips are so virtual unfriendly - think about poor old z/vm storage management. We need to steal some pages for some real work going on. Do we steal it from the server doing real transactions? or from the one that is blipping? oops, we can't tell the difference. -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/ -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/
Re: How to convince others. Was: Re: mono keep guest active - ban the blips.
A 'general monitoring tool' is not a performance monitor. In an environment where efficient resource utilization is critical to the business, a means to monitor: - the performance of the virtual machine environment - the virtual machines running in that environment - potentially systems outboard from the environment Is paramount to a successful implementation on System z. Additionally you may want to perform chargeback and accounting based on internal procedures that may be in place. Nagios doesn't provide the timing resolution or access to z/VM monitoring resources, so it loses. On 08/19/2010 02:49 PM, Berry van Sleeuwen wrote: That's a good way to make things clear. Especially to management. Here is a challenge. We are in the process of enrolling new machines into production. Part of that is that they want to force us to install a general monitoring tool (nagios and local scripting). We noticed quite a dramatic increase in resource usage. CPU at least doubles and the guests all go to Q3. Upon our comments on wasting resources, poorer storage handling etc. management responds so then we have to buy storage. So we now have to write a bussinesscase why we NOT should increase storage to handle the load. What are convincing arguments? After a few years of discussing this over and over again I'm out of ideas. Thanks, Berry. Op 17-08-10 23:35, Barton Robinson schreef: The reason these blips are so virtual unfriendly - think about poor old z/vm storage management. We need to steal some pages for some real work going on. Do we steal it from the server doing real transactions? or from the one that is blipping? oops, we can't tell the difference. -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/ -- Rich Smrcina Phone: 414-491-6001 http://www.linkedin.com/in/richsmrcina Catch the WAVV! http://www.wavv.org WAVV 2011 - April 15-19, 2011 Colorado Springs, CO -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/
Re: How to convince others. Was: Re: mono keep guest active - ban the blips.
Nagios is in use at the server side. Each client (our servers) has the nagios client, with scipting instead of the nagios plugins, and sec. Sec is in use for monitoring the /var/log/messages, it makes the server go into Q3 and stay there and has quite some CPU load as well. Usefull, I don't know, perhaps but why brun so many cycles and keep busy all the time? I mean, how many message can you write and consequently read? At least when we monitor the linux console with PROP we won't have that much overhead. The other part is scripting scheduled in cron to monitor the filesystem and processes. They tend to run at the same time for all servers and have some CPU load as well. I did notice the mon_fsstat and such, that only have minor impact on the linuxsystem and they even write records every minute. So in this case, usefull yes, but at a cost. Berry. Op 19-08-10 22:04, David Kreuter schreef: Are Nagios and local scripts waking up needlessly? or are they doing legitimate work even if it is wasteful? David Kreuter Original Message Subject: How to convince others. Was: Re: mono keep guest active - ban the blips. From: Berry van Sleeuwen berry.vansleeu...@xs4all.nl Date: Thu, August 19, 2010 3:49 pm To: LINUX-390@VM.MARIST.EDU That's a good way to make things clear. Especially to management. Here is a challenge. We are in the process of enrolling new machines into production. Part of that is that they want to force us to install a general monitoring tool (nagios and local scripting). We noticed quite a dramatic increase in resource usage. CPU at least doubles and the guests all go to Q3. Upon our comments on wasting resources, poorer storage handling etc. management responds so then we have to buy storage. So we now have to write a bussinesscase why we NOT should increase storage to handle the load. What are convincing arguments? After a few years of discussing this over and over again I'm out of ideas. Thanks, Berry. Op 17-08-10 23:35, Barton Robinson schreef: The reason these blips are so virtual unfriendly - think about poor old z/vm storage management. We need to steal some pages for some real work going on. Do we steal it from the server doing real transactions? or from the one that is blipping? oops, we can't tell the difference. -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/ -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/ -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/
Re: How to convince others. Was: Re: mono keep guest active - ban the blips.
True, it isn't. It's the replacement of an operator. The main issue here is that it needs to raise tickets and get reporting stats. For instance, raise a ticket at 100% CPU (and indeed, our ABS limithard machines do raise tickets when they are running their batch..sigh.) or when a filesystem is at 100%. The reporting is for instance on CPU and filesystem usage. But indeed it can't provide insight in the performance of a guest, other than detect thresholds. And it doesn't have to either, the monitoring department can look at top, vmstat or sar to detect that kind of problems should they need to (yeah right, then they know all about the entire environment). Still, as for a case, this is a good point. We need to be able to address performance related monitoring and nagios can't do that. Or at least not within the scope of an entire LPAR. Thanks, Berry. Op 19-08-10 22:12, Rich Smrcina schreef: A 'general monitoring tool' is not a performance monitor. In an environment where efficient resource utilization is critical to the business, a means to monitor: - the performance of the virtual machine environment - the virtual machines running in that environment - potentially systems outboard from the environment Is paramount to a successful implementation on System z. Additionally you may want to perform chargeback and accounting based on internal procedures that may be in place. Nagios doesn't provide the timing resolution or access to z/VM monitoring resources, so it loses. -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/
Re: How to convince others. Was: Re: mono keep guest active - ban the blips.
If your batch runs regularly or consistently drive some virtual machines to 100% this may not signal a loop condition (which, I would guess, is why the ticket is being raised). Techs may grow conditioned to this and either take longer to respond or just outright 'ignore' the tickets eventually, since the 'normal' course of action is to page for a condition that is unresolvable without a larger share, or redistribution of the load. If only the monitor could 'know' that the machine was running this batch load at a certain time of day and had an absolute share and was running 100% for an extended period of time. It could be set up to not sent out alerts based on all of these criteria. Wow! That would be a very nice feature. When your monitoring department looks at top, vmstat and sar to detect problems, don't forget the kernel numbers lie. Even the new steal timer is a little off. On 08/19/2010 05:51 PM, Berry van Sleeuwen wrote: True, it isn't. It's the replacement of an operator. The main issue here is that it needs to raise tickets and get reporting stats. For instance, raise a ticket at 100% CPU (and indeed, our ABS limithard machines do raise tickets when they are running their batch..sigh.) or when a filesystem is at 100%. The reporting is for instance on CPU and filesystem usage. But indeed it can't provide insight in the performance of a guest, other than detect thresholds. And it doesn't have to either, the monitoring department can look at top, vmstat or sar to detect that kind of problems should they need to (yeah right, then they know all about the entire environment). Still, as for a case, this is a good point. We need to be able to address performance related monitoring and nagios can't do that. Or at least not within the scope of an entire LPAR. Thanks, Berry. -- Rich Smrcina Phone: 414-491-6001 http://www.linkedin.com/in/richsmrcina Catch the WAVV! http://www.wavv.org WAVV 2011 - April 15-19, 2011 Colorado Springs, CO -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/
Re: How to convince others. Was: Re: mono keep guest active - ban the blips.
Berry, to monitor some stats of lpar using nagios, we set up a machine with high class level, and make some scripts to use vmcp module to query and filter informations... i have sure that is not the best way, but, some times we need improvise :-) On Thu, Aug 19, 2010 at 7:51 PM, Berry van Sleeuwen berry.vansleeu...@xs4all.nl wrote: True, it isn't. It's the replacement of an operator. The main issue here is that it needs to raise tickets and get reporting stats. For instance, raise a ticket at 100% CPU (and indeed, our ABS limithard machines do raise tickets when they are running their batch..sigh.) or when a filesystem is at 100%. The reporting is for instance on CPU and filesystem usage. But indeed it can't provide insight in the performance of a guest, other than detect thresholds. And it doesn't have to either, the monitoring department can look at top, vmstat or sar to detect that kind of problems should they need to (yeah right, then they know all about the entire environment). Still, as for a case, this is a good point. We need to be able to address performance related monitoring and nagios can't do that. Or at least not within the scope of an entire LPAR. Thanks, Berry. Op 19-08-10 22:12, Rich Smrcina schreef: A 'general monitoring tool' is not a performance monitor. In an environment where efficient resource utilization is critical to the business, a means to monitor: - the performance of the virtual machine environment - the virtual machines running in that environment - potentially systems outboard from the environment Is paramount to a successful implementation on System z. Additionally you may want to perform chargeback and accounting based on internal procedures that may be in place. Nagios doesn't provide the timing resolution or access to z/VM monitoring resources, so it loses. -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/ -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/
Re: mono keep guest active - ban the blips.
As a workaround this was suggested: Add the following to the apache config - MonoMaxActiveRequests 0 On 8/18/10 10:19 AM, van Sleeuwen, Berry berry.vansleeu...@atosorigin.com wrote: It is not on SLES11 SP1, there it contains the 2.0.1 version. Is there any way to get around this? Like I mentioned, we tried the MONO_MANAGED_WATCHER=disable parameter but either we have done this the wrong way or it doesn't work as we expected it. Regards, Berry. -Original Message- From: Linux on 390 Port [mailto:linux-...@vm.marist.edu] On Behalf Of Neale Ferguson Sent: woensdag 18 augustus 2010 16:08 To: LINUX-390@VM.MARIST.EDU Subject: Re: mono keep guest active - ban the blips. I've confirmed the behavior has been fixed in mono 2.4 Neale -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/ -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/
Re: How to convince others. Was: Re: mono keep guest active - ban the blips.
It'd be even cooler if your monitor could learn a virtual machines normal or expected activity pattern by time of day / day of week and the signal things out of the ordinary. Like the batch activity that was supposed to have been running but took an unexpected low address protection exception and cpu dived to .5% or the online server whose new code release put them into an occasional loop and chewed an engine for a while. (real world examples from oh the last 3 weeks :). The business of triggering on error messages is always a reactive thing. You get a message, you have a big problem because bad messsage went unnoticed for hours and something on down the line failed, people play cleanup. You add paging automation around that message for the next time... All of this systems automation software could be a lot smarter... Marcy -Original Message- From: Linux on 390 Port [mailto:linux-...@vm.marist.edu] On Behalf Of Rich Smrcina Sent: Thursday, August 19, 2010 4:39 PM To: LINUX-390@vm.marist.edu Subject: Re: [LINUX-390] How to convince others. Was: Re: mono keep guest active - ban the blips. If your batch runs regularly or consistently drive some virtual machines to 100% this may not signal a loop condition (which, I would guess, is why the ticket is being raised). Techs may grow conditioned to this and either take longer to respond or just outright 'ignore' the tickets eventually, since the 'normal' course of action is to page for a condition that is unresolvable without a larger share, or redistribution of the load. If only the monitor could 'know' that the machine was running this batch load at a certain time of day and had an absolute share and was running 100% for an extended period of time. It could be set up to not sent out alerts based on all of these criteria. Wow! That would be a very nice feature. When your monitoring department looks at top, vmstat and sar to detect problems, don't forget the kernel numbers lie. Even the new steal timer is a little off. On 08/19/2010 05:51 PM, Berry van Sleeuwen wrote: True, it isn't. It's the replacement of an operator. The main issue here is that it needs to raise tickets and get reporting stats. For instance, raise a ticket at 100% CPU (and indeed, our ABS limithard machines do raise tickets when they are running their batch..sigh.) or when a filesystem is at 100%. The reporting is for instance on CPU and filesystem usage. But indeed it can't provide insight in the performance of a guest, other than detect thresholds. And it doesn't have to either, the monitoring department can look at top, vmstat or sar to detect that kind of problems should they need to (yeah right, then they know all about the entire environment). Still, as for a case, this is a good point. We need to be able to address performance related monitoring and nagios can't do that. Or at least not within the scope of an entire LPAR. Thanks, Berry. -- Rich Smrcina Phone: 414-491-6001 http://www.linkedin.com/in/richsmrcina Catch the WAVV! http://www.wavv.org WAVV 2011 - April 15-19, 2011 Colorado Springs, CO -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/ -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/
Re: mono keep guest active - ban the blips.
Neale, Did I say that? Perhaps I wasn't too clear about that. I mean powertop shows met that when the guest wakes up, mono was in about 50% of the times responsible for the wakup. Or to say it in Barton's words, 50% of the blips are from mono. Indeed using top I guess we never will see mono since it doesn't use any CPU. That's why I didn't even bother to look at top, I already expected the machine suffered from a timer instead of real work. Berry. -Original Message- From: Linux on 390 Port [mailto:linux-...@vm.marist.edu] On Behalf Of Neale Ferguson Sent: woensdag 18 augustus 2010 0:38 To: LINUX-390@VM.MARIST.EDU Subject: Re: mono keep guest active - ban the blips. I was referring to his observation that he was seeing 55-65% CPU. As for blipping, that's why I suggested he use strace to see what API is being used if there is blipping taking place. Unlike java we can't use oprofile to easily identify the method responsible (if it is blipping). I'll try it on my system but I probably have a different level of mono. I wonder how hard it would be to add oprofile support to mono. , 2010, at 18:19, Barton Robinson bar...@vm1.velocity-software.com wrote: Yep, this is exactly the problem. These processes do not use much cpu, but they blip every 10ms or so. You need to check the queue from the z/VM side to see if they are in Q3. If in Q3, then they are blipping (think i need to trademark that word). The reason these blips are so virtual unfriendly - think about poor old z/vm storage management. We need to steal some pages for some real work going on. Do we steal it from the server doing real transactions? or from the one that is blipping? oops, we can't tell the difference. Neale Ferguson wrote: I¹m looking at my system which has mod_mono in the apache config file and it¹s barely registering on top for CPU though it's quite memory hungry: 1476 wwwrun15 0 59756 28m 6652 S 0.0 5.7 24:58.73 mono 1477 wwwrun15 0 10264 2980 1404 S 0.0 0.6 0:00.00 httpd2-prefork 1478 wwwrun16 0 10264 2996 1420 S 0.0 0.6 0:00.00 httpd2-prefork 1479 wwwrun15 0 10264 2992 1420 S 0.0 0.6 0:00.00 httpd2-prefork 1480 wwwrun16 0 10128 2824 1352 S 0.0 0.6 0:00.00 httpd2-prefork 1481 wwwrun15 0 10128 2760 1300 S 0.0 0.5 0:00.00 httpd2-prefork 3058 wwwrun15 0 10128 2756 1300 S 0.0 0.5 0:00.00 httpd2-prefork 3078 wwwrun17 0 10264 2984 1404 S 0.0 0.6 0:00.00 httpd2-prefork 3079 wwwrun15 0 10264 2976 1404 S 0.0 0.6 0:00.00 httpd2-prefork 3080 wwwrun15 0 10264 2976 1404 S 0.0 0.6 0:00.00 httpd2-prefork The system's been up for: 4:56pm up 39 days 4:37, 1 user, load average: 0.00, 0.00, 0.00 What level of mono is installed? Is it registering when there's nobody connected via http? Neale - - For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 - - For more information on Linux on System z, visit http://wiki.linuxvm.org/ -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/ -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/ ÿþD i t b e r i c h t i s v e r t r o u w e l i j k e n k a n g e h e i m e i n f o r m a t i e b e v a t t e n e n k e l b e s t e m d v o o r d e g e a d r e s s e e r d e . I n d i e n d i t b e r i c h t n i e t v o o r u i s b e s t e m d , v e r z o e k e n w i j u d i t o n m i d d e l l i j k a a n o n s t e m e l d e n e n h e t b e r i c h t t e v e r n i e t i g e n . A a n g e z i e n d e i n t e g r i t e i t v a n h e t b e r i c h t n i e t v e i l i g g e s t e l d i s m i d d e l s v e r z e n d i n g v i a i n t e r n e t , k a n A t o s O r i g i n n i e t a a n s p r a k e l i j k w o r d e n g e h o u d e n v o o r d e i n h o u d d a a r v a n . H o e w e l w i j o n s i n s p a n n e n e e n v i r
Re: mono keep guest active
On Wed, Aug 18, 2010 at 4:47 AM, David Boyes dbo...@sinenomine.net wrote: The approach that was used in the 100 hz timer pop elimination code for Z is fairly elegant, but it relies in structure on some hardware features in the Z that would be hard to retro-fit into Intel systems. I think you're misinformed. The nohz timer initially was done only for s390. Since other platforms need this too, Martin generalized the code and it was included in the architecture independent part of the Linux kernel. That's one of the areas where s390 is leading the flock. | Rob -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/
Re: mono keep guest active - ban the blips.
My mistake! I have checked with the mono folks and gone through the code. It turns out that the culprit is pthread_cond_timedwait() used to check for changes to the .config file. This has, apparently, been fixed in later releases/versions of mono. What level are you on? You can verify that the culprit is as I suggest by using gdb with the -p pid option from root. It should stop in that API and a backtrace (bt comand) should show something like this: 0x020f1aa4 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 (gdb) bt #0 0x020f1aa4 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 #1 0x800dda72 in ?? () #2 0x800cf3cc in ?? () #3 0x80084522 in mono_thread_manage () #4 0x8001f57a in mono_main () #5 0x021b8598 in __libc_start_main () from /lib64/libc.so.6 #6 0x8001e68a in sigfillset () On 8/18/10 3:03 AM, van Sleeuwen, Berry berry.vansleeu...@atosorigin.com wrote: Neale, Did I say that? Perhaps I wasn't too clear about that. I mean powertop shows met that when the guest wakes up, mono was in about 50% of the times responsible for the wakup. Or to say it in Barton's words, 50% of the blips are from mono. Indeed using top I guess we never will see mono since it doesn't use any CPU. That's why I didn't even bother to look at top, I already expected the machine suffered from a timer instead of real work. Berry. -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/
Re: mono keep guest active - ban the blips.
Hi Neal, Thanks for your investigations. So yet another package had to be installed on the machine :-). The gdb indeed show quite a similar result, line #6 is different though: 0x0212c860 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 (gdb) bt #0 0x0212c860 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 #1 0x80129558 in ?? () #2 0x80113cd8 in ?? () #3 0x800b956e in mono_thread_manage () #4 0x8001d6ca in mono_main () #5 0x021f9898 in __libc_start_main () from /lib64/libc.so.6 #6 0x8001bac2 in g_get_current_dir () Installed packages for mono (SLES11 SP1): nl24...@nlzlx121:~ rpm -qa | grep mono mono-nunit-2.0.1-1.19.1 mono-data-sqlite-2.0.1-1.19.1 mono-data-2.0.1-1.19.1 mono-web-2.0.1-1.19.1 mono-winforms-2.0.1-1.19.1 apache2-mod_mono-2.0-1.26 mono-core-2.0.1-1.19.1 Regards, Berry. -Original Message- From: Linux on 390 Port [mailto:linux-...@vm.marist.edu] On Behalf Of Neale Ferguson Sent: woensdag 18 augustus 2010 15:48 To: LINUX-390@VM.MARIST.EDU Subject: Re: mono keep guest active - ban the blips. My mistake! I have checked with the mono folks and gone through the code. It turns out that the culprit is pthread_cond_timedwait() used to check for changes to the .config file. This has, apparently, been fixed in later releases/versions of mono. What level are you on? You can verify that the culprit is as I suggest by using gdb with the -p pid option from root. It should stop in that API and a backtrace (bt comand) should show something like this: -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/ ÿþD i t b e r i c h t i s v e r t r o u w e l i j k e n k a n g e h e i m e i n f o r m a t i e b e v a t t e n e n k e l b e s t e m d v o o r d e g e a d r e s s e e r d e . I n d i e n d i t b e r i c h t n i e t v o o r u i s b e s t e m d , v e r z o e k e n w i j u d i t o n m i d d e l l i j k a a n o n s t e m e l d e n e n h e t b e r i c h t t e v e r n i e t i g e n . A a n g e z i e n d e i n t e g r i t e i t v a n h e t b e r i c h t n i e t v e i l i g g e s t e l d i s m i d d e l s v e r z e n d i n g v i a i n t e r n e t , k a n A t o s O r i g i n n i e t a a n s p r a k e l i j k w o r d e n g e h o u d e n v o o r d e i n h o u d d a a r v a n . H o e w e l w i j o n s i n s p a n n e n e e n v i r u s v r i j n e t w e r k t e h a n t e r e n , g e v e n w i j g e e n e n k e l e g a r a n t i e d a t d i t b e r i c h t v i r u s v r i j i s , n o c h a a n v a a r d e n w i j e n i g e a a n s p r a k e l i j k h e i d v o o r d e m o g e l i j k e a a n w e z i g h e i d v a n e e n v i r u s i n d i t b e r i c h t . O p a l o n z e r e c h t s v e r h o u d i n g e n , a a n b i e d i n g e n e n o v e r e e n k o m s t e n w a a r o n d e r A t o s O r i g i n g o e d e r e n e n / o f d i e n s t e n l e v e r t z i j n m e t u i t s l u i t i n g v a n a l l e a n d e r e v o o r w a a r d e n d e L e v e r i n g s v o o r w a a r d e n v a n A t o s O r i g i n v a n t o e p a s s i n g . D e z e w o r d e n u o p a a n v r a a g d i r e c t k o s t e l o o s t o e g e z o n d e n . T h i s e - m a i l a n d t h e d o c u m e n t s a t t a c h e d a r e c o n f i d e n t i a l a n d i n t e n d e d s o l e l y f o r t h e a d d r e s s e e ; i t m a y a l s o b e p r i v i l e g e d . I f y o u r e c e i v e t h i s e - m a i l i n e r r o r , p l e a s e n o t i f y t h e s e n d e r i m m e d i a t e l y a n d d e s t r o y i t . A s i t s i n t e g r i t y c a n n o t b e s e c u r e d o n t h e I n t e r n e t , t h e A t o s O r i g i n g r o u p l i a b i l i t y c a n n o t b e t r i g g e r e d f o r t h e m e s s a g e c o n t e n t . A l t h o u g h t h e s e n d e r e n d e a v o u r s t o m a i n t a i n a c o m p u t e r v i r u s - f r e e n e t w o r k , t h e s e n d e r d o e s n o t w a r r a n t t h a t t h i s t r a n s m i s s i o n i s v i r u s - f r e e a n d w i l l n o t b e l i a b l e f o r a n y d a m a g e s r e s u l t i n g f r o m a n y v i r u s t r a n s m i
Re: mono keep guest active - ban the blips.
I've confirmed the behavior has been fixed in mono 2.4 Neale On 8/18/10 3:03 AM, van Sleeuwen, Berry berry.vansleeu...@atosorigin.com wrote: Neale, Did I say that? Perhaps I wasn't too clear about that. I mean powertop shows met that when the guest wakes up, mono was in about 50% of the times responsible for the wakup. Or to say it in Barton's words, 50% of the blips are from mono. Indeed using top I guess we never will see mono since it doesn't use any CPU. That's why I didn't even bother to look at top, I already expected the machine suffered from a timer instead of real work. Berry. -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/
Re: mono keep guest active - ban the blips.
It is not on SLES11 SP1, there it contains the 2.0.1 version. Is there any way to get around this? Like I mentioned, we tried the MONO_MANAGED_WATCHER=disable parameter but either we have done this the wrong way or it doesn't work as we expected it. Regards, Berry. -Original Message- From: Linux on 390 Port [mailto:linux-...@vm.marist.edu] On Behalf Of Neale Ferguson Sent: woensdag 18 augustus 2010 16:08 To: LINUX-390@VM.MARIST.EDU Subject: Re: mono keep guest active - ban the blips. I've confirmed the behavior has been fixed in mono 2.4 Neale -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/ ÿþD i t b e r i c h t i s v e r t r o u w e l i j k e n k a n g e h e i m e i n f o r m a t i e b e v a t t e n e n k e l b e s t e m d v o o r d e g e a d r e s s e e r d e . I n d i e n d i t b e r i c h t n i e t v o o r u i s b e s t e m d , v e r z o e k e n w i j u d i t o n m i d d e l l i j k a a n o n s t e m e l d e n e n h e t b e r i c h t t e v e r n i e t i g e n . A a n g e z i e n d e i n t e g r i t e i t v a n h e t b e r i c h t n i e t v e i l i g g e s t e l d i s m i d d e l s v e r z e n d i n g v i a i n t e r n e t , k a n A t o s O r i g i n n i e t a a n s p r a k e l i j k w o r d e n g e h o u d e n v o o r d e i n h o u d d a a r v a n . H o e w e l w i j o n s i n s p a n n e n e e n v i r u s v r i j n e t w e r k t e h a n t e r e n , g e v e n w i j g e e n e n k e l e g a r a n t i e d a t d i t b e r i c h t v i r u s v r i j i s , n o c h a a n v a a r d e n w i j e n i g e a a n s p r a k e l i j k h e i d v o o r d e m o g e l i j k e a a n w e z i g h e i d v a n e e n v i r u s i n d i t b e r i c h t . O p a l o n z e r e c h t s v e r h o u d i n g e n , a a n b i e d i n g e n e n o v e r e e n k o m s t e n w a a r o n d e r A t o s O r i g i n g o e d e r e n e n / o f d i e n s t e n l e v e r t z i j n m e t u i t s l u i t i n g v a n a l l e a n d e r e v o o r w a a r d e n d e L e v e r i n g s v o o r w a a r d e n v a n A t o s O r i g i n v a n t o e p a s s i n g . D e z e w o r d e n u o p a a n v r a a g d i r e c t k o s t e l o o s t o e g e z o n d e n . T h i s e - m a i l a n d t h e d o c u m e n t s a t t a c h e d a r e c o n f i d e n t i a l a n d i n t e n d e d s o l e l y f o r t h e a d d r e s s e e ; i t m a y a l s o b e p r i v i l e g e d . I f y o u r e c e i v e t h i s e - m a i l i n e r r o r , p l e a s e n o t i f y t h e s e n d e r i m m e d i a t e l y a n d d e s t r o y i t . A s i t s i n t e g r i t y c a n n o t b e s e c u r e d o n t h e I n t e r n e t , t h e A t o s O r i g i n g r o u p l i a b i l i t y c a n n o t b e t r i g g e r e d f o r t h e m e s s a g e c o n t e n t . A l t h o u g h t h e s e n d e r e n d e a v o u r s t o m a i n t a i n a c o m p u t e r v i r u s - f r e e n e t w o r k , t h e s e n d e r d o e s n o t w a r r a n t t h a t t h i s t r a n s m i s s i o n i s v i r u s - f r e e a n d w i l l n o t b e l i a b l e f o r a n y d a m a g e s r e s u l t i n g f r o m a n y v i r u s t r a n s m i t t e d . O n a l l o f f e r s a n d a g r e e m e n t s u n d e r w h i c h A t o s O r i g i n s u p p l i e s g o o d s a n d / o r s e r v i c e s o f w h a t e v e r n a t u r e , t h e T e r m s o f D e l i v e r y f r o m A t o s O r i g i n e x c l u s i v e l y a p p l y . T h e T e r m s o f D e l i v e r y s h a l l b e p r o m p t l y s u b m i t t e d t o y o u o n y o u r r e q u e s t . A t o s O r i g i n N e d e r l a n d B . V . / U t r e c h t K v K U t r e c h t 3 0 1 3 2 7 6 2
Re: mono keep guest active - ban the blips.
On 8/18/2010 at 10:19 AM, van Sleeuwen, Berry berry.vansleeu...@atosorigin.com wrote: It is not on SLES11 SP1, there it contains the 2.0.1 version. You need to download and install the SLES11 Mono Extension. That contains 2.4 packages, including apache2-mod_mono-addon-2.4-4.2.s390x.rpm. Note that the SLES11 Mono Extension is an extra-cost item, even for System z customers (unlike the HA Extension which is no extra cost). Mark Post -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/
Re: mono keep guest active
Run oprofile and see where this mod is spending its time. strace is also an option to see what API it's using (select with a timeout probably). BTW (not related to your problem) I have submitted a set of fixes to the mono folks that will make a huge set of methods available that currently aren't. (The s390x port had only been using a simplistic vtable lookup whereas the other platforms moved to IMT - a method trampolining scheme. A lot of the new APIs require this mechanism.) On 8/17/10 9:03 AM, Mike Friesenegger mfrieseneg...@novell.com wrote: Hello Berry, I do not know the answer to your question but I know some that might. Let me run this by them and get back to you. Mike van Sleeuwen, Berry 08/17/10 6:58 AM Hi listers, We have configured a SLES11 SP1 with apache and mono. When we start the httpd the server is active all the time, keeping it in Q3 all the time. We have determined that indeed the mono module is the cause for the wakeup of the guest. Powertop shows that 50%-65% of the time mono was responsible for wakeup-from-idle and when we remove mono the guest drops from queue. We have been looking at some options at this time. First we have changed KeepAlive too Off in server-tuning.conf, but no luck. Next we have created a new configuration for mono in the /etc/apache2/conf.d directory to replace the default mod_mono.conf. # note, this config has been created using an online tool to create configfiles... we added LoadModule and MONO_MANAGED_WATCHER. LoadModule mono_module /usr/lib64/apache2/mod_mono.so Alias /sds /srv/www/htdocs/sds MonoServerPath sds /usr/bin/mod-mono-server2 MonoSetEnv sds MONO_IOMAP=all;MONO_MANAGED_WATCHER=disable MonoApplications sds /sds:/srv/www/htdocs/sds Allow from all Order allow,deny MonoSetServerAlias sds SetHandler mono SetOutputFilter DEFLATE SetEnvIfNoCase Request_URI \.(?:gif|jpe?g|png)$ no-gzip dont-vary AddOutputFilterByType DEFLATE text/html text/plain text/xml text/javascript I had found a reference for the MONO_MANAGED_WATCHER that should be set to disable to prevent mono from watching (polling) for filesystem updates. But this also has no effect, though I don't know for sure if this config is really what it should be. But all this did not yet give us a guest that drops out of queue, it still remains in Q3. Any ideas what can we do to reduce the activity of this guest? Met vriendelijke groet/With kind regards, Berry van Sleeuwen Flight Forum 3000 5657 EW Eindhoven ( +31 (0)6 22564276 Atos Origin MO CF SC Mainframe Services -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/ -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/ -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/
Re: mono keep guest active
Yes, this is a problem. We call it virtual hostile. Rob van der Heij has been doing a tremendous amount of research in this area for the last 4 years, we've been trying to educate our customers (and IBM) on what this means. Back in 2001, there was the Linux timer, had the same problem. Got that fixed. This is the same problem. Only originally because the CPU was so slow, it was seen as a CPU problem. With much faster CPUs now, this is a storage problem. There are ways to alleviate the storage problem in our research. The list of virtually hostile software is quite long. van Sleeuwen, Berry wrote: Hi listers, We have configured a SLES11 SP1 with apache and mono. When we start the httpd the server is active all the time, keeping it in Q3 all the time. We have determined that indeed the mono module is the cause for the wakeup of the guest. Powertop shows that 50%-65% of the time mono was responsible for wakeup-from-idle and when we remove mono the guest drops from queue. We have been looking at some options at this time. First we have changed KeepAlive too Off in server-tuning.conf, but no luck. Next we have created a new configuration for mono in the /etc/apache2/conf.d directory to replace the default mod_mono.conf. # note, this config has been created using an online tool to create configfiles... we added LoadModule and MONO_MANAGED_WATCHER. LoadModule mono_module /usr/lib64/apache2/mod_mono.so Alias /sds /srv/www/htdocs/sds MonoServerPath sds /usr/bin/mod-mono-server2 MonoSetEnv sds MONO_IOMAP=all;MONO_MANAGED_WATCHER=disable MonoApplications sds /sds:/srv/www/htdocs/sds Location /sds Allow from all Order allow,deny MonoSetServerAlias sds SetHandler mono SetOutputFilter DEFLATE SetEnvIfNoCase Request_URI \.(?:gif|jpe?g|png)$ no-gzip dont-vary /Location IfModule mod_deflate.c AddOutputFilterByType DEFLATE text/html text/plain text/xml text/javascript /IfModule I had found a reference for the MONO_MANAGED_WATCHER that should be set to disable to prevent mono from watching (polling) for filesystem updates. But this also has no effect, though I don't know for sure if this config is really what it should be. But all this did not yet give us a guest that drops out of queue, it still remains in Q3. Any ideas what can we do to reduce the activity of this guest? Met vriendelijke groet/With kind regards, Berry van Sleeuwen Flight Forum 3000 5657 EW Eindhoven ( +31 (0)6 22564276 Atos Origin http://www.atosorigin.com/ MO CF SC Mainframe Services -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/ -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/
Re: mono keep guest active
The non-hostile list is quite short unfortunately. For the most part Oracle is not hostile and queue drops nicely. Getting vendors including IBM to: 1. acknowledge the problem is hard. 2. once acknowledged repairing (woops, I mean adding a feature) doesn't happen quickly or for that matter often. In my view it is not criminal or heretic for code to acknowledge its virtual surroundings. But lots of apps people think otherwise. People we just want all our virtual machine children to play and share nicely. Give up when you do not have actual work, you will get your turn when needed, really you will. Is that too much to ask? David Kreuter Original Message Subject: Re: mono keep guest active From: Barton Robinson bar...@vm1.velocity-software.com Date: Tue, August 17, 2010 11:11 am To: LINUX-390@VM.MARIST.EDU Yes, this is a problem. We call it virtual hostile. Rob van der Heij has been doing a tremendous amount of research in this area for the last 4 years, we've been trying to educate our customers (and IBM) on what this means. Back in 2001, there was the Linux timer, had the same problem. Got that fixed. This is the same problem. Only originally because the CPU was so slow, it was seen as a CPU problem. With much faster CPUs now, this is a storage problem. There are ways to alleviate the storage problem in our research. The list of virtually hostile software is quite long. van Sleeuwen, Berry wrote: Hi listers, We have configured a SLES11 SP1 with apache and mono. When we start the httpd the server is active all the time, keeping it in Q3 all the time. We have determined that indeed the mono module is the cause for the wakeup of the guest. Powertop shows that 50%-65% of the time mono was responsible for wakeup-from-idle and when we remove mono the guest drops from queue. We have been looking at some options at this time. First we have changed KeepAlive too Off in server-tuning.conf, but no luck. Next we have created a new configuration for mono in the /etc/apache2/conf.d directory to replace the default mod_mono.conf. # note, this config has been created using an online tool to create configfiles... we added LoadModule and MONO_MANAGED_WATCHER. LoadModule mono_module /usr/lib64/apache2/mod_mono.so Alias /sds /srv/www/htdocs/sds MonoServerPath sds /usr/bin/mod-mono-server2 MonoSetEnv sds MONO_IOMAP=all;MONO_MANAGED_WATCHER=disable MonoApplications sds /sds:/srv/www/htdocs/sds Location /sds Allow from all Order allow,deny MonoSetServerAlias sds SetHandler mono SetOutputFilter DEFLATE SetEnvIfNoCase Request_URI \.(?:gif|jpe?g|png)$ no-gzip dont-vary /Location IfModule mod_deflate.c AddOutputFilterByType DEFLATE text/html text/plain text/xml text/javascript /IfModule I had found a reference for the MONO_MANAGED_WATCHER that should be set to disable to prevent mono from watching (polling) for filesystem updates. But this also has no effect, though I don't know for sure if this config is really what it should be. But all this did not yet give us a guest that drops out of queue, it still remains in Q3. Any ideas what can we do to reduce the activity of this guest? Met vriendelijke groet/With kind regards, Berry van Sleeuwen Flight Forum 3000 5657 EW Eindhoven ( +31 (0)6 22564276 Atos Origin http://www.atosorigin.com/ MO CF SC Mainframe Services -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/ -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/ -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/
Re: mono keep guest active
I have the source to mod_mono and the right to commit to the Mono source tree. If we can identify what is waking up then I can make the change(s) to make it friendlier. On 8/17/10 11:47 AM, David Kreuter dkreu...@vm-resources.com wrote: The non-hostile list is quite short unfortunately. For the most part Oracle is not hostile and queue drops nicely. Getting vendors including IBM to: 1. acknowledge the problem is hard. 2. once acknowledged repairing (woops, I mean adding a feature) doesn't happen quickly or for that matter often. In my view it is not criminal or heretic for code to acknowledge its virtual surroundings. But lots of apps people think otherwise. People we just want all our virtual machine children to play and share nicely. Give up when you do not have actual work, you will get your turn when needed, really you will. Is that too much to ask? David Kreuter -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/
Re: mono keep guest active
mod_mono itself is just a stub that kicks off the xsp_server app so I assume you're seeing the process called mono doing the damage. In which case oprofile is not going to help. strace may produce useful information that we may be able to track back to a specific method. On 8/17/10 8:56 AM, van Sleeuwen, Berry berry.vansleeu...@atosorigin.com wrote: Hi listers, We have configured a SLES11 SP1 with apache and mono. When we start the httpd the server is active all the time, keeping it in Q3 all the time. We have determined that indeed the mono module is the cause for the wakeup of the guest. Powertop shows that 50%-65% of the time mono was responsible for wakeup-from-idle and when we remove mono the guest drops from queue. -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/
Re: mono keep guest active
I¹m looking at my system which has mod_mono in the apache config file and it¹s barely registering on top for CPU though it's quite memory hungry: 1476 wwwrun15 0 59756 28m 6652 S 0.0 5.7 24:58.73 mono 1477 wwwrun15 0 10264 2980 1404 S 0.0 0.6 0:00.00 httpd2-prefork 1478 wwwrun16 0 10264 2996 1420 S 0.0 0.6 0:00.00 httpd2-prefork 1479 wwwrun15 0 10264 2992 1420 S 0.0 0.6 0:00.00 httpd2-prefork 1480 wwwrun16 0 10128 2824 1352 S 0.0 0.6 0:00.00 httpd2-prefork 1481 wwwrun15 0 10128 2760 1300 S 0.0 0.5 0:00.00 httpd2-prefork 3058 wwwrun15 0 10128 2756 1300 S 0.0 0.5 0:00.00 httpd2-prefork 3078 wwwrun17 0 10264 2984 1404 S 0.0 0.6 0:00.00 httpd2-prefork 3079 wwwrun15 0 10264 2976 1404 S 0.0 0.6 0:00.00 httpd2-prefork 3080 wwwrun15 0 10264 2976 1404 S 0.0 0.6 0:00.00 httpd2-prefork The system's been up for: 4:56pm up 39 days 4:37, 1 user, load average: 0.00, 0.00, 0.00 What level of mono is installed? Is it registering when there's nobody connected via http? Neale -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/
Re: mono keep guest active - ban the blips.
Yep, this is exactly the problem. These processes do not use much cpu, but they blip every 10ms or so. You need to check the queue from the z/VM side to see if they are in Q3. If in Q3, then they are blipping (think i need to trademark that word). The reason these blips are so virtual unfriendly - think about poor old z/vm storage management. We need to steal some pages for some real work going on. Do we steal it from the server doing real transactions? or from the one that is blipping? oops, we can't tell the difference. Neale Ferguson wrote: I¹m looking at my system which has mod_mono in the apache config file and it¹s barely registering on top for CPU though it's quite memory hungry: 1476 wwwrun15 0 59756 28m 6652 S 0.0 5.7 24:58.73 mono 1477 wwwrun15 0 10264 2980 1404 S 0.0 0.6 0:00.00 httpd2-prefork 1478 wwwrun16 0 10264 2996 1420 S 0.0 0.6 0:00.00 httpd2-prefork 1479 wwwrun15 0 10264 2992 1420 S 0.0 0.6 0:00.00 httpd2-prefork 1480 wwwrun16 0 10128 2824 1352 S 0.0 0.6 0:00.00 httpd2-prefork 1481 wwwrun15 0 10128 2760 1300 S 0.0 0.5 0:00.00 httpd2-prefork 3058 wwwrun15 0 10128 2756 1300 S 0.0 0.5 0:00.00 httpd2-prefork 3078 wwwrun17 0 10264 2984 1404 S 0.0 0.6 0:00.00 httpd2-prefork 3079 wwwrun15 0 10264 2976 1404 S 0.0 0.6 0:00.00 httpd2-prefork 3080 wwwrun15 0 10264 2976 1404 S 0.0 0.6 0:00.00 httpd2-prefork The system's been up for: 4:56pm up 39 days 4:37, 1 user, load average: 0.00, 0.00, 0.00 What level of mono is installed? Is it registering when there's nobody connected via http? Neale -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/ -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/
Re: mono keep guest active
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 David Kreuter wrote: The non-hostile list is quite short unfortunately. For the most part Oracle is not hostile and queue drops nicely. Getting vendors including IBM to: 1. acknowledge the problem is hard. 2. once acknowledged repairing (woops, I mean adding a feature) doesn't happen quickly or for that matter often. In my view it is not criminal or heretic for code to acknowledge its virtual surroundings. But lots of apps people think otherwise. People we just want all our virtual machine children to play and share nicely. Give up when you do not have actual work, you will get your turn when needed, really you will. Is that too much to ask? David Kreuter It seems to me that this issue has certain parallels to the current and long running debate about linux kernel power management hacks targeting embedded devices (e.g. android wake locks) Specifically -- applications are very frequently crappy, and fixing them all, or even a significant fraction of them, is significantly unlikely. Ergo, what, if anything, could a linux kernel do to reign in misbehaving apps? Android's answer is to sleep regardless of what the apps say, with a privilege limited mechanism that blocks sleeping. Privs are only granted to apps the admin (or android packager) deems truely critical like the radio / phone apps. Would some similar sort of mechanism help for virtualization? (complete, uninformed speculation here) Perhaps a kernel mechanism to limit wakeups in the case that no cpu is seen to be consumed, or the like? - -- Pat -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.9 (Darwin) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAkxrBHsACgkQNObCqA8uBswOzwCeN8Sdm59uWxiJXRJiYT60FYX7 4h8AnixYLgrj2+uGx4O2DgD4yI9ornI+ =0pfK -END PGP SIGNATURE- -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/
Re: mono keep guest active - ban the blips.
I was referring to his observation that he was seeing 55-65% CPU. As for blipping, that's why I suggested he use strace to see what API is being used if there is blipping taking place. Unlike java we can't use oprofile to easily identify the method responsible (if it is blipping). I'll try it on my system but I probably have a different level of mono. I wonder how hard it would be to add oprofile support to mono. , 2010, at 18:19, Barton Robinson bar...@vm1.velocity-software.com wrote: Yep, this is exactly the problem. These processes do not use much cpu, but they blip every 10ms or so. You need to check the queue from the z/VM side to see if they are in Q3. If in Q3, then they are blipping (think i need to trademark that word). The reason these blips are so virtual unfriendly - think about poor old z/vm storage management. We need to steal some pages for some real work going on. Do we steal it from the server doing real transactions? or from the one that is blipping? oops, we can't tell the difference. Neale Ferguson wrote: I¹m looking at my system which has mod_mono in the apache config file and it¹s barely registering on top for CPU though it's quite memory hungry: 1476 wwwrun15 0 59756 28m 6652 S 0.0 5.7 24:58.73 mono 1477 wwwrun15 0 10264 2980 1404 S 0.0 0.6 0:00.00 httpd2-prefork 1478 wwwrun16 0 10264 2996 1420 S 0.0 0.6 0:00.00 httpd2-prefork 1479 wwwrun15 0 10264 2992 1420 S 0.0 0.6 0:00.00 httpd2-prefork 1480 wwwrun16 0 10128 2824 1352 S 0.0 0.6 0:00.00 httpd2-prefork 1481 wwwrun15 0 10128 2760 1300 S 0.0 0.5 0:00.00 httpd2-prefork 3058 wwwrun15 0 10128 2756 1300 S 0.0 0.5 0:00.00 httpd2-prefork 3078 wwwrun17 0 10264 2984 1404 S 0.0 0.6 0:00.00 httpd2-prefork 3079 wwwrun15 0 10264 2976 1404 S 0.0 0.6 0:00.00 httpd2-prefork 3080 wwwrun15 0 10264 2976 1404 S 0.0 0.6 0:00.00 httpd2-prefork The system's been up for: 4:56pm up 39 days 4:37, 1 user, load average: 0.00, 0.00, 0.00 What level of mono is installed? Is it registering when there's nobody connected via http? Neale -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/ -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/
Re: mono keep guest active
Pat - sure, any intelligent code paths will help. Certainly in a virtualized environment including but not limited to system z resources are being shared intensely. The q4 problem (maybe I should trademark it!) -- errant q3 -- is insidious and damaging. These aren't grandpa's CMS machines with small working set sizes. The machines which are waking up needlessly due to application layer code typically have very large WSSes. So regardless of path length you have these virtual beasts competing against each other and legit work inducing unneeded paging, storage management, etc. And what is most expensive dollar resource? Unless you are getting the deal of the millennium it's system Z memory, not the IFLs. In general I have found you cannot tune your way out this with SRM values or other CP settings. Keeping your Linux virtual machine size as small as you can while providing decent service is advisable, but it doesn't keep them out of q4. A large DASD paging farm and appropriate xstore values helps contain this but it not a fix. I fail to see why applications are reluctant to determine what environment they are in and make decisions accordingly. I know and understand the rationale behind agnostic code but the entire system z solution for Linux is being hurt by this. It just seems unreasonable for any IBM product to be insensitive to running in a virtual machine. The kernel certainly knows, hey, it even announces it at boot time! David Kreuter Original Message Subject: Re: mono keep guest active From: Patrick Spinler spinler.patr...@mayo.edu Date: Tue, August 17, 2010 5:51 pm To: LINUX-390@VM.MARIST.EDU -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 David Kreuter wrote: The non-hostile list is quite short unfortunately. For the most part Oracle is not hostile and queue drops nicely. Getting vendors including IBM to: 1. acknowledge the problem is hard. 2. once acknowledged repairing (woops, I mean adding a feature) doesn't happen quickly or for that matter often. In my view it is not criminal or heretic for code to acknowledge its virtual surroundings. But lots of apps people think otherwise. People we just want all our virtual machine children to play and share nicely. Give up when you do not have actual work, you will get your turn when needed, really you will. Is that too much to ask? David Kreuter It seems to me that this issue has certain parallels to the current and long running debate about linux kernel power management hacks targeting embedded devices (e.g. android wake locks) Specifically -- applications are very frequently crappy, and fixing them all, or even a significant fraction of them, is significantly unlikely. Ergo, what, if anything, could a linux kernel do to reign in misbehaving apps? Android's answer is to sleep regardless of what the apps say, with a privilege limited mechanism that blocks sleeping. Privs are only granted to apps the admin (or android packager) deems truely critical like the radio / phone apps. Would some similar sort of mechanism help for virtualization? (complete, uninformed speculation here) Perhaps a kernel mechanism to limit wakeups in the case that no cpu is seen to be consumed, or the like? - -- Pat -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.9 (Darwin) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAkxrBHsACgkQNObCqA8uBswOzwCeN8Sdm59uWxiJXRJiYT60FYX7 4h8AnixYLgrj2+uGx4O2DgD4yI9ornI+ =0pfK -END PGP SIGNATURE- -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/ -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/
Re: mono keep guest active
Fortunately it's no longer a Linux on z only problem anymore. With more systems being run under VMware, KVM, virtualbox etc. the number of people who are being affected is getting larger and, hopefully, that translates into vendors whose applications misbehave being lobbied to get their act together. On Aug 17, 2010, at 19:41, David Kreuter dkreu...@vm-resources.com wrote: Pat - sure, any intelligent code paths will help. Certainly in a virtualized environment including but not limited to system z resources are being shared intensely. The q4 problem (maybe I should trademark it!) -- errant q3 -- is insidious and damaging. These aren't grandpa's CMS machines with small working set sizes. The machines which are waking up needlessly due to application layer code typically have very large WSSes. So regardless of path length you have these virtual beasts competing against each other and legit work inducing unneeded paging, storage management, etc. And what is most expensive dollar resource? Unless you are getting the deal of the millennium it's system Z memory, not the IFLs. In general I have found you cannot tune your way out this with SRM values or other CP settings. Keeping your Linux virtual machine size as small as you can while providing decent service is advisable, but it doesn't keep them out of q4. A large DASD paging farm and appropriate xstore values helps contain this but it not a fix. I fail to see why applications are reluctant to determine what environment they are in and make decisions accordingly. I know and understand the rationale behind agnostic code but the entire system z solution for Linux is being hurt by this. It just seems unreasonable for any IBM product to be insensitive to running in a virtual machine. The kernel certainly knows, hey, it even announces it at boot time! David Kreuter Original Message Subject: Re: mono keep guest active From: Patrick Spinler spinler.patr...@mayo.edu Date: Tue, August 17, 2010 5:51 pm To: LINUX-390@VM.MARIST.EDU -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 David Kreuter wrote: The non-hostile list is quite short unfortunately. For the most part Oracle is not hostile and queue drops nicely. Getting vendors including IBM to: 1. acknowledge the problem is hard. 2. once acknowledged repairing (woops, I mean adding a feature) doesn't happen quickly or for that matter often. In my view it is not criminal or heretic for code to acknowledge its virtual surroundings. But lots of apps people think otherwise. People we just want all our virtual machine children to play and share nicely. Give up when you do not have actual work, you will get your turn when needed, really you will. Is that too much to ask? David Kreuter It seems to me that this issue has certain parallels to the current and long running debate about linux kernel power management hacks targeting embedded devices (e.g. android wake locks) Specifically -- applications are very frequently crappy, and fixing them all, or even a significant fraction of them, is significantly unlikely. Ergo, what, if anything, could a linux kernel do to reign in misbehaving apps? Android's answer is to sleep regardless of what the apps say, with a privilege limited mechanism that blocks sleeping. Privs are only granted to apps the admin (or android packager) deems truely critical like the radio / phone apps. Would some similar sort of mechanism help for virtualization? (complete, uninformed speculation here) Perhaps a kernel mechanism to limit wakeups in the case that no cpu is seen to be consumed, or the like? - -- Pat -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.9 (Darwin) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAkxrBHsACgkQNObCqA8uBswOzwCeN8Sdm59uWxiJXRJiYT60FYX7 4h8AnixYLgrj2+uGx4O2DgD4yI9ornI+ =0pfK -END PGP SIGNATURE- -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/ -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/ -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX
Re: mono keep guest active
It seems to me that this issue has certain parallels to the current and long running debate about linux kernel power management hacks targeting embedded devices (e.g. android wake locks) Yes and no. The analogy to embedded systems is dead on (especially wrt to efficient use of resources), but the problem becomes then how to avoid using more cycles figuring out a sophisticated guessing mechanism than just doing the stupid wake/check/go back to sleep model. All the sophisticated guessing models are more expensive than the just do it model. The approach that was used in the 100 hz timer pop elimination code for Z is fairly elegant, but it relies in structure on some hardware features in the Z that would be hard to retro-fit into Intel systems. Unfortunately, I think the problem is that polling and other evil things are easy; anything else requires thought. Entropy and Sturgeon's Law dominates programming as well as the rest of the universe. -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/