Re: [Qemu-devel] linux guests and ksm performance
On 24.02.2012 08:23, Stefan Hajnoczi wrote: On Fri, Feb 24, 2012 at 6:53 AM, Stefan Hajnoczistefa...@gmail.com wrote: On Fri, Feb 24, 2012 at 6:41 AM, Stefan Hajnoczistefa...@gmail.com wrote: On Thu, Feb 23, 2012 at 7:08 PM, peter.lie...@gmail.comp...@dlh.net wrote: Stefan Hajnoczistefa...@gmail.com schrieb: On Thu, Feb 23, 2012 at 3:40 PM, Peter Lievenp...@dlh.net wrote: However, in a virtual machine I have not observed the above slow down to that extend while the benefit of zero after free in a virtualisation environment is obvious: 1) zero pages can easily be merged by ksm or other technique. 2) zero (dup) pages are a lot faster to transfer in case of migration. The other approach is a memory page discard mechanism - which obviously requires more code changes than zeroing freed pages. The advantage is that we don't take the brute-force and CPU intensive approach of zeroing pages. It would be like a fine-grained ballooning feature. I dont think that it is cpu intense. All user pages are zeroed anyway, but at allocation time it shouldnt be a big difference in terms of cpu power. It's easy to find a scenario where eagerly zeroing pages is wasteful. Imagine a process that uses all of physical memory. Once it terminates the system is going to run processes that only use a small set of pages. It's pointless zeroing all those pages if we're not going to use them anymore. Perhaps the middle path is to zero pages but do it after a grace timeout. I wonder if this helps eliminate the 2-3% slowdown you noticed when compiling. Gah, it's too early in the morning. I don't think this timer actually makes sense. do you think it makes then sense to make a patchset/proposal to notice a guest kernel about the presense of ksm in the host and switch to zero after free? peter Stefan
Re: [Qemu-devel] linux guests and ksm performance
On 28.02.2012 13:05, Stefan Hajnoczi wrote: On Tue, Feb 28, 2012 at 11:46 AM, Peter Lievenp...@dlh.net wrote: On 24.02.2012 08:23, Stefan Hajnoczi wrote: On Fri, Feb 24, 2012 at 6:53 AM, Stefan Hajnoczistefa...@gmail.com wrote: On Fri, Feb 24, 2012 at 6:41 AM, Stefan Hajnoczistefa...@gmail.com wrote: On Thu, Feb 23, 2012 at 7:08 PM, peter.lie...@gmail.comp...@dlh.net wrote: Stefan Hajnoczistefa...@gmail.comschrieb: On Thu, Feb 23, 2012 at 3:40 PM, Peter Lievenp...@dlh.netwrote: However, in a virtual machine I have not observed the above slow down to that extend while the benefit of zero after free in a virtualisation environment is obvious: 1) zero pages can easily be merged by ksm or other technique. 2) zero (dup) pages are a lot faster to transfer in case of migration. The other approach is a memory page discard mechanism - which obviously requires more code changes than zeroing freed pages. The advantage is that we don't take the brute-force and CPU intensive approach of zeroing pages. It would be like a fine-grained ballooning feature. I dont think that it is cpu intense. All user pages are zeroed anyway, but at allocation time it shouldnt be a big difference in terms of cpu power. It's easy to find a scenario where eagerly zeroing pages is wasteful. Imagine a process that uses all of physical memory. Once it terminates the system is going to run processes that only use a small set of pages. It's pointless zeroing all those pages if we're not going to use them anymore. Perhaps the middle path is to zero pages but do it after a grace timeout. I wonder if this helps eliminate the 2-3% slowdown you noticed when compiling. Gah, it's too early in the morning. I don't think this timer actually makes sense. do you think it makes then sense to make a patchset/proposal to notice a guest kernel about the presense of ksm in the host and switch to zero after free? I think your idea is interesting - whether or not people are happy with it will depend on the performance impact. It seems reasonable to me. could you support/help me in implementing and publishing this approach? Peter
Re: [Qemu-devel] linux guests and ksm performance
On 02/23/2012 06:42 PM, Stefan Hajnoczi wrote: On Thu, Feb 23, 2012 at 3:40 PM, Peter Lieven p...@dlh.net wrote: However, in a virtual machine I have not observed the above slow down to that extend while the benefit of zero after free in a virtualisation environment is obvious: 1) zero pages can easily be merged by ksm or other technique. 2) zero (dup) pages are a lot faster to transfer in case of migration. The other approach is a memory page discard mechanism - which obviously requires more code changes than zeroing freed pages. The advantage is that we don't take the brute-force and CPU intensive approach of zeroing pages. It would be like a fine-grained ballooning feature. I hope someone will follow up saying this has already been done or prototyped :). It already exists - that's the balloon code. Right now it's host driven, but maybe we can modify it to allow the guest to initiate balloon inflations. -- error compiling committee.c: too many arguments to function
Re: [Qemu-devel] linux guests and ksm performance
On 02/24/2012 08:41 AM, Stefan Hajnoczi wrote: I dont think that it is cpu intense. All user pages are zeroed anyway, but at allocation time it shouldnt be a big difference in terms of cpu power. It's easy to find a scenario where eagerly zeroing pages is wasteful. Imagine a process that uses all of physical memory. Once it terminates the system is going to run processes that only use a small set of pages. It's pointless zeroing all those pages if we're not going to use them anymore. In the long term, we will use them, except if the guest is completely idle. The scenario in which zeroing is expensive is when the page is refilled through DMA. In that case the zeroing was wasted. This is a pretty common scenario in pagecache intensive workloads. -- error compiling committee.c: too many arguments to function
Re: [Qemu-devel] linux guests and ksm performance
On 28.02.2012 14:16, Avi Kivity wrote: On 02/24/2012 08:41 AM, Stefan Hajnoczi wrote: I dont think that it is cpu intense. All user pages are zeroed anyway, but at allocation time it shouldnt be a big difference in terms of cpu power. It's easy to find a scenario where eagerly zeroing pages is wasteful. Imagine a process that uses all of physical memory. Once it terminates the system is going to run processes that only use a small set of pages. It's pointless zeroing all those pages if we're not going to use them anymore. In the long term, we will use them, except if the guest is completely idle. The scenario in which zeroing is expensive is when the page is refilled through DMA. In that case the zeroing was wasted. This is a pretty common scenario in pagecache intensive workloads. Avi, what do you think of the proposal to give the guest vm a hint that the host is running ksm? In that case the administrator has already chosen that saving physical memory is more important than performance to him? Peter
Re: [Qemu-devel] linux guests and ksm performance
On 02/28/2012 03:20 PM, Peter Lieven wrote: On 28.02.2012 14:16, Avi Kivity wrote: On 02/24/2012 08:41 AM, Stefan Hajnoczi wrote: I dont think that it is cpu intense. All user pages are zeroed anyway, but at allocation time it shouldnt be a big difference in terms of cpu power. It's easy to find a scenario where eagerly zeroing pages is wasteful. Imagine a process that uses all of physical memory. Once it terminates the system is going to run processes that only use a small set of pages. It's pointless zeroing all those pages if we're not going to use them anymore. In the long term, we will use them, except if the guest is completely idle. The scenario in which zeroing is expensive is when the page is refilled through DMA. In that case the zeroing was wasted. This is a pretty common scenario in pagecache intensive workloads. Avi, what do you think of the proposal to give the guest vm a hint that the host is running ksm? In that case the administrator has already chosen that saving physical memory is more important than performance to him? It makes some sense. Perhaps through the balloon device, a flag that indicates that voluntary ballooning will be gratefully accepted. -- error compiling committee.c: too many arguments to function
[Qemu-devel] linux guests and ksm performance
Hi, i have recently been playing with an old idea (originally in grsecurity for security reasons) to change the policy from zero on allocate to zero after free in the linux page allocator. My concern is that linux leaves a lot of waste in the physical memory unlike Windows which per default zeros pages after they are freed. I have run some tests and I can confirm some old results that a hardware Linux machine is approximately 2-3% slower with zero after free on big compilation jobs. This might be due to either the fact that pages are only zeroed on allocate if GFP_ZERO is set or due to caching benefits. However, in a virtual machine I have not observed the above slow down to that extend while the benefit of zero after free in a virtualisation environment is obvious: 1) zero pages can easily be merged by ksm or other technique. 2) zero (dup) pages are a lot faster to transfer in case of migration. Therefore I would like to hear your thoughts if it would be a good idea to change the strategy in the Linux kernel from zero on allocate to zero after free automatically if the 'hypervisor' cpu feature is set? Or even have another technique to tell a linux guest that ksm is running on the host. If this is not feasible can someone think of a kernel module / userspace program that zeroes out unused pages periodically. Peter
Re: [Qemu-devel] linux guests and ksm performance
On Thu, Feb 23, 2012 at 3:40 PM, Peter Lieven p...@dlh.net wrote: However, in a virtual machine I have not observed the above slow down to that extend while the benefit of zero after free in a virtualisation environment is obvious: 1) zero pages can easily be merged by ksm or other technique. 2) zero (dup) pages are a lot faster to transfer in case of migration. The other approach is a memory page discard mechanism - which obviously requires more code changes than zeroing freed pages. The advantage is that we don't take the brute-force and CPU intensive approach of zeroing pages. It would be like a fine-grained ballooning feature. I hope someone will follow up saying this has already been done or prototyped :). Stefan
Re: [Qemu-devel] linux guests and ksm performance
On Thu, Feb 23, 2012 at 11:42 AM, Stefan Hajnoczi stefa...@gmail.com wrote: The other approach is a memory page discard mechanism - which obviously requires more code changes than zeroing freed pages. The advantage is that we don't take the brute-force and CPU intensive approach of zeroing pages. It would be like a fine-grained ballooning feature. (disclaimer: i don't know the code, i'm just guessing) does KVM emulate the MMU? if so, is there any 'unmap page' primitive? -- Javier
Re: [Qemu-devel] linux guests and ksm performance
Stefan Hajnoczi stefa...@gmail.com schrieb: On Thu, Feb 23, 2012 at 3:40 PM, Peter Lieven p...@dlh.net wrote: However, in a virtual machine I have not observed the above slow down to that extend while the benefit of zero after free in a virtualisation environment is obvious: 1) zero pages can easily be merged by ksm or other technique. 2) zero (dup) pages are a lot faster to transfer in case of migration. The other approach is a memory page discard mechanism - which obviously requires more code changes than zeroing freed pages. The advantage is that we don't take the brute-force and CPU intensive approach of zeroing pages. It would be like a fine-grained ballooning feature. I dont think that it is cpu intense. All user pages are zeroed anyway, but at allocation time it shouldnt be a big difference in terms of cpu power. I hope someone will follow up saying this has already been done or prototyped :). Stefan -- Diese Nachricht wurde von meinem Android-Mobiltelefon mit K-9 Mail gesendet.
Re: [Qemu-devel] linux guests and ksm performance
On Thu, Feb 23, 2012 at 7:08 PM, peter.lie...@gmail.com p...@dlh.net wrote: Stefan Hajnoczi stefa...@gmail.com schrieb: On Thu, Feb 23, 2012 at 3:40 PM, Peter Lieven p...@dlh.net wrote: However, in a virtual machine I have not observed the above slow down to that extend while the benefit of zero after free in a virtualisation environment is obvious: 1) zero pages can easily be merged by ksm or other technique. 2) zero (dup) pages are a lot faster to transfer in case of migration. The other approach is a memory page discard mechanism - which obviously requires more code changes than zeroing freed pages. The advantage is that we don't take the brute-force and CPU intensive approach of zeroing pages. It would be like a fine-grained ballooning feature. I dont think that it is cpu intense. All user pages are zeroed anyway, but at allocation time it shouldnt be a big difference in terms of cpu power. It's easy to find a scenario where eagerly zeroing pages is wasteful. Imagine a process that uses all of physical memory. Once it terminates the system is going to run processes that only use a small set of pages. It's pointless zeroing all those pages if we're not going to use them anymore. Stefan
Re: [Qemu-devel] linux guests and ksm performance
On Fri, Feb 24, 2012 at 6:41 AM, Stefan Hajnoczi stefa...@gmail.com wrote: On Thu, Feb 23, 2012 at 7:08 PM, peter.lie...@gmail.com p...@dlh.net wrote: Stefan Hajnoczi stefa...@gmail.com schrieb: On Thu, Feb 23, 2012 at 3:40 PM, Peter Lieven p...@dlh.net wrote: However, in a virtual machine I have not observed the above slow down to that extend while the benefit of zero after free in a virtualisation environment is obvious: 1) zero pages can easily be merged by ksm or other technique. 2) zero (dup) pages are a lot faster to transfer in case of migration. The other approach is a memory page discard mechanism - which obviously requires more code changes than zeroing freed pages. The advantage is that we don't take the brute-force and CPU intensive approach of zeroing pages. It would be like a fine-grained ballooning feature. I dont think that it is cpu intense. All user pages are zeroed anyway, but at allocation time it shouldnt be a big difference in terms of cpu power. It's easy to find a scenario where eagerly zeroing pages is wasteful. Imagine a process that uses all of physical memory. Once it terminates the system is going to run processes that only use a small set of pages. It's pointless zeroing all those pages if we're not going to use them anymore. Perhaps the middle path is to zero pages but do it after a grace timeout. I wonder if this helps eliminate the 2-3% slowdown you noticed when compiling. This requires no special host-guest interfaces for discarding pages. Stefan
Re: [Qemu-devel] linux guests and ksm performance
On Thu, Feb 23, 2012 at 04:42:54PM +, Stefan Hajnoczi wrote: On Thu, Feb 23, 2012 at 3:40 PM, Peter Lieven p...@dlh.net wrote: However, in a virtual machine I have not observed the above slow down to that extend while the benefit of zero after free in a virtualisation environment is obvious: 1) zero pages can easily be merged by ksm or other technique. 2) zero (dup) pages are a lot faster to transfer in case of migration. The other approach is a memory page discard mechanism - which obviously requires more code changes than zeroing freed pages. The advantage is that we don't take the brute-force and CPU intensive approach of zeroing pages. It would be like a fine-grained ballooning feature. I hope someone will follow up saying this has already been done or prototyped :). That was attempted. It is called page hinting, but AFAIK due to complex locking issue attempt was abandoned. -- Gleb.
Re: [Qemu-devel] linux guests and ksm performance
On Fri, Feb 24, 2012 at 6:53 AM, Stefan Hajnoczi stefa...@gmail.com wrote: On Fri, Feb 24, 2012 at 6:41 AM, Stefan Hajnoczi stefa...@gmail.com wrote: On Thu, Feb 23, 2012 at 7:08 PM, peter.lie...@gmail.com p...@dlh.net wrote: Stefan Hajnoczi stefa...@gmail.com schrieb: On Thu, Feb 23, 2012 at 3:40 PM, Peter Lieven p...@dlh.net wrote: However, in a virtual machine I have not observed the above slow down to that extend while the benefit of zero after free in a virtualisation environment is obvious: 1) zero pages can easily be merged by ksm or other technique. 2) zero (dup) pages are a lot faster to transfer in case of migration. The other approach is a memory page discard mechanism - which obviously requires more code changes than zeroing freed pages. The advantage is that we don't take the brute-force and CPU intensive approach of zeroing pages. It would be like a fine-grained ballooning feature. I dont think that it is cpu intense. All user pages are zeroed anyway, but at allocation time it shouldnt be a big difference in terms of cpu power. It's easy to find a scenario where eagerly zeroing pages is wasteful. Imagine a process that uses all of physical memory. Once it terminates the system is going to run processes that only use a small set of pages. It's pointless zeroing all those pages if we're not going to use them anymore. Perhaps the middle path is to zero pages but do it after a grace timeout. I wonder if this helps eliminate the 2-3% slowdown you noticed when compiling. Gah, it's too early in the morning. I don't think this timer actually makes sense. Stefan
Re: [Qemu-devel] linux guests and ksm performance
Am 24.02.2012 um 08:23 schrieb Stefan Hajnoczi: On Fri, Feb 24, 2012 at 6:53 AM, Stefan Hajnoczi stefa...@gmail.com wrote: On Fri, Feb 24, 2012 at 6:41 AM, Stefan Hajnoczi stefa...@gmail.com wrote: On Thu, Feb 23, 2012 at 7:08 PM, peter.lie...@gmail.com p...@dlh.net wrote: Stefan Hajnoczi stefa...@gmail.com schrieb: On Thu, Feb 23, 2012 at 3:40 PM, Peter Lieven p...@dlh.net wrote: However, in a virtual machine I have not observed the above slow down to that extend while the benefit of zero after free in a virtualisation environment is obvious: 1) zero pages can easily be merged by ksm or other technique. 2) zero (dup) pages are a lot faster to transfer in case of migration. The other approach is a memory page discard mechanism - which obviously requires more code changes than zeroing freed pages. The advantage is that we don't take the brute-force and CPU intensive approach of zeroing pages. It would be like a fine-grained ballooning feature. I dont think that it is cpu intense. All user pages are zeroed anyway, but at allocation time it shouldnt be a big difference in terms of cpu power. It's easy to find a scenario where eagerly zeroing pages is wasteful. Imagine a process that uses all of physical memory. Once it terminates the system is going to run processes that only use a small set of pages. It's pointless zeroing all those pages if we're not going to use them anymore. Perhaps the middle path is to zero pages but do it after a grace timeout. I wonder if this helps eliminate the 2-3% slowdown you noticed when compiling. Gah, it's too early in the morning. I don't think this timer actually makes sense. ok, that would be the idea of an ansynchronous page zeroing in the guest. i also think this is to complicated. maybe the other idea is too simple: is it possible to give the guest a hint that ksm is enabled on the host (lets say in a way like its done with kvmclock). if ksm is enabled on the host the administrator has already made the decision that performance is not so important and he/she is eager to save physical memory. what if and only if this flag is set switch from zero on allocate to zero after free. i think the whole thing is less than 10-20 lines of code. and its code that has been proven to be working well in grsecurity for ages. this might introduce a little (2-3%) overhead, but only if there is a lot of non GFP_FREE memory is allocated, but its definitely faster than swapping. of course, it has to be garanteed that this code does not slow down normal systems due to additionales branches (would it be enough to mark the if statements as unlikely?) peter peter Stefan