Re: KSM For All Via LD_PRELOAD?

2010-06-10 Thread Jes Sorensen
On 06/10/10 11:09, Gordan Bobic wrote:
> On 06/10/2010 08:44 AM, Jes Sorensen wrote:
>> Not sure if it is worth it, but you might want to look at ElectricFence
>> which does malloc wrapping in a somewhat similar way. It might save you
>> some code :)
> 
> I'll look into it, but I don't see this requiring more than maybe 50
> lines of code, including comments, headers and Makefile. I was planning
> to literally just intercept mmap()/brk()/malloc() and mark them with
> madvise() when the underlying call returns.

I suspect it will grow in complexity to catch corner cases, but see
where it takes you.

> Which brings me to another question:
> 
> Would intercepting malloc() be completely redundant if mmap() is
> intercepted? Would I also need to do something with intercepting free()?
> Is three anything else I would need to intercept?

You need to have a look at glibc to see what it is doing. I am not sure
what the current malloc implementation is based on these days.

>> Whether or not you will run into problems if you run it system wise is
>> really hard to predict. Any other application that might be linked in a
>> special way or use preload itself might bark, but you can try it out and
>> see what explodes.
> 
> Thanks for the heads up. Can you think of any such applications off the
> top of your head?

Sorry, nothing off hand. A normal Linux distro runs so many applications
just at bootup that it's very hard to keep track of what might cause
problems. You might end up having to create a blacklist in your library
to work around it, but you will only find out when you start trying the
boot.

Cheers,
Jes
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: KSM For All Via LD_PRELOAD?

2010-06-10 Thread Gordan Bobic

On 06/10/2010 08:44 AM, Jes Sorensen wrote:

On 06/08/10 20:43, Gordan Bobic wrote:

Is this plausible?

I'm trying to work out if it's even worth considering this approach to
enable all memory used by in a system to be open to KSM page merging,
rather than only memory used by specific programs aware of it (e.g.
kvm/qemu).

Something like this would address the fact that container based
virtualization (OpenVZ, VServer, LXC) cannot benefit from KSM.

What I'm thinking about is somehow intercepting malloc() and wrapping it
so that all malloc()-ed memory gets madvise()-d as well.


Not sure if it is worth it, but you might want to look at ElectricFence
which does malloc wrapping in a somewhat similar way. It might save you
some code :)


I'll look into it, but I don't see this requiring more than maybe 50 
lines of code, including comments, headers and Makefile. I was planning 
to literally just intercept mmap()/brk()/malloc() and mark them with 
madvise() when the underlying call returns.


Which brings me to another question:

Would intercepting malloc() be completely redundant if mmap() is 
intercepted? Would I also need to do something with intercepting free()? 
Is three anything else I would need to intercept?



Whether or not you will run into problems if you run it system wise is
really hard to predict. Any other application that might be linked in a
special way or use preload itself might bark, but you can try it out and
see what explodes.


Thanks for the heads up. Can you think of any such applications off the 
top of your head?


Gordan
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: KSM For All Via LD_PRELOAD?

2010-06-10 Thread Gordan Bobic

On 06/10/2010 08:33 AM, Dor Laor wrote:

On 06/09/2010 01:31 PM, Gordan Bobic wrote:

On 06/09/2010 09:56 AM, Paolo Bonzini wrote:

Or is this too crazy an idea?


It should work. Note that the the malloced memory should be aligned in
order to get better sharing.


Within glibc malloc large blocks are mmaped, so they are automatically
aligned. Effective sharing of small blocks would take too much luck or
too much wasted memory, so probably madvising brk memory is not too
useful.

Of course there are exceptions. Bitmaps are very much sharable, but not
big. And some programs have their own allocator, using mmap in all
likelihood and slicing the resulting block. Typically these will be
virtual machines for garbage collected languages (but also GCC for
example does this). They will store a lot of pointers in there too, so
in this case KSM would likely work a lot for little benefit.

So if you really want to apply it to _all_ processes, it comes to mind
to wrap both mmap and malloc so that you can set a flag only for
mmap-within-malloc... It will take some experimentation and heuristics
to actually not degrade performance (and of course it will depend on the
workload), but it should work.


Arguably, the way QEMU KVM does it for the VM's entire memory block
doesn't seem to be distinguishing the types of memory allocation inside
the VM, so simply covering all mmap()/brk() calls would probably do no
worse in terms of performance. Or am I missing something?


There won't be drastic effect for qemu-kvm since the non guest ram areas
are minimal. I thought you were trying to trap mmap/brk/malloc for other
general applications regardless of virt.


Why does it matter that the non-guest RAM areas are minimal? The way I 
envisage using is is by putting:

export LD_PRELOAD=myksmintercept.so
as the first line in rc.sysinit and having _all_ processes in the system 
subject to this. So the memory areas not subject to KSM would be as 
negligible as in the virt case if not more so. Or am I misunderstanding 
what you're saying?


Gordan
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: KSM For All Via LD_PRELOAD?

2010-06-10 Thread Jes Sorensen
On 06/08/10 20:43, Gordan Bobic wrote:
> Is this plausible?
> 
> I'm trying to work out if it's even worth considering this approach to
> enable all memory used by in a system to be open to KSM page merging,
> rather than only memory used by specific programs aware of it (e.g.
> kvm/qemu).
> 
> Something like this would address the fact that container based
> virtualization (OpenVZ, VServer, LXC) cannot benefit from KSM.
> 
> What I'm thinking about is somehow intercepting malloc() and wrapping it
> so that all malloc()-ed memory gets madvise()-d as well.

Not sure if it is worth it, but you might want to look at ElectricFence
which does malloc wrapping in a somewhat similar way. It might save you
some code :)

Whether or not you will run into problems if you run it system wise is
really hard to predict. Any other application that might be linked in a
special way or use preload itself might bark, but you can try it out and
see what explodes.

Cheers,
Jes
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: KSM For All Via LD_PRELOAD?

2010-06-10 Thread Dor Laor

On 06/09/2010 01:31 PM, Gordan Bobic wrote:

On 06/09/2010 09:56 AM, Paolo Bonzini wrote:

Or is this too crazy an idea?


It should work. Note that the the malloced memory should be aligned in
order to get better sharing.


Within glibc malloc large blocks are mmaped, so they are automatically
aligned. Effective sharing of small blocks would take too much luck or
too much wasted memory, so probably madvising brk memory is not too
useful.

Of course there are exceptions. Bitmaps are very much sharable, but not
big. And some programs have their own allocator, using mmap in all
likelihood and slicing the resulting block. Typically these will be
virtual machines for garbage collected languages (but also GCC for
example does this). They will store a lot of pointers in there too, so
in this case KSM would likely work a lot for little benefit.

So if you really want to apply it to _all_ processes, it comes to mind
to wrap both mmap and malloc so that you can set a flag only for
mmap-within-malloc... It will take some experimentation and heuristics
to actually not degrade performance (and of course it will depend on the
workload), but it should work.


Arguably, the way QEMU KVM does it for the VM's entire memory block
doesn't seem to be distinguishing the types of memory allocation inside
the VM, so simply covering all mmap()/brk() calls would probably do no
worse in terms of performance. Or am I missing something?


There won't be drastic effect for qemu-kvm since the non guest ram areas 
are minimal. I thought you were trying to trap mmap/brk/malloc for other 
general applications regardless of virt.




Gordan
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: KSM For All Via LD_PRELOAD?

2010-06-09 Thread Gordan Bobic

On 06/09/2010 09:56 AM, Paolo Bonzini wrote:

Or is this too crazy an idea?


It should work. Note that the the malloced memory should be aligned in
order to get better sharing.


Within glibc malloc large blocks are mmaped, so they are automatically
aligned. Effective sharing of small blocks would take too much luck or
too much wasted memory, so probably madvising brk memory is not too useful.

Of course there are exceptions. Bitmaps are very much sharable, but not
big. And some programs have their own allocator, using mmap in all
likelihood and slicing the resulting block. Typically these will be
virtual machines for garbage collected languages (but also GCC for
example does this). They will store a lot of pointers in there too, so
in this case KSM would likely work a lot for little benefit.

So if you really want to apply it to _all_ processes, it comes to mind
to wrap both mmap and malloc so that you can set a flag only for
mmap-within-malloc... It will take some experimentation and heuristics
to actually not degrade performance (and of course it will depend on the
workload), but it should work.


Arguably, the way QEMU KVM does it for the VM's entire memory block 
doesn't seem to be distinguishing the types of memory allocation inside 
the VM, so simply covering all mmap()/brk() calls would probably do no 
worse in terms of performance. Or am I missing something?


Gordan
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: KSM For All Via LD_PRELOAD?

2010-06-09 Thread Paolo Bonzini

On 06/09/2010 10:33 AM, Dor Laor wrote:

What I'm thinking about is somehow intercepting malloc() and wrapping it
so that all malloc()-ed memory gets madvise()-d as well.


You can also operate at a lower level and intercept mmap and brk, not 
malloc.  (But see below).



Or is this too crazy an idea?


It should work. Note that the the malloced memory should be aligned in
order to get better sharing.


Within glibc malloc large blocks are mmaped, so they are automatically 
aligned.  Effective sharing of small blocks would take too much luck or 
too much wasted memory, so probably madvising brk memory is not too useful.


Of course there are exceptions.  Bitmaps are very much sharable, but not 
big.  And some programs have their own allocator, using mmap in all 
likelihood and slicing the resulting block.  Typically these will be 
virtual machines for garbage collected languages (but also GCC for 
example does this).  They will store a lot of pointers in there too, so 
in this case KSM would likely work a lot for little benefit.


So if you really want to apply it to _all_ processes, it comes to mind 
to wrap both mmap and malloc so that you can set a flag only for 
mmap-within-malloc...  It will take some experimentation and heuristics 
to actually not degrade performance (and of course it will depend on the 
workload), but it should work.


Paolo
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: KSM For All Via LD_PRELOAD?

2010-06-09 Thread Dor Laor

On 06/08/2010 09:43 PM, Gordan Bobic wrote:

Is this plausible?

I'm trying to work out if it's even worth considering this approach to
enable all memory used by in a system to be open to KSM page merging,
rather than only memory used by specific programs aware of it (e.g.
kvm/qemu).

Something like this would address the fact that container based
virtualization (OpenVZ, VServer, LXC) cannot benefit from KSM.

What I'm thinking about is somehow intercepting malloc() and wrapping it
so that all malloc()-ed memory gets madvise()-d as well.

Has this been done?

Or is this too crazy an idea?


It should work. Note that the the malloced memory should be aligned in 
order to get better sharing.




Gordan
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html