Re: [Qemu-devel] [PATCH QEMU] Transparent Hugepage Support #3

2010-03-17 Thread Andrea Arcangeli
On Wed, Mar 17, 2010 at 04:07:09PM +, Paul Brook wrote: > > On Wed, Mar 17, 2010 at 03:52:15PM +, Paul Brook wrote: > > > > > > Size not multiple I think is legitimate, the below-4G chunk isn't > > > > > > required to end 2M aligned, all it matters is that the above-4G > > > > > > then star

Re: [Qemu-devel] [PATCH QEMU] Transparent Hugepage Support #3

2010-03-17 Thread Paul Brook
> On Wed, Mar 17, 2010 at 03:52:15PM +, Paul Brook wrote: > > > > > Size not multiple I think is legitimate, the below-4G chunk isn't > > > > > required to end 2M aligned, all it matters is that the above-4G > > > > > then starts aligned. In short one thing to add in the future as > > > > > par

Re: [Qemu-devel] [PATCH QEMU] Transparent Hugepage Support #3

2010-03-17 Thread Andrea Arcangeli
On Wed, Mar 17, 2010 at 03:52:15PM +, Paul Brook wrote: > > > > Size not multiple I think is legitimate, the below-4G chunk isn't > > > > required to end 2M aligned, all it matters is that the above-4G then > > > > starts aligned. In short one thing to add in the future as parameter > > > > to

Re: [Qemu-devel] [PATCH QEMU] Transparent Hugepage Support #3

2010-03-17 Thread Paul Brook
> > > Size not multiple I think is legitimate, the below-4G chunk isn't > > > required to end 2M aligned, all it matters is that the above-4G then > > > starts aligned. In short one thing to add in the future as parameter > > > to qemu_ram_alloc is the physical address that the host virtual > > > a

Re: [Qemu-devel] [PATCH QEMU] Transparent Hugepage Support #3

2010-03-17 Thread Andrea Arcangeli
On Wed, Mar 17, 2010 at 03:21:26PM +, Paul Brook wrote: > > On Wed, Mar 17, 2010 at 03:05:57PM +, Paul Brook wrote: > > > > + if (size >= PREFERRED_RAM_ALIGN) > > > > + new_block->host = qemu_memalign(PREFERRED_RAM_ALIGN, > > > > size); > > > > > > Is this deliberately b

Re: [Qemu-devel] [PATCH QEMU] Transparent Hugepage Support #3

2010-03-17 Thread Paul Brook
> On Wed, Mar 17, 2010 at 03:05:57PM +, Paul Brook wrote: > > > + if (size >= PREFERRED_RAM_ALIGN) > > > + new_block->host = qemu_memalign(PREFERRED_RAM_ALIGN, > > > size); > > > > Is this deliberately bigger-than rather than multiple-of? > > Having the size not be a multipl

Re: [Qemu-devel] [PATCH QEMU] Transparent Hugepage Support #3

2010-03-17 Thread Andrea Arcangeli
On Wed, Mar 17, 2010 at 03:05:57PM +, Paul Brook wrote: > > + if (size >= PREFERRED_RAM_ALIGN) > > + new_block->host = qemu_memalign(PREFERRED_RAM_ALIGN, size); > > > > Is this deliberately bigger-than rather than multiple-of? > Having the size not be a multiple of alignme

Re: [Qemu-devel] [PATCH QEMU] Transparent Hugepage Support #3

2010-03-17 Thread Paul Brook
> + if (size >= PREFERRED_RAM_ALIGN) > + new_block->host = qemu_memalign(PREFERRED_RAM_ALIGN, size); > Is this deliberately bigger-than rather than multiple-of? Having the size not be a multiple of alignment seems somewhat strange, it's always going to be wrong at one end...

[Qemu-devel] [PATCH QEMU] Transparent Hugepage Support #3

2010-03-17 Thread Andrea Arcangeli
From: Andrea Arcangeli This will allow proper alignment so NPT/EPT can take advantage of linux host backing the guest memory with hugepages. It also ensures that when KVM isn't used the first 2M of guest physical memory are backed by a large TLB. To complete it, it will also notify the kernel tha

Re: [Qemu-devel] [PATCH QEMU] Transparent Hugepage Support #2

2010-03-16 Thread Paul Brook
> +#if defined(__linux__) && defined(__x86_64__) > +#define MAX_TRANSPARENT_HUGEPAGE_SIZE (2*1024*1024) > + if (size >= MAX_TRANSPARENT_HUGEPAGE_SIZE) I'd prefer something like: #if defined(__linux__) && defined(__x86_64__) /* [...Allow the host to use huge pages easily...]. */ #define PRE

Re: [Qemu-devel] [PATCH QEMU] Transparent Hugepage Support #2

2010-03-16 Thread Jamie Lokier
Andrea Arcangeli wrote: > + * take advantage of hugepages with NPT/EPP or to Spelling: NPT/EPT? -- Jamie

[Qemu-devel] [PATCH QEMU] Transparent Hugepage Support #2

2010-03-16 Thread Andrea Arcangeli
From: Andrea Arcangeli This will allow proper alignment so NPT/EPT can take advantage of linux host backing the guest memory with hugepages. It also ensures that when KVM isn't used the first 2M of guest physical memory are backed by a large TLB. To complete it, it will also notify the kernel tha

Re: [Qemu-devel] [PATCH QEMU] transparent hugepage support

2010-03-13 Thread Andrea Arcangeli
On Sat, Mar 13, 2010 at 10:28:32AM +0200, Avi Kivity wrote: > On 03/11/2010 06:05 PM, Andrea Arcangeli wrote: > > On Thu, Mar 11, 2010 at 05:52:16PM +0200, Avi Kivity wrote: > > > >> That is a little wasteful. How about a hint to mmap() requesting proper > >> alignment (MAP_HPAGE_ALIGN)? > >>

Re: [Qemu-devel] [PATCH QEMU] transparent hugepage support

2010-03-13 Thread Avi Kivity
On 03/11/2010 06:05 PM, Andrea Arcangeli wrote: On Thu, Mar 11, 2010 at 05:52:16PM +0200, Avi Kivity wrote: That is a little wasteful. How about a hint to mmap() requesting proper alignment (MAP_HPAGE_ALIGN)? So you suggest adding a new kernel feature to mmap? Not sure if it's worth

Re: [Qemu-devel] [PATCH QEMU] transparent hugepage support

2010-03-12 Thread Jamie Lokier
Andrea Arcangeli wrote: > On Fri, Mar 12, 2010 at 06:41:56PM +, Paul Brook wrote: > > Doesn't non-PAE (i.e. most 32-bit x86) use 4M huge pages? There's still a > > good > > number of those knocking about. > > Yep, but 32bit x86 host won't support transparent hugepage (certain > bits overflow

Re: [Qemu-devel] [PATCH QEMU] transparent hugepage support

2010-03-12 Thread Andrea Arcangeli
On Fri, Mar 12, 2010 at 06:41:56PM +, Paul Brook wrote: > Doesn't non-PAE (i.e. most 32-bit x86) use 4M huge pages? There's still a > good > number of those knocking about. Yep, but 32bit x86 host won't support transparent hugepage (certain bits overflows in the kernel implementation, there'

Re: [Qemu-devel] [PATCH QEMU] transparent hugepage support

2010-03-12 Thread Paul Brook
>My point is that there is no need to show the smaller page sizes to >userland, only the max one is relevant and this isn't going to change >and I'm uncomfortable to add plural stuff to a patch that doesn't >contemplate mixes page sizes and for the time being multiple hpage size >isn't even on the

Re: [Qemu-devel] [PATCH QEMU] transparent hugepage support

2010-03-12 Thread Andrea Arcangeli
On Fri, Mar 12, 2010 at 06:17:05PM +, Paul Brook wrote: > > > No particular preference. Or you could have .../page_sizes list all > > > available sizes, and have qemu take the first one (or last depending on > > > sort order). > > > > That would also work. Considering that the current transpar

Re: [Qemu-devel] [PATCH QEMU] transparent hugepage support

2010-03-12 Thread Paul Brook
> > No particular preference. Or you could have .../page_sizes list all > > available sizes, and have qemu take the first one (or last depending on > > sort order). > > That would also work. Considering that the current transparent > hugepage support won't support any more than 1 page, I think it'

Re: [Qemu-devel] [PATCH QEMU] transparent hugepage support

2010-03-12 Thread Andrea Arcangeli
On Fri, Mar 12, 2010 at 05:10:54PM +, Paul Brook wrote: > > > So shouldn't [the name of] the value the kernel provides for recommended > > > alignment be equally implementation agnostic? > > > > Is sys/kernel/mm/transparent_hugepage directory implementation > > agnostic in the first place? >

Re: [Qemu-devel] [PATCH QEMU] transparent hugepage support

2010-03-12 Thread Paul Brook
> > So shouldn't [the name of] the value the kernel provides for recommended > > alignment be equally implementation agnostic? > > Is sys/kernel/mm/transparent_hugepage directory implementation > agnostic in the first place? It's about as agnostic as MADV_HUGEPAGE :-) > If we want to fully take

Re: [Qemu-devel] [PATCH QEMU] transparent hugepage support

2010-03-12 Thread Andrea Arcangeli
On Fri, Mar 12, 2010 at 04:24:24PM +, Paul Brook wrote: > > On Fri, Mar 12, 2010 at 04:04:03PM +, Paul Brook wrote: > > > > > > $ cat /sys/kernel/mm/transparent_hugepage/hpage_pmd_size > > > > > > 2097152 > > > > > Hmm, ok. I'm guessing linux doesn't support anything other than "huge" > > >

Re: [Qemu-devel] [PATCH QEMU] transparent hugepage support

2010-03-12 Thread Paul Brook
> On Fri, Mar 12, 2010 at 04:04:03PM +, Paul Brook wrote: > > > > > $ cat /sys/kernel/mm/transparent_hugepage/hpage_pmd_size > > > > > 2097152 > > > Hmm, ok. I'm guessing linux doesn't support anything other than "huge" > > and "normal" page sizes now, so it's a question of whether we want it t

Re: [Qemu-devel] [PATCH QEMU] transparent hugepage support

2010-03-12 Thread Andrea Arcangeli
On Fri, Mar 12, 2010 at 04:04:03PM +, Paul Brook wrote: > > > > $ cat /sys/kernel/mm/transparent_hugepage/hpage_pmd_size > > > > 2097152 > > > > > > Is "pmd" x86 specific? > > > > It's linux specific, this is common code, nothing x86 specific. In > > fact on x86 it's not called pmd but Page Di

Re: [Qemu-devel] [PATCH QEMU] transparent hugepage support

2010-03-12 Thread Paul Brook
> > allocates it on a 2M boundary. I suspect you actually want (base % 2M) == > > 1M. Aligning on a 1M boundary will only DTRT half the time. > > The 1m-end is an hypothetical worry that come to mind as I was > discussing the issue with you. Basically my point is that if the pc.c > code will chang

Re: [Qemu-devel] [PATCH QEMU] transparent hugepage support

2010-03-12 Thread Paul Brook
> > > $ cat /sys/kernel/mm/transparent_hugepage/hpage_pmd_size > > > 2097152 > > > > Is "pmd" x86 specific? > > It's linux specific, this is common code, nothing x86 specific. In > fact on x86 it's not called pmd but Page Directory. I've actually no > idea what pmd stands for but it's definitely n

Re: [Qemu-devel] [PATCH QEMU] transparent hugepage support

2010-03-12 Thread Andrea Arcangeli
On Fri, Mar 12, 2010 at 11:36:33AM +, Paul Brook wrote: > > On Thu, Mar 11, 2010 at 05:55:10PM +, Paul Brook wrote: > > > sysconf(_SC_HUGEPAGESIZE); would seem to be the obvious answer. > > > > There's not just one hugepage size > > We only have one madvise flag... Transparent hugepage

Re: [Qemu-devel] [PATCH QEMU] transparent hugepage support

2010-03-12 Thread Paul Brook
> On Thu, Mar 11, 2010 at 05:55:10PM +, Paul Brook wrote: > > sysconf(_SC_HUGEPAGESIZE); would seem to be the obvious answer. > > There's not just one hugepage size We only have one madvise flag... > and that thing doesn't exist yet > plus it'd require mangling over glibc too. If it existed

Re: [Qemu-devel] [PATCH QEMU] transparent hugepage support

2010-03-11 Thread Andrea Arcangeli
On Thu, Mar 11, 2010 at 05:55:10PM +, Paul Brook wrote: > sysconf(_SC_HUGEPAGESIZE); would seem to be the obvious answer. There's not just one hugepage size and that thing doesn't exist yet plus it'd require mangling over glibc too. If it existed I could use it but I think this is better: $ c

Re: [Qemu-devel] [PATCH QEMU] transparent hugepage support

2010-03-11 Thread Paul Brook
> On Thu, Mar 11, 2010 at 04:28:04PM +, Paul Brook wrote: > > > > + /* > > > > +* Align on HPAGE_SIZE so "(gfn ^ pfn)& > > > > +* (HPAGE_SIZE-1) == 0" to allow KVM to take advantage > > > > +* of hugepages with NPT/EPT. > > > > +

Re: [Qemu-devel] [PATCH QEMU] transparent hugepage support

2010-03-11 Thread Andrea Arcangeli
On Thu, Mar 11, 2010 at 04:28:04PM +, Paul Brook wrote: > > > + /* > > > + * Align on HPAGE_SIZE so "(gfn ^ pfn)& > > > + * (HPAGE_SIZE-1) == 0" to allow KVM to take advantage > > > + * of hugepages with NPT/EPT. > > > + */ > > > + new_block->

Re: [Qemu-devel] [PATCH QEMU] transparent hugepage support

2010-03-11 Thread Paul Brook
> > + /* > > +* Align on HPAGE_SIZE so "(gfn ^ pfn)& > > +* (HPAGE_SIZE-1) == 0" to allow KVM to take advantage > > +* of hugepages with NPT/EPT. > > +*/ > > + new_block->host = qemu_memalign(1<< TARGET_HPAGE_BITS, size); This sh

Re: [Qemu-devel] [PATCH QEMU] transparent hugepage support

2010-03-11 Thread Andrea Arcangeli
On Thu, Mar 11, 2010 at 05:52:16PM +0200, Avi Kivity wrote: > That is a little wasteful. How about a hint to mmap() requesting proper > alignment (MAP_HPAGE_ALIGN)? So you suggest adding a new kernel feature to mmap? Not sure if it's worth it, considering it'd also increase the number of vmas be

Re: [Qemu-devel] [PATCH QEMU] transparent hugepage support

2010-03-11 Thread Avi Kivity
On 03/11/2010 05:14 PM, Andrea Arcangeli wrote: From: Andrea Arcangeli This will allow proper alignment so NPT/EPT can take advantage of linux host backing the guest memory with hugepages (only relevant for KVM and not for QEMU that has no NPT/EPT support). To complete it, it will also notify th

[Qemu-devel] [PATCH QEMU] transparent hugepage support

2010-03-11 Thread Andrea Arcangeli
From: Andrea Arcangeli This will allow proper alignment so NPT/EPT can take advantage of linux host backing the guest memory with hugepages (only relevant for KVM and not for QEMU that has no NPT/EPT support). To complete it, it will also notify the kernel that this memory is important to be back