Re: [Xen-devel] Re: Next steps with pv_ops for Xen

2007-12-05 Thread Derek Murray
Hi Gerd, Gerd Hoffmann wrote: Want reproduce? Here we go: * grab xenner 0.8 from http://dl.bytesex.org/releases/xenner/ * grab a xenified dom0 kernel without blktap driver (either not compiled or module not loaded). * start xend * start blkbackd from xenner package (you probably wa

Re: [Xen-devel] Re: Next steps with pv_ops for Xen

2007-12-05 Thread Gerd Hoffmann
Stephen C. Tweedie wrote: > I can't help wondering if this is a hint that now is the time to find a > better API, which doesn't have the requirement (a) that seems to be > causing such trouble? Are other PV guests --- *BSD, Solaris --- going > to have the same problems with their VM layers if they

Re: [Xen-devel] Re: Next steps with pv_ops for Xen

2007-12-05 Thread Derek Murray
Gerd, Can you try the attached patch against linux-2.6.18-xen.hg? I think the problem was that the gntdev VMA is not marked as being VM_PFNMAP, therefore it tries to get a struct page_struct for each granted page when it is unmapped (and maybe sometimes succeeds (incorrectly), which could be

Re: [Xen-devel] Re: Next steps with pv_ops for Xen

2007-12-05 Thread Derek Murray
Keir Fraser wrote: Is this patch to go into linux-2.6.18-xen.hg then? Yes, even if it doesn't fix the exact bug we're seeing here, I think it should go in. I've attached a version with my signed-off-by and a better commit comment. Cheers, Derek. # HG changeset patch # User [EMAIL PROTECTED

[PATCH 4/9] remove references to cr8 register

2007-12-05 Thread Glauber de Oliveira Costa
As pointed out by Andi, linux never really uses this register so saving and restoring is not really necessary. This patch removes all references to it. Signed-off-by: Glauber de Oliveira Costa <[EMAIL PROTECTED]> --- arch/x86/kernel/asm-offsets_64.c |1 - arch/x86/kernel/suspend_64.c |

[PATCH 2/9] put together equal pieces of system.h

2007-12-05 Thread Glauber de Oliveira Costa
This patch puts together pieces of system_{32,64}.h that looks like the same. It's the first step towards integration of this file. Signed-off-by: Glauber de Oliveira Costa <[EMAIL PROTECTED]> --- arch/x86/kernel/process_64.c |2 +- include/asm-x86/system.h | 70

[PATCH 6/9] remove unused macro

2007-12-05 Thread Glauber de Oliveira Costa
Mr. Grep says warn_if_not_ulong() is not used anymore anywhere in the code. So, we remove it. Signed-off-by: Glauber de Oliveira Costa <[EMAIL PROTECTED]> --- include/asm-x86/system_64.h |2 -- 1 files changed, 0 insertions(+), 2 deletions(-) diff --git a/include/asm-x86/system_64.h b/includ

[PATCH 9/9] unify system.h

2007-12-05 Thread Glauber de Oliveira Costa
This patch finishes the unification of system.h file. i386 needs a constant to be defined, and it is defined inside an ifdef Other than that, pretty much nothing but includes are left in the arch specific headers, and they are deleted. Signed-off-by: Glauber de Oliveira Costa <[EMAIL PROTECTED]>

Re: [Xen-devel] Re: Next steps with pv_ops for Xen

2007-12-05 Thread Derek Murray
Keir Fraser wrote: Yes, this would work okay I suspect. Good enough as a stop-gap measure? Are there any other responsibilities that you acquire if you make use of VM_FOREIGN (in particular, how would this affect get_user_pages)? VM_FOREIGN is already set for the gntdev VMA (mostly because it's

Re: [Xen-devel] Re: Next steps with pv_ops for Xen

2007-12-05 Thread Derek Murray
Jeremy Fitzhardinge wrote: Could we use one of the software-defined bits in the PTE to indicate that this is a foreign/granted PTE, and have set_pte_at behave differently if you pass it a pte with this bit set? Actually, as Gerd pointed out in his answer to his own question, the use of VM_DONT

[PATCH 5/9] unify paravirt parts of system.h

2007-12-05 Thread Glauber de Oliveira Costa
This patch moves the i386 control registers manipulation functions, wbinvd, and clts functions to system.h. They are essentially the same as in x86_64. With this, system.h paravirt comes for free in x86_64. Signed-off-by: Glauber de Oliveira Costa <[EMAIL PROTECTED]> --- include/asm-x86/system.h

Re: [Xen-devel] Re: Next steps with pv_ops for Xen

2007-12-05 Thread Derek Murray
Keir Fraser wrote: Actually I'm not so sure now. Presumably you add VM_PFNMAP to make vm_normal_page() return NULL? But actually I would expect pte_pfn() to return max_mapnr because the mapped page is not a local page. And that should cause vm_normal_page() to return NULL always, regardless of w

[PATCH 8/9] move switch_to macro to system.h

2007-12-05 Thread Glauber de Oliveira Costa
This patch moves the switch_to() macro to system.h As those macros are fundamentally different between i386 and x86_64, they are enclosed around an ifdef. Signed-off-by: Glauber de Oliveira Costa <[EMAIL PROTECTED]> --- include/asm-x86/system.h| 61 +

Re: [Xen-devel] Re: Next steps with pv_ops for Xen

2007-12-05 Thread Derek Murray
Keir Fraser wrote: Need to bite the bullet and fix this properly by setting a software flag in ptes that are not subject to reference counting. Could we get away with testing the VM_FOREIGN flag in vm_normal_page()? Although I get the impression that this wouldn't be easily justified if tryin

Re: [Xen-devel] Re: Next steps with pv_ops for Xen

2007-12-05 Thread Jeremy Fitzhardinge
Derek Murray wrote: > Ultimately, fork calls dup_mm, which calls, dup_mmap, which calls > copy_{page,pud,pmd,pte}_range, which calls copy_one_pte, which calls > set_pte_at, which hypercalls HYPERVISOR_update_va_mapping. > > The hypercall will not succeed and will return an error code > indicating t

[PATCH 0/9 - v2] Integrate system.h

2007-12-05 Thread Glauber de Oliveira Costa
Hi, At Ingo's request, here it goes a new patchset, that actually applies ontop of the x86 tree (mm branch). Besides this issue, I've also included a patch that remove the cr8 references, as Andi suggested. ___ Virtualization mailing list Virtualizati

Re: [Xen-devel] Re: Next steps with pv_ops for Xen

2007-12-05 Thread Jeremy Fitzhardinge
Derek Murray wrote: > Jeremy Fitzhardinge wrote: >> Could we use one of the software-defined bits in the PTE to indicate >> that this is a foreign/granted PTE, and have set_pte_at behave >> differently if you pass it a pte with this bit set? > > Actually, as Gerd pointed out in his answer to his ow

Re: [Xen-devel] Re: Next steps with pv_ops for Xen

2007-12-05 Thread Keir Fraser
On 5/12/07 17:17, "Derek Murray" <[EMAIL PROTECTED]> wrote: >> Actually I'm not so sure now. Presumably you add VM_PFNMAP to make >> vm_normal_page() return NULL? But actually I would expect pte_pfn() to >> return max_mapnr because the mapped page is not a local page. And that >> should cause vm_n

[PATCH 3/9] unify load_segment macro

2007-12-05 Thread Glauber de Oliveira Costa
This patch unifies the load_segment() macro, making them equal in both x86_64 and i386 architectures. The common version goes to system.h, and the old are deleted. Signed-off-by: Glauber de Oliveira Costa <[EMAIL PROTECTED]> --- include/asm-x86/system.h| 21 + include/as

Re: [PATCH 0/9 - v2] Integrate system.h

2007-12-05 Thread Ingo Molnar
* Glauber de Oliveira Costa <[EMAIL PROTECTED]> wrote: > At Ingo's request, here it goes a new patchset, that actually applies > ontop of the x86 tree (mm branch). Besides this issue, I've also > included a patch that remove the cr8 references, as Andi suggested. thanks - i've picked them up.

Re: [Xen-devel] Re: Next steps with pv_ops for Xen

2007-12-05 Thread Keir Fraser
On 5/12/07 20:15, "Jeremy Fitzhardinge" <[EMAIL PROTECTED]> wrote: > In 2.6.18-xen the only two implementations of zap_pte are > blktap_clear_pte and gntdev_clear_pte. Given a ptep with the > grant-mapping bit set, could we determine which of these need calling > and do the appropriate thing? Do

Re: [Xen-devel] Re: Next steps with pv_ops for Xen

2007-12-05 Thread Keir Fraser
On 5/12/07 17:48, "Derek Murray" <[EMAIL PROTECTED]> wrote: > Keir Fraser wrote: >> Need to bite the bullet and fix this properly by setting a software flag in >> ptes that are not subject to reference counting. > > Could we get away with testing the VM_FOREIGN flag in vm_normal_page()? > Althoug

[PATCH 7/9] unify smp parts of system.h

2007-12-05 Thread Glauber de Oliveira Costa
The memory barrier parts of system.h are not very different between i386 and x86_64, the main difference being the availability of instructions, which we handle with the use of ifdefs. They are consolidated in system.h file, and then removed from the arch-specific headers. Signed-off-by: Glauber

[PATCH 1/9] remove volatile keyword from clflush.

2007-12-05 Thread Glauber de Oliveira Costa
the p parameter is an explicit memory reference, and is enough to prevent gcc to being nasty here. The volatile seems completely not needed. Signed-off-by: Glauber de Oliveira Costa <[EMAIL PROTECTED]> --- include/asm-x86/system_32.h |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) di

Re: [Xen-devel] Re: Next steps with pv_ops for Xen

2007-12-05 Thread Gerd Hoffmann
>> Alternatively, could we use the _PAGE_GNTTAB PTE flag that is used for >> debugging? Indeed, if we did this, could be obviate the need for the >> PTE-zapping hook, by instead catching the case where this flag is set, >> and unmapping the grant implicitly? > > Well, in the general case you don't

Re: [Xen-devel] Re: Next steps with pv_ops for Xen

2007-12-05 Thread Keir Fraser
On 5/12/07 14:30, "Derek Murray" <[EMAIL PROTECTED]> wrote: > Keir Fraser wrote: >> Is this patch to go into linux-2.6.18-xen.hg then? > > Yes, even if it doesn't fix the exact bug we're seeing here, I think it > should go in. I've attached a version with my signed-off-by and a better > commit co

Re: [Xen-devel] Re: Next steps with pv_ops for Xen

2007-12-05 Thread Geoffrey Lefebvre
> Can we take a different approach from the zap_pte hook? Given that > we're 1) planning on claiming a pte bit for grant mappings, and 2) need > to hook ptep_get_and_clear anyway to solve the mprotect performance > problems, couldn't we just special-case grant mapping pte_clears? > > In 2.6.18-xen

Re: [Xen-devel] Re: Next steps with pv_ops for Xen

2007-12-05 Thread Derek Murray
Stephen C. Tweedie wrote: So... the interface (a) cannot be used on the Linux VM without at least one invasive VM modification, due to the requirement of ptes being explicitly unmapped via hypercall; Also there is the use of VM_FOREIGN (http://xenbits.xensource.com/linux-2.6.18-xen.hg?file/b27

Re: [Xen-devel] Re: Next steps with pv_ops for Xen

2007-12-05 Thread Keir Fraser
On 5/12/07 14:12, "Gerd Hoffmann" <[EMAIL PROTECTED]> wrote: >> Thanks for the repro details. I'll have a go at this later. One thing we >> haven't tested AFAIK is mapping grants in the same domain: could you >> check to see if the bug is the same if you attach a block device to a >> domain other

Re: [Xen-devel] Re: Next steps with pv_ops for Xen

2007-12-05 Thread Gerd Hoffmann
Hi, > Thanks for the repro details. I'll have a go at this later. One thing we > haven't tested AFAIK is mapping grants in the same domain: could you > check to see if the bug is the same if you attach a block device to a > domain other than Dom0? Also, could you send any Xen console output, if

Re: [Xen-devel] Re: Next steps with pv_ops for Xen

2007-12-05 Thread Gerd Hoffmann
Hi, > gntdev doesn't even try to handle forking. I wouldn't be surprised if > that is a great way to kill Domain-0. The xen hypervisor will most > likely not be amused to find a pte refering to a granted (but foreign) > page which wasn't established using the grant table interface. Pinning >