Re: Out of sync shadow core breaks Hurd

2008-11-25 Thread Aurelien Jarno
On Tue, Nov 25, 2008 at 10:57:17AM +0100, Aurelien Jarno wrote:
> On Thu, Nov 20, 2008 at 10:48:21AM +0100, Marcelo Tosatti wrote:
> > Hi Aurelien,
> Hi,
> 
> > On Wed, Nov 12, 2008 at 08:00:37PM +0100, Aurelien Jarno wrote:
> > > Hi,
> > > 
> > > Starting with kvm-76 (and including kvm-79), Hurd does not boot anymore
> > > under KVM. The ext2fs translator issues a strange error message:
> > > 
> > > | Hurd server bootstrap: ext2fs.static[device:hd0s3] execext2fs.static: 
> > > /build/bui
> > > | ldd/hurd-20080607/build-tree/hurd/ext2fs/dir.c:494: dirscanblock: 
> > > Assertion `dp-
> > > | >dn->dirents[idx] == -1 || dp->dn->dirents[idx] == nentries' failed.
> > >-
> > > | >dn->dirents[idx] == -1 || dp->dn->dirents[idx] == nentries' failed.
> > > 
> > > Bisecting the problem, I have found that it comes from this patch:
> > > 
> > > | 641fb03992b20aa640781a245f6b7136f0b845e4 is first bad commit
> > > | commit 641fb03992b20aa640781a245f6b7136f0b845e4
> > > | Author: Marcelo Tosatti <[EMAIL PROTECTED]>
> > > | Date:   Tue Sep 23 13:18:39 2008 -0300
> > > | 
> > > | KVM: MMU: out of sync shadow core v2
> > > | 
> > > | Allow guest pagetables to go out of sync.
> > > | 
> > > | Signed-off-by: Marcelo Tosatti <[EMAIL PROTECTED]>
> > > | Signed-off-by: Avi Kivity <[EMAIL PROTECTED]>
> > > 
> > > The problem can be workarounded loading the kvm module with 
> > > oos_shadow=0.
> > > 
> > > The easiest way to reproduce the problem is to download a ready to use
> > > Hurd image [1]. The error message from the ext2fs translator is not
> > > exactly the same, but it still fails.
> > 
> > It seems Hurd does not always explicitly flush the TLB via cr0/cr3/cr4
> > writes or invlpg after updating pagetables. Debugging shows that OOS is
> > properly syncing the sptes wrt the guest pagetables, and that all pages
> > are synced before guest re-entry on TLB flush exits.
> 

Looking more precisely at the code, Hurd (actually GNU Mach) flushes the
TLB via cr3, but just *before* updating the pagetables. I have no idea
why it is done that way, but it seems to be correct given the way the
Intel MMU works. However, it fails to comply with the recommendations
from Intel ("5.2 Recommended Invalidation"), which if I understand 
correctly, have been taken as a basis for implementing out of sync 
shadow.

I have confirmed that by patching the GNU Mach code so that TLB are
flushed before and after modifying pagetables.

-- 
  .''`.  Aurelien Jarno | GPG: 1024D/F1BCDB73
 : :' :  Debian developer   | Electrical Engineer
 `. `'   [EMAIL PROTECTED] | [EMAIL PROTECTED]
   `-people.debian.org/~aurel32 | www.aurel32.net
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Out of sync shadow core breaks Hurd

2008-11-25 Thread Aurelien Jarno
On Thu, Nov 20, 2008 at 10:48:21AM +0100, Marcelo Tosatti wrote:
> Hi Aurelien,
Hi,

> On Wed, Nov 12, 2008 at 08:00:37PM +0100, Aurelien Jarno wrote:
> > Hi,
> > 
> > Starting with kvm-76 (and including kvm-79), Hurd does not boot anymore
> > under KVM. The ext2fs translator issues a strange error message:
> > 
> > | Hurd server bootstrap: ext2fs.static[device:hd0s3] execext2fs.static: 
> > /build/bui
> > | ldd/hurd-20080607/build-tree/hurd/ext2fs/dir.c:494: dirscanblock: 
> > Assertion `dp-
> > | >dn->dirents[idx] == -1 || dp->dn->dirents[idx] == nentries' failed.  
> >  -
> > | >dn->dirents[idx] == -1 || dp->dn->dirents[idx] == nentries' failed.
> > 
> > Bisecting the problem, I have found that it comes from this patch:
> > 
> > | 641fb03992b20aa640781a245f6b7136f0b845e4 is first bad commit
> > | commit 641fb03992b20aa640781a245f6b7136f0b845e4
> > | Author: Marcelo Tosatti <[EMAIL PROTECTED]>
> > | Date:   Tue Sep 23 13:18:39 2008 -0300
> > | 
> > | KVM: MMU: out of sync shadow core v2
> > | 
> > | Allow guest pagetables to go out of sync.
> > | 
> > | Signed-off-by: Marcelo Tosatti <[EMAIL PROTECTED]>
> > | Signed-off-by: Avi Kivity <[EMAIL PROTECTED]>
> > 
> > The problem can be workarounded loading the kvm module with 
> > oos_shadow=0.
> > 
> > The easiest way to reproduce the problem is to download a ready to use
> > Hurd image [1]. The error message from the ext2fs translator is not
> > exactly the same, but it still fails.
> 
> It seems Hurd does not always explicitly flush the TLB via cr0/cr3/cr4
> writes or invlpg after updating pagetables. Debugging shows that OOS is
> properly syncing the sptes wrt the guest pagetables, and that all pages
> are synced before guest re-entry on TLB flush exits.

Thanks for your investigation.

> The Intel TLB doc says (5.1 "Invalidation Instructions"):
> 
> (Other instructions and operations may invalidate entries in the TLBs
> and the paging structure caches, but the instructions identified above
> are recommended.)
> 
> As a test, syncing on every exit makes it happy:
> 
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 7a2aeba..47e2550 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -3052,6 +3052,8 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu, 
> struct kvm_run *kvm_run)
>  
>   kvm_lapic_sync_from_vapic(vcpu);
>  
> + kvm_mmu_sync_roots(vcpu);
> +
>   r = kvm_x86_ops->handle_exit(kvm_run, vcpu);
>  out:
>   return r;
> 
> It would be necessary to confirm this by hacking Hurd to flush on every
> pagetable update. Perhaps something like
> 
> RCS file: /sources/hurd/gnumach/i386/intel/pmap.c,v
> retrieving revision 1.4.2.22
> diff -u -r1.4.2.22 pmap.c
> --- pmap.c  11 Nov 2008 02:24:18 -  1.4.2.22
> +++ pmap.c  20 Nov 2008 12:47:01 -
> @@ -82,7 +82,7 @@
>  #include 
>  #include 
>  
> -#defineWRITE_PTE(pte_p, pte_entry) *(pte_p) = (pte_entry);
> +#defineWRITE_PTE(pte_p, pte_entry) *(pte_p) = (pte_entry);
> flush_tlb();
>  
>  /*
>   * Private data structures.
> 
> 

I have tried this patch, but it doesn't change anything. I'll try to see
if there are more place when the PTE is written.

-- 
  .''`.  Aurelien Jarno | GPG: 1024D/F1BCDB73
 : :' :  Debian developer   | Electrical Engineer
 `. `'   [EMAIL PROTECTED] | [EMAIL PROTECTED]
   `-people.debian.org/~aurel32 | www.aurel32.net
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Out of sync shadow core breaks Hurd

2008-11-20 Thread Marcelo Tosatti
Hi Aurelien,

On Wed, Nov 12, 2008 at 08:00:37PM +0100, Aurelien Jarno wrote:
> Hi,
> 
> Starting with kvm-76 (and including kvm-79), Hurd does not boot anymore
> under KVM. The ext2fs translator issues a strange error message:
> 
> | Hurd server bootstrap: ext2fs.static[device:hd0s3] execext2fs.static: 
> /build/bui
> | ldd/hurd-20080607/build-tree/hurd/ext2fs/dir.c:494: dirscanblock: Assertion 
> `dp-
> | >dn->dirents[idx] == -1 || dp->dn->dirents[idx] == nentries' failed.
>-
> | >dn->dirents[idx] == -1 || dp->dn->dirents[idx] == nentries' failed.
> 
> Bisecting the problem, I have found that it comes from this patch:
> 
> | 641fb03992b20aa640781a245f6b7136f0b845e4 is first bad commit
> | commit 641fb03992b20aa640781a245f6b7136f0b845e4
> | Author: Marcelo Tosatti <[EMAIL PROTECTED]>
> | Date:   Tue Sep 23 13:18:39 2008 -0300
> | 
> | KVM: MMU: out of sync shadow core v2
> | 
> | Allow guest pagetables to go out of sync.
> | 
> | Signed-off-by: Marcelo Tosatti <[EMAIL PROTECTED]>
> | Signed-off-by: Avi Kivity <[EMAIL PROTECTED]>
> 
> The problem can be workarounded loading the kvm module with 
> oos_shadow=0.
> 
> The easiest way to reproduce the problem is to download a ready to use
> Hurd image [1]. The error message from the ext2fs translator is not
> exactly the same, but it still fails.

It seems Hurd does not always explicitly flush the TLB via cr0/cr3/cr4
writes or invlpg after updating pagetables. Debugging shows that OOS is
properly syncing the sptes wrt the guest pagetables, and that all pages
are synced before guest re-entry on TLB flush exits.

The Intel TLB doc says (5.1 "Invalidation Instructions"):

(Other instructions and operations may invalidate entries in the TLBs
and the paging structure caches, but the instructions identified above
are recommended.)

As a test, syncing on every exit makes it happy:

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 7a2aeba..47e2550 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -3052,6 +3052,8 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu, struct 
kvm_run *kvm_run)
 
kvm_lapic_sync_from_vapic(vcpu);
 
+   kvm_mmu_sync_roots(vcpu);
+
r = kvm_x86_ops->handle_exit(kvm_run, vcpu);
 out:
return r;

It would be necessary to confirm this by hacking Hurd to flush on every
pagetable update. Perhaps something like

RCS file: /sources/hurd/gnumach/i386/intel/pmap.c,v
retrieving revision 1.4.2.22
diff -u -r1.4.2.22 pmap.c
--- pmap.c  11 Nov 2008 02:24:18 -  1.4.2.22
+++ pmap.c  20 Nov 2008 12:47:01 -
@@ -82,7 +82,7 @@
 #include 
 #include 
 
-#defineWRITE_PTE(pte_p, pte_entry) *(pte_p) = (pte_entry);
+#defineWRITE_PTE(pte_p, pte_entry) *(pte_p) = (pte_entry);
flush_tlb();
 
 /*
  * Private data structures.

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Out of sync shadow core breaks Hurd

2008-11-15 Thread Marcelo Tosatti
On Wed, Nov 12, 2008 at 08:00:37PM +0100, Aurelien Jarno wrote:
> Hi,
> 
> Starting with kvm-76 (and including kvm-79), Hurd does not boot anymore
> under KVM. The ext2fs translator issues a strange error message:
> 
> | Hurd server bootstrap: ext2fs.static[device:hd0s3] execext2fs.static: 
> /build/bui
> | ldd/hurd-20080607/build-tree/hurd/ext2fs/dir.c:494: dirscanblock: Assertion 
> `dp-
> | >dn->dirents[idx] == -1 || dp->dn->dirents[idx] == nentries' failed.
>-
> | >dn->dirents[idx] == -1 || dp->dn->dirents[idx] == nentries' failed.
> 
> Bisecting the problem, I have found that it comes from this patch:
> 
> | 641fb03992b20aa640781a245f6b7136f0b845e4 is first bad commit
> | commit 641fb03992b20aa640781a245f6b7136f0b845e4
> | Author: Marcelo Tosatti <[EMAIL PROTECTED]>
> | Date:   Tue Sep 23 13:18:39 2008 -0300
> | 
> | KVM: MMU: out of sync shadow core v2
> | 
> | Allow guest pagetables to go out of sync.
> | 
> | Signed-off-by: Marcelo Tosatti <[EMAIL PROTECTED]>
> | Signed-off-by: Avi Kivity <[EMAIL PROTECTED]>
> 
> The problem can be workarounded loading the kvm module with 
> oos_shadow=0.
> 
> The easiest way to reproduce the problem is to download a ready to use
> Hurd image [1]. The error message from the ext2fs translator is not
> exactly the same, but it still fails.

Thanks Aurelien, I'll be looking at this next week.


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Out of sync shadow core breaks Hurd

2008-11-12 Thread Aurelien Jarno
Hi,

Starting with kvm-76 (and including kvm-79), Hurd does not boot anymore
under KVM. The ext2fs translator issues a strange error message:

| Hurd server bootstrap: ext2fs.static[device:hd0s3] execext2fs.static: 
/build/bui
| ldd/hurd-20080607/build-tree/hurd/ext2fs/dir.c:494: dirscanblock: Assertion 
`dp-
| >dn->dirents[idx] == -1 || dp->dn->dirents[idx] == nentries' failed.  
 -
| >dn->dirents[idx] == -1 || dp->dn->dirents[idx] == nentries' failed.

Bisecting the problem, I have found that it comes from this patch:

| 641fb03992b20aa640781a245f6b7136f0b845e4 is first bad commit
| commit 641fb03992b20aa640781a245f6b7136f0b845e4
| Author: Marcelo Tosatti <[EMAIL PROTECTED]>
| Date:   Tue Sep 23 13:18:39 2008 -0300
| 
| KVM: MMU: out of sync shadow core v2
| 
| Allow guest pagetables to go out of sync.
| 
| Signed-off-by: Marcelo Tosatti <[EMAIL PROTECTED]>
| Signed-off-by: Avi Kivity <[EMAIL PROTECTED]>

The problem can be workarounded loading the kvm module with 
oos_shadow=0.

The easiest way to reproduce the problem is to download a ready to use
Hurd image [1]. The error message from the ext2fs translator is not
exactly the same, but it still fails.

Aurelien

[1] 
http://ftp.debian-ports.org/debian-cd/hurd-i386/current/debian-hurd-k16-qemu.img.tar.gz

-- 
  .''`.  Aurelien Jarno | GPG: 1024D/F1BCDB73
 : :' :  Debian developer   | Electrical Engineer
 `. `'   [EMAIL PROTECTED] | [EMAIL PROTECTED]
   `-people.debian.org/~aurel32 | www.aurel32.net
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html