RE: [PATCH v9 01/13] KVM: PPC: POWERNV: move iommu_add_device earlier

2013-10-29 Thread Bhushan Bharat-R65777
Hi Alex,

Looks like this patch is not picked by anyone, Are you going to pick this patch?
My vfio/iommu patches have dependency on this patch (this is already tested by 
me).

Thanks
-Bharat

 -Original Message-
 From: Linuxppc-dev [mailto:linuxppc-dev-
 bounces+bharat.bhushan=freescale@lists.ozlabs.org] On Behalf Of Alexey
 Kardashevskiy
 Sent: Wednesday, August 28, 2013 2:08 PM
 To: linuxppc-...@lists.ozlabs.org
 Cc: k...@vger.kernel.org; Gleb Natapov; Alexey Kardashevskiy; Alexander Graf;
 kvm-ppc@vger.kernel.org; linux-ker...@vger.kernel.org; linux...@kvack.org; 
 Paul
 Mackerras; Paolo Bonzini; David Gibson
 Subject: [PATCH v9 01/13] KVM: PPC: POWERNV: move iommu_add_device earlier
 
 The current implementation of IOMMU on sPAPR does not use iommu_ops and
 therefore does not call IOMMU API's bus_set_iommu() which
 1) sets iommu_ops for a bus
 2) registers a bus notifier
 Instead, PCI devices are added to IOMMU groups from
 subsys_initcall_sync(tce_iommu_init) which does basically the same thing 
 without
 using iommu_ops callbacks.
 
 However Freescale PAMU driver (https://lkml.org/lkml/2013/7/1/158)
 implements iommu_ops and when tce_iommu_init is called, every PCI device is
 already added to some group so there is a conflict.
 
 This patch does 2 things:
 1. removes the loop in which PCI devices were added to groups and adds 
 explicit
 iommu_add_device() calls to add devices as soon as they get the iommu_table
 pointer assigned to them.
 2. moves a bus notifier to powernv code in order to avoid conflict with the
 notifier from Freescale driver.
 
 iommu_add_device() and iommu_del_device() are public now.
 
 Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru
 ---
 Changes:
 v8:
 * added the check for iommu_group!=NULL before removing device from a group as
 suggested by Wei Yang weiy...@linux.vnet.ibm.com
 
 v2:
 * added a helper - set_iommu_table_base_and_group - which does
 set_iommu_table_base() and iommu_add_device()
 ---
  arch/powerpc/include/asm/iommu.h|  9 +++
  arch/powerpc/kernel/iommu.c | 41 
 +++--
  arch/powerpc/platforms/powernv/pci-ioda.c   |  8 +++---
  arch/powerpc/platforms/powernv/pci-p5ioc2.c |  2 +-
  arch/powerpc/platforms/powernv/pci.c| 33 ++-
  arch/powerpc/platforms/pseries/iommu.c  |  8 +++---
  6 files changed, 55 insertions(+), 46 deletions(-)
 
 diff --git a/arch/powerpc/include/asm/iommu.h 
 b/arch/powerpc/include/asm/iommu.h
 index c34656a..19ad77f 100644
 --- a/arch/powerpc/include/asm/iommu.h
 +++ b/arch/powerpc/include/asm/iommu.h
 @@ -103,6 +103,15 @@ extern struct iommu_table *iommu_init_table(struct
 iommu_table * tbl,
   int nid);
  extern void iommu_register_group(struct iommu_table *tbl,
int pci_domain_number, unsigned long pe_num);
 +extern int iommu_add_device(struct device *dev); extern void
 +iommu_del_device(struct device *dev);
 +
 +static inline void set_iommu_table_base_and_group(struct device *dev,
 +   void *base)
 +{
 + set_iommu_table_base(dev, base);
 + iommu_add_device(dev);
 +}
 
  extern int iommu_map_sg(struct device *dev, struct iommu_table *tbl,
   struct scatterlist *sglist, int nelems, diff --git
 a/arch/powerpc/kernel/iommu.c b/arch/powerpc/kernel/iommu.c index
 b20ff17..15f8ca8 100644
 --- a/arch/powerpc/kernel/iommu.c
 +++ b/arch/powerpc/kernel/iommu.c
 @@ -1105,7 +1105,7 @@ void iommu_release_ownership(struct iommu_table *tbl)  }
 EXPORT_SYMBOL_GPL(iommu_release_ownership);
 
 -static int iommu_add_device(struct device *dev)
 +int iommu_add_device(struct device *dev)
  {
   struct iommu_table *tbl;
   int ret = 0;
 @@ -1134,46 +1134,13 @@ static int iommu_add_device(struct device *dev)
 
   return ret;
  }
 +EXPORT_SYMBOL_GPL(iommu_add_device);
 
 -static void iommu_del_device(struct device *dev)
 +void iommu_del_device(struct device *dev)
  {
   iommu_group_remove_device(dev);
  }
 -
 -static int iommu_bus_notifier(struct notifier_block *nb,
 -   unsigned long action, void *data)
 -{
 - struct device *dev = data;
 -
 - switch (action) {
 - case BUS_NOTIFY_ADD_DEVICE:
 - return iommu_add_device(dev);
 - case BUS_NOTIFY_DEL_DEVICE:
 - iommu_del_device(dev);
 - return 0;
 - default:
 - return 0;
 - }
 -}
 -
 -static struct notifier_block tce_iommu_bus_nb = {
 - .notifier_call = iommu_bus_notifier,
 -};
 -
 -static int __init tce_iommu_init(void)
 -{
 - struct pci_dev *pdev = NULL;
 -
 - BUILD_BUG_ON(PAGE_SIZE  IOMMU_PAGE_SIZE);
 -
 - for_each_pci_dev(pdev)
 - iommu_add_device(pdev-dev);
 -
 - bus_register_notifier(pci_bus_type, tce_iommu_bus_nb);
 - return 0;
 -}
 -
 -subsys_initcall_sync(tce_iommu_init);
 +EXPORT_SYMBOL_GPL(iommu_del_device);
 
  

RE: [PATCH 2/2] kvm: ppc: booke: check range page invalidation progress on page setup

2013-10-10 Thread Bhushan Bharat-R65777


 -Original Message-
 From: Paolo Bonzini [mailto:paolo.bonz...@gmail.com] On Behalf Of Paolo 
 Bonzini
 Sent: Monday, October 07, 2013 5:35 PM
 To: Alexander Graf
 Cc: Bhushan Bharat-R65777; Paul Mackerras; Wood Scott-B07421; kvm-
 p...@vger.kernel.org; k...@vger.kernel.org mailing list; Bhushan 
 Bharat-R65777;
 Gleb Natapov
 Subject: Re: [PATCH 2/2] kvm: ppc: booke: check range page invalidation 
 progress
 on page setup
 
 Il 04/10/2013 15:38, Alexander Graf ha scritto:
 
  On 07.08.2013, at 12:03, Bharat Bhushan wrote:
 
  When the MM code is invalidating a range of pages, it calls the KVM
  kvm_mmu_notifier_invalidate_range_start() notifier function, which calls
  kvm_unmap_hva_range(), which arranges to flush all the TLBs for guest 
  pages.
  However, the Linux PTEs for the range being flushed are still valid at
  that point.  We are not supposed to establish any new references to pages
  in the range until the ...range_end() notifier gets called.
  The PPC-specific KVM code doesn't get any explicit notification of that;
  instead, we are supposed to use mmu_notifier_retry() to test whether we
  are or have been inside a range flush notifier pair while we have been
  referencing a page.
 
  This patch calls the mmu_notifier_retry() while mapping the guest
  page to ensure we are not referencing a page when in range invalidation.
 
  This call is inside a region locked with kvm-mmu_lock, which is the
  same lock that is called by the KVM MMU notifier functions, thus
  ensuring that no new notification can proceed while we are in the
  locked region.
 
  Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
 
  Acked-by: Alexander Graf ag...@suse.de
 
  Gleb, Paolo, please queue for 3.12 directly.
 
 Here is the backport.  The second hunk has a nontrivial conflict, so
 someone please give their {Tested,Reviewed,Compiled}-by.

{Compiled,Reviewed}-by: Bharat Bhushan bharat.bhus...@freescale.com

Thanks
-Bharat

 
 Paolo
 
 diff --git a/arch/powerpc/kvm/e500_mmu_host.c 
 b/arch/powerpc/kvm/e500_mmu_host.c
 index 1c6a9d7..c65593a 100644
 --- a/arch/powerpc/kvm/e500_mmu_host.c
 +++ b/arch/powerpc/kvm/e500_mmu_host.c
 @@ -332,6 +332,13 @@ static inline int kvmppc_e500_shadow_map(struct
 kvmppc_vcpu_e500 *vcpu_e500,
   unsigned long hva;
   int pfnmap = 0;
   int tsize = BOOK3E_PAGESZ_4K;
 + int ret = 0;
 + unsigned long mmu_seq;
 + struct kvm *kvm = vcpu_e500-vcpu.kvm;
 +
 + /* used to check for invalidations in progress */
 + mmu_seq = kvm-mmu_notifier_seq;
 + smp_rmb();
 
   /*
* Translate guest physical to true physical, acquiring
 @@ -449,6 +456,12 @@ static inline int kvmppc_e500_shadow_map(struct
 kvmppc_vcpu_e500 *vcpu_e500,
   gvaddr = ~((tsize_pages  PAGE_SHIFT) - 1);
   }
 
 + spin_lock(kvm-mmu_lock);
 + if (mmu_notifier_retry(kvm, mmu_seq)) {
 + ret = -EAGAIN;
 + goto out;
 + }
 +
   kvmppc_e500_ref_setup(ref, gtlbe, pfn);
 
   kvmppc_e500_setup_stlbe(vcpu_e500-vcpu, gtlbe, tsize,
 @@ -457,10 +470,13 @@ static inline int kvmppc_e500_shadow_map(struct
 kvmppc_vcpu_e500 *vcpu_e500,
   /* Clear i-cache for new pages */
   kvmppc_mmu_flush_icache(pfn);
 
 +out:
 + spin_unlock(kvm-mmu_lock);
 +
   /* Drop refcount on page, so that mmu notifiers can clear it */
   kvm_release_pfn_clean(pfn);
 
 - return 0;
 + return ret;
  }
 
  /* XXX only map the one-one case, for now use TLB0 */
 
 


--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 3/4] kvm: powerpc: define a linux pte lookup function

2013-10-10 Thread Bhushan Bharat-R65777


 -Original Message-
 From: Paul Mackerras [mailto:pau...@samba.org]
 Sent: Thursday, October 10, 2013 4:06 PM
 To: Wood Scott-B07421
 Cc: Bhushan Bharat-R65777; Wood Scott-B07421; ag...@suse.de; Yoder Stuart-
 B08248; k...@vger.kernel.org; kvm-ppc@vger.kernel.org
 Subject: Re: [PATCH 3/4] kvm: powerpc: define a linux pte lookup function
 
 On Wed, Oct 09, 2013 at 12:47:31PM -0500, Scott Wood wrote:
  On Wed, 2013-10-09 at 03:48 -0500, Bhushan Bharat-R65777 wrote:
  
   What lookup_linux_pte_and_update() does:-
- find_linux_pte_or_hugepte()
- does size and some other trivial checks
- Then atomically update the pte:-
  = while()
  = wait till _PAGE_BUSY is clear
  = atomically update the pte
  = if not updated then go back to while() above else break
  
  
   While what lookup_linux_pte() does:-
- find_linux_pte_or_hugepte()
- does size and some other trivial checks
- wait till _PAGE_BUSY is clear
- return pte
  
   I am finding it difficult to call lookup_linux_pte() from
 lookup_linux_pte_and_update().
 
  You could factor out a common lookup_linux_ptep().
 
 I don't really think it's enough code to be worth wringing out the last drop 
 of
 duplication.  However, if he removed the checks for _PAGE_BUSY and 
 _PAGE_PRESENT
 as I suggested in another mail, and made it return the pte pointer rather than
 the value, it would then essentially be a lookup_linux_ptep() as you suggest.

Do we want to have lookup_linux_pte() or  lookup_linux_ptep() or both where 
lookup_linux_pte() and lookup_linux_pte_and_update() calls lookup_linux_ptep() ?

-Bharat

 
 Paul.


--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 3/4] kvm: powerpc: define a linux pte lookup function

2013-10-09 Thread Bhushan Bharat-R65777


 -Original Message-
 From: Wood Scott-B07421
 Sent: Wednesday, October 09, 2013 3:07 AM
 To: Bhushan Bharat-R65777
 Cc: ag...@suse.de; Yoder Stuart-B08248; k...@vger.kernel.org; kvm-
 p...@vger.kernel.org; pau...@samba.org; Bhushan Bharat-R65777
 Subject: Re: [PATCH 3/4] kvm: powerpc: define a linux pte lookup function
 
 On Tue, 2013-10-08 at 11:33 +0530, Bharat Bhushan wrote:
  We need to search linux pte to get pte attributes for setting TLB
  in KVM.
  This patch defines a linux_pte_lookup() function for same.
 
  Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
  ---
   arch/powerpc/include/asm/pgtable.h |   35 
  +++
   1 files changed, 35 insertions(+), 0 deletions(-)
 
  diff --git a/arch/powerpc/include/asm/pgtable.h
  b/arch/powerpc/include/asm/pgtable.h
  index 7d6eacf..fd26c04 100644
  --- a/arch/powerpc/include/asm/pgtable.h
  +++ b/arch/powerpc/include/asm/pgtable.h
  @@ -223,6 +223,41 @@ extern int gup_hugepte(pte_t *ptep, unsigned long
  sz, unsigned long addr,  #endif  pte_t
  *find_linux_pte_or_hugepte(pgd_t *pgdir, unsigned long ea,
   unsigned *shift);
  +
  +static inline pte_t lookup_linux_pte(pgd_t *pgdir, unsigned long hva,
  +unsigned long *pte_sizep)
  +{
  +   pte_t *ptep;
  +   pte_t pte;
  +   unsigned long ps = *pte_sizep;
  +   unsigned int shift;
  +
  +   ptep = find_linux_pte_or_hugepte(pgdir, hva, shift);
  +   if (!ptep)
  +   return __pte(0);
  +   if (shift)
  +   *pte_sizep = 1ul  shift;
  +   else
  +   *pte_sizep = PAGE_SIZE;
  +
  +   if (ps  *pte_sizep)
  +   return __pte(0);
  +
  +   /* wait until _PAGE_BUSY is clear */
  +   while (1) {
  +   pte = pte_val(*ptep);
  +   if (unlikely(pte  _PAGE_BUSY)) {
  +   cpu_relax();
  +   continue;
  +   }
  +   }
  +
  +   /* If pte is not present return None */
  +   if (unlikely(!(pte  _PAGE_PRESENT)))
  +   return __pte(0);
  +
  +   return pte;
  +}
 
 Can lookup_linux_pte_and_update() call lookup_linux_pte()?

What lookup_linux_pte_and_update() does:-
 - find_linux_pte_or_hugepte()
 - does size and some other trivial checks
 - Then atomically update the pte:-
   = while()
   = wait till _PAGE_BUSY is clear
   = atomically update the pte
   = if not updated then go back to while() above else break


While what lookup_linux_pte() does:-
 - find_linux_pte_or_hugepte()
 - does size and some other trivial checks
 - wait till _PAGE_BUSY is clear
 - return pte

I am finding it difficult to call lookup_linux_pte() from 
lookup_linux_pte_and_update().

Thanks
-Bharat

 
 -Scott
 



RE: [PATCH 1/2] kvm/powerpc: rename kvm_hypercall() to epapr_hypercall()

2013-10-07 Thread Bhushan Bharat-R65777


 -Original Message-
 From: Alexander Graf [mailto:ag...@suse.de]
 Sent: Friday, October 04, 2013 4:46 PM
 To: Bhushan Bharat-R65777
 Cc: Wood Scott-B07421; kvm-ppc@vger.kernel.org; k...@vger.kernel.org
 Subject: Re: [PATCH 1/2] kvm/powerpc: rename kvm_hypercall() to
 epapr_hypercall()
 
 
 On 04.10.2013, at 06:26, Bhushan Bharat-R65777 wrote:
 
 
 
  -Original Message-
  From: Wood Scott-B07421
  Sent: Thursday, October 03, 2013 12:04 AM
  To: Alexander Graf
  Cc: Bhushan Bharat-R65777; kvm-ppc@vger.kernel.org;
  k...@vger.kernel.org; Bhushan
  Bharat-R65777
  Subject: Re: [PATCH 1/2] kvm/powerpc: rename kvm_hypercall() to
  epapr_hypercall()
 
  On Wed, 2013-10-02 at 19:54 +0200, Alexander Graf wrote:
  On 02.10.2013, at 19:49, Scott Wood wrote:
 
  On Wed, 2013-10-02 at 19:46 +0200, Alexander Graf wrote:
  On 02.10.2013, at 19:42, Scott Wood wrote:
 
  On Wed, 2013-10-02 at 19:17 +0200, Alexander Graf wrote:
  On 02.10.2013, at 19:04, Scott Wood wrote:
 
  On Wed, 2013-10-02 at 18:53 +0200, Alexander Graf wrote:
  On 02.10.2013, at 18:40, Scott Wood wrote:
 
  On Wed, 2013-10-02 at 16:19 +0200, Alexander Graf wrote:
  Won't this break when CONFIG_EPAPR_PARAVIRT=n? We wouldn't
  have
  epapr_hcalls.S compiled into the code base then and the bl above
  would reference an unknown function.
 
  KVM_GUEST selects EPAPR_PARAVIRT.
 
  But you can not select KVM_GUEST and still call these inline
  functions,
  no?
 
  No.
 
  Like kvm_arch_para_features().
 
  Where does that get called without KVM_GUEST?
 
  How would that work currently, with the call to kvm_hypercall()
  in arch/powerpc/kernel/kvm.c (which calls epapr_hypercall, BTW)?
 
  It wouldn't ever get called because kvm_hypercall() ends up
  always
  returning EV_UNIMPLEMENTED when #ifndef CONFIG_KVM_GUEST.
 
  OK, so the objection is to removing that stub?  Where would we
  actually want to call this without knowing that KVM_GUEST or
  EPAPR_PARAVIRT are enabled?
 
  In probing code. I usually prefer
 
  if (kvm_feature_available(X)) {
   ...
  }
 
  over
 
  #ifdef CONFIG_KVM_GUEST
  if (kvm_feature_available(X)) {
   ...
  }
  #endif
 
  at least when I can avoid it. With the current code the compiler
  would be
  smart enough to just optimize out the complete branch.
 
  Sure.  My point is, where would you be calling that where the
  entire file isn't predicated on (or selecting) CONFIG_KVM_GUEST or 
  similar?
 
  We don't do these stubs for every single function in the kernel --
  only ones where the above is a reasonable use case.
 
  Yeah, I'm fine on dropping it, but we need to make that a conscious
  decision
  and verify that no caller relies on it.
 
  kvm_para_has_feature() is called from arch/powerpc/kernel/kvm.c,
  arch/x86/kernel/kvm.c, and arch/x86/kernel/kvmclock.c, all of which
  are enabled by CONFIG_KVM_GUEST.
 
  I did find one example of kvm_para_available() being used in an
  unexpected place
  -- sound/pci/intel8x0.c.  It defines its own non-CONFIG_KVM_GUEST
  stub, even though x86 defines kvm_para_available() using inline CPUID
  stuff which should work without CONFIG_KVM_GUEST.
  I'm not sure why it even needs to do that, though -- shouldn't the
  subsequent PCI subsystem vendor/device check should be sufficient?
  No hypercalls are involved.
 
  That said, the possibility that some random driver might want to make
  use of paravirt features is a decent argument for keeping the stub.
 
 
  I am not sure where we are agreeing on?
  Do we want to remove the stub in arch/powerpc/include/asm/kvm_para.h ? as
 there is no caller without KVM_GUEST and in future caller ensure this to be
 called only from code selected by KVM_GUEST?
 
  Or let this stub stay to avoid any random driver calling this ?
 
 I think the most reasonable way forward is to add a stub for non-CONFIG_EPAPR 
 to
 the epapr code, then replace the kvm bits with generic epapr bits (which your
 patches already do).

Please describe which stub you are talking about.

Thanks
-Bharat

 
 With that we should be 100% equivalent to today's code, just with a lot less
 lines of code :).
 
 
 Alex
 


--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 1/2] kvm/powerpc: rename kvm_hypercall() to epapr_hypercall()

2013-10-07 Thread Bhushan Bharat-R65777
  at least when I can avoid it. With the current code the compiler
  would be
  smart enough to just optimize out the complete branch.
 
  Sure.  My point is, where would you be calling that where the
  entire file isn't predicated on (or selecting) CONFIG_KVM_GUEST or
 similar?
 
  We don't do these stubs for every single function in the kernel
  -- only ones where the above is a reasonable use case.
 
  Yeah, I'm fine on dropping it, but we need to make that a
  conscious decision
  and verify that no caller relies on it.
 
  kvm_para_has_feature() is called from arch/powerpc/kernel/kvm.c,
  arch/x86/kernel/kvm.c, and arch/x86/kernel/kvmclock.c, all of which
  are enabled by CONFIG_KVM_GUEST.
 
  I did find one example of kvm_para_available() being used in an
  unexpected place
  -- sound/pci/intel8x0.c.  It defines its own non-CONFIG_KVM_GUEST
  stub, even though x86 defines kvm_para_available() using inline
  CPUID stuff which should work without CONFIG_KVM_GUEST.
  I'm not sure why it even needs to do that, though -- shouldn't the
  subsequent PCI subsystem vendor/device check should be sufficient?
  No hypercalls are involved.
 
  That said, the possibility that some random driver might want to
  make use of paravirt features is a decent argument for keeping the stub.
 
 
  I am not sure where we are agreeing on?
  Do we want to remove the stub in arch/powerpc/include/asm/kvm_para.h
  ? as
  there is no caller without KVM_GUEST and in future caller ensure this
  to be called only from code selected by KVM_GUEST?
 
  Or let this stub stay to avoid any random driver calling this ?
 
  I think the most reasonable way forward is to add a stub for
  non-CONFIG_EPAPR to the epapr code, then replace the kvm bits with
  generic epapr bits (which your patches already do).
 
  Please describe which stub you are talking about.
 
 kvm_hypercall is always available, regardless of the config option, which 
 makes
 all its subfunctions always available as well.

This patch renames kvm_hypercall() to epapr_hypercall() and which is always 
available. And the kvm_hypercall() friends now directly calls epapr_hypercall().
IIUC, So what you are trying to say is let the kvm_hypercall() friends keep on 
calling kvm_hypercall() itself and a sub something like this:

#ifdef CONFIG_KVM_GUEST
 
static unsigned long kvm_hypercall(unsigned long *in,
unsigned long *out,
unsigned long nr)
{
return epapr_hypercall(in, out. nr);
}
 
 #else
static unsigned long kvm_hypercall(unsigned long *in,
unsigned long *out,
unsigned long nr) {
 return EV_UNIMPLEMENTED;
}
-

I am still not really convinced about why we want to keep this stub where we 
know this is not called outside KVM_GUEST and calling this without KVM_GUEST is 
debatable.

Thanks
-Bharat

Thanks
-Bharat

 
 
 Alex
 
 ---
 
 #ifdef CONFIG_KVM_GUEST
 
 #include linux/of.h
 
 static inline int kvm_para_available(void) {
 struct device_node *hyper_node;
 
 hyper_node = of_find_node_by_path(/hypervisor);
 if (!hyper_node)
 return 0;
 
 if (!of_device_is_compatible(hyper_node, linux,kvm))
 return 0;
 
 return 1;
 }
 
 extern unsigned long kvm_hypercall(unsigned long *in,
unsigned long *out,
unsigned long nr);
 
 #else
 
 static inline int kvm_para_available(void) {
 return 0;
 }
 
 static unsigned long kvm_hypercall(unsigned long *in,
unsigned long *out,
unsigned long nr) {
 return EV_UNIMPLEMENTED;
 }
 
 #endif
 


--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 1/2] kvm/powerpc: rename kvm_hypercall() to epapr_hypercall()

2013-10-07 Thread Bhushan Bharat-R65777


 -Original Message-
 From: kvm-ppc-ow...@vger.kernel.org [mailto:kvm-ppc-ow...@vger.kernel.org] On
 Behalf Of Alexander Graf
 Sent: Monday, October 07, 2013 9:16 PM
 To: Bhushan Bharat-R65777
 Cc: Wood Scott-B07421; kvm-ppc@vger.kernel.org; k...@vger.kernel.org
 Subject: Re: [PATCH 1/2] kvm/powerpc: rename kvm_hypercall() to
 epapr_hypercall()
 
 
 On 07.10.2013, at 17:43, Bhushan Bharat-R65777 r65...@freescale.com wrote:
 
  at least when I can avoid it. With the current code the
  compiler would be
  smart enough to just optimize out the complete branch.
 
  Sure.  My point is, where would you be calling that where the
  entire file isn't predicated on (or selecting) CONFIG_KVM_GUEST
  or
  similar?
 
  We don't do these stubs for every single function in the kernel
  -- only ones where the above is a reasonable use case.
 
  Yeah, I'm fine on dropping it, but we need to make that a
  conscious decision
  and verify that no caller relies on it.
 
  kvm_para_has_feature() is called from arch/powerpc/kernel/kvm.c,
  arch/x86/kernel/kvm.c, and arch/x86/kernel/kvmclock.c, all of
  which are enabled by CONFIG_KVM_GUEST.
 
  I did find one example of kvm_para_available() being used in an
  unexpected place
  -- sound/pci/intel8x0.c.  It defines its own non-CONFIG_KVM_GUEST
  stub, even though x86 defines kvm_para_available() using inline
  CPUID stuff which should work without CONFIG_KVM_GUEST.
  I'm not sure why it even needs to do that, though -- shouldn't
  the subsequent PCI subsystem vendor/device check should be sufficient?
  No hypercalls are involved.
 
  That said, the possibility that some random driver might want to
  make use of paravirt features is a decent argument for keeping the 
  stub.
 
 
  I am not sure where we are agreeing on?
  Do we want to remove the stub in
  arch/powerpc/include/asm/kvm_para.h
  ? as
  there is no caller without KVM_GUEST and in future caller ensure
  this to be called only from code selected by KVM_GUEST?
 
  Or let this stub stay to avoid any random driver calling this ?
 
  I think the most reasonable way forward is to add a stub for
  non-CONFIG_EPAPR to the epapr code, then replace the kvm bits with
  generic epapr bits (which your patches already do).
 
  Please describe which stub you are talking about.
 
  kvm_hypercall is always available, regardless of the config option,
  which makes all its subfunctions always available as well.
 
  This patch renames kvm_hypercall() to epapr_hypercall() and which is always
 available. And the kvm_hypercall() friends now directly calls 
 epapr_hypercall().
  IIUC, So what you are trying to say is let the kvm_hypercall() friends keep 
  on
 calling kvm_hypercall() itself and a sub something like this:
 
 No, what I'm saying is that we either
 
   a) drop the whole #ifndef code path consciously. This would have to be a
 separate patch with a separate discussion. It's orthogonal to combining
 kvm_hypercall() and epapr_hypercall()
 
   b) add the #ifndef path to epapr_hypercall()

Do you mean like this in arch/powerpc/include/asm/epapr_hcalls.h

#ifdef CONFIG_KVM_GUEST
static inline unsigned long epapr_hypercall(unsigned long *in,
   unsigned long *out,
   unsigned long nr)
{
 // code for this function
} 
#else
static inline unsigned long epapr_hypercall(unsigned long *in,
   unsigned long *out,
   unsigned long nr)
{
return EV_UNIMPLEMENTED;
}
#endif

 
 I prefer b, Scott prefers b.
 
 
 Alex
 
 --
 To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body
 of a message to majord...@vger.kernel.org More majordomo info at
 http://vger.kernel.org/majordomo-info.html


--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 1/2] kvm/powerpc: rename kvm_hypercall() to epapr_hypercall()

2013-10-07 Thread Bhushan Bharat-R65777


 -Original Message-
 From: Alexander Graf [mailto:ag...@suse.de]
 Sent: Monday, October 07, 2013 9:43 PM
 To: Bhushan Bharat-R65777
 Cc: Wood Scott-B07421; kvm-ppc@vger.kernel.org; k...@vger.kernel.org
 Subject: Re: [PATCH 1/2] kvm/powerpc: rename kvm_hypercall() to
 epapr_hypercall()
 
 
 On 07.10.2013, at 18:04, Bhushan Bharat-R65777 r65...@freescale.com wrote:
 
 
 
  -Original Message-
  From: kvm-ppc-ow...@vger.kernel.org
  [mailto:kvm-ppc-ow...@vger.kernel.org] On Behalf Of Alexander Graf
  Sent: Monday, October 07, 2013 9:16 PM
  To: Bhushan Bharat-R65777
  Cc: Wood Scott-B07421; kvm-ppc@vger.kernel.org; k...@vger.kernel.org
  Subject: Re: [PATCH 1/2] kvm/powerpc: rename kvm_hypercall() to
  epapr_hypercall()
 
 
  On 07.10.2013, at 17:43, Bhushan Bharat-R65777 r65...@freescale.com 
  wrote:
 
  at least when I can avoid it. With the current code the
  compiler would be
  smart enough to just optimize out the complete branch.
 
  Sure.  My point is, where would you be calling that where the
  entire file isn't predicated on (or selecting)
  CONFIG_KVM_GUEST or
  similar?
 
  We don't do these stubs for every single function in the
  kernel
  -- only ones where the above is a reasonable use case.
 
  Yeah, I'm fine on dropping it, but we need to make that a
  conscious decision
  and verify that no caller relies on it.
 
  kvm_para_has_feature() is called from
  arch/powerpc/kernel/kvm.c, arch/x86/kernel/kvm.c, and
  arch/x86/kernel/kvmclock.c, all of which are enabled by
 CONFIG_KVM_GUEST.
 
  I did find one example of kvm_para_available() being used in an
  unexpected place
  -- sound/pci/intel8x0.c.  It defines its own
  non-CONFIG_KVM_GUEST stub, even though x86 defines
  kvm_para_available() using inline CPUID stuff which should work 
  without
 CONFIG_KVM_GUEST.
  I'm not sure why it even needs to do that, though -- shouldn't
  the subsequent PCI subsystem vendor/device check should be 
  sufficient?
  No hypercalls are involved.
 
  That said, the possibility that some random driver might want
  to make use of paravirt features is a decent argument for keeping the
 stub.
 
 
  I am not sure where we are agreeing on?
  Do we want to remove the stub in
  arch/powerpc/include/asm/kvm_para.h
  ? as
  there is no caller without KVM_GUEST and in future caller ensure
  this to be called only from code selected by KVM_GUEST?
 
  Or let this stub stay to avoid any random driver calling this ?
 
  I think the most reasonable way forward is to add a stub for
  non-CONFIG_EPAPR to the epapr code, then replace the kvm bits
  with generic epapr bits (which your patches already do).
 
  Please describe which stub you are talking about.
 
  kvm_hypercall is always available, regardless of the config option,
  which makes all its subfunctions always available as well.
 
  This patch renames kvm_hypercall() to epapr_hypercall() and which is
  always
  available. And the kvm_hypercall() friends now directly calls
 epapr_hypercall().
  IIUC, So what you are trying to say is let the kvm_hypercall()
  friends keep on
  calling kvm_hypercall() itself and a sub something like this:
 
  No, what I'm saying is that we either
 
   a) drop the whole #ifndef code path consciously. This would have to
  be a separate patch with a separate discussion. It's orthogonal to
  combining
  kvm_hypercall() and epapr_hypercall()
 
   b) add the #ifndef path to epapr_hypercall()
 
  Do you mean like this in arch/powerpc/include/asm/epapr_hcalls.h
 
  #ifdef CONFIG_KVM_GUEST
 
 CONFIG_EPAPR_PARAVIRT

Yes, I was getting confused why only KVM_GUEST as this not specific to 
KVM-GUEST.
Thank you

 
 Apart from that, yes, I think that's what we want.
 
 
 Alex
 
  static inline unsigned long epapr_hypercall(unsigned long *in,
unsigned long *out,
unsigned long nr) { // code for this
  function } #else static inline unsigned long epapr_hypercall(unsigned
  long *in,
unsigned long *out,
unsigned long nr) {
  return EV_UNIMPLEMENTED;
  }
  #endif
 
 


--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 1/3 v6] kvm: powerpc: keep only pte search logic in lookup_linux_pte

2013-10-06 Thread Bhushan Bharat-R65777
Hi Paul,

 -Original Message-
 From: kvm-ppc-ow...@vger.kernel.org [mailto:kvm-ppc-ow...@vger.kernel.org] On
 Behalf Of Paul Mackerras
 Sent: Monday, October 07, 2013 4:39 AM
 To: Bhushan Bharat-R65777
 Cc: ag...@suse.de; k...@vger.kernel.org; kvm-ppc@vger.kernel.org; Wood Scott-
 B07421; b...@kernel.crashing.org; Bhushan Bharat-R65777
 Subject: Re: [PATCH 1/3 v6] kvm: powerpc: keep only pte search logic in
 lookup_linux_pte
 
 On Fri, Oct 04, 2013 at 08:25:31PM +0530, Bharat Bhushan wrote:
  lookup_linux_pte() was searching for a pte and also sets access flags
  is writable. This function now searches only pte while access flag
  setting is done explicitly.


 
 So in order to reduce some code duplication, you have added code duplication 
 in
 the existing callers of this function.  I'm not convinced it's an overall win.

lookup_linux_pte(): as per name it is supposed to only lookup for a pte, but it 
is doing more than that (Also updating the pte). So I made this function to 
only do lookup (which also check size). I am not an MM expert but I think we 
can make this function better like you suggested checking pte_present() only if 
_PAGE_BUSY not set.

 What's left in this function is pretty trivial, just a call to
 find_linux_pte_or_hugepte() and some pagesize computations.  I would prefer 
 you
 found a way to do what you want without adding code duplication at the 
 existing
 call sites.

What about doing this way:
1) A function which will do the lookup for Linux pte. May be call that as 
lookup_linux_pte()
2) lookup + page update (what the existing function lookup_linux_pte() is 
doing). Will rename this function to lookup_linux_pte_and_update(), which will 
call above defined lookup_linux_pte()


Thanks
-Bharat

  Maybe you could have a new find_linux_pte_and_check_pagesize() and
 call that from the existing lookup_linux_pte().
 
 The other thing you've done, without commenting on why you have done it, is to
 add a pte_present check without having looked at _PAGE_BUSY.
 kvmppc_read_update_linux_pte() only checks _PAGE_PRESENT after checking that
 _PAGE_BUSY is clear, so this is a semantic change, which I think is wrong for
 server processors.
 
 So, on the whole, NACK from me for this patch.
 
 Paul.
 --
 To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body
 of a message to majord...@vger.kernel.org More majordomo info at
 http://vger.kernel.org/majordomo-info.html


--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 4/6 v5] kvm: powerpc: keep only pte search logic in lookup_linux_pte

2013-10-04 Thread Bhushan Bharat-R65777


 -Original Message-
 From: Alexander Graf [mailto:ag...@suse.de]
 Sent: Friday, October 04, 2013 6:57 PM
 To: Bhushan Bharat-R65777
 Cc: b...@kernel.crashing.org; pau...@samba.org; k...@vger.kernel.org; kvm-
 p...@vger.kernel.org; linuxppc-...@lists.ozlabs.org; Wood Scott-B07421; 
 Bhushan
 Bharat-R65777
 Subject: Re: [PATCH 4/6 v5] kvm: powerpc: keep only pte search logic in
 lookup_linux_pte
 
 
 On 19.09.2013, at 08:02, Bharat Bhushan wrote:
 
  lookup_linux_pte() was searching for a pte and also sets access flags
  is writable. This function now searches only pte while access flag
  setting is done explicitly.
 
  This pte lookup is not kvm specific, so moved to common code
  (asm/pgtable.h) My Followup patch will use this on booke.
 
  Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
  ---
  v4-v5
  - No change
 
  arch/powerpc/include/asm/pgtable.h  |   24 +++
  arch/powerpc/kvm/book3s_hv_rm_mmu.c |   36 
  +++---
  2 files changed, 36 insertions(+), 24 deletions(-)
 
  diff --git a/arch/powerpc/include/asm/pgtable.h
  b/arch/powerpc/include/asm/pgtable.h
  index 7d6eacf..3a5de5c 100644
  --- a/arch/powerpc/include/asm/pgtable.h
  +++ b/arch/powerpc/include/asm/pgtable.h
  @@ -223,6 +223,30 @@ extern int gup_hugepte(pte_t *ptep, unsigned long
  sz, unsigned long addr, #endif pte_t *find_linux_pte_or_hugepte(pgd_t
  *pgdir, unsigned long ea,
   unsigned *shift);
  +
  +static inline pte_t *lookup_linux_pte(pgd_t *pgdir, unsigned long hva,
  +unsigned long *pte_sizep)
  +{
  +   pte_t *ptep;
  +   unsigned long ps = *pte_sizep;
  +   unsigned int shift;
  +
  +   ptep = find_linux_pte_or_hugepte(pgdir, hva, shift);
  +   if (!ptep)
  +   return __pte(0);
 
 This returns a struct pte_t, but your return value of the function is a struct
 pte_t *. So this code will fail compiling with STRICT_MM_TYPECHECKS set. Any
 reason you don't just return NULL here?

I want to return the ptep (pte pointer) , so yes this should be NULL.
Will correct this.

Thanks
-Bharat

 
 That way callers could simply check on if (ptep) ... or you leave the return
 value as struct pte_t.
 
 
 Alex
 
  +   if (shift)
  +   *pte_sizep = 1ul  shift;
  +   else
  +   *pte_sizep = PAGE_SIZE;
  +
  +   if (ps  *pte_sizep)
  +   return __pte(0);
  +
  +   if (!pte_present(*ptep))
  +   return __pte(0);
 
  +
  +   return ptep;
  +}
  #endif /* __ASSEMBLY__ */
 
  #endif /* __KERNEL__ */
  diff --git a/arch/powerpc/kvm/book3s_hv_rm_mmu.c
  b/arch/powerpc/kvm/book3s_hv_rm_mmu.c
  index 45e30d6..74fa7f8 100644
  --- a/arch/powerpc/kvm/book3s_hv_rm_mmu.c
  +++ b/arch/powerpc/kvm/book3s_hv_rm_mmu.c
  @@ -134,25 +134,6 @@ static void remove_revmap_chain(struct kvm *kvm, long
 pte_index,
  unlock_rmap(rmap);
  }
 
  -static pte_t lookup_linux_pte(pgd_t *pgdir, unsigned long hva,
  - int writing, unsigned long *pte_sizep)
  -{
  -   pte_t *ptep;
  -   unsigned long ps = *pte_sizep;
  -   unsigned int hugepage_shift;
  -
  -   ptep = find_linux_pte_or_hugepte(pgdir, hva, hugepage_shift);
  -   if (!ptep)
  -   return __pte(0);
  -   if (hugepage_shift)
  -   *pte_sizep = 1ul  hugepage_shift;
  -   else
  -   *pte_sizep = PAGE_SIZE;
  -   if (ps  *pte_sizep)
  -   return __pte(0);
  -   return kvmppc_read_update_linux_pte(ptep, writing, hugepage_shift);
  -}
  -
  static inline void unlock_hpte(unsigned long *hpte, unsigned long
  hpte_v) {
  asm volatile(PPC_RELEASE_BARRIER  : : : memory); @@ -173,6 +154,7
  @@ long kvmppc_do_h_enter(struct kvm *kvm, unsigned long flags,
  unsigned long is_io;
  unsigned long *rmap;
  pte_t pte;
  +   pte_t *ptep;
  unsigned int writing;
  unsigned long mmu_seq;
  unsigned long rcbits;
  @@ -231,8 +213,9 @@ long kvmppc_do_h_enter(struct kvm *kvm, unsigned
  long flags,
 
  /* Look up the Linux PTE for the backing page */
  pte_size = psize;
  -   pte = lookup_linux_pte(pgdir, hva, writing, pte_size);
  -   if (pte_present(pte)) {
  +   ptep = lookup_linux_pte(pgdir, hva, pte_size);
  +   if (pte_present(pte_val(*ptep))) {
  +   pte = kvmppc_read_update_linux_pte(ptep, writing);
  if (writing  !pte_write(pte))
  /* make the actual HPTE be read-only */
  ptel = hpte_make_readonly(ptel);
  @@ -661,15 +644,20 @@ long kvmppc_h_protect(struct kvm_vcpu *vcpu, unsigned
 long flags,
  struct kvm_memory_slot *memslot;
  pgd_t *pgdir = vcpu-arch.pgdir;
  pte_t pte;
  +   pte_t *ptep;
 
  psize = hpte_page_size(v, r);
  gfn = ((r  HPTE_R_RPN)  ~(psize - 1))  PAGE_SHIFT;
  memslot

RE: [PATCH 1/2] kvm/powerpc: rename kvm_hypercall() to epapr_hypercall()

2013-10-03 Thread Bhushan Bharat-R65777


 -Original Message-
 From: Wood Scott-B07421
 Sent: Thursday, October 03, 2013 12:04 AM
 To: Alexander Graf
 Cc: Bhushan Bharat-R65777; kvm-ppc@vger.kernel.org; k...@vger.kernel.org; 
 Bhushan
 Bharat-R65777
 Subject: Re: [PATCH 1/2] kvm/powerpc: rename kvm_hypercall() to
 epapr_hypercall()
 
 On Wed, 2013-10-02 at 19:54 +0200, Alexander Graf wrote:
  On 02.10.2013, at 19:49, Scott Wood wrote:
 
   On Wed, 2013-10-02 at 19:46 +0200, Alexander Graf wrote:
   On 02.10.2013, at 19:42, Scott Wood wrote:
  
   On Wed, 2013-10-02 at 19:17 +0200, Alexander Graf wrote:
   On 02.10.2013, at 19:04, Scott Wood wrote:
  
   On Wed, 2013-10-02 at 18:53 +0200, Alexander Graf wrote:
   On 02.10.2013, at 18:40, Scott Wood wrote:
  
   On Wed, 2013-10-02 at 16:19 +0200, Alexander Graf wrote:
   Won't this break when CONFIG_EPAPR_PARAVIRT=n? We wouldn't have
 epapr_hcalls.S compiled into the code base then and the bl above would 
 reference
 an unknown function.
  
   KVM_GUEST selects EPAPR_PARAVIRT.
  
   But you can not select KVM_GUEST and still call these inline 
   functions,
 no?
  
   No.
  
   Like kvm_arch_para_features().
  
   Where does that get called without KVM_GUEST?
  
   How would that work currently, with the call to kvm_hypercall()
   in arch/powerpc/kernel/kvm.c (which calls epapr_hypercall, BTW)?
  
   It wouldn't ever get called because kvm_hypercall() ends up always
 returning EV_UNIMPLEMENTED when #ifndef CONFIG_KVM_GUEST.
  
   OK, so the objection is to removing that stub?  Where would we
   actually want to call this without knowing that KVM_GUEST or
   EPAPR_PARAVIRT are enabled?
  
   In probing code. I usually prefer
  
   if (kvm_feature_available(X)) {
 ...
   }
  
   over
  
   #ifdef CONFIG_KVM_GUEST
   if (kvm_feature_available(X)) {
 ...
   }
   #endif
  
   at least when I can avoid it. With the current code the compiler would be
 smart enough to just optimize out the complete branch.
  
   Sure.  My point is, where would you be calling that where the entire
   file isn't predicated on (or selecting) CONFIG_KVM_GUEST or similar?
  
   We don't do these stubs for every single function in the kernel --
   only ones where the above is a reasonable use case.
 
  Yeah, I'm fine on dropping it, but we need to make that a conscious decision
 and verify that no caller relies on it.
 
 kvm_para_has_feature() is called from arch/powerpc/kernel/kvm.c,
 arch/x86/kernel/kvm.c, and arch/x86/kernel/kvmclock.c, all of which are 
 enabled
 by CONFIG_KVM_GUEST.
 
 I did find one example of kvm_para_available() being used in an unexpected 
 place
 -- sound/pci/intel8x0.c.  It defines its own non-CONFIG_KVM_GUEST stub, even
 though x86 defines kvm_para_available() using inline CPUID stuff which should
 work without CONFIG_KVM_GUEST.
 I'm not sure why it even needs to do that, though -- shouldn't the subsequent
 PCI subsystem vendor/device check should be sufficient?  No hypercalls are
 involved.
 
 That said, the possibility that some random driver might want to make use of
 paravirt features is a decent argument for keeping the stub.
 

I am not sure where we are agreeing on?
Do we want to remove the stub in arch/powerpc/include/asm/kvm_para.h ? as there 
is no caller without KVM_GUEST and in future caller ensure this to be called 
only from code selected by KVM_GUEST?

Or let this stub stay to avoid any random driver calling this ?

Thanks
-Bharat






RE: [PATCH 5/6 v5] kvm: booke: clear host tlb reference flag on guest tlb invalidation

2013-09-20 Thread Bhushan Bharat-R65777


 -Original Message-
 From: Wood Scott-B07421
 Sent: Friday, September 20, 2013 11:38 PM
 To: Bhushan Bharat-R65777
 Cc: Wood Scott-B07421; b...@kernel.crashing.org; ag...@suse.de;
 pau...@samba.org; k...@vger.kernel.org; kvm-ppc@vger.kernel.org; linuxppc-
 d...@lists.ozlabs.org
 Subject: Re: [PATCH 5/6 v5] kvm: booke: clear host tlb reference flag on guest
 tlb invalidation
 
 On Fri, 2013-09-20 at 13:04 -0500, Bhushan Bharat-R65777 wrote:
 
   -Original Message-
   From: Wood Scott-B07421
   Sent: Friday, September 20, 2013 9:48 PM
   To: Bhushan Bharat-R65777
   Cc: Wood Scott-B07421; b...@kernel.crashing.org; ag...@suse.de;
   pau...@samba.org; k...@vger.kernel.org; kvm-ppc@vger.kernel.org;
   linuxppc- d...@lists.ozlabs.org
   Subject: Re: [PATCH 5/6 v5] kvm: booke: clear host tlb reference
   flag on guest tlb invalidation
  
   On Thu, 2013-09-19 at 23:19 -0500, Bhushan Bharat-R65777 wrote:
We uses these bit flags only for TLB1 and if size of stlbe is 4K
then we set E500_TLB_TLB0  otherwise we set E500_TLB_BITMAP.
Although I think that E500_TLB_BITMAP should be set only if stlbe
size is less than gtlbe size.
  
   Why?  Even if there's only one bit set in the map, we need it to
   keep track of which entry was used.
 
  If there is one entry then will not this be simple/faster to not lookup 
  bitmap
 and guest-host array?
  A flag indicate it is 1:1 map and this is physical address.
 
 The difference would be negligible, and you'd have added overhead (both 
 runtime
 and complexity) of making this a special case.

May be you are right , I will see if I can give a try :)
BTW I have already sent v6 of this patch.

-Bharat

 
 -Scott
 



RE: [PATCH 5/6 v5] kvm: booke: clear host tlb reference flag on guest tlb invalidation

2013-09-20 Thread Bhushan Bharat-R65777


 -Original Message-
 From: Wood Scott-B07421
 Sent: Friday, September 20, 2013 9:48 PM
 To: Bhushan Bharat-R65777
 Cc: Wood Scott-B07421; b...@kernel.crashing.org; ag...@suse.de;
 pau...@samba.org; k...@vger.kernel.org; kvm-ppc@vger.kernel.org; linuxppc-
 d...@lists.ozlabs.org
 Subject: Re: [PATCH 5/6 v5] kvm: booke: clear host tlb reference flag on guest
 tlb invalidation
 
 On Thu, 2013-09-19 at 23:19 -0500, Bhushan Bharat-R65777 wrote:
 
   -Original Message-
   From: Wood Scott-B07421
   Sent: Friday, September 20, 2013 2:38 AM
   To: Bhushan Bharat-R65777
   Cc: b...@kernel.crashing.org; ag...@suse.de; pau...@samba.org;
   k...@vger.kernel.org; kvm-ppc@vger.kernel.org;
   linuxppc-...@lists.ozlabs.org; Bhushan Bharat-R65777
   Subject: Re: [PATCH 5/6 v5] kvm: booke: clear host tlb reference
   flag on guest tlb invalidation
  
   This breaks when you have both E500_TLB_BITMAP and E500_TLB_TLB0 set.
 
  I do not see any case where we set both E500_TLB_BITMAP and
  E500_TLB_TLB0.
 
 This would happen if you have a guest TLB1 entry that is backed by some 4K 
 pages
 and some larger pages (e.g. if the guest maps CCSR with one big
 TLB1 and there are varying I/O passthrough regions mapped).  It's not common,
 but it's possible.

Agree

 
   Also we have not optimized that yet (keeping track of multiple shadow
  TLB0 entries for one guest TLB1 entry)
 
 This is about correctness, not optimization.
 
  We uses these bit flags only for TLB1 and if size of stlbe is 4K then
  we set E500_TLB_TLB0  otherwise we set E500_TLB_BITMAP. Although I
  think that E500_TLB_BITMAP should be set only if stlbe size is less
  than gtlbe size.
 
 Why?  Even if there's only one bit set in the map, we need it to keep track of
 which entry was used.

If there is one entry then will not this be simple/faster to not lookup bitmap 
and guest-host array?
A flag indicate it is 1:1 map and this is physical address.

-Bharat

 
 -Scott
 

N�r��yb�X��ǧv�^�)޺{.n�+jir)w*jg����ݢj/���z�ޖ��2�ޙ�)ߡ�a�����G���h��j:+v���w��٥

RE: [PATCH 5/6 v5] kvm: booke: clear host tlb reference flag on guest tlb invalidation

2013-09-19 Thread Bhushan Bharat-R65777


 -Original Message-
 From: Wood Scott-B07421
 Sent: Friday, September 20, 2013 2:38 AM
 To: Bhushan Bharat-R65777
 Cc: b...@kernel.crashing.org; ag...@suse.de; pau...@samba.org;
 k...@vger.kernel.org; kvm-ppc@vger.kernel.org; linuxppc-...@lists.ozlabs.org;
 Bhushan Bharat-R65777
 Subject: Re: [PATCH 5/6 v5] kvm: booke: clear host tlb reference flag on guest
 tlb invalidation
 
 On Thu, 2013-09-19 at 11:32 +0530, Bharat Bhushan wrote:
  On booke, struct tlbe_ref contains host tlb mapping information
  (pfn: for guest-pfn to pfn, flags: attribute associated with this
  mapping) for a guest tlb entry. So when a guest creates a TLB entry
  then struct tlbe_ref is set to point to valid pfn and set
  attributes in flags field of the above said structure. When a guest
  TLB entry is invalidated then flags field of corresponding struct
  tlbe_ref is updated to point that this is no more valid, also we
  selectively clear some other attribute bits, example: if
  E500_TLB_BITMAP was set then we clear E500_TLB_BITMAP, if E500_TLB_TLB0 is 
  set
 then we clear this.
 
  Ideally we should clear complete flags as this entry is invalid and
  does not have anything to re-used. The other part of the problem is
  that when we use the same entry again then also we do not clear (started 
  doing
 or-ing etc).
 
  So far it was working because the selectively clearing mentioned above
  actually clears flags what was set during TLB mapping. But the
  problem starts coming when we add more attributes to this then we need
  to selectively clear them and which is not needed.
 
  This patch we do both
  - Clear flags when invalidating;
  - Clear flags when reusing same entry later
 
  Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
  ---
  v3- v5
   - New patch (found this issue when doing vfio-pci development)
 
   arch/powerpc/kvm/e500_mmu_host.c |   12 +++-
   1 files changed, 7 insertions(+), 5 deletions(-)
 
  diff --git a/arch/powerpc/kvm/e500_mmu_host.c
  b/arch/powerpc/kvm/e500_mmu_host.c
  index 1c6a9d7..60f5a3c 100644
  --- a/arch/powerpc/kvm/e500_mmu_host.c
  +++ b/arch/powerpc/kvm/e500_mmu_host.c
  @@ -217,7 +217,8 @@ void inval_gtlbe_on_host(struct kvmppc_vcpu_e500
 *vcpu_e500, int tlbsel,
  }
  mb();
  vcpu_e500-g2h_tlb1_map[esel] = 0;
  -   ref-flags = ~(E500_TLB_BITMAP | E500_TLB_VALID);
  +   /* Clear flags as TLB is not backed by the host anymore */
  +   ref-flags = 0;
  local_irq_restore(flags);
  }
 
 This breaks when you have both E500_TLB_BITMAP and E500_TLB_TLB0 set.

I do not see any case where we set both E500_TLB_BITMAP and E500_TLB_TLB0. Also 
we have not optimized that yet (keeping track of multiple shadow TLB0 entries 
for one guest TLB1 entry)

We uses these bit flags only for TLB1 and if size of stlbe is 4K then we set 
E500_TLB_TLB0  otherwise we set E500_TLB_BITMAP. Although I think that 
E500_TLB_BITMAP should be set only if stlbe size is less than gtlbe size.

 
 Instead, just convert the final E500_TLB_VALID clearing at the end into
 ref-flags = 0, and convert the early return a few lines earlier into
 conditional execution of the tlbil_one().

This looks better, will send the patch shortly.

Thanks
-Bharat

 
 -Scott
 

N�r��yb�X��ǧv�^�)޺{.n�+jir)w*jg����ݢj/���z�ޖ��2�ޙ�)ߡ�a�����G���h��j:+v���w��٥

RE: vfio for platform devices - 9/5/2012 - minutes

2013-09-12 Thread Bhushan Bharat-R65777


 -Original Message-
 From: Alex Williamson [mailto:alex.william...@redhat.com]
 Sent: Wednesday, September 11, 2013 10:45 PM
 To: Yoder Stuart-B08248
 Cc: Wood Scott-B07421; Sethi Varun-B16395; Bhushan Bharat-R65777; 'Peter
 Maydell'; 'Santosh Shukla'; 'Alexander Graf'; 'Antonios Motakis'; 'Christoffer
 Dall'; 'kim.phill...@linaro.org'; kvm...@lists.cs.columbia.edu; kvm-
 p...@vger.kernel.org; qemu-de...@nongnu.org
 Subject: Re: vfio for platform devices - 9/5/2012 - minutes
 
 On Wed, 2013-09-11 at 16:42 +, Yoder Stuart-B08248 wrote:
 
   -Original Message-
   From: Yoder Stuart-B08248
   Sent: Thursday, September 05, 2013 12:51 PM
   To: Wood Scott-B07421; Sethi Varun-B16395; Bhushan Bharat-R65777;
   'Peter Maydell'; 'Santosh Shukla'; 'Alex Williamson'; 'Alexander
   Graf'; 'Antonios Motakis'; 'Christoffer Dall'; 'kim.phill...@linaro.org'
   Cc: kvm...@lists.cs.columbia.edu; 'kvm-ppc@vger.kernel.org'; 'qemu-
   de...@nongnu.org'
   Subject: vfio for platform devices - 9/5/2012 - minutes
  
   We had a call with those interested and/or working on vfio for
   platform devices.
  
   Participants: Scott Wood, Varun Sethi, Bharat Bhushan, Peter Maydell,
 Santosh Shukla, Alex Williamson, Alexander Graf,
 Antonios Motakis, Christoffer Dall, Kim Phillips,
 Stuart Yoder
  
   Several aspects to vfio for platform devices:
  
   1. IOMMU groups
  
-iommu driver needs to register a bus notifier for the platform bus
 and create groups for relevant platform devices  -Antonios is
   looking at this for several ARM IOMMUs  -PAMU (Freescale) driver
   already does this
  
   2. unbinding device from host
  
PCI:
  echo :06:0d.0 
   /sys/bus/pci/devices/:06:0d.0/driver/unbind
Platform:
  echo ffe101300.dma 
   /sys/bus/platform/devices/ffe101300.dma/driver/unbind
  
-don't believe there are issues or work to do here
  
   3. binding device to vfio-platform driver
  
PCI:
  echo 1102 0002  /sys/bus/pci/drivers/vfio-pci/new_id
  
-this is probably the least understood issue-- platform drivers
 register themselves with the bus for a specific name
 string.  That is matched with device tree compatible strings
 later to bind a device to a driver  -we want is to have the
   vfio-platform driver to dynamically bind
 to a variety of platform devices previously unknown to
 vfio-platform
-ideally unbinding and binding could be an atomic operation  -Alex
   W pointed out that x86 could leverage this work so
 keep that in mind in what we design  -Kim Phillips (Linaro) will
   start working on this
 
  One thing we didn't discuss needs to be considered (probably by Kim
  who is looking at the 'binding device' issue) is around returning a
  passthru device back to the host.
 
  After a platform device has been bound to vfio and is in use by user
  space or a virtual machine, we also need to be able to unwind all that
  and return the device back to the host in a sane state.
 
  What happens when user space exits and the vfio file descriptors are
  closed?
 
 For reference, expectations of how vfio-pci handles these situations:
 
 For vfio-pci, when the reference count on the device fd drops to zero we call 
 a
 device disable function that includes disabling the bus master bit in config
 space stop ongoing DMA.

There is no bus mastering for platform devices and device disabling is not a 
standard like PCI. Do you think that we might need to do some device specific 
handling. For example for DMA controller, we need to atleast disable the DMA 
controller and mask its interrupts (may be ensure that there is no 
pending/running dma transaction, release irqs etc). As we are not yet having 
any linkage to respective kernel driver then I am not sure how we will do that 
specific handling?

 
  What if the device is still active and doing bus
  mastering?   (e.g. a VM crashed causing a QEMU
  exit)
 
 If the VM crashes the vfio fds get released resulting in the above opportunity
 for the vfio device driver to quiesce the device.

I think the quiesing of devices with we device specific, so the generic 
vfio=platform driver may not be able to handle that, right?

 
  How can the vfio-platform layer in the host kernel get a specific
  device in a sane state?
 
 It's not easy on pci either.  We save config space prior to exposing the 
 device
 and restore config space later, but it's not complete.  We mostly rely on 
 device
 (function) resets, to put things in a sane state, but those aren't always
 supported.

All platform devices also may not have reset capability (and if any then not 
generic way for all devices).

  I just introduced patches for v3.12 that enable a PCI bus reset
 interface, but it's mostly useful for userspace, since on PCI it's often the
 case that a bus contains multiple devices which don't necessarily align to 
 iommu
 group boundaries.
 
  When a plaform device is 'unbound

RE: [PATCH 0/2] KVM: PPC: BOOKE: MMU Fixes

2013-08-29 Thread Bhushan Bharat-R65777
Hi Alex,

Second patch (kvm: ppc: booke: check range page invalidation progress on page 
setup) of this patch series fixes a critical issue and we would like that to be 
part of 2.12.

First Patch is not that important but pretty simple.

Thanks
-Bharat

 -Original Message-
 From: Bhushan Bharat-R65777
 Sent: Wednesday, August 07, 2013 3:34 PM
 To: pau...@samba.org; Wood Scott-B07421; ag...@suse.de; 
 kvm-ppc@vger.kernel.org;
 k...@vger.kernel.org
 Cc: Bhushan Bharat-R65777
 Subject: [PATCH 0/2] KVM: PPC: BOOKE: MMU Fixes
 
 From: Bharat Bhushan bharat.bhus...@freescale.com
 
 First Patch set missing _PAGE_ACCESSED when a guest page is accessed
 
 Second Patch check for MMU notifier range invalidation progress when setting a
 reference for a guest page. This is based on
 KVM: PPC: Book3S PR: Use mmu_notifier_retry() in kvmppc_mmu_map_page()
 patch sent by Pauls (still in review).
 
 Bharat Bhushan (2):
   kvm: powerpc: mark page accessed when mapping a guest page
   kvm: ppc: booke: check range page invalidation progress on page setup
   
  arch/powerpc/kvm/e500_mmu_host.c |   22 --
  1 files changed, 20 insertions(+), 2 deletions(-)


--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 6/6 v3] kvm: powerpc: use caching attributes as per linux pte

2013-08-12 Thread Bhushan Bharat-R65777


 -Original Message-
 From: Wood Scott-B07421
 Sent: Saturday, August 10, 2013 6:35 AM
 To: Bhushan Bharat-R65777
 Cc: b...@kernel.crashing.org; ag...@suse.de; pau...@samba.org;
 k...@vger.kernel.org; kvm-ppc@vger.kernel.org; linuxppc-...@lists.ozlabs.org;
 Bhushan Bharat-R65777
 Subject: Re: [PATCH 6/6 v3] kvm: powerpc: use caching attributes as per linux
 pte
 
 On Tue, 2013-08-06 at 17:01 +0530, Bharat Bhushan wrote:
  @@ -449,7 +446,16 @@ static inline int kvmppc_e500_shadow_map(struct
 kvmppc_vcpu_e500 *vcpu_e500,
  gvaddr = ~((tsize_pages  PAGE_SHIFT) - 1);
  }
 
  -   kvmppc_e500_ref_setup(ref, gtlbe, pfn);
  +   pgdir = vcpu_e500-vcpu.arch.pgdir;
  +   ptep = lookup_linux_pte(pgdir, hva, tsize_pages);
  +   if (pte_present(*ptep)) {
  +   wimg = (pte_val(*ptep)  PTE_WIMGE_SHIFT)  MAS2_WIMGE_MASK;
  +   } else {
  +   printk(KERN_ERR pte not present: gfn %lx, pfn %lx\n,
  +   (long)gfn, pfn);
  +   return -EINVAL;
 
 Don't let the guest spam the host kernel console by repeatedly accessing bad
 mappings (even if it requires host userspace to assist by pointing a memslot 
 at
 a bad hva).  This should at most be printk_ratelimited(), and probably just
 pr_debug().  It should also have __func__ context.

Very good point, I will make this printk_ratelimited() in this patch. And 
convert this and other error prints to pr_debug() when we will send machine 
check on error in this flow.

 
 Also, I don't see the return value getting checked (the immediate callers 
 check
 it and propogate the error, but kvmppc_mmu_map() doesn't).
 We want to send a machine check to the guest if this happens (or possibly exit
 to userspace since it indicates a bad memslot, not just a guest bug).  We 
 don't
 want to just silently retry over and over.

I completely agree with you, but this was something already missing (error 
return by this function is nothing new added in this patch), So I would like to 
take that separately.

 
 Otherwise, this series looks good to me.

Thank you. :)
-Bharat

 
 -Scott
 

N�r��yb�X��ǧv�^�)޺{.n�+jir)w*jg����ݢj/���z�ޖ��2�ޙ�)ߡ�a�����G���h��j:+v���w��٥

RE: [PATCH 21/23] KVM: PPC: Book3S PR: Use mmu_notifier_retry() in kvmppc_mmu_map_page()

2013-08-07 Thread Bhushan Bharat-R65777


 -Original Message-
 From: Paul Mackerras [mailto:pau...@samba.org]
 Sent: Wednesday, August 07, 2013 1:58 PM
 To: Bhushan Bharat-R65777
 Cc: Alexander Graf; Benjamin Herrenschmidt; kvm-ppc@vger.kernel.org;
 k...@vger.kernel.org
 Subject: Re: [PATCH 21/23] KVM: PPC: Book3S PR: Use mmu_notifier_retry() in
 kvmppc_mmu_map_page()
 
 On Wed, Aug 07, 2013 at 05:17:29AM +, Bhushan Bharat-R65777 wrote:
 
  Pauls, I am trying to understand the flow; does retry mean that we do not
 create the mapping and return to guest, which will fault again and then we 
 will
 retry?
 
 Yes, and you do put_page or kvm_release_pfn_clean for any page that you got.

Ok, but what is the value to return back to guest when we know it is again 
going to generate fault. 
Cannot we retry within KVM?

Thanks
-Bharat

 
 Paul.


--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 5/6 v2] kvm: powerpc: booke: Add linux pte lookup like booke3s

2013-08-06 Thread Bhushan Bharat-R65777


 -Original Message-
 From: Bhushan Bharat-R65777
 Sent: Tuesday, August 06, 2013 6:42 AM
 To: Wood Scott-B07421
 Cc: Benjamin Herrenschmidt; ag...@suse.de; kvm-ppc@vger.kernel.org;
 k...@vger.kernel.org; linuxppc-...@lists.ozlabs.org
 Subject: RE: [PATCH 5/6 v2] kvm: powerpc: booke: Add linux pte lookup like
 booke3s
 
 
 
  -Original Message-
  From: Wood Scott-B07421
  Sent: Tuesday, August 06, 2013 12:49 AM
  To: Bhushan Bharat-R65777
  Cc: Benjamin Herrenschmidt; Wood Scott-B07421; ag...@suse.de; kvm-
  p...@vger.kernel.org; k...@vger.kernel.org;
  linuxppc-...@lists.ozlabs.org
  Subject: Re: [PATCH 5/6 v2] kvm: powerpc: booke: Add linux pte lookup
  like booke3s
 
  On Mon, 2013-08-05 at 09:27 -0500, Bhushan Bharat-R65777 wrote:
  
-Original Message-
From: Benjamin Herrenschmidt [mailto:b...@kernel.crashing.org]
Sent: Saturday, August 03, 2013 9:54 AM
To: Bhushan Bharat-R65777
Cc: Wood Scott-B07421; ag...@suse.de; kvm-ppc@vger.kernel.org;
k...@vger.kernel.org; linuxppc-...@lists.ozlabs.org
Subject: Re: [PATCH 5/6 v2] kvm: powerpc: booke: Add linux pte
lookup like booke3s
   
On Sat, 2013-08-03 at 02:58 +, Bhushan Bharat-R65777 wrote:
 One of the problem I saw was that if I put this code in
 asm/pgtable-32.h and asm/pgtable-64.h then pte_persent() and
 other friend function (on which this code depends) are defined in
 pgtable.h.
 And pgtable.h includes asm/pgtable-32.h and asm/pgtable-64.h
 before it defines pte_present() and friends functions.

 Ok I move wove this in asm/pgtable*.h, initially I fought with
 myself to take this code in pgtable* but finally end up doing
 here (got biased by book3s :)).
   
Is there a reason why these routines can not be completely generic
in pgtable.h ?
  
   How about the generic function:
  
   diff --git a/arch/powerpc/include/asm/pgtable-ppc64.h
   b/arch/powerpc/include/asm/pgtable-ppc64.h
   index d257d98..21daf28 100644
   --- a/arch/powerpc/include/asm/pgtable-ppc64.h
   +++ b/arch/powerpc/include/asm/pgtable-ppc64.h
   @@ -221,6 +221,27 @@ static inline unsigned long pte_update(struct
   mm_struct
  *mm,
   return old;
}
  
   +static inline unsigned long pte_read(pte_t *p) { #ifdef
   +PTE_ATOMIC_UPDATES
   +   pte_t pte;
   +   pte_t tmp;
   +   __asm__ __volatile__ (
   +   1: ldarx   %0,0,%3\n
   +  andi.   %1,%0,%4\n
   +  bne-1b\n
   +  ori %1,%0,%4\n
   +  stdcx.  %1,0,%3\n
   +  bne-1b
   +   : =r (pte), =r (tmp), =m (*p)
   +   : r (p), i (_PAGE_BUSY)
   +   : cc);
   +
   +   return pte;
   +#else
   +   return pte_val(*p);
   +#endif
   +#endif
   +}
static inline int __ptep_test_and_clear_young(struct mm_struct *mm,
 unsigned long addr,
   pte_t *ptep)
 
  Please leave a blank line between functions.
 
{
   diff --git a/arch/powerpc/include/asm/pgtable.h
   b/arch/powerpc/include/asm/pgtable.h
   index 690c8c2..dad712c 100644
   --- a/arch/powerpc/include/asm/pgtable.h
   +++ b/arch/powerpc/include/asm/pgtable.h
   @@ -254,6 +254,45 @@ static inline pte_t
   *find_linux_pte_or_hugepte(pgd_t *pgdir, unsigned long ea,  }
   #endif
   /* !CONFIG_HUGETLB_PAGE */
  
   +static inline pte_t lookup_linux_pte(pgd_t *pgdir, unsigned long hva,
   +int writing, unsigned long
   +*pte_sizep)
 
  The name implies that it just reads the PTE.  Setting accessed/dirty
  shouldn't be an undocumented side-effect.
 
 Ok, will rename and document.
 
  Why can't the caller do that (or a different function that the caller
  calls afterward if desired)?
 
 The current implementation in book3s is;
  1) find a pte/hugepte
  2) return null if pte not present
  3) take _PAGE_BUSY lock
  4) set accessed/dirty
  5) clear _PAGE_BUSY.
 
 What I tried was
 1) find a pte/hugepte
 2) return null if pte not present
 3) return pte (not take lock by not setting _PAGE_BUSY)
 
 4) then user calls  __ptep_set_access_flags() to atomic update the
 dirty/accessed flags in pte.
 
 - but the benchmark results were not good
 - Also can there be race as we do not take lock in step 3 and update in step 
 4 ?
 
 
  Though even then you have the undocumented side effect of locking the
  PTE on certain targets.
 
   +{
   +   pte_t *ptep;
   +   pte_t pte;
   +   unsigned long ps = *pte_sizep;
   +   unsigned int shift;
   +
   +   ptep = find_linux_pte_or_hugepte(pgdir, hva, shift);
   +   if (!ptep)
   +   return __pte(0);
   +   if (shift)
   +   *pte_sizep = 1ul  shift;
   +   else
   +   *pte_sizep = PAGE_SIZE;
   +
   +   if (ps  *pte_sizep)
   +   return __pte(0);
   +
   +   if (!pte_present(*ptep))
   +   return __pte(0);
   +
   +#ifdef CONFIG_PPC64
   +   /* Lock

RE: [PATCH 5/6 v2] kvm: powerpc: booke: Add linux pte lookup like booke3s

2013-08-06 Thread Bhushan Bharat-R65777


 -Original Message-
 From: Wood Scott-B07421
 Sent: Tuesday, August 06, 2013 12:49 AM
 To: Bhushan Bharat-R65777
 Cc: Benjamin Herrenschmidt; Wood Scott-B07421; ag...@suse.de; kvm-
 p...@vger.kernel.org; k...@vger.kernel.org; linuxppc-...@lists.ozlabs.org
 Subject: Re: [PATCH 5/6 v2] kvm: powerpc: booke: Add linux pte lookup like
 booke3s
 
 On Mon, 2013-08-05 at 09:27 -0500, Bhushan Bharat-R65777 wrote:
 
   -Original Message-
   From: Benjamin Herrenschmidt [mailto:b...@kernel.crashing.org]
   Sent: Saturday, August 03, 2013 9:54 AM
   To: Bhushan Bharat-R65777
   Cc: Wood Scott-B07421; ag...@suse.de; kvm-ppc@vger.kernel.org;
   k...@vger.kernel.org; linuxppc-...@lists.ozlabs.org
   Subject: Re: [PATCH 5/6 v2] kvm: powerpc: booke: Add linux pte
   lookup like booke3s
  
   On Sat, 2013-08-03 at 02:58 +, Bhushan Bharat-R65777 wrote:
One of the problem I saw was that if I put this code in
asm/pgtable-32.h and asm/pgtable-64.h then pte_persent() and other
friend function (on which this code depends) are defined in pgtable.h.
And pgtable.h includes asm/pgtable-32.h and asm/pgtable-64.h
before it defines pte_present() and friends functions.
   
Ok I move wove this in asm/pgtable*.h, initially I fought with
myself to take this code in pgtable* but finally end up doing here
(got biased by book3s :)).
  
   Is there a reason why these routines can not be completely generic
   in pgtable.h ?
 
  How about the generic function:
 
  diff --git a/arch/powerpc/include/asm/pgtable-ppc64.h
  b/arch/powerpc/include/asm/pgtable-ppc64.h
  index d257d98..21daf28 100644
  --- a/arch/powerpc/include/asm/pgtable-ppc64.h
  +++ b/arch/powerpc/include/asm/pgtable-ppc64.h
  @@ -221,6 +221,27 @@ static inline unsigned long pte_update(struct mm_struct
 *mm,
  return old;
   }
 
  +static inline unsigned long pte_read(pte_t *p) { #ifdef
  +PTE_ATOMIC_UPDATES
  +   pte_t pte;
  +   pte_t tmp;
  +   __asm__ __volatile__ (
  +   1: ldarx   %0,0,%3\n
  +  andi.   %1,%0,%4\n
  +  bne-1b\n
  +  ori %1,%0,%4\n
  +  stdcx.  %1,0,%3\n
  +  bne-1b
  +   : =r (pte), =r (tmp), =m (*p)
  +   : r (p), i (_PAGE_BUSY)
  +   : cc);
  +
  +   return pte;
  +#else
  +   return pte_val(*p);
  +#endif
  +#endif
  +}
   static inline int __ptep_test_and_clear_young(struct mm_struct *mm,
unsigned long addr,
  pte_t *ptep)
 
 Please leave a blank line between functions.
 
   {
  diff --git a/arch/powerpc/include/asm/pgtable.h
  b/arch/powerpc/include/asm/pgtable.h
  index 690c8c2..dad712c 100644
  --- a/arch/powerpc/include/asm/pgtable.h
  +++ b/arch/powerpc/include/asm/pgtable.h
  @@ -254,6 +254,45 @@ static inline pte_t
  *find_linux_pte_or_hugepte(pgd_t *pgdir, unsigned long ea,  }  #endif
  /* !CONFIG_HUGETLB_PAGE */
 
  +static inline pte_t lookup_linux_pte(pgd_t *pgdir, unsigned long hva,
  +int writing, unsigned long
  +*pte_sizep)
 
 The name implies that it just reads the PTE.  Setting accessed/dirty shouldn't
 be an undocumented side-effect.  Why can't the caller do that (or a different
 function that the caller calls afterward if desired)?

Scott, I sent the next version of patch based on above idea. Now I think we do 
not need to update the pte flags on booke 
So we do not need to solve the kvmppc_read_update_linux_pte() stuff of book3s.

-Bharat

 
 Though even then you have the undocumented side effect of locking the PTE on
 certain targets.
 
  +{
  +   pte_t *ptep;
  +   pte_t pte;
  +   unsigned long ps = *pte_sizep;
  +   unsigned int shift;
  +
  +   ptep = find_linux_pte_or_hugepte(pgdir, hva, shift);
  +   if (!ptep)
  +   return __pte(0);
  +   if (shift)
  +   *pte_sizep = 1ul  shift;
  +   else
  +   *pte_sizep = PAGE_SIZE;
  +
  +   if (ps  *pte_sizep)
  +   return __pte(0);
  +
  +   if (!pte_present(*ptep))
  +   return __pte(0);
  +
  +#ifdef CONFIG_PPC64
  +   /* Lock PTE (set _PAGE_BUSY) and read */
  +   pte = pte_read(ptep);
  +#else
  +   pte = pte_val(*ptep);
  +#endif
 
 What about 32-bit platforms that need atomic PTEs?
 
 -Scott
 

N�r��yb�X��ǧv�^�)޺{.n�+jir)w*jg����ݢj/���z�ޖ��2�ޙ�)ߡ�a�����G���h��j:+v���w��٥

RE: [PATCH 21/23] KVM: PPC: Book3S PR: Use mmu_notifier_retry() in kvmppc_mmu_map_page()

2013-08-06 Thread Bhushan Bharat-R65777


 -Original Message-
 From: kvm-ow...@vger.kernel.org [mailto:kvm-ow...@vger.kernel.org] On Behalf 
 Of
 Paul Mackerras
 Sent: Tuesday, August 06, 2013 9:58 AM
 To: Alexander Graf; Benjamin Herrenschmidt
 Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org
 Subject: [PATCH 21/23] KVM: PPC: Book3S PR: Use mmu_notifier_retry() in
 kvmppc_mmu_map_page()
 
 When the MM code is invalidating a range of pages, it calls the KVM
 kvm_mmu_notifier_invalidate_range_start() notifier function, which calls
 kvm_unmap_hva_range(), which arranges to flush all the existing host HPTEs for
 guest pages.  However, the Linux PTEs for the range being flushed are still
 valid at that point.  We are not supposed to establish any new references to
 pages in the range until the ...range_end() notifier gets called.  The PPC-
 specific KVM code doesn't get any explicit notification of that; instead, we 
 are
 supposed to use
 mmu_notifier_retry() to test whether we are or have been inside a range flush
 notifier pair while we have been getting a page and instantiating a host HPTE
 for the page.
 
 This therefore adds a call to mmu_notifier_retry inside kvmppc_mmu_map_page().
 This call is inside a region locked with
 kvm-mmu_lock, which is the same lock that is called by the KVM
 MMU notifier functions, thus ensuring that no new notification can proceed 
 while
 we are in the locked region.  Inside this region we also create the host HPTE
 and link the corresponding hpte_cache structure into the lists used to find it
 later.  We cannot allocate the hpte_cache structure inside this locked region
 because that can lead to deadlock, so we allocate it outside the region and 
 free
 it if we end up not using it.
 
 This also moves the updates of vcpu3s-hpte_cache_count inside the regions
 locked with vcpu3s-mmu_lock, and does the increment in
 kvmppc_mmu_hpte_cache_map() when the pte is added to the cache rather than 
 when
 it is allocated, in order that the hpte_cache_count is accurate.
 
 Signed-off-by: Paul Mackerras pau...@samba.org
 ---
  arch/powerpc/include/asm/kvm_book3s.h |  1 +
 arch/powerpc/kvm/book3s_64_mmu_host.c | 37 ++-
  arch/powerpc/kvm/book3s_mmu_hpte.c| 14 +
  3 files changed, 39 insertions(+), 13 deletions(-)
 
 diff --git a/arch/powerpc/include/asm/kvm_book3s.h
 b/arch/powerpc/include/asm/kvm_book3s.h
 index 4fe6864..e711e77 100644
 --- a/arch/powerpc/include/asm/kvm_book3s.h
 +++ b/arch/powerpc/include/asm/kvm_book3s.h
 @@ -143,6 +143,7 @@ extern long kvmppc_hv_find_lock_hpte(struct kvm *kvm, 
 gva_t
 eaddr,
 
  extern void kvmppc_mmu_hpte_cache_map(struct kvm_vcpu *vcpu, struct 
 hpte_cache
 *pte);  extern struct hpte_cache *kvmppc_mmu_hpte_cache_next(struct kvm_vcpu
 *vcpu);
 +extern void kvmppc_mmu_hpte_cache_free(struct hpte_cache *pte);
  extern void kvmppc_mmu_hpte_destroy(struct kvm_vcpu *vcpu);  extern int
 kvmppc_mmu_hpte_init(struct kvm_vcpu *vcpu);  extern void
 kvmppc_mmu_invalidate_pte(struct kvm_vcpu *vcpu, struct hpte_cache *pte); 
 diff -
 -git a/arch/powerpc/kvm/book3s_64_mmu_host.c
 b/arch/powerpc/kvm/book3s_64_mmu_host.c
 index 7fcf38f..b7e9504 100644
 --- a/arch/powerpc/kvm/book3s_64_mmu_host.c
 +++ b/arch/powerpc/kvm/book3s_64_mmu_host.c
 @@ -93,6 +93,13 @@ int kvmppc_mmu_map_page(struct kvm_vcpu *vcpu, struct
 kvmppc_pte *orig_pte,
   int r = 0;
   int hpsize = MMU_PAGE_4K;
   bool writable;
 + unsigned long mmu_seq;
 + struct kvm *kvm = vcpu-kvm;
 + struct hpte_cache *cpte;
 +
 + /* used to check for invalidations in progress */
 + mmu_seq = kvm-mmu_notifier_seq;
 + smp_rmb();

Should not the smp_rmb() come before reading kvm-mmu_notifier_seq.

-Bharat

 
   /* Get host physical address for gpa */
   hpaddr = kvmppc_gfn_to_pfn(vcpu, orig_pte-raddr  PAGE_SHIFT, @@ 
 -143,6
 +150,14 @@ int kvmppc_mmu_map_page(struct kvm_vcpu *vcpu, struct kvmppc_pte
 *orig_pte,
 
   hash = hpt_hash(vpn, mmu_psize_defs[hpsize].shift, MMU_SEGSIZE_256M);
 
 + cpte = kvmppc_mmu_hpte_cache_next(vcpu);
 +
 + spin_lock(kvm-mmu_lock);
 + if (!cpte || mmu_notifier_retry(kvm, mmu_seq)) {
 + r = -EAGAIN;
 + goto out_unlock;
 + }
 +
  map_again:
   hpteg = ((hash  htab_hash_mask) * HPTES_PER_GROUP);
 
 @@ -150,7 +165,7 @@ map_again:
   if (attempt  1)
   if (ppc_md.hpte_remove(hpteg)  0) {
   r = -1;
 - goto out;
 + goto out_unlock;
   }
 
   ret = ppc_md.hpte_insert(hpteg, vpn, hpaddr, rflags, vflags, @@ -163,8
 +178,6 @@ map_again:
   attempt++;
   goto map_again;
   } else {
 - struct hpte_cache *pte = kvmppc_mmu_hpte_cache_next(vcpu);
 -
   trace_kvm_book3s_64_mmu_map(rflags, hpteg,
   vpn, hpaddr, orig_pte);
 
 @@ -175,15 +188,21 @@ map_again:
   hpteg = ((hash  

RE: [PATCH 21/23] KVM: PPC: Book3S PR: Use mmu_notifier_retry() in kvmppc_mmu_map_page()

2013-08-06 Thread Bhushan Bharat-R65777


 -Original Message-
 From: kvm-ow...@vger.kernel.org [mailto:kvm-ow...@vger.kernel.org] On Behalf 
 Of
 Paul Mackerras
 Sent: Tuesday, August 06, 2013 9:58 AM
 To: Alexander Graf; Benjamin Herrenschmidt
 Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org
 Subject: [PATCH 21/23] KVM: PPC: Book3S PR: Use mmu_notifier_retry() in
 kvmppc_mmu_map_page()
 
 When the MM code is invalidating a range of pages, it calls the KVM
 kvm_mmu_notifier_invalidate_range_start() notifier function, which calls
 kvm_unmap_hva_range(), which arranges to flush all the existing host
 HPTEs for guest pages.  However, the Linux PTEs for the range being
 flushed are still valid at that point.  We are not supposed to establish
 any new references to pages in the range until the ...range_end()
 notifier gets called.  The PPC-specific KVM code doesn't get any
 explicit notification of that; instead, we are supposed to use
 mmu_notifier_retry() to test whether we are or have been inside a
 range flush notifier pair while we have been getting a page and
 instantiating a host HPTE for the page.
 
 This therefore adds a call to mmu_notifier_retry inside
 kvmppc_mmu_map_page().  This call is inside a region locked with
 kvm-mmu_lock, which is the same lock that is called by the KVM
 MMU notifier functions, thus ensuring that no new notification can
 proceed while we are in the locked region.  Inside this region we
 also create the host HPTE and link the corresponding hpte_cache
 structure into the lists used to find it later.  We cannot allocate
 the hpte_cache structure inside this locked region because that can
 lead to deadlock, so we allocate it outside the region and free it
 if we end up not using it.
 
 This also moves the updates of vcpu3s-hpte_cache_count inside the
 regions locked with vcpu3s-mmu_lock, and does the increment in
 kvmppc_mmu_hpte_cache_map() when the pte is added to the cache
 rather than when it is allocated, in order that the hpte_cache_count
 is accurate.
 
 Signed-off-by: Paul Mackerras pau...@samba.org
 ---
  arch/powerpc/include/asm/kvm_book3s.h |  1 +
  arch/powerpc/kvm/book3s_64_mmu_host.c | 37 
 ++-
  arch/powerpc/kvm/book3s_mmu_hpte.c| 14 +
  3 files changed, 39 insertions(+), 13 deletions(-)
 
 diff --git a/arch/powerpc/include/asm/kvm_book3s.h
 b/arch/powerpc/include/asm/kvm_book3s.h
 index 4fe6864..e711e77 100644
 --- a/arch/powerpc/include/asm/kvm_book3s.h
 +++ b/arch/powerpc/include/asm/kvm_book3s.h
 @@ -143,6 +143,7 @@ extern long kvmppc_hv_find_lock_hpte(struct kvm *kvm, 
 gva_t
 eaddr,
 
  extern void kvmppc_mmu_hpte_cache_map(struct kvm_vcpu *vcpu, struct 
 hpte_cache
 *pte);
  extern struct hpte_cache *kvmppc_mmu_hpte_cache_next(struct kvm_vcpu *vcpu);
 +extern void kvmppc_mmu_hpte_cache_free(struct hpte_cache *pte);
  extern void kvmppc_mmu_hpte_destroy(struct kvm_vcpu *vcpu);
  extern int kvmppc_mmu_hpte_init(struct kvm_vcpu *vcpu);
  extern void kvmppc_mmu_invalidate_pte(struct kvm_vcpu *vcpu, struct 
 hpte_cache
 *pte);
 diff --git a/arch/powerpc/kvm/book3s_64_mmu_host.c
 b/arch/powerpc/kvm/book3s_64_mmu_host.c
 index 7fcf38f..b7e9504 100644
 --- a/arch/powerpc/kvm/book3s_64_mmu_host.c
 +++ b/arch/powerpc/kvm/book3s_64_mmu_host.c
 @@ -93,6 +93,13 @@ int kvmppc_mmu_map_page(struct kvm_vcpu *vcpu, struct
 kvmppc_pte *orig_pte,
   int r = 0;
   int hpsize = MMU_PAGE_4K;
   bool writable;
 + unsigned long mmu_seq;
 + struct kvm *kvm = vcpu-kvm;
 + struct hpte_cache *cpte;
 +
 + /* used to check for invalidations in progress */
 + mmu_seq = kvm-mmu_notifier_seq;
 + smp_rmb();
 
   /* Get host physical address for gpa */
   hpaddr = kvmppc_gfn_to_pfn(vcpu, orig_pte-raddr  PAGE_SHIFT,
 @@ -143,6 +150,14 @@ int kvmppc_mmu_map_page(struct kvm_vcpu *vcpu, struct
 kvmppc_pte *orig_pte,
 
   hash = hpt_hash(vpn, mmu_psize_defs[hpsize].shift, MMU_SEGSIZE_256M);
 
 + cpte = kvmppc_mmu_hpte_cache_next(vcpu);
 +
 + spin_lock(kvm-mmu_lock);
 + if (!cpte || mmu_notifier_retry(kvm, mmu_seq)) {
 + r = -EAGAIN;

Pauls, I am trying to understand the flow; does retry mean that we do not 
create the mapping and return to guest, which will fault again and then we will 
retry? 

Thanks
-Bharat

 + goto out_unlock;
 + }
 +
  map_again:
   hpteg = ((hash  htab_hash_mask) * HPTES_PER_GROUP);
 
 @@ -150,7 +165,7 @@ map_again:
   if (attempt  1)
   if (ppc_md.hpte_remove(hpteg)  0) {
   r = -1;
 - goto out;
 + goto out_unlock;
   }
 
   ret = ppc_md.hpte_insert(hpteg, vpn, hpaddr, rflags, vflags,
 @@ -163,8 +178,6 @@ map_again:
   attempt++;
   goto map_again;
   } else {
 - struct hpte_cache *pte = kvmppc_mmu_hpte_cache_next(vcpu);
 -
   trace_kvm_book3s_64_mmu_map(rflags, hpteg,
   vpn, 

RE: [PATCH 21/23] KVM: PPC: Book3S PR: Use mmu_notifier_retry() in kvmppc_mmu_map_page()

2013-08-06 Thread Bhushan Bharat-R65777


 -Original Message-
 From: Paul Mackerras [mailto:pau...@samba.org]
 Sent: Wednesday, August 07, 2013 9:59 AM
 To: Bhushan Bharat-R65777
 Cc: Alexander Graf; Benjamin Herrenschmidt; kvm-ppc@vger.kernel.org;
 k...@vger.kernel.org
 Subject: Re: [PATCH 21/23] KVM: PPC: Book3S PR: Use mmu_notifier_retry() in
 kvmppc_mmu_map_page()
 
 On Wed, Aug 07, 2013 at 04:13:34AM +, Bhushan Bharat-R65777 wrote:
 
   + /* used to check for invalidations in progress */
   + mmu_seq = kvm-mmu_notifier_seq;
   + smp_rmb();
 
  Should not the smp_rmb() come before reading kvm-mmu_notifier_seq.
 
 No, it should come after, because it is ordering the read of
 kvm-mmu_notifier_seq before the read of the Linux PTE.

Ahh, ok. Thanks

-Bharat

 
 Paul.


--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 5/6 v2] kvm: powerpc: booke: Add linux pte lookup like booke3s

2013-08-05 Thread Bhushan Bharat-R65777


 -Original Message-
 From: Benjamin Herrenschmidt [mailto:b...@kernel.crashing.org]
 Sent: Saturday, August 03, 2013 9:54 AM
 To: Bhushan Bharat-R65777
 Cc: Wood Scott-B07421; ag...@suse.de; kvm-ppc@vger.kernel.org;
 k...@vger.kernel.org; linuxppc-...@lists.ozlabs.org
 Subject: Re: [PATCH 5/6 v2] kvm: powerpc: booke: Add linux pte lookup like
 booke3s
 
 On Sat, 2013-08-03 at 02:58 +, Bhushan Bharat-R65777 wrote:
  One of the problem I saw was that if I put this code in
  asm/pgtable-32.h and asm/pgtable-64.h then pte_persent() and other
  friend function (on which this code depends) are defined in pgtable.h.
  And pgtable.h includes asm/pgtable-32.h and asm/pgtable-64.h before it
  defines pte_present() and friends functions.
 
  Ok I move wove this in asm/pgtable*.h, initially I fought with myself
  to take this code in pgtable* but finally end up doing here (got
  biased by book3s :)).
 
 Is there a reason why these routines can not be completely generic in 
 pgtable.h
 ?

How about the generic function:

diff --git a/arch/powerpc/include/asm/pgtable-ppc64.h 
b/arch/powerpc/include/asm/pgtable-ppc64.h
index d257d98..21daf28 100644
--- a/arch/powerpc/include/asm/pgtable-ppc64.h
+++ b/arch/powerpc/include/asm/pgtable-ppc64.h
@@ -221,6 +221,27 @@ static inline unsigned long pte_update(struct mm_struct 
*mm,
return old;
 }

+static inline unsigned long pte_read(pte_t *p)
+{
+#ifdef PTE_ATOMIC_UPDATES
+   pte_t pte;
+   pte_t tmp;
+   __asm__ __volatile__ (
+   1: ldarx   %0,0,%3\n
+  andi.   %1,%0,%4\n
+  bne-1b\n
+  ori %1,%0,%4\n
+  stdcx.  %1,0,%3\n
+  bne-1b
+   : =r (pte), =r (tmp), =m (*p)
+   : r (p), i (_PAGE_BUSY)
+   : cc);
+
+   return pte;
+#else  
+   return pte_val(*p);
+#endif
+#endif
+}
 static inline int __ptep_test_and_clear_young(struct mm_struct *mm,
  unsigned long addr, pte_t *ptep)
 {
diff --git a/arch/powerpc/include/asm/pgtable.h 
b/arch/powerpc/include/asm/pgtable.h
index 690c8c2..dad712c 100644
--- a/arch/powerpc/include/asm/pgtable.h
+++ b/arch/powerpc/include/asm/pgtable.h
@@ -254,6 +254,45 @@ static inline pte_t *find_linux_pte_or_hugepte(pgd_t 
*pgdir, unsigned long ea,
 }
 #endif /* !CONFIG_HUGETLB_PAGE */

+static inline pte_t lookup_linux_pte(pgd_t *pgdir, unsigned long hva,
+int writing, unsigned long *pte_sizep)
+{
+   pte_t *ptep;
+   pte_t pte;
+   unsigned long ps = *pte_sizep;
+   unsigned int shift;
+
+   ptep = find_linux_pte_or_hugepte(pgdir, hva, shift);
+   if (!ptep)
+   return __pte(0);
+   if (shift)
+   *pte_sizep = 1ul  shift;
+   else
+   *pte_sizep = PAGE_SIZE;
+
+   if (ps  *pte_sizep)
+   return __pte(0);
+
+   if (!pte_present(*ptep))
+   return __pte(0);
+
+#ifdef CONFIG_PPC64
+   /* Lock PTE (set _PAGE_BUSY) and read */
+   pte = pte_read(ptep);
+#else
+   pte = pte_val(*ptep);
+#endif
+   if (pte_present(pte)) {
+   pte = pte_mkyoung(pte);
+   if (writing  pte_write(pte))
+   pte = pte_mkdirty(pte);
+   }
+
+   *ptep = __pte(pte); /* 64bit: Also unlock pte (clear _PAGE_BUSY) */
+
+   return pte;
+}
+
 #endif /* __ASSEMBLY__ */

 #endif /* __KERNEL__ */


RE: [PATCH 5/6 v2] kvm: powerpc: booke: Add linux pte lookup like booke3s

2013-08-05 Thread Bhushan Bharat-R65777


 -Original Message-
 From: Wood Scott-B07421
 Sent: Tuesday, August 06, 2013 12:49 AM
 To: Bhushan Bharat-R65777
 Cc: Benjamin Herrenschmidt; Wood Scott-B07421; ag...@suse.de; kvm-
 p...@vger.kernel.org; k...@vger.kernel.org; linuxppc-...@lists.ozlabs.org
 Subject: Re: [PATCH 5/6 v2] kvm: powerpc: booke: Add linux pte lookup like
 booke3s
 
 On Mon, 2013-08-05 at 09:27 -0500, Bhushan Bharat-R65777 wrote:
 
   -Original Message-
   From: Benjamin Herrenschmidt [mailto:b...@kernel.crashing.org]
   Sent: Saturday, August 03, 2013 9:54 AM
   To: Bhushan Bharat-R65777
   Cc: Wood Scott-B07421; ag...@suse.de; kvm-ppc@vger.kernel.org;
   k...@vger.kernel.org; linuxppc-...@lists.ozlabs.org
   Subject: Re: [PATCH 5/6 v2] kvm: powerpc: booke: Add linux pte
   lookup like booke3s
  
   On Sat, 2013-08-03 at 02:58 +, Bhushan Bharat-R65777 wrote:
One of the problem I saw was that if I put this code in
asm/pgtable-32.h and asm/pgtable-64.h then pte_persent() and other
friend function (on which this code depends) are defined in pgtable.h.
And pgtable.h includes asm/pgtable-32.h and asm/pgtable-64.h
before it defines pte_present() and friends functions.
   
Ok I move wove this in asm/pgtable*.h, initially I fought with
myself to take this code in pgtable* but finally end up doing here
(got biased by book3s :)).
  
   Is there a reason why these routines can not be completely generic
   in pgtable.h ?
 
  How about the generic function:
 
  diff --git a/arch/powerpc/include/asm/pgtable-ppc64.h
  b/arch/powerpc/include/asm/pgtable-ppc64.h
  index d257d98..21daf28 100644
  --- a/arch/powerpc/include/asm/pgtable-ppc64.h
  +++ b/arch/powerpc/include/asm/pgtable-ppc64.h
  @@ -221,6 +221,27 @@ static inline unsigned long pte_update(struct mm_struct
 *mm,
  return old;
   }
 
  +static inline unsigned long pte_read(pte_t *p) { #ifdef
  +PTE_ATOMIC_UPDATES
  +   pte_t pte;
  +   pte_t tmp;
  +   __asm__ __volatile__ (
  +   1: ldarx   %0,0,%3\n
  +  andi.   %1,%0,%4\n
  +  bne-1b\n
  +  ori %1,%0,%4\n
  +  stdcx.  %1,0,%3\n
  +  bne-1b
  +   : =r (pte), =r (tmp), =m (*p)
  +   : r (p), i (_PAGE_BUSY)
  +   : cc);
  +
  +   return pte;
  +#else
  +   return pte_val(*p);
  +#endif
  +#endif
  +}
   static inline int __ptep_test_and_clear_young(struct mm_struct *mm,
unsigned long addr,
  pte_t *ptep)
 
 Please leave a blank line between functions.
 
   {
  diff --git a/arch/powerpc/include/asm/pgtable.h
  b/arch/powerpc/include/asm/pgtable.h
  index 690c8c2..dad712c 100644
  --- a/arch/powerpc/include/asm/pgtable.h
  +++ b/arch/powerpc/include/asm/pgtable.h
  @@ -254,6 +254,45 @@ static inline pte_t
  *find_linux_pte_or_hugepte(pgd_t *pgdir, unsigned long ea,  }  #endif
  /* !CONFIG_HUGETLB_PAGE */
 
  +static inline pte_t lookup_linux_pte(pgd_t *pgdir, unsigned long hva,
  +int writing, unsigned long
  +*pte_sizep)
 
 The name implies that it just reads the PTE.  Setting accessed/dirty shouldn't
 be an undocumented side-effect.

Ok, will rename and document.

 Why can't the caller do that (or a different
 function that the caller calls afterward if desired)?

The current implementation in book3s is;
 1) find a pte/hugepte
 2) return null if pte not present
 3) take _PAGE_BUSY lock
 4) set accessed/dirty
 5) clear _PAGE_BUSY.

What I tried was 
1) find a pte/hugepte
2) return null if pte not present
3) return pte (not take lock by not setting _PAGE_BUSY)

4) then user calls  __ptep_set_access_flags() to atomic update the 
dirty/accessed flags in pte.

- but the benchmark results were not good
- Also can there be race as we do not take lock in step 3 and update in step 4 ?
  
 
 Though even then you have the undocumented side effect of locking the PTE on
 certain targets.
 
  +{
  +   pte_t *ptep;
  +   pte_t pte;
  +   unsigned long ps = *pte_sizep;
  +   unsigned int shift;
  +
  +   ptep = find_linux_pte_or_hugepte(pgdir, hva, shift);
  +   if (!ptep)
  +   return __pte(0);
  +   if (shift)
  +   *pte_sizep = 1ul  shift;
  +   else
  +   *pte_sizep = PAGE_SIZE;
  +
  +   if (ps  *pte_sizep)
  +   return __pte(0);
  +
  +   if (!pte_present(*ptep))
  +   return __pte(0);
  +
  +#ifdef CONFIG_PPC64
  +   /* Lock PTE (set _PAGE_BUSY) and read */
  +   pte = pte_read(ptep);
  +#else
  +   pte = pte_val(*ptep);
  +#endif
 
 What about 32-bit platforms that need atomic PTEs?

I called __ptep_set_access_flags() for both 32/64bit (for 64bit I was not 
calling pte_read()), which handles atomic updates. Somehow the benchmark result 
were not good, will try again.

Thanks
-Bharat
 
 -Scott
 



RE: [PATCH 5/6 v2] kvm: powerpc: booke: Add linux pte lookup like booke3s

2013-08-02 Thread Bhushan Bharat-R65777


 -Original Message-
 From: Benjamin Herrenschmidt [mailto:b...@kernel.crashing.org]
 Sent: Saturday, August 03, 2013 4:47 AM
 To: Wood Scott-B07421
 Cc: Bhushan Bharat-R65777; ag...@suse.de; kvm-ppc@vger.kernel.org;
 k...@vger.kernel.org; linuxppc-...@lists.ozlabs.org; Bhushan Bharat-R65777
 Subject: Re: [PATCH 5/6 v2] kvm: powerpc: booke: Add linux pte lookup like
 booke3s
 
 On Fri, 2013-08-02 at 17:58 -0500, Scott Wood wrote:
 
  What about 64-bit PTEs on 32-bit kernels?
 
  In any case, this code does not belong in KVM.  It should be in the
  main PPC mm code, even if KVM is the only user.
 
 Also don't we do similar things in BookS KVM ? At the very least that sutff
 should become common. And yes, I agree, it should probably also move to 
 pgtable*

One of the problem I saw was that if I put this code in asm/pgtable-32.h and 
asm/pgtable-64.h then pte_persent() and other friend function (on which this 
code depends) are defined in pgtable.h. And pgtable.h includes asm/pgtable-32.h 
and asm/pgtable-64.h before it defines pte_present() and friends functions.

Ok I move wove this in asm/pgtable*.h, initially I fought with myself to take 
this code in pgtable* but finally end up doing here (got biased by book3s :)).

Thanks
-Bharat

 
 Cheers,
 Ben.
 
 

N�r��yb�X��ǧv�^�)޺{.n�+jir)w*jg����ݢj/���z�ޖ��2�ޙ�)ߡ�a�����G���h��j:+v���w��٥

RE: [PATCH 6/6 v2] kvm: powerpc: use caching attributes as per linux pte

2013-08-02 Thread Bhushan Bharat-R65777


 -Original Message-
 From: Wood Scott-B07421
 Sent: Saturday, August 03, 2013 5:05 AM
 To: Bhushan Bharat-R65777
 Cc: b...@kernel.crashing.org; ag...@suse.de; kvm-ppc@vger.kernel.org;
 k...@vger.kernel.org; linuxppc-...@lists.ozlabs.org; Bhushan Bharat-R65777
 Subject: Re: [PATCH 6/6 v2] kvm: powerpc: use caching attributes as per linux
 pte
 
 On Thu, Aug 01, 2013 at 04:42:38PM +0530, Bharat Bhushan wrote:
  diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c index
  17722d8..eb2 100644
  --- a/arch/powerpc/kvm/booke.c
  +++ b/arch/powerpc/kvm/booke.c
  @@ -697,7 +697,7 @@ int kvmppc_vcpu_run(struct kvm_run *kvm_run,
  struct kvm_vcpu *vcpu)  #endif
 
  kvmppc_fix_ee_before_entry();
  -
  +   vcpu-arch.pgdir = current-mm-pgd;
  ret = __kvmppc_vcpu_run(kvm_run, vcpu);
 
 kvmppc_fix_ee_before_entry() is supposed to be the last thing that happens
 before __kvmppc_vcpu_run().
 
  @@ -332,6 +324,8 @@ static inline int kvmppc_e500_shadow_map(struct
 kvmppc_vcpu_e500 *vcpu_e500,
  unsigned long hva;
  int pfnmap = 0;
  int tsize = BOOK3E_PAGESZ_4K;
  +   pte_t pte;
  +   int wimg = 0;
 
  /*
   * Translate guest physical to true physical, acquiring @@ -437,6
  +431,8 @@ static inline int kvmppc_e500_shadow_map(struct
  kvmppc_vcpu_e500 *vcpu_e500,
 
  if (likely(!pfnmap)) {
  unsigned long tsize_pages = 1  (tsize + 10 - PAGE_SHIFT);
  +   pgd_t *pgdir;
  +
  pfn = gfn_to_pfn_memslot(slot, gfn);
  if (is_error_noslot_pfn(pfn)) {
  printk(KERN_ERR Couldn't get real page for gfn 
  %lx!\n, @@
 -447,9
  +443,18 @@ static inline int kvmppc_e500_shadow_map(struct kvmppc_vcpu_e500
 *vcpu_e500,
  /* Align guest and physical address to page map boundaries */
  pfn = ~(tsize_pages - 1);
  gvaddr = ~((tsize_pages  PAGE_SHIFT) - 1);
  +   pgdir = vcpu_e500-vcpu.arch.pgdir;
  +   pte = lookup_linux_pte(pgdir, hva, 1, tsize_pages);
  +   if (pte_present(pte)) {
  +   wimg = (pte  PTE_WIMGE_SHIFT)  MAS2_WIMGE_MASK;
  +   } else {
  +   printk(KERN_ERR pte not present: gfn %lx, pfn %lx\n,
  +   (long)gfn, pfn);
  +   return -EINVAL;
  +   }
  }
 
 How does wimg get set in the pfnmap case?

Pfnmap is not kernel managed pages, right? So should we set I+G there ?

 
 Could you explain why we need to set dirty/referenced on the PTE, when we 
 didn't
 need to do that before? All we're getting from the PTE is wimg.
 We have MMU notifiers to take care of the page being unmapped, and we've 
 already
 marked the page itself as dirty if the TLB entry is writeable.

I pulled this code from book3s.

Ben, can you describe why we need this on book3s ?

Thanks
-Bharat
 
 -Scott

--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 4/4] kvm: powerpc: set cache coherency only for RAM pages

2013-07-30 Thread Bhushan Bharat-R65777


 -Original Message-
 From: Benjamin Herrenschmidt [mailto:b...@kernel.crashing.org]
 Sent: Saturday, July 27, 2013 3:57 AM
 To: Bhushan Bharat-R65777
 Cc: Alexander Graf; kvm-ppc@vger.kernel.org; k...@vger.kernel.org; linuxppc-
 d...@lists.ozlabs.org; Wood Scott-B07421
 Subject: Re: [PATCH 4/4] kvm: powerpc: set cache coherency only for RAM pages
 
 On Fri, 2013-07-26 at 15:03 +, Bhushan Bharat-R65777 wrote:
  Will not searching the Linux PTE is a overkill?
 
 That's the best approach. Also we are searching it already to resolve the page
 fault. That does mean we search twice but on the other hand that also means 
 it's
 hot in the cache.


Below is early git diff (not a proper cleanup patch), to be sure that this is 
what we want on PowerPC and take early feedback. Also I run some benchmark to 
understand the overhead if any. 

Using kvm_is_mmio_pfn(); what the current patch does:   

Real: 0m46.616s + 0m49.517s + 0m49.510s + 0m46.936s + 0m46.889s + 0m46.684s = 
Avg; 47.692s
User: 0m31.636s + 0m31.816s + 0m31.456s + 0m31.752s + 0m32.028s + 0m31.848s = 
Avg; 31.756s
Sys:  0m11.596s + 0m11.868s + 0m12.244s + 0m11.672s + 0m11.356s + 0m11.432s = 
Avg; 11.695s


Using kernel page table search (below changes):
Real: 0m46.431s + 0m50.269s + 0m46.724s + 0m46.645s + 0m46.670s + 0m50.259s = 
Avg; 47.833s
User: 0m31.568s + 0m31.816s + 0m31.444s + 0m31.808s + 0m31.312s + 0m31.740s = 
Avg; 31.614s
Sys:  0m11.516s + 0m12.060s + 0m11.872s + 0m11.476s + 0m12.000s + 0m12.152s = 
Avg; 11.846s

--
diff --git a/arch/powerpc/include/asm/kvm_host.h 
b/arch/powerpc/include/asm/kvm_host.h
index 3328353..d6d0dac 100644
--- a/arch/powerpc/include/asm/kvm_host.h
+++ b/arch/powerpc/include/asm/kvm_host.h
@@ -532,6 +532,7 @@ struct kvm_vcpu_arch {
u32 epr;
u32 crit_save;
struct kvmppc_booke_debug_reg dbg_reg;
+   pgd_t *pgdir;
 #endif
gpa_t paddr_accessed;
gva_t vaddr_accessed;
diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
index 17722d8..eb2 100644
--- a/arch/powerpc/kvm/booke.c
+++ b/arch/powerpc/kvm/booke.c
@@ -697,7 +697,7 @@ int kvmppc_vcpu_run(struct kvm_run *kvm_run, struct 
kvm_vcpu *vcpu)
 #endif
 
kvmppc_fix_ee_before_entry();
-
+   vcpu-arch.pgdir = current-mm-pgd;
ret = __kvmppc_vcpu_run(kvm_run, vcpu);
 
/* No need for kvm_guest_exit. It's done in handle_exit.
diff --git a/arch/powerpc/kvm/e500.h b/arch/powerpc/kvm/e500.h
index 4fd9650..fc4b2f6 100644
--- a/arch/powerpc/kvm/e500.h
+++ b/arch/powerpc/kvm/e500.h
@@ -31,11 +31,13 @@ enum vcpu_ftr {
 #define E500_TLB_NUM   2
 
 /* entry is mapped somewhere in host TLB */
-#define E500_TLB_VALID (1  0)
+#define E500_TLB_VALID (1  31)
 /* TLB1 entry is mapped by host TLB1, tracked by bitmaps */
-#define E500_TLB_BITMAP(1  1)
+#define E500_TLB_BITMAP(1  30)
 /* TLB1 entry is mapped by host TLB0 */
-#define E500_TLB_TLB0  (1  2)
+#define E500_TLB_TLB0  (1  29)
+/* Lower 5 bits have WIMGE value */
+#define E500_TLB_WIMGE_MASK(0x1f)
 
 struct tlbe_ref {
pfn_t pfn;  /* valid only for TLB0, except briefly */
diff --git a/arch/powerpc/kvm/e500_mmu_host.c b/arch/powerpc/kvm/e500_mmu_host.c
index 5cbdc8f..a48c13f 100644
--- a/arch/powerpc/kvm/e500_mmu_host.c
+++ b/arch/powerpc/kvm/e500_mmu_host.c
@@ -40,6 +40,84 @@
 
 static struct kvmppc_e500_tlb_params host_tlb_params[E500_TLB_NUM];
 
+/*
+ * find_linux_pte returns the address of a linux pte for a given
+ * effective address and directory.  If not found, it returns zero.
+ */
+static inline pte_t *find_linux_pte(pgd_t *pgdir, unsigned long ea)
+{
+pgd_t *pg;
+pud_t *pu;
+pmd_t *pm;
+pte_t *pt = NULL;
+
+pg = pgdir + pgd_index(ea);
+if (!pgd_none(*pg)) {
+pu = pud_offset(pg, ea);
+if (!pud_none(*pu)) {
+pm = pmd_offset(pu, ea);
+if (pmd_present(*pm))
+pt = pte_offset_kernel(pm, ea);
+}
+}
+return pt;
+}
+
+#ifdef CONFIG_HUGETLB_PAGE
+pte_t *find_linux_pte_or_hugepte(pgd_t *pgdir, unsigned long ea,
+ unsigned *shift);
+#else
+static inline pte_t *find_linux_pte_or_hugepte(pgd_t *pgdir, unsigned long ea,
+   unsigned *shift)
+{
+if (shift)
+*shift = 0;
+return find_linux_pte(pgdir, ea);
+}
+#endif /* !CONFIG_HUGETLB_PAGE */
+
+/*
+ * Lock and read a linux PTE.  If it's present and writable, atomically
+ * set dirty and referenced bits and return the PTE, otherwise return 0.
+ */
+static inline pte_t kvmppc_read_update_linux_pte(pte_t *p, int writing)
+{
+   pte_t pte = pte_val(*p);
+
+   if (pte_present(pte)) {
+   pte = pte_mkyoung

RE: [PATCH 4/4] kvm: powerpc: set cache coherency only for RAM pages

2013-07-30 Thread Bhushan Bharat-R65777


 -Original Message-
 From: Wood Scott-B07421
 Sent: Wednesday, July 31, 2013 12:19 AM
 To: Bhushan Bharat-R65777
 Cc: Benjamin Herrenschmidt; Alexander Graf; kvm-ppc@vger.kernel.org;
 k...@vger.kernel.org; linuxppc-...@lists.ozlabs.org; Wood Scott-B07421
 Subject: Re: [PATCH 4/4] kvm: powerpc: set cache coherency only for RAM pages
 
 On 07/30/2013 11:22:54 AM, Bhushan Bharat-R65777 wrote:
  diff --git a/arch/powerpc/kvm/e500_mmu_host.c
  b/arch/powerpc/kvm/e500_mmu_host.c
  index 5cbdc8f..a48c13f 100644
  --- a/arch/powerpc/kvm/e500_mmu_host.c
  +++ b/arch/powerpc/kvm/e500_mmu_host.c
  @@ -40,6 +40,84 @@
 
   static struct kvmppc_e500_tlb_params host_tlb_params[E500_TLB_NUM];
 
  +/*
  + * find_linux_pte returns the address of a linux pte for a given
  + * effective address and directory.  If not found, it returns zero.
  + */
  +static inline pte_t *find_linux_pte(pgd_t *pgdir, unsigned long ea) {
  +pgd_t *pg;
  +pud_t *pu;
  +pmd_t *pm;
  +pte_t *pt = NULL;
  +
  +pg = pgdir + pgd_index(ea);
  +if (!pgd_none(*pg)) {
  +pu = pud_offset(pg, ea);
  +if (!pud_none(*pu)) {
  +pm = pmd_offset(pu, ea);
  +if (pmd_present(*pm))
  +pt = pte_offset_kernel(pm, ea);
  +}
  +}
  +return pt;
  +}
 
 How is this specific to KVM or e500?
 
  +#ifdef CONFIG_HUGETLB_PAGE
  +pte_t *find_linux_pte_or_hugepte(pgd_t *pgdir, unsigned long ea,
  + unsigned *shift); #else static
  +inline pte_t *find_linux_pte_or_hugepte(pgd_t *pgdir,
  unsigned long ea,
  +   unsigned *shift) {
  +if (shift)
  +*shift = 0;
  +return find_linux_pte(pgdir, ea); } #endif /*
  +!CONFIG_HUGETLB_PAGE */
 
 This is already declared in asm/pgtable.h.  If we need a non-hugepage
 alternative, that should also go in asm/pgtable.h.
 
  +/*
  + * Lock and read a linux PTE.  If it's present and writable,
  atomically
  + * set dirty and referenced bits and return the PTE, otherwise
  return 0.
  + */
  +static inline pte_t kvmppc_read_update_linux_pte(pte_t *p, int
  writing)
  +{
  +   pte_t pte = pte_val(*p);
  +
  +   if (pte_present(pte)) {
  +   pte = pte_mkyoung(pte);
  +   if (writing  pte_write(pte))
  +   pte = pte_mkdirty(pte);
  +   }
  +
  +   *p = pte;
  +
  +   return pte;
  +}
  +
  +static pte_t lookup_linux_pte(pgd_t *pgdir, unsigned long hva,
  + int writing, unsigned long *pte_sizep) {
  +   pte_t *ptep;
  +   unsigned long ps = *pte_sizep;
  +   unsigned int shift;
  +
  +   ptep = find_linux_pte_or_hugepte(pgdir, hva, shift);
  +   if (!ptep)
  +   return __pte(0);
  +   if (shift)
  +   *pte_sizep = 1ul  shift;
  +   else
  +   *pte_sizep = PAGE_SIZE;
  +
  +   if (ps  *pte_sizep)
  +   return __pte(0);
  +   if (!pte_present(*ptep))
  +   return __pte(0);
  +
  +   return kvmppc_read_update_linux_pte(ptep, writing); }
  +
 
 None of this belongs in this file either.
 
  @@ -326,8 +405,8 @@ static void kvmppc_e500_setup_stlbe(
 
  /* Force IPROT=0 for all guest mappings. */
  stlbe-mas1 = MAS1_TSIZE(tsize) | get_tlb_sts(gtlbe) |
  MAS1_VALID;
  -   stlbe-mas2 = (gvaddr  MAS2_EPN) |
  - e500_shadow_mas2_attrib(gtlbe-mas2, pfn);
  +   stlbe-mas2 = (gvaddr  MAS2_EPN) | (ref-flags 
  E500_TLB_WIMGE_MASK);
  +//   e500_shadow_mas2_attrib(gtlbe-mas2, pfn);
 
 MAS2_E and MAS2_G should be safe to come from the guest.

This is handled when setting WIMGE in ref-flags.

 
 How does this work for TLB1?  One ref corresponds to one guest entry, which 
 may
 correspond to multiple host entries, potentially each with different WIM
 settings.

Yes, one ref corresponds to one guest entry. To understand how this will work 
when a one guest tlb1 entry may maps to many host tlb0/1 entry; 
on guest tlbwe, KVM setup one guest tlb entry and then pre-map one host tlb 
entry (out of many) and ref (ref-pfn etc) points to this pre-map entry for 
that guest entry.
Now a guest TLB miss happens which falls on same guest tlb entry and but 
demands another host tlb entry. In that flow we change/overwrite ref (ref-pfn 
etc) to point to new host mapping for same guest mapping.

 
  stlbe-mas7_3 = ((u64)pfn  PAGE_SHIFT) |
  e500_shadow_mas3_attrib(gtlbe-mas7_3, pr);
 
  @@ -346,6 +425,8 @@ static inline int kvmppc_e500_shadow_map(struct
  kvmppc_vcpu_e500 *vcpu_e500,
  unsigned long hva;
  int pfnmap = 0;
  int tsize = BOOK3E_PAGESZ_4K;
  +   pte_t pte;
  +   int wimg = 0;
 
  /*
   * Translate guest physical to true physical, acquiring

RE: [PATCH 4/4] kvm: powerpc: set cache coherency only for RAM pages

2013-07-26 Thread Bhushan Bharat-R65777


 -Original Message-
 From: Benjamin Herrenschmidt [mailto:b...@kernel.crashing.org]
 Sent: Friday, July 26, 2013 1:57 PM
 To: Bhushan Bharat-R65777
 Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; 
 linuxppc-...@lists.ozlabs.org;
 ag...@suse.de; Wood Scott-B07421; Bhushan Bharat-R65777
 Subject: Re: [PATCH 4/4] kvm: powerpc: set cache coherency only for RAM pages
 
 On Fri, 2013-07-26 at 11:16 +0530, Bharat Bhushan wrote:
  If the page is RAM then map this as cacheable and coherent (set M
  bit) otherwise this page is treated as I/O and map this as cache
  inhibited and guarded (set  I + G)
 
  This helps setting proper MMU mapping for direct assigned device.
 
  NOTE: There can be devices that require cacheable mapping, which is not yet
 supported.
 
 Why don't you do like server instead and enforce the use of the same I and M
 bits as the corresponding qemu PTE ?

Ben/Alex, I will look into the code. Can you please describe how this is 
handled on server?

Thanks
-Bharat

 
 Cheers,
 Ben.
 
  Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
  ---
   arch/powerpc/kvm/e500_mmu_host.c |   24 +++-
   1 files changed, 19 insertions(+), 5 deletions(-)
 
  diff --git a/arch/powerpc/kvm/e500_mmu_host.c
  b/arch/powerpc/kvm/e500_mmu_host.c
  index 1c6a9d7..5cbdc8f 100644
  --- a/arch/powerpc/kvm/e500_mmu_host.c
  +++ b/arch/powerpc/kvm/e500_mmu_host.c
  @@ -64,13 +64,27 @@ static inline u32 e500_shadow_mas3_attrib(u32 mas3, int
 usermode)
  return mas3;
   }
 
  -static inline u32 e500_shadow_mas2_attrib(u32 mas2, int usermode)
  +static inline u32 e500_shadow_mas2_attrib(u32 mas2, pfn_t pfn)
   {
  +   u32 mas2_attr;
  +
  +   mas2_attr = mas2  MAS2_ATTRIB_MASK;
  +
  +   if (kvm_is_mmio_pfn(pfn)) {
  +   /*
  +* If page is not RAM then it is treated as I/O page.
  +* Map it with cache inhibited and guarded (set I + G).
  +*/
  +   mas2_attr |= MAS2_I | MAS2_G;
  +   return mas2_attr;
  +   }
  +
  +   /* Map RAM pages as cacheable (Not setting I in MAS2) */
   #ifdef CONFIG_SMP
  -   return (mas2  MAS2_ATTRIB_MASK) | MAS2_M;
  -#else
  -   return mas2  MAS2_ATTRIB_MASK;
  +   /* Also map as coherent (set M) in SMP */
  +   mas2_attr |= MAS2_M;
   #endif
  +   return mas2_attr;
   }
 
   /*
  @@ -313,7 +327,7 @@ static void kvmppc_e500_setup_stlbe(
  /* Force IPROT=0 for all guest mappings. */
  stlbe-mas1 = MAS1_TSIZE(tsize) | get_tlb_sts(gtlbe) | MAS1_VALID;
  stlbe-mas2 = (gvaddr  MAS2_EPN) |
  - e500_shadow_mas2_attrib(gtlbe-mas2, pr);
  + e500_shadow_mas2_attrib(gtlbe-mas2, pfn);
  stlbe-mas7_3 = ((u64)pfn  PAGE_SHIFT) |
  e500_shadow_mas3_attrib(gtlbe-mas7_3, pr);
 
 
 



RE: [PATCH 4/4] kvm: powerpc: set cache coherency only for RAM pages

2013-07-26 Thread Bhushan Bharat-R65777


 -Original Message-
 From: kvm-ppc-ow...@vger.kernel.org [mailto:kvm-ppc-ow...@vger.kernel.org] On
 Behalf Of Alexander Graf
 Sent: Friday, July 26, 2013 2:20 PM
 To: Benjamin Herrenschmidt
 Cc: Bhushan Bharat-R65777; kvm-ppc@vger.kernel.org; k...@vger.kernel.org;
 linuxppc-...@lists.ozlabs.org; Wood Scott-B07421; Bhushan Bharat-R65777
 Subject: Re: [PATCH 4/4] kvm: powerpc: set cache coherency only for RAM pages
 
 
 On 26.07.2013, at 10:26, Benjamin Herrenschmidt wrote:
 
  On Fri, 2013-07-26 at 11:16 +0530, Bharat Bhushan wrote:
  If the page is RAM then map this as cacheable and coherent (set M
  bit) otherwise this page is treated as I/O and map this as cache
  inhibited and guarded (set  I + G)
 
  This helps setting proper MMU mapping for direct assigned device.
 
  NOTE: There can be devices that require cacheable mapping, which is not yet
 supported.
 
  Why don't you do like server instead and enforce the use of the same I
  and M bits as the corresponding qemu PTE ?
 
 Specifically, Ben is talking about this code:
 
 
 /* Translate to host virtual address */
 hva = __gfn_to_hva_memslot(memslot, gfn);
 
 /* Look up the Linux PTE for the backing page */
 pte_size = psize;
 pte = lookup_linux_pte(pgdir, hva, writing, pte_size);
 if (pte_present(pte)) {
 if (writing  !pte_write(pte))
 /* make the actual HPTE be read-only */
 ptel = hpte_make_readonly(ptel);
 is_io = hpte_cache_bits(pte_val(pte));
 pa = pte_pfn(pte)  PAGE_SHIFT;
 }
 

Ok

Thanks
-Bharat


 
 Alex
 
 --
 To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body
 of a message to majord...@vger.kernel.org More majordomo info at
 http://vger.kernel.org/majordomo-info.html


--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 4/4] kvm: powerpc: set cache coherency only for RAM pages

2013-07-26 Thread Bhushan Bharat-R65777


 -Original Message-
 From: kvm-ppc-ow...@vger.kernel.org [mailto:kvm-ppc-ow...@vger.kernel.org] On
 Behalf Of Alexander Graf
 Sent: Friday, July 26, 2013 2:20 PM
 To: Benjamin Herrenschmidt
 Cc: Bhushan Bharat-R65777; kvm-ppc@vger.kernel.org; k...@vger.kernel.org;
 linuxppc-...@lists.ozlabs.org; Wood Scott-B07421; Bhushan Bharat-R65777
 Subject: Re: [PATCH 4/4] kvm: powerpc: set cache coherency only for RAM pages
 
 
 On 26.07.2013, at 10:26, Benjamin Herrenschmidt wrote:
 
  On Fri, 2013-07-26 at 11:16 +0530, Bharat Bhushan wrote:
  If the page is RAM then map this as cacheable and coherent (set M
  bit) otherwise this page is treated as I/O and map this as cache
  inhibited and guarded (set  I + G)
 
  This helps setting proper MMU mapping for direct assigned device.
 
  NOTE: There can be devices that require cacheable mapping, which is not yet
 supported.
 
  Why don't you do like server instead and enforce the use of the same I
  and M bits as the corresponding qemu PTE ?
 
 Specifically, Ben is talking about this code:
 
 
 /* Translate to host virtual address */
 hva = __gfn_to_hva_memslot(memslot, gfn);
 
 /* Look up the Linux PTE for the backing page */
 pte_size = psize;
 pte = lookup_linux_pte(pgdir, hva, writing, pte_size);
 if (pte_present(pte)) {
 if (writing  !pte_write(pte))
 /* make the actual HPTE be read-only */
 ptel = hpte_make_readonly(ptel);
 is_io = hpte_cache_bits(pte_val(pte));
 pa = pte_pfn(pte)  PAGE_SHIFT;
 }
 

Will not searching the Linux PTE is a overkill?

=Bharat



--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages

2013-07-24 Thread Bhushan Bharat-R65777


 -Original Message-
 From: Alexander Graf [mailto:ag...@suse.de]
 Sent: Wednesday, July 24, 2013 1:55 PM
 To: “tiejun.chen”
 Cc: Bhushan Bharat-R65777; kvm-ppc@vger.kernel.org; k...@vger.kernel.org list;
 Wood Scott-B07421; Gleb Natapov; Paolo Bonzini
 Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel
 managed pages
 
 
 On 24.07.2013, at 04:26, “tiejun.chen” wrote:
 
  On 07/18/2013 06:27 PM, Alexander Graf wrote:
 
  On 18.07.2013, at 12:19, “tiejun.chen” wrote:
 
  On 07/18/2013 06:12 PM, Alexander Graf wrote:
 
  On 18.07.2013, at 12:08, “tiejun.chen” wrote:
 
  On 07/18/2013 05:48 PM, Alexander Graf wrote:
 
  On 18.07.2013, at 10:25, Bhushan Bharat-R65777 wrote:
 
 
 
  -Original Message-
  From: Bhushan Bharat-R65777
  Sent: Thursday, July 18, 2013 1:53 PM
  To: ' tiejun.chen '
  Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org;
  ag...@suse.de; Wood Scott-
  B07421
  Subject: RE: [PATCH 2/2] kvm: powerpc: set cache coherency only
  for kernel managed pages
 
 
 
  -Original Message-
  From:  tiejun.chen  [mailto:tiejun.c...@windriver.com]
  Sent: Thursday, July 18, 2013 1:52 PM
  To: Bhushan Bharat-R65777
  Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org;
  ag...@suse.de; Wood
  Scott-
  B07421
  Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency
  only for kernel managed pages
 
  On 07/18/2013 04:08 PM, Bhushan Bharat-R65777 wrote:
 
 
  -Original Message-
  From: kvm-ppc-ow...@vger.kernel.org
  [mailto:kvm-ppc-ow...@vger.kernel.org] On Behalf Of  tiejun.chen 
  
  Sent: Thursday, July 18, 2013 1:01 PM
  To: Bhushan Bharat-R65777
  Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org;
  ag...@suse.de; Wood
  Scott-
  B07421
  Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency
  only for kernel managed pages
 
  On 07/18/2013 03:12 PM, Bhushan Bharat-R65777 wrote:
 
 
  -Original Message-
  From:  tiejun.chen  [mailto:tiejun.c...@windriver.com]
  Sent: Thursday, July 18, 2013 11:56 AM
  To: Bhushan Bharat-R65777
  Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org;
  ag...@suse.de; Wood
  Scott- B07421; Bhushan Bharat-R65777
  Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency
  only for kernel managed pages
 
  On 07/18/2013 02:04 PM, Bharat Bhushan wrote:
  If there is a struct page for the requested mapping then
  it's normal DDR and the mapping sets M bit (coherent,
  cacheable) else this is treated as I/O and we set  I +
  G  (cache inhibited,
  guarded)
 
  This helps setting proper TLB mapping for direct assigned
  device
 
  Signed-off-by: Bharat Bhushan
  bharat.bhus...@freescale.com
  ---
 arch/powerpc/kvm/e500_mmu_host.c |   17 -
 1 files changed, 12 insertions(+), 5 deletions(-)
 
  diff --git a/arch/powerpc/kvm/e500_mmu_host.c
  b/arch/powerpc/kvm/e500_mmu_host.c
  index 1c6a9d7..089c227 100644
  --- a/arch/powerpc/kvm/e500_mmu_host.c
  +++ b/arch/powerpc/kvm/e500_mmu_host.c
  @@ -64,13 +64,20 @@ static inline u32
  e500_shadow_mas3_attrib(u32 mas3, int
  usermode)
 return mas3;
 }
 
  -static inline u32 e500_shadow_mas2_attrib(u32 mas2, int
  usermode)
  +static inline u32 e500_shadow_mas2_attrib(u32 mas2,
  +pfn_t pfn)
 {
  +  u32 mas2_attr;
  +
  +  mas2_attr = mas2  MAS2_ATTRIB_MASK;
  +
  +  if (!pfn_valid(pfn)) {
 
  Why not directly use kvm_is_mmio_pfn()?
 
  What I understand from this function (someone can correct
  me) is that it
  returns false when the page is managed by kernel and is
  not marked as RESERVED (for some reason). For us it does not
  matter whether the page is reserved or not, if it is kernel
  visible page then it
  is DDR.
 
 
  I think you are setting I|G by addressing all mmio pages,
  right? If so,
 
   KVM: direct mmio pfn check
 
   Userspace may specify memory slots that are backed by
  mmio pages rather than
   normal RAM.  In some cases it is not enough to identify
  these mmio
  pages
   by pfn_valid().  This patch adds checking the PageReserved as
 well.
 
  Do you know what are those some cases and how checking
  PageReserved helps in
  those cases?
 
  No, myself didn't see these actual cases in qemu,too. But this
  should be chronically persistent as I understand ;-)
 
  Then I will wait till someone educate me :)
 
  The reason is , kvm_is_mmio_pfn() function looks pretty heavy and I do
 not want to call this for all tlbwe operation unless it is necessary.
 
  It certainly does more than we need and potentially slows down the fast
 path (RAM mapping). The only thing it does on top of if (pfn_valid()) is to
 check for pages that are declared reserved on the host. This happens in 2 
 cases:
 
1) Non cache coherent DMA
2) Memory hot remove
 
  The non coherent DMA case would be interesting, as with the mechanism 
  as
 it is in place in Linux today, we could potentially break normal guest 
 operation
 if we don't take it into account. However, it's Kconfig guarded

RE: [PATCH 2/2 v2] kvm: powerpc: set cache coherency only for kernel managed pages

2013-07-23 Thread Bhushan Bharat-R65777


 -Original Message-
 From: Wood Scott-B07421
 Sent: Thursday, July 18, 2013 10:48 PM
 To: Bhushan Bharat-R65777
 Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; ag...@suse.de; Bhushan 
 Bharat-
 R65777
 Subject: Re: [PATCH 2/2 v2] kvm: powerpc: set cache coherency only for kernel
 managed pages
 
 On 07/18/2013 08:19:03 AM, Bharat Bhushan wrote:
  If there is a struct page for the requested mapping then it's normal
  RAM and the mapping is set to M bit (coherent, cacheable) otherwise
  this is treated as I/O and we set  I + G  (cache inhibited, guarded)
 
  This helps setting proper TLB mapping for direct assigned device
 
  Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
  ---
  v2: some cleanup and added comment
   -
   arch/powerpc/kvm/e500_mmu_host.c |   23 ++-
   1 files changed, 18 insertions(+), 5 deletions(-)
 
  diff --git a/arch/powerpc/kvm/e500_mmu_host.c
  b/arch/powerpc/kvm/e500_mmu_host.c
  index 1c6a9d7..02eb973 100644
  --- a/arch/powerpc/kvm/e500_mmu_host.c
  +++ b/arch/powerpc/kvm/e500_mmu_host.c
  @@ -64,13 +64,26 @@ static inline u32 e500_shadow_mas3_attrib(u32
  mas3, int usermode)
  return mas3;
   }
 
  -static inline u32 e500_shadow_mas2_attrib(u32 mas2, int usermode)
  +static inline u32 e500_shadow_mas2_attrib(u32 mas2, pfn_t pfn)
   {
  +   u32 mas2_attr;
  +
  +   mas2_attr = mas2  MAS2_ATTRIB_MASK;
  +
  +   /*
  +* RAM is always mappable on e500 systems, so this is identical
  +* to kvm_is_mmio_pfn(), just without its overhead.
  +*/
  +   if (!pfn_valid(pfn)) {
 
 Please use page_is_ram(), which is what gets used when setting the WIMG for 
 the
 host userspace mapping.  We want to make sure the two are consistent.
 
  +   /* Pages not managed by Kernel are treated as I/O, set
  I + G */
  +   mas2_attr |= MAS2_I | MAS2_G;
   #ifdef CONFIG_SMP
  -   return (mas2  MAS2_ATTRIB_MASK) | MAS2_M;
  -#else
  -   return mas2  MAS2_ATTRIB_MASK;
  +   } else {
  +   /* Kernel managed pages are actually RAM so set  M */
  +   mas2_attr |= MAS2_M;
   #endif
 
 Likewise, we want to make sure this matches the host entry.
 Unfortunately, this is a bit of a mess already.  64-bit booke appears to 
 always
 set MAS2_M for TLB0 mappings.

Scott, can you please point to the code where MAS2_M is always set for TLB0?

-Bharat

  The initial KERNELBASE mapping on boot uses
 M_IF_SMP, and the settlbcam() that (IIRC) replaces it uses _PAGE_COHERENT.  
 32-
 bit always uses _PAGE_COHERENT, except that initial KERNELBASE mapping.
 _PAGE_COHERENT appears to be set based on CONFIG_SMP || CONFIG_PPC_STD_MMU 
 (the
 latter config clears _PAGE_COHERENT in the non-CPU_FTR_NEED_COHERENT case).
 
 As for what we actually want to happen, there are cases when we want M to be 
 set
 for non-SMP.  One such case is AMP, where CPUs may be sharing memory even if 
 the
 Linux instance only runs on one CPU (this is not hypothetical, BTW).  It's 
 also
 possible that we encounter a hardware bug that requires MAS2_M, similar to 
 what
 some of our non-booke chips require.
 
 -Scott

--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 2/2 v2] kvm: powerpc: set cache coherency only for kernel managed pages

2013-07-23 Thread Bhushan Bharat-R65777


 -Original Message-
 From: Wood Scott-B07421
 Sent: Tuesday, July 23, 2013 10:15 PM
 To: Bhushan Bharat-R65777
 Cc: Wood Scott-B07421; Alexander Graf; kvm-ppc@vger.kernel.org;
 k...@vger.kernel.org
 Subject: Re: [PATCH 2/2 v2] kvm: powerpc: set cache coherency only for kernel
 managed pages
 
 On 07/22/2013 10:39:16 PM, Bhushan Bharat-R65777 wrote:
 
 
   -Original Message-
   From: Wood Scott-B07421
   Sent: Tuesday, July 23, 2013 12:18 AM
   To: Bhushan Bharat-R65777
   Cc: Wood Scott-B07421; Alexander Graf; kvm-ppc@vger.kernel.org;
   k...@vger.kernel.org
   Subject: Re: [PATCH 2/2 v2] kvm: powerpc: set cache coherency only
  for kernel
   managed pages
  
   On 07/21/2013 11:39:45 PM, Bhushan Bharat-R65777 wrote:
   
   
 -Original Message-
 From: Wood Scott-B07421
 Sent: Thursday, July 18, 2013 11:09 PM
 To: Alexander Graf
 Cc: Bhushan Bharat-R65777; kvm-ppc@vger.kernel.org;
k...@vger.kernel.org; Bhushan
 Bharat-R65777
 Subject: Re: [PATCH 2/2 v2] kvm: powerpc: set cache coherency
  only
for kernel
 managed pages

 On 07/18/2013 12:32:18 PM, Alexander Graf wrote:
 
  On 18.07.2013, at 19:17, Scott Wood wrote:
 
   On 07/18/2013 08:19:03 AM, Bharat Bhushan wrote:
   Likewise, we want to make sure this matches the host entry.
  Unfortunately, this is a bit of a mess already.  64-bit booke
appears
  to always set MAS2_M for TLB0 mappings.  The initial
  KERNELBASE
  mapping on boot uses M_IF_SMP, and the settlbcam() that (IIRC)
  replaces it uses _PAGE_COHERENT.  32-bit always uses
_PAGE_COHERENT,
  except that initial KERNELBASE mapping.  _PAGE_COHERENT
  appears
to be
  set based on CONFIG_SMP || CONFIG_PPC_STD_MMU (the latter
  config
  clears _PAGE_COHERENT in the non-CPU_FTR_NEED_COHERENT case).
  
   As for what we actually want to happen, there are cases
  when we
  want M to be set for non-SMP.  One such case is AMP, where
  CPUs
may be
  sharing memory even if the Linux instance only runs on one CPU
(this
  is not hypothetical, BTW).  It's also possible that we
  encounter a
  hardware bug that requires MAS2_M, similar to what some of our
  non-booke chips require.
 
  How about we always set M then for RAM?

 M is like I in that bad things happen if you mix them.
   
I am trying to list the invalid mixing of WIMG:
   
 1) I  M
 2) W  I
 3) W  M (Scott mentioned that he observed issues when  mixing
  these
two)
 4) is there any other?
  
   That's not what I was talking about (and I don't think I mentioned
  W at all,
   though it is also potentially problematic).
 
  Here is cut paste of your one response:
  The architecture makes it illegal to mix cacheable and
  cache-inhibited mappings to the same physical page.  Mixing W or M
  bits is generally bad as well.  I've seen it cause machine checks,
  error interrupts, etc.
  -- not just corrupting the page in question.
 
  So I added not mixing W  M. But at that time I missed to understood
  why mixing M  I for same physical address can be issue :).
 
 W or M, not W and M.  I meant that each one, separately, is in a similar
 situation as the I bit.
 
 None of this is about invalid combinations of attributes on a single TLB entry
 (though there are architectural restrictions there as well).

Ok, I misread again :(. The second part of comment was (looks like you missed 
so copy pasted below)

 
When we say all RAM (page_is_ram() is true) will be having M bit, then same 
RAM physical address will not have M mixed with any other, right?

Similarly, For IO (which is not RAM), we will set I+G, so I will not be 
mixed with M. Is not that?


-Bharat
 
 -Scott

--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 2/2 v2] kvm: powerpc: set cache coherency only for kernel managed pages

2013-07-22 Thread Bhushan Bharat-R65777


 -Original Message-
 From: Wood Scott-B07421
 Sent: Tuesday, July 23, 2013 12:18 AM
 To: Bhushan Bharat-R65777
 Cc: Wood Scott-B07421; Alexander Graf; kvm-ppc@vger.kernel.org;
 k...@vger.kernel.org
 Subject: Re: [PATCH 2/2 v2] kvm: powerpc: set cache coherency only for kernel
 managed pages
 
 On 07/21/2013 11:39:45 PM, Bhushan Bharat-R65777 wrote:
 
 
   -Original Message-
   From: Wood Scott-B07421
   Sent: Thursday, July 18, 2013 11:09 PM
   To: Alexander Graf
   Cc: Bhushan Bharat-R65777; kvm-ppc@vger.kernel.org;
  k...@vger.kernel.org; Bhushan
   Bharat-R65777
   Subject: Re: [PATCH 2/2 v2] kvm: powerpc: set cache coherency only
  for kernel
   managed pages
  
   On 07/18/2013 12:32:18 PM, Alexander Graf wrote:
   
On 18.07.2013, at 19:17, Scott Wood wrote:
   
 On 07/18/2013 08:19:03 AM, Bharat Bhushan wrote:
 Likewise, we want to make sure this matches the host entry.
Unfortunately, this is a bit of a mess already.  64-bit booke
  appears
to always set MAS2_M for TLB0 mappings.  The initial KERNELBASE
mapping on boot uses M_IF_SMP, and the settlbcam() that (IIRC)
replaces it uses _PAGE_COHERENT.  32-bit always uses
  _PAGE_COHERENT,
except that initial KERNELBASE mapping.  _PAGE_COHERENT appears
  to be
set based on CONFIG_SMP || CONFIG_PPC_STD_MMU (the latter config
clears _PAGE_COHERENT in the non-CPU_FTR_NEED_COHERENT case).

 As for what we actually want to happen, there are cases when we
want M to be set for non-SMP.  One such case is AMP, where CPUs
  may be
sharing memory even if the Linux instance only runs on one CPU
  (this
is not hypothetical, BTW).  It's also possible that we encounter a
hardware bug that requires MAS2_M, similar to what some of our
non-booke chips require.
   
How about we always set M then for RAM?
  
   M is like I in that bad things happen if you mix them.
 
  I am trying to list the invalid mixing of WIMG:
 
   1) I  M
   2) W  I
   3) W  M (Scott mentioned that he observed issues when  mixing these
  two)
   4) is there any other?
 
 That's not what I was talking about (and I don't think I mentioned W at all,
 though it is also potentially problematic).

Here is cut paste of your one response:
The architecture makes it illegal to mix cacheable and cache-inhibited  
mappings to the same physical page.  Mixing W or M bits is generally  
bad as well.  I've seen it cause machine checks, error interrupts, etc.  
-- not just corrupting the page in question.

So I added not mixing W  M. But at that time I missed to understood why mixing 
M  I for same physical address can be issue :).

  I'm talking about mixing I with
 not-I (on two different virtual addresses pointing to the same physical), M 
 with
 not-M, etc.

When we say all RAM (page_is_ram() is true) will be having M bit, then RAM 
physical address will not have M mixed with any other, right?

Similarly, For IO (which is not RAM), we will set I+G, so I will not be 
mixed with M. Is not that?

-Bharat

 
So we really want to
   match exactly what the rest of the kernel is doing.
 
  How the rest of kernel is doing is a bit complex. IIUC, if we forget
  about the boot state then this is how kernel set WIMG bits:
   1) For Memory always set M if CONFIG_SMP set.
  - So KVM can do same. M will not be mixed with W and I. G and E
  are guest control.
 
 I don't think this is accurate for 64-bit.  And what about the AMP case?
 
 -Scott

--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 2/2 v2] kvm: powerpc: set cache coherency only for kernel managed pages

2013-07-21 Thread Bhushan Bharat-R65777


 -Original Message-
 From: Wood Scott-B07421
 Sent: Thursday, July 18, 2013 11:09 PM
 To: Alexander Graf
 Cc: Bhushan Bharat-R65777; kvm-ppc@vger.kernel.org; k...@vger.kernel.org; 
 Bhushan
 Bharat-R65777
 Subject: Re: [PATCH 2/2 v2] kvm: powerpc: set cache coherency only for kernel
 managed pages
 
 On 07/18/2013 12:32:18 PM, Alexander Graf wrote:
 
  On 18.07.2013, at 19:17, Scott Wood wrote:
 
   On 07/18/2013 08:19:03 AM, Bharat Bhushan wrote:
   Likewise, we want to make sure this matches the host entry.
  Unfortunately, this is a bit of a mess already.  64-bit booke appears
  to always set MAS2_M for TLB0 mappings.  The initial KERNELBASE
  mapping on boot uses M_IF_SMP, and the settlbcam() that (IIRC)
  replaces it uses _PAGE_COHERENT.  32-bit always uses _PAGE_COHERENT,
  except that initial KERNELBASE mapping.  _PAGE_COHERENT appears to be
  set based on CONFIG_SMP || CONFIG_PPC_STD_MMU (the latter config
  clears _PAGE_COHERENT in the non-CPU_FTR_NEED_COHERENT case).
  
   As for what we actually want to happen, there are cases when we
  want M to be set for non-SMP.  One such case is AMP, where CPUs may be
  sharing memory even if the Linux instance only runs on one CPU (this
  is not hypothetical, BTW).  It's also possible that we encounter a
  hardware bug that requires MAS2_M, similar to what some of our
  non-booke chips require.
 
  How about we always set M then for RAM?
 
 M is like I in that bad things happen if you mix them.

I am trying to list the invalid mixing of WIMG:

 1) I  M
 2) W  I
 3) W  M (Scott mentioned that he observed issues when  mixing these two)
 4) is there any other?

So it mean it is safe to let guest control G and E.

  So we really want to
 match exactly what the rest of the kernel is doing.

How the rest of kernel is doing is a bit complex. IIUC, if we forget about the 
boot state then this is how kernel set WIMG bits:
 1) For Memory always set M if CONFIG_SMP set.
- So KVM can do same. M will not be mixed with W and I. G and E 
are guest control.
 2) For I/O , drivers can pass flags to set M or I + G.
- For KVM; if not memory then it is I/O. For now we can always set I + 
G.
- Later we can design some mechanism in VFIO interface to let KVM 
somehow know whether to set M or I+G.

-Bharat

 
 Plus, the performance penalty on some single-core chips can be pretty bad.
 
 -Scott

--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages

2013-07-18 Thread Bhushan Bharat-R65777


 -Original Message-
 From: “tiejun.chen” [mailto:tiejun.c...@windriver.com]
 Sent: Thursday, July 18, 2013 11:56 AM
 To: Bhushan Bharat-R65777
 Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; ag...@suse.de; Wood Scott-
 B07421; Bhushan Bharat-R65777
 Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel
 managed pages
 
 On 07/18/2013 02:04 PM, Bharat Bhushan wrote:
  If there is a struct page for the requested mapping then it's normal
  DDR and the mapping sets M bit (coherent, cacheable) else this is
  treated as I/O and we set  I + G  (cache inhibited, guarded)
 
  This helps setting proper TLB mapping for direct assigned device
 
  Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
  ---
arch/powerpc/kvm/e500_mmu_host.c |   17 -
1 files changed, 12 insertions(+), 5 deletions(-)
 
  diff --git a/arch/powerpc/kvm/e500_mmu_host.c
  b/arch/powerpc/kvm/e500_mmu_host.c
  index 1c6a9d7..089c227 100644
  --- a/arch/powerpc/kvm/e500_mmu_host.c
  +++ b/arch/powerpc/kvm/e500_mmu_host.c
  @@ -64,13 +64,20 @@ static inline u32 e500_shadow_mas3_attrib(u32 mas3, int
 usermode)
  return mas3;
}
 
  -static inline u32 e500_shadow_mas2_attrib(u32 mas2, int usermode)
  +static inline u32 e500_shadow_mas2_attrib(u32 mas2, pfn_t pfn)
{
  +   u32 mas2_attr;
  +
  +   mas2_attr = mas2  MAS2_ATTRIB_MASK;
  +
  +   if (!pfn_valid(pfn)) {
 
 Why not directly use kvm_is_mmio_pfn()?

What I understand from this function (someone can correct me) is that it 
returns false when the page is managed by kernel and is not marked as 
RESERVED (for some reason). For us it does not matter whether the page is 
reserved or not, if it is kernel visible page then it is DDR.

-Bharat

 
 Tiejun
 
  +   mas2_attr |= MAS2_I | MAS2_G;
  +   } else {
#ifdef CONFIG_SMP
  -   return (mas2  MAS2_ATTRIB_MASK) | MAS2_M;
  -#else
  -   return mas2  MAS2_ATTRIB_MASK;
  +   mas2_attr |= MAS2_M;
#endif
  +   }
  +   return mas2_attr;
}
 
/*
  @@ -313,7 +320,7 @@ static void kvmppc_e500_setup_stlbe(
  /* Force IPROT=0 for all guest mappings. */
  stlbe-mas1 = MAS1_TSIZE(tsize) | get_tlb_sts(gtlbe) | MAS1_VALID;
  stlbe-mas2 = (gvaddr  MAS2_EPN) |
  - e500_shadow_mas2_attrib(gtlbe-mas2, pr);
  + e500_shadow_mas2_attrib(gtlbe-mas2, pfn);
  stlbe-mas7_3 = ((u64)pfn  PAGE_SHIFT) |
  e500_shadow_mas3_attrib(gtlbe-mas7_3, pr);
 
 
 

N�r��yb�X��ǧv�^�)޺{.n�+jir)w*jg����ݢj/���z�ޖ��2�ޙ�)ߡ�a�����G���h��j:+v���w��٥

RE: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages

2013-07-18 Thread Bhushan Bharat-R65777


 -Original Message-
 From: kvm-ppc-ow...@vger.kernel.org [mailto:kvm-ppc-ow...@vger.kernel.org] On
 Behalf Of “tiejun.chen”
 Sent: Thursday, July 18, 2013 1:01 PM
 To: Bhushan Bharat-R65777
 Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; ag...@suse.de; Wood Scott-
 B07421
 Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel
 managed pages
 
 On 07/18/2013 03:12 PM, Bhushan Bharat-R65777 wrote:
 
 
  -Original Message-
  From: “tiejun.chen” [mailto:tiejun.c...@windriver.com]
  Sent: Thursday, July 18, 2013 11:56 AM
  To: Bhushan Bharat-R65777
  Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; ag...@suse.de; Wood
  Scott- B07421; Bhushan Bharat-R65777
  Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for
  kernel managed pages
 
  On 07/18/2013 02:04 PM, Bharat Bhushan wrote:
  If there is a struct page for the requested mapping then it's normal
  DDR and the mapping sets M bit (coherent, cacheable) else this is
  treated as I/O and we set  I + G  (cache inhibited, guarded)
 
  This helps setting proper TLB mapping for direct assigned device
 
  Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
  ---
 arch/powerpc/kvm/e500_mmu_host.c |   17 -
 1 files changed, 12 insertions(+), 5 deletions(-)
 
  diff --git a/arch/powerpc/kvm/e500_mmu_host.c
  b/arch/powerpc/kvm/e500_mmu_host.c
  index 1c6a9d7..089c227 100644
  --- a/arch/powerpc/kvm/e500_mmu_host.c
  +++ b/arch/powerpc/kvm/e500_mmu_host.c
  @@ -64,13 +64,20 @@ static inline u32 e500_shadow_mas3_attrib(u32
  mas3, int
  usermode)
return mas3;
 }
 
  -static inline u32 e500_shadow_mas2_attrib(u32 mas2, int usermode)
  +static inline u32 e500_shadow_mas2_attrib(u32 mas2, pfn_t pfn)
 {
  + u32 mas2_attr;
  +
  + mas2_attr = mas2  MAS2_ATTRIB_MASK;
  +
  + if (!pfn_valid(pfn)) {
 
  Why not directly use kvm_is_mmio_pfn()?
 
  What I understand from this function (someone can correct me) is that it
 returns false when the page is managed by kernel and is not marked as 
 RESERVED
 (for some reason). For us it does not matter whether the page is reserved or
 not, if it is kernel visible page then it is DDR.
 
 
 I think you are setting I|G by addressing all mmio pages, right? If so,
 
  KVM: direct mmio pfn check
 
  Userspace may specify memory slots that are backed by mmio pages rather
 than
  normal RAM.  In some cases it is not enough to identify these mmio pages
  by pfn_valid().  This patch adds checking the PageReserved as well.

Do you know what are those some cases and how checking PageReserved helps in 
those cases?

-Bharat

 
 Tiejun
 
  -Bharat
 
 
  Tiejun
 
  + mas2_attr |= MAS2_I | MAS2_G;
  + } else {
 #ifdef CONFIG_SMP
  - return (mas2  MAS2_ATTRIB_MASK) | MAS2_M;
  -#else
  - return mas2  MAS2_ATTRIB_MASK;
  + mas2_attr |= MAS2_M;
 #endif
  + }
  + return mas2_attr;
 }
 
 /*
  @@ -313,7 +320,7 @@ static void kvmppc_e500_setup_stlbe(
/* Force IPROT=0 for all guest mappings. */
stlbe-mas1 = MAS1_TSIZE(tsize) | get_tlb_sts(gtlbe) | 
  MAS1_VALID;
stlbe-mas2 = (gvaddr  MAS2_EPN) |
  -   e500_shadow_mas2_attrib(gtlbe-mas2, pr);
  +   e500_shadow_mas2_attrib(gtlbe-mas2, pfn);
stlbe-mas7_3 = ((u64)pfn  PAGE_SHIFT) |
e500_shadow_mas3_attrib(gtlbe-mas7_3, pr);
 
 
 
 
 
 --
 To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body
 of a message to majord...@vger.kernel.org More majordomo info at
 http://vger.kernel.org/majordomo-info.html

N�r��yb�X��ǧv�^�)޺{.n�+jir)w*jg����ݢj/���z�ޖ��2�ޙ�)ߡ�a�����G���h��j:+v���w��٥

RE: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages

2013-07-18 Thread Bhushan Bharat-R65777


 -Original Message-
 From: “tiejun.chen” [mailto:tiejun.c...@windriver.com]
 Sent: Thursday, July 18, 2013 1:52 PM
 To: Bhushan Bharat-R65777
 Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; ag...@suse.de; Wood Scott-
 B07421
 Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel
 managed pages
 
 On 07/18/2013 04:08 PM, Bhushan Bharat-R65777 wrote:
 
 
  -Original Message-
  From: kvm-ppc-ow...@vger.kernel.org
  [mailto:kvm-ppc-ow...@vger.kernel.org] On Behalf Of “tiejun.chen”
  Sent: Thursday, July 18, 2013 1:01 PM
  To: Bhushan Bharat-R65777
  Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; ag...@suse.de; Wood
  Scott-
  B07421
  Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for
  kernel managed pages
 
  On 07/18/2013 03:12 PM, Bhushan Bharat-R65777 wrote:
 
 
  -Original Message-
  From: “tiejun.chen” [mailto:tiejun.c...@windriver.com]
  Sent: Thursday, July 18, 2013 11:56 AM
  To: Bhushan Bharat-R65777
  Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; ag...@suse.de;
  Wood
  Scott- B07421; Bhushan Bharat-R65777
  Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for
  kernel managed pages
 
  On 07/18/2013 02:04 PM, Bharat Bhushan wrote:
  If there is a struct page for the requested mapping then it's
  normal DDR and the mapping sets M bit (coherent, cacheable) else
  this is treated as I/O and we set  I + G  (cache inhibited,
  guarded)
 
  This helps setting proper TLB mapping for direct assigned device
 
  Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
  ---
  arch/powerpc/kvm/e500_mmu_host.c |   17 -
  1 files changed, 12 insertions(+), 5 deletions(-)
 
  diff --git a/arch/powerpc/kvm/e500_mmu_host.c
  b/arch/powerpc/kvm/e500_mmu_host.c
  index 1c6a9d7..089c227 100644
  --- a/arch/powerpc/kvm/e500_mmu_host.c
  +++ b/arch/powerpc/kvm/e500_mmu_host.c
  @@ -64,13 +64,20 @@ static inline u32 e500_shadow_mas3_attrib(u32
  mas3, int
  usermode)
  return mas3;
  }
 
  -static inline u32 e500_shadow_mas2_attrib(u32 mas2, int usermode)
  +static inline u32 e500_shadow_mas2_attrib(u32 mas2, pfn_t pfn)
  {
  +   u32 mas2_attr;
  +
  +   mas2_attr = mas2  MAS2_ATTRIB_MASK;
  +
  +   if (!pfn_valid(pfn)) {
 
  Why not directly use kvm_is_mmio_pfn()?
 
  What I understand from this function (someone can correct me) is
  that it
  returns false when the page is managed by kernel and is not marked
  as RESERVED (for some reason). For us it does not matter whether the
  page is reserved or not, if it is kernel visible page then it is DDR.
 
 
  I think you are setting I|G by addressing all mmio pages, right? If
  so,
 
KVM: direct mmio pfn check
 
Userspace may specify memory slots that are backed by mmio
  pages rather than
normal RAM.  In some cases it is not enough to identify these mmio
 pages
by pfn_valid().  This patch adds checking the PageReserved as well.
 
  Do you know what are those some cases and how checking PageReserved helps 
  in
 those cases?
 
 No, myself didn't see these actual cases in qemu,too. But this should be
 chronically persistent as I understand ;-)

Then I will wait till someone educate me :)

-Bharat

 
 Tiejun
 
 
  -Bharat
 
 
  Tiejun
 
  -Bharat
 
 
  Tiejun
 
  +   mas2_attr |= MAS2_I | MAS2_G;
  +   } else {
  #ifdef CONFIG_SMP
  -   return (mas2  MAS2_ATTRIB_MASK) | MAS2_M;
  -#else
  -   return mas2  MAS2_ATTRIB_MASK;
  +   mas2_attr |= MAS2_M;
  #endif
  +   }
  +   return mas2_attr;
  }
 
  /*
  @@ -313,7 +320,7 @@ static void kvmppc_e500_setup_stlbe(
  /* Force IPROT=0 for all guest mappings. */
  stlbe-mas1 = MAS1_TSIZE(tsize) | get_tlb_sts(gtlbe) | 
  MAS1_VALID;
  stlbe-mas2 = (gvaddr  MAS2_EPN) |
  - e500_shadow_mas2_attrib(gtlbe-mas2, pr);
  + e500_shadow_mas2_attrib(gtlbe-mas2, pfn);
  stlbe-mas7_3 = ((u64)pfn  PAGE_SHIFT) |
  e500_shadow_mas3_attrib(gtlbe-mas7_3, pr);
 
 
 
 
 
  --
  To unsubscribe from this list: send the line unsubscribe kvm-ppc in
  the body of a message to majord...@vger.kernel.org More majordomo
  info at http://vger.kernel.org/majordomo-info.html
 
 



RE: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages

2013-07-18 Thread Bhushan Bharat-R65777


 -Original Message-
 From: Bhushan Bharat-R65777
 Sent: Thursday, July 18, 2013 1:53 PM
 To: '“tiejun.chen”'
 Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; ag...@suse.de; Wood Scott-
 B07421
 Subject: RE: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel
 managed pages
 
 
 
  -Original Message-
  From: “tiejun.chen” [mailto:tiejun.c...@windriver.com]
  Sent: Thursday, July 18, 2013 1:52 PM
  To: Bhushan Bharat-R65777
  Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; ag...@suse.de; Wood
  Scott-
  B07421
  Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for
  kernel managed pages
 
  On 07/18/2013 04:08 PM, Bhushan Bharat-R65777 wrote:
  
  
   -Original Message-
   From: kvm-ppc-ow...@vger.kernel.org
   [mailto:kvm-ppc-ow...@vger.kernel.org] On Behalf Of “tiejun.chen”
   Sent: Thursday, July 18, 2013 1:01 PM
   To: Bhushan Bharat-R65777
   Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; ag...@suse.de;
   Wood
   Scott-
   B07421
   Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for
   kernel managed pages
  
   On 07/18/2013 03:12 PM, Bhushan Bharat-R65777 wrote:
  
  
   -Original Message-
   From: “tiejun.chen” [mailto:tiejun.c...@windriver.com]
   Sent: Thursday, July 18, 2013 11:56 AM
   To: Bhushan Bharat-R65777
   Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; ag...@suse.de;
   Wood
   Scott- B07421; Bhushan Bharat-R65777
   Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only
   for kernel managed pages
  
   On 07/18/2013 02:04 PM, Bharat Bhushan wrote:
   If there is a struct page for the requested mapping then it's
   normal DDR and the mapping sets M bit (coherent, cacheable)
   else this is treated as I/O and we set  I + G  (cache
   inhibited,
   guarded)
  
   This helps setting proper TLB mapping for direct assigned device
  
   Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
   ---
   arch/powerpc/kvm/e500_mmu_host.c |   17 -
   1 files changed, 12 insertions(+), 5 deletions(-)
  
   diff --git a/arch/powerpc/kvm/e500_mmu_host.c
   b/arch/powerpc/kvm/e500_mmu_host.c
   index 1c6a9d7..089c227 100644
   --- a/arch/powerpc/kvm/e500_mmu_host.c
   +++ b/arch/powerpc/kvm/e500_mmu_host.c
   @@ -64,13 +64,20 @@ static inline u32
   e500_shadow_mas3_attrib(u32 mas3, int
   usermode)
 return mas3;
   }
  
   -static inline u32 e500_shadow_mas2_attrib(u32 mas2, int
   usermode)
   +static inline u32 e500_shadow_mas2_attrib(u32 mas2, pfn_t pfn)
   {
   + u32 mas2_attr;
   +
   + mas2_attr = mas2  MAS2_ATTRIB_MASK;
   +
   + if (!pfn_valid(pfn)) {
  
   Why not directly use kvm_is_mmio_pfn()?
  
   What I understand from this function (someone can correct me) is
   that it
   returns false when the page is managed by kernel and is not
   marked as RESERVED (for some reason). For us it does not matter
   whether the page is reserved or not, if it is kernel visible page then it
 is DDR.
  
  
   I think you are setting I|G by addressing all mmio pages, right? If
   so,
  
 KVM: direct mmio pfn check
  
 Userspace may specify memory slots that are backed by mmio
   pages rather than
 normal RAM.  In some cases it is not enough to identify these
   mmio
  pages
 by pfn_valid().  This patch adds checking the PageReserved as well.
  
   Do you know what are those some cases and how checking
   PageReserved helps in
  those cases?
 
  No, myself didn't see these actual cases in qemu,too. But this should
  be chronically persistent as I understand ;-)
 
 Then I will wait till someone educate me :)

The reason is , kvm_is_mmio_pfn() function looks pretty heavy and I do not want 
to call this for all tlbwe operation unless it is necessary.

-Bharat

   + mas2_attr |= MAS2_I | MAS2_G;
   + } else {
   #ifdef CONFIG_SMP
   - return (mas2  MAS2_ATTRIB_MASK) | MAS2_M;
   -#else
   - return mas2  MAS2_ATTRIB_MASK;
   + mas2_attr |= MAS2_M;
   #endif
   + }
   + return mas2_attr;
   }
  
   /*
   @@ -313,7 +320,7 @@ static void kvmppc_e500_setup_stlbe(
 /* Force IPROT=0 for all guest mappings. */
 stlbe-mas1 = MAS1_TSIZE(tsize) | get_tlb_sts(gtlbe) | 
   MAS1_VALID;
 stlbe-mas2 = (gvaddr  MAS2_EPN) |
   -   e500_shadow_mas2_attrib(gtlbe-mas2, pr);
   +   e500_shadow_mas2_attrib(gtlbe-mas2, pfn);
 stlbe-mas7_3 = ((u64)pfn  PAGE_SHIFT) |
 e500_shadow_mas3_attrib(gtlbe-mas7_3, pr);
  
  
  
  
  
   --
   To unsubscribe from this list: send the line unsubscribe kvm-ppc
   in the body of a message to majord...@vger.kernel.org More
   majordomo info at http://vger.kernel.org/majordomo-info.html
  
 

N�r��yb�X��ǧv�^�)޺{.n�+jir)w*jg����ݢj/���z�ޖ��2�ޙ�)ߡ�a�����G���h��j:+v���w��٥

RE: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages

2013-07-18 Thread Bhushan Bharat-R65777


 -Original Message-
 From: kvm-ppc-ow...@vger.kernel.org [mailto:kvm-ppc-ow...@vger.kernel.org] On
 Behalf Of Alexander Graf
 Sent: Thursday, July 18, 2013 3:19 PM
 To: Bhushan Bharat-R65777
 Cc: “tiejun.chen”; kvm-ppc@vger.kernel.org; k...@vger.kernel.org; Wood 
 Scott-
 B07421
 Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel
 managed pages
 
 
 On 18.07.2013, at 10:25, Bhushan Bharat-R65777 wrote:
 
 
 
  -Original Message-
  From: Bhushan Bharat-R65777
  Sent: Thursday, July 18, 2013 1:53 PM
  To: '“tiejun.chen”'
  Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; ag...@suse.de; Wood
  Scott-
  B07421
  Subject: RE: [PATCH 2/2] kvm: powerpc: set cache coherency only for
  kernel managed pages
 
 
 
  -Original Message-
  From: “tiejun.chen” [mailto:tiejun.c...@windriver.com]
  Sent: Thursday, July 18, 2013 1:52 PM
  To: Bhushan Bharat-R65777
  Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; ag...@suse.de;
  Wood
  Scott-
  B07421
  Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for
  kernel managed pages
 
  On 07/18/2013 04:08 PM, Bhushan Bharat-R65777 wrote:
 
 
  -Original Message-
  From: kvm-ppc-ow...@vger.kernel.org
  [mailto:kvm-ppc-ow...@vger.kernel.org] On Behalf Of “tiejun.chen”
  Sent: Thursday, July 18, 2013 1:01 PM
  To: Bhushan Bharat-R65777
  Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; ag...@suse.de;
  Wood
  Scott-
  B07421
  Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only
  for kernel managed pages
 
  On 07/18/2013 03:12 PM, Bhushan Bharat-R65777 wrote:
 
 
  -Original Message-
  From: “tiejun.chen” [mailto:tiejun.c...@windriver.com]
  Sent: Thursday, July 18, 2013 11:56 AM
  To: Bhushan Bharat-R65777
  Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; ag...@suse.de;
  Wood
  Scott- B07421; Bhushan Bharat-R65777
  Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only
  for kernel managed pages
 
  On 07/18/2013 02:04 PM, Bharat Bhushan wrote:
  If there is a struct page for the requested mapping then it's
  normal DDR and the mapping sets M bit (coherent, cacheable)
  else this is treated as I/O and we set  I + G  (cache
  inhibited,
  guarded)
 
  This helps setting proper TLB mapping for direct assigned
  device
 
  Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
  ---
 arch/powerpc/kvm/e500_mmu_host.c |   17 -
 1 files changed, 12 insertions(+), 5 deletions(-)
 
  diff --git a/arch/powerpc/kvm/e500_mmu_host.c
  b/arch/powerpc/kvm/e500_mmu_host.c
  index 1c6a9d7..089c227 100644
  --- a/arch/powerpc/kvm/e500_mmu_host.c
  +++ b/arch/powerpc/kvm/e500_mmu_host.c
  @@ -64,13 +64,20 @@ static inline u32
  e500_shadow_mas3_attrib(u32 mas3, int
  usermode)
   return mas3;
 }
 
  -static inline u32 e500_shadow_mas2_attrib(u32 mas2, int
  usermode)
  +static inline u32 e500_shadow_mas2_attrib(u32 mas2, pfn_t pfn)
 {
  +u32 mas2_attr;
  +
  +mas2_attr = mas2  MAS2_ATTRIB_MASK;
  +
  +if (!pfn_valid(pfn)) {
 
  Why not directly use kvm_is_mmio_pfn()?
 
  What I understand from this function (someone can correct me) is
  that it
  returns false when the page is managed by kernel and is not
  marked as RESERVED (for some reason). For us it does not matter
  whether the page is reserved or not, if it is kernel visible page
  then it
  is DDR.
 
 
  I think you are setting I|G by addressing all mmio pages, right?
  If so,
 
   KVM: direct mmio pfn check
 
   Userspace may specify memory slots that are backed by mmio
  pages rather than
   normal RAM.  In some cases it is not enough to identify these
  mmio
  pages
   by pfn_valid().  This patch adds checking the PageReserved as well.
 
  Do you know what are those some cases and how checking
  PageReserved helps in
  those cases?
 
  No, myself didn't see these actual cases in qemu,too. But this
  should be chronically persistent as I understand ;-)
 
  Then I will wait till someone educate me :)
 
  The reason is , kvm_is_mmio_pfn() function looks pretty heavy and I do not
 want to call this for all tlbwe operation unless it is necessary.
 
 It certainly does more than we need and potentially slows down the fast path
 (RAM mapping). The only thing it does on top of if (pfn_valid()) is to check
 for pages that are declared reserved on the host. This happens in 2 cases:
 
   1) Non cache coherent DMA
   2) Memory hot remove
 
 The non coherent DMA case would be interesting, as with the mechanism as it is
 in place in Linux today, we could potentially break normal guest operation if 
 we
 don't take it into account. However, it's Kconfig guarded by:
 
 depends on 4xx || 8xx || E200 || PPC_MPC512x || GAMECUBE_COMMON
 default n if PPC_47x
 default y
 
 so we never hit it with any core we care about ;).
 
 Memory hot remove does not exist on e500 FWIW, so we don't have to worry about
 that one either.
 
 Which means I think it's fine

RE: [PATCH 1/2 v2] kvm: powerpc: Do not ignore E attribute in mas2

2013-07-18 Thread Bhushan Bharat-R65777


 -Original Message-
 From: Alexander Graf [mailto:ag...@suse.de]
 Sent: Thursday, July 18, 2013 8:50 PM
 To: Bhushan Bharat-R65777
 Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; Wood Scott-B07421
 Subject: Re: [PATCH 1/2 v2] kvm: powerpc: Do not ignore E attribute in mas2
 
 
 On 18.07.2013, at 17:12, Bhushan Bharat-R65777 wrote:
 
 
 
  -Original Message-
  From: kvm-ppc-ow...@vger.kernel.org
  [mailto:kvm-ppc-ow...@vger.kernel.org] On Behalf Of Alexander Graf
  Sent: Thursday, July 18, 2013 8:18 PM
  To: Bhushan Bharat-R65777
  Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; Wood Scott-B07421;
  Bhushan
  Bharat-R65777
  Subject: Re: [PATCH 1/2 v2] kvm: powerpc: Do not ignore E attribute
  in mas2
 
 
  This needs a description. Why shouldn't we ignore E?
 
  What I understood is that there is no reason to stop guest setting E, so
 allow him.
 
 Please add that to the patch description. Also explain what the bit means.

Ok :)

-Bharat
 
 
 Alex
 


--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 3/5] booke: define reset and shutdown hcalls

2013-07-17 Thread Bhushan Bharat-R65777
  On 17.07.2013, at 13:00, Gleb Natapov wrote:
 
  On Tue, Jul 16, 2013 at 06:04:34PM -0500, Scott Wood wrote:
  On 07/16/2013 01:35:55 AM, Gleb Natapov wrote:
  On Mon, Jul 15, 2013 at 01:17:33PM -0500, Scott Wood wrote:
  On 07/15/2013 06:30:20 AM, Gleb Natapov wrote:
  There is no much sense to share hypercalls between 
  architectures.
  There
  is zero probability x86 will implement those for instance
 
  This is similar to the question of whether to keep device
  API enumerations per-architecture...  It costs very little
  to keep it in a common place, and it's hard to go back in
  the other direction if we later realize there are things
  that should be
  shared.
 
  This is different from device API since with device API all
  arches have to create/destroy devices, so it make sense to
  put device lifecycle management into the common code, and
  device API has single entry point to the code - device fd
  ioctl - where it makes sense to handle common tasks, if any,
  and despatch others to specific device implementation.
 
  This is totally unlike hypercalls which are, by definition,
  very architecture specific (the way they are triggered, the
  way parameter are passed from guest to host, what hypercalls
  arch
  needs...).
 
  The ABI is architecture specific.  The API doesn't need to
  be, any more than it does with syscalls (I consider the
  architecture-specific definition of syscall numbers and
  similar constants in Linux to be unfortunate, especially for
  tools such as strace or QEMU's linux-user emulation).
 
  Unlike syscalls different arches have very different ideas
  what hypercalls they need to implement, so while with unified
  syscall space I can see how it may benefit (very) small number
  of tools, I do not see what advantage it will give us. The
  disadvantage is one more global name space to manage.
 
  Keeping it in a common place also makes it more visible to
  people looking to add new hcalls, which could cut down on
  reinventing the wheel.
  I do not want other arches to start using hypercalls in the
  way powerpc started to use them: separate device io space,
  so it is better to hide this as far away from common code as
  possible :) But on a more serious note hypercalls should be
  a last resort and added only when no other possibility
  exists, so people should not look what hcalls others
  implemented, so they can add them to their favorite arch,
  but they should have a problem at hand that they cannot
  solve without hcall, but at this point they will have pretty good
 idea what this hcall should do.
 
  Why are hcalls such a bad thing?
 
  Because they often used to do non architectural things making
  OSes behave different from how they runs on real HW and real
  HW is what OSes are designed and tested for. Example: there
  once was a KVM (XEN have/had similar one) hypercall to
  accelerate MMU
  operation.
  One thing it allowed is to to flush tlb without doing IPI if
  vcpu is not running. Later optimization was added to Linux MMU
  code that _relies_ on those IPIs for synchronisation. Good
  that at that point those hypercalls were already deprecated on
  KVM (IIRC XEN was broke for some time in that regard). Which
  brings me to another point: they often get obsoleted by code
  improvement and HW advancement (happened to aforementioned MMU
  hypercalls), but they hard to deprecate if hypervisor supports
  live migration, without live migration it is less of a problem.
  Next point is that people often try to use them instead of
  emulate PV or real device just because they think it is
  easier, but it
  is often not so. Example:
  pvpanic device was initially proposed as hypercall, so lets
  say we would implement it as such. It would have been KVM
  specific, implementation would touch core guest KVM code and
  would have been Linux guest specific. Instead it was
  implemented as platform device with very small platform driver
  confined in drivers/ directory, immediately usable by XEN and
  QEMU tcg in addition
 
  This is actually a very good point. How do we support reboot
  and shutdown for TCG guests? We surely don't want to expose TCG
  as KVM
  hypervisor.
 
  Hmm...so are you proposing that we abandon the current approach,
  and switch to a device-based mechanism for reboot/shutdown?
 
  Reading Gleb's email it sounds like the more future proof
  approach, yes. I'm not quite sure yet where we should plug this though.
 
  What do you mean...where the paravirt device would go in the
  physical address map??
 
  Right. Either we
 
  - let the guest decide (PCI)
  - let QEMU decide, but potentially break the SoC layout (SysBus)
  - let QEMU decide, but only for the virt machine so that we don't
  break anyone
  (PlatBus)
 
  Can you please elaborate above two points ?
 
  If we emulate an MPC8544DS, we need to emulate an MPC8544DS. Any time
  we diverge from the layout of the original chip, things can break.
 
  However, for our PV machine (-M ppce500 / 

RE: [PATCH 1/5] powerpc: define ePAPR hcall exit interface

2013-07-15 Thread Bhushan Bharat-R65777


 -Original Message-
 From: Alexander Graf [mailto:ag...@suse.de]
 Sent: Monday, July 15, 2013 4:51 PM
 To: Bhushan Bharat-R65777
 Cc: k...@vger.kernel.org; kvm-ppc@vger.kernel.org; Wood Scott-B07421; Yoder
 Stuart-B08248; Bhushan Bharat-R65777
 Subject: Re: [PATCH 1/5] powerpc: define ePAPR hcall exit interface
 
 
 On 15.07.2013, at 13:11, Bharat Bhushan wrote:
 
  This patch defines the ePAPR hcall exit interface to guest user space.
 
 The subject line is misleading. This is a kvm patch. Same applies for most 
 other
 patches.

Ok, will make this kvm: powerpc: define ePAPR hcall exit interface 

 
 
  Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
  ---
  Documentation/virtual/kvm/api.txt |   20 
  include/uapi/linux/kvm.h  |7 +++
  2 files changed, 27 insertions(+), 0 deletions(-)
 
  diff --git a/Documentation/virtual/kvm/api.txt
  b/Documentation/virtual/kvm/api.txt
  index 66dd2aa..054f2f4 100644
  --- a/Documentation/virtual/kvm/api.txt
  +++ b/Documentation/virtual/kvm/api.txt
  @@ -2597,6 +2597,26 @@ The possible hypercalls are defined in the
  Power Architecture Platform Requirements (PAPR) document available
  from www.power.org (free developer registration required to access it).
 
  +   /* KVM_EXIT_EPAPR_HCALL */
  +   struct {
  +   __u64 nr;
  +   __u64 ret;
  +   __u64 args[8];
  +   } epapr_hcall;
  +
  +This is used on PowerPC platforms that support ePAPR hcalls.
  +It occurs when a guest does a hypercall (as defined in the ePAPR 1.1)
  +and the hcall is not handled by the kernel.
  +
  +The 'nr' field contains the hypercall number (from the guest R11),
  +and 'args' contains the arguments (from the guest R3 - R10).
  +Userspace should put the return code in 'ret' and any extra returned
  +values in args[].  If the VM is not in 64-bit mode KVM zeros the
  +upper half of each field in the struct.
  +
  +As per the ePAPR hcall ABI, the return value is returned to the guest
  +in R3 and output return values in R4 - R10.
  +
  /* KVM_EXIT_S390_TSCH */
  struct {
  __u16 subchannel_id;
  diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h index
  acccd08..01ee50e 100644
  --- a/include/uapi/linux/kvm.h
  +++ b/include/uapi/linux/kvm.h
  @@ -171,6 +171,7 @@ struct kvm_pit_config {
  #define KVM_EXIT_WATCHDOG 21
  #define KVM_EXIT_S390_TSCH22
  #define KVM_EXIT_EPR  23
  +#define KVM_EXIT_EPAPR_HCALL  24
 
  /* For KVM_EXIT_INTERNAL_ERROR */
  /* Emulate instruction failed. */
  @@ -288,6 +289,12 @@ struct kvm_run {
  __u64 ret;
  __u64 args[9];
  } papr_hcall;
  +   /* KVM_EXIT_EPAPR_HCALL */
  +   struct {
  +   __u64 nr;
  +   __u64 ret;
  +   __u64 args[8];
  +   } epapr_hcall;
 
 This should be at the end of the union.

Ok.

-Bharat

 
 
 Alex
 
  /* KVM_EXIT_S390_TSCH */
  struct {
  __u16 subchannel_id;
  --
  1.7.0.4
 
 
 


--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 2/5] booke: exit to guest userspace for unimplemented hcalls in kvm

2013-07-15 Thread Bhushan Bharat-R65777


 -Original Message-
 From: Alexander Graf [mailto:ag...@suse.de]
 Sent: Monday, July 15, 2013 5:02 PM
 To: Bhushan Bharat-R65777
 Cc: k...@vger.kernel.org; kvm-ppc@vger.kernel.org; Wood Scott-B07421; Yoder
 Stuart-B08248; Bhushan Bharat-R65777
 Subject: Re: [PATCH 2/5] booke: exit to guest userspace for unimplemented 
 hcalls
 in kvm
 
 
 On 15.07.2013, at 13:11, Bharat Bhushan wrote:
 
  Exit to guest user space if kvm does not implement the hcall.
 
  Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
  ---
  arch/powerpc/kvm/booke.c   |   47 
  +--
  arch/powerpc/kvm/powerpc.c |1 +
  include/uapi/linux/kvm.h   |1 +
  3 files changed, 42 insertions(+), 7 deletions(-)
 
  diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c index
  17722d8..c8b41b4 100644
  --- a/arch/powerpc/kvm/booke.c
  +++ b/arch/powerpc/kvm/booke.c
  @@ -1005,9 +1005,25 @@ int kvmppc_handle_exit(struct kvm_run *run, struct
 kvm_vcpu *vcpu,
  break;
 
  #ifdef CONFIG_KVM_BOOKE_HV
  -   case BOOKE_INTERRUPT_HV_SYSCALL:
  +   case BOOKE_INTERRUPT_HV_SYSCALL: {
 
 This is getting large. Please extract hcall handling into its own function.
 Maybe you can merge the HV and non-HV case then too.
 
  +   int i;
  if (!(vcpu-arch.shared-msr  MSR_PR)) {
  -   kvmppc_set_gpr(vcpu, 3, kvmppc_kvm_pv(vcpu));
  +   r = kvmppc_kvm_pv(vcpu);
  +   if (r != EV_UNIMPLEMENTED) {
  +   /* except unimplemented return to guest */
  +   kvmppc_set_gpr(vcpu, 3, r);
  +   kvmppc_account_exit(vcpu, SYSCALL_EXITS);
  +   r = RESUME_GUEST;
  +   break;
  +   }
  +   /* Exit to userspace for unimplemented hcalls in kvm */
  +   run-epapr_hcall.nr = kvmppc_get_gpr(vcpu, 11);
  +   run-epapr_hcall.ret = 0;
  +   for (i = 0; i  8; i++)
  +   run-epapr_hcall.args[i] = kvmppc_get_gpr(vcpu, 
  3 +
 i);
  +   vcpu-arch.hcall_needed = 1;
  +   kvmppc_account_exit(vcpu, SYSCALL_EXITS);
  +   r = RESUME_HOST;
  } else {
  /*
   * hcall from guest userspace -- send privileged @@ 
  -1016,22
  +1032,39 @@ int kvmppc_handle_exit(struct kvm_run *run, struct kvm_vcpu 
  *vcpu,
  kvmppc_core_queue_program(vcpu, ESR_PPR);
  }
 
  -   r = RESUME_GUEST;
  +   run-exit_reason = KVM_EXIT_EPAPR_HCALL;


Oops, what I have done, I wanted this to be kvmppc_account_exit(vcpu, 
SYSCALL_EXITS);

s/ run-exit_reason = KVM_EXIT_EPAPR_HCALL;/ kvmppc_account_exit(vcpu, 
SYSCALL_EXITS);

-Bharat

 
 This looks odd. Your exit reason only changes when you do the hcall exiting,
 right?
 
 You also need to guard user space hcall exits with an ENABLE_CAP. Otherwise
 older user space will break, as it doesn't know about the exit type yet.

So the user space so make enable_cap also?

-Bharat

 
 
 Alex
 
  break;
  +   }
  #else
  -   case BOOKE_INTERRUPT_SYSCALL:
  +   case BOOKE_INTERRUPT_SYSCALL: {
  +   int i;
  +   r = RESUME_GUEST;
  if (!(vcpu-arch.shared-msr  MSR_PR) 
  (((u32)kvmppc_get_gpr(vcpu, 0)) == KVM_SC_MAGIC_R0)) {
  /* KVM PV hypercalls */
  -   kvmppc_set_gpr(vcpu, 3, kvmppc_kvm_pv(vcpu));
  -   r = RESUME_GUEST;
  +   r = kvmppc_kvm_pv(vcpu);
  +   if (r != EV_UNIMPLEMENTED) {
  +   /* except unimplemented return to guest */
  +   kvmppc_set_gpr(vcpu, 3, r);
  +   kvmppc_account_exit(vcpu, SYSCALL_EXITS);
  +   r = RESUME_GUEST;
  +   break;
  +   }
  +   /* Exit to userspace for unimplemented hcalls in kvm */
  +   run-epapr_hcall.nr = kvmppc_get_gpr(vcpu, 11);
  +   run-epapr_hcall.ret = 0;
  +   for (i = 0; i  8; i++)
  +   run-epapr_hcall.args[i] = kvmppc_get_gpr(vcpu, 
  3 +
 i);
  +   vcpu-arch.hcall_needed = 1;
  +   run-exit_reason = KVM_EXIT_EPAPR_HCALL;
  +   r = RESUME_HOST;
  } else {
  /* Guest syscalls */
  kvmppc_booke_queue_irqprio(vcpu, BOOKE_IRQPRIO_SYSCALL);
  }
  kvmppc_account_exit(vcpu, SYSCALL_EXITS);
  -   r = RESUME_GUEST;
  break;
  +   }
  #endif
 
  case BOOKE_INTERRUPT_DTLB_MISS: {
  diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c
  index 4e05f8c..6c6199d 100644
  --- a/arch/powerpc/kvm/powerpc.c
  +++ b/arch

RE: [PATCH 2/5] booke: exit to guest userspace for unimplemented hcalls in kvm

2013-07-15 Thread Bhushan Bharat-R65777


 -Original Message-
 From: Alexander Graf [mailto:ag...@suse.de]
 Sent: Monday, July 15, 2013 5:16 PM
 To: Bhushan Bharat-R65777
 Cc: k...@vger.kernel.org; kvm-ppc@vger.kernel.org; Wood Scott-B07421; Yoder
 Stuart-B08248
 Subject: Re: [PATCH 2/5] booke: exit to guest userspace for unimplemented 
 hcalls
 in kvm
 
 
 On 15.07.2013, at 13:38, Bhushan Bharat-R65777 wrote:
 
 
 
  -Original Message-
  From: Alexander Graf [mailto:ag...@suse.de]
  Sent: Monday, July 15, 2013 5:02 PM
  To: Bhushan Bharat-R65777
  Cc: k...@vger.kernel.org; kvm-ppc@vger.kernel.org; Wood Scott-B07421;
  Yoder Stuart-B08248; Bhushan Bharat-R65777
  Subject: Re: [PATCH 2/5] booke: exit to guest userspace for
  unimplemented hcalls in kvm
 
 
  On 15.07.2013, at 13:11, Bharat Bhushan wrote:
 
  Exit to guest user space if kvm does not implement the hcall.
 
  Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
  ---
  arch/powerpc/kvm/booke.c   |   47 
  +-
 -
  arch/powerpc/kvm/powerpc.c |1 +
  include/uapi/linux/kvm.h   |1 +
  3 files changed, 42 insertions(+), 7 deletions(-)
 
  diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
  index
  17722d8..c8b41b4 100644
  --- a/arch/powerpc/kvm/booke.c
  +++ b/arch/powerpc/kvm/booke.c
  @@ -1005,9 +1005,25 @@ int kvmppc_handle_exit(struct kvm_run *run,
  struct
  kvm_vcpu *vcpu,
break;
 
  #ifdef CONFIG_KVM_BOOKE_HV
  - case BOOKE_INTERRUPT_HV_SYSCALL:
  + case BOOKE_INTERRUPT_HV_SYSCALL: {
 
  This is getting large. Please extract hcall handling into its own function.
  Maybe you can merge the HV and non-HV case then too.
 
  + int i;
if (!(vcpu-arch.shared-msr  MSR_PR)) {
  - kvmppc_set_gpr(vcpu, 3, kvmppc_kvm_pv(vcpu));
  + r = kvmppc_kvm_pv(vcpu);
  + if (r != EV_UNIMPLEMENTED) {
  + /* except unimplemented return to guest */
  + kvmppc_set_gpr(vcpu, 3, r);
  + kvmppc_account_exit(vcpu, SYSCALL_EXITS);
  + r = RESUME_GUEST;
  + break;
  + }
  + /* Exit to userspace for unimplemented hcalls in kvm */
  + run-epapr_hcall.nr = kvmppc_get_gpr(vcpu, 11);
  + run-epapr_hcall.ret = 0;
  + for (i = 0; i  8; i++)
  + run-epapr_hcall.args[i] = kvmppc_get_gpr(vcpu, 
  3 +
  i);
  + vcpu-arch.hcall_needed = 1;
  + kvmppc_account_exit(vcpu, SYSCALL_EXITS);
  + r = RESUME_HOST;
} else {
/*
 * hcall from guest userspace -- send privileged @@ 
  -1016,22
  +1032,39 @@ int kvmppc_handle_exit(struct kvm_run *run, struct
  +kvm_vcpu *vcpu,
kvmppc_core_queue_program(vcpu, ESR_PPR);
}
 
  - r = RESUME_GUEST;
  + run-exit_reason = KVM_EXIT_EPAPR_HCALL;
 
 
  Oops, what I have done, I wanted this to be kvmppc_account_exit(vcpu,
  SYSCALL_EXITS);
 
  s/ run-exit_reason = KVM_EXIT_EPAPR_HCALL;/ kvmppc_account_exit(vcpu,
  SYSCALL_EXITS);
 
  -Bharat
 
 
  This looks odd. Your exit reason only changes when you do the hcall
  exiting, right?
 
  You also need to guard user space hcall exits with an ENABLE_CAP.
  Otherwise older user space will break, as it doesn't know about the exit 
  type
 yet.
 
  So the user space so make enable_cap also?
 
 User space needs to call enable_cap on this cap, yes. Otherwise a guest can
 confuse user space with an hcall exit it can't handle.

We do not have enable_cap for book3s, any specific reason why ?

-Bharat

 
 
 Alex
 


--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 5/5] powerpc: using reset hcall when kvm,has-reset

2013-07-15 Thread Bhushan Bharat-R65777


 -Original Message-
 From: Alexander Graf [mailto:ag...@suse.de]
 Sent: Monday, July 15, 2013 5:20 PM
 To: Bhushan Bharat-R65777
 Cc: k...@vger.kernel.org; kvm-ppc@vger.kernel.org; Wood Scott-B07421; Yoder
 Stuart-B08248; Bhushan Bharat-R65777
 Subject: Re: [PATCH 5/5] powerpc: using reset hcall when kvm,has-reset
 
 
 On 15.07.2013, at 13:11, Bharat Bhushan wrote:
 
  Detect the availability of the reset hcalls by looking at
  kvm,has-reset property on the /hypervisor node in the device tree
  passed to the VM and patches the reset mechanism to use reset hcall.
 
  This patch uses the reser hcall when kvm,has-reset is there in
 
 Your patch description is pretty broken :).
 
 
  Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
  ---
  arch/powerpc/kernel/epapr_paravirt.c |   12 
  1 files changed, 12 insertions(+), 0 deletions(-)
 
  diff --git a/arch/powerpc/kernel/epapr_paravirt.c
  b/arch/powerpc/kernel/epapr_paravirt.c
  index d44a571..651d701 100644
  --- a/arch/powerpc/kernel/epapr_paravirt.c
  +++ b/arch/powerpc/kernel/epapr_paravirt.c
  @@ -22,6 +22,8 @@
  #include asm/cacheflush.h
  #include asm/code-patching.h
  #include asm/machdep.h
  +#include asm/kvm_para.h
  +#include asm/kvm_host.h
 
 Why would we need kvm_host.h? This is guest code.
 
 
  #if !defined(CONFIG_64BIT) || defined(CONFIG_PPC_BOOK3E_64) extern
  void epapr_ev_idle(void); @@ -30,6 +32,14 @@ extern u32
  epapr_ev_idle_start[];
 
  bool epapr_paravirt_enabled;
 
  +void epapr_hypercall_reset(char *cmd) {
  +   long ret;
  +   ret = kvm_hypercall0(KVM_HC_VM_RESET);
 
 Is this available without CONFIG_KVM_GUEST? kvm_hypercall() simply returns
 unimplemented for everything when that config option is not set.

We are here because we patched the ppc_md.restart to point to new handler.
So I think we should patch the ppc_md.restart only if CONFIG_KVM_GUEST is true. 


 
  +   printk(error: system reset returned with error %ld\n, ret);
 
 So we should fall back to the normal reset handler here.

Do you mean return normally from here, no BUG() etc? 

-Bharat

 
 
 Alex
 
  +   BUG();
  +}
  +
  static int __init epapr_paravirt_init(void) {
  struct device_node *hyper_node;
  @@ -58,6 +68,8 @@ static int __init epapr_paravirt_init(void)
  if (of_get_property(hyper_node, has-idle, NULL))
  ppc_md.power_save = epapr_ev_idle;
  #endif
  +   if (of_get_property(hyper_node, kvm,has-reset, NULL))
  +   ppc_md.restart = epapr_hypercall_reset;
 
  epapr_paravirt_enabled = true;
 
  --
  1.7.0.4
 
 
  --
  To unsubscribe from this list: send the line unsubscribe kvm-ppc in
  the body of a message to majord...@vger.kernel.org More majordomo info
  at  http://vger.kernel.org/majordomo-info.html
 


--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 2/5] booke: exit to guest userspace for unimplemented hcalls in kvm

2013-07-15 Thread Bhushan Bharat-R65777


 -Original Message-
 From: Alexander Graf [mailto:ag...@suse.de]
 Sent: Monday, July 15, 2013 8:27 PM
 To: Bhushan Bharat-R65777
 Cc: k...@vger.kernel.org; kvm-ppc@vger.kernel.org; Wood Scott-B07421; Yoder
 Stuart-B08248
 Subject: Re: [PATCH 2/5] booke: exit to guest userspace for unimplemented 
 hcalls
 in kvm
 
 
 On 15.07.2013, at 16:50, Bhushan Bharat-R65777 wrote:
 
 
 
  -Original Message-
  From: Alexander Graf [mailto:ag...@suse.de]
  Sent: Monday, July 15, 2013 5:16 PM
  To: Bhushan Bharat-R65777
  Cc: k...@vger.kernel.org; kvm-ppc@vger.kernel.org; Wood Scott-B07421;
  Yoder
  Stuart-B08248
  Subject: Re: [PATCH 2/5] booke: exit to guest userspace for
  unimplemented hcalls in kvm
 
 
  On 15.07.2013, at 13:38, Bhushan Bharat-R65777 wrote:
 
 
 
  -Original Message-
  From: Alexander Graf [mailto:ag...@suse.de]
  Sent: Monday, July 15, 2013 5:02 PM
  To: Bhushan Bharat-R65777
  Cc: k...@vger.kernel.org; kvm-ppc@vger.kernel.org; Wood
  Scott-B07421; Yoder Stuart-B08248; Bhushan Bharat-R65777
  Subject: Re: [PATCH 2/5] booke: exit to guest userspace for
  unimplemented hcalls in kvm
 
 
  On 15.07.2013, at 13:11, Bharat Bhushan wrote:
 
  Exit to guest user space if kvm does not implement the hcall.
 
  Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
  ---
  arch/powerpc/kvm/booke.c   |   47 
  +---
 --
  -
  arch/powerpc/kvm/powerpc.c |1 +
  include/uapi/linux/kvm.h   |1 +
  3 files changed, 42 insertions(+), 7 deletions(-)
 
  diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
  index
  17722d8..c8b41b4 100644
  --- a/arch/powerpc/kvm/booke.c
  +++ b/arch/powerpc/kvm/booke.c
  @@ -1005,9 +1005,25 @@ int kvmppc_handle_exit(struct kvm_run *run,
  struct
  kvm_vcpu *vcpu,
  break;
 
  #ifdef CONFIG_KVM_BOOKE_HV
  -   case BOOKE_INTERRUPT_HV_SYSCALL:
  +   case BOOKE_INTERRUPT_HV_SYSCALL: {
 
  This is getting large. Please extract hcall handling into its own 
  function.
  Maybe you can merge the HV and non-HV case then too.
 
  +   int i;
  if (!(vcpu-arch.shared-msr  MSR_PR)) {
  -   kvmppc_set_gpr(vcpu, 3, kvmppc_kvm_pv(vcpu));
  +   r = kvmppc_kvm_pv(vcpu);
  +   if (r != EV_UNIMPLEMENTED) {
  +   /* except unimplemented return to guest 
  */
  +   kvmppc_set_gpr(vcpu, 3, r);
  +   kvmppc_account_exit(vcpu, 
  SYSCALL_EXITS);
  +   r = RESUME_GUEST;
  +   break;
  +   }
  +   /* Exit to userspace for unimplemented hcalls 
  in kvm
 */
  +   run-epapr_hcall.nr = kvmppc_get_gpr(vcpu, 11);
  +   run-epapr_hcall.ret = 0;
  +   for (i = 0; i  8; i++)
  +   run-epapr_hcall.args[i] = 
  kvmppc_get_gpr(vcpu,
 3 +
  i);
  +   vcpu-arch.hcall_needed = 1;
  +   kvmppc_account_exit(vcpu, SYSCALL_EXITS);
  +   r = RESUME_HOST;
  } else {
  /*
   * hcall from guest userspace -- send 
  privileged @@ -1016,22
  +1032,39 @@ int kvmppc_handle_exit(struct kvm_run *run, struct
  +kvm_vcpu *vcpu,
  kvmppc_core_queue_program(vcpu, ESR_PPR);
  }
 
  -   r = RESUME_GUEST;
  +   run-exit_reason = KVM_EXIT_EPAPR_HCALL;
 
 
  Oops, what I have done, I wanted this to be
  kvmppc_account_exit(vcpu, SYSCALL_EXITS);
 
  s/ run-exit_reason = KVM_EXIT_EPAPR_HCALL;/
  kvmppc_account_exit(vcpu, SYSCALL_EXITS);
 
  -Bharat
 
 
  This looks odd. Your exit reason only changes when you do the hcall
  exiting, right?
 
  You also need to guard user space hcall exits with an ENABLE_CAP.
  Otherwise older user space will break, as it doesn't know about the
  exit type
  yet.
 
  So the user space so make enable_cap also?
 
  User space needs to call enable_cap on this cap, yes. Otherwise a
  guest can confuse user space with an hcall exit it can't handle.
 
  We do not have enable_cap for book3s, any specific reason why ?
 
 We do. If you enable PAPR, you get PAPR hcalls. If you enable OSI, you get OSI
 hcalls.

Oh, We check this on book3s_PR and book3s_HV.

 KVM hcalls on book3s don't return to user space.

It exits, is not it? arch/powerpc/kvm/book3s_pr.c exits with 
KVM_EXIT_PAPR_HCALL. And same in book3s_pv.

Btw, Adding this on booke is not a question. I am just understanding book3s.

-Bharat
 

 Which is something we
 probably want to change along with this patch set.
 
 
 Alex
 


--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 5/5] powerpc: using reset hcall when kvm,has-reset

2013-07-15 Thread Bhushan Bharat-R65777


 -Original Message-
 From: Alexander Graf [mailto:ag...@suse.de]
 Sent: Monday, July 15, 2013 8:40 PM
 To: Bhushan Bharat-R65777
 Cc: k...@vger.kernel.org; kvm-ppc@vger.kernel.org; Wood Scott-B07421; Yoder
 Stuart-B08248
 Subject: Re: [PATCH 5/5] powerpc: using reset hcall when kvm,has-reset
 
 
 On 15.07.2013, at 17:05, Bhushan Bharat-R65777 wrote:
 
 
 
  -Original Message-
  From: Alexander Graf [mailto:ag...@suse.de]
  Sent: Monday, July 15, 2013 5:20 PM
  To: Bhushan Bharat-R65777
  Cc: k...@vger.kernel.org; kvm-ppc@vger.kernel.org; Wood Scott-B07421;
  Yoder Stuart-B08248; Bhushan Bharat-R65777
  Subject: Re: [PATCH 5/5] powerpc: using reset hcall when
  kvm,has-reset
 
 
  On 15.07.2013, at 13:11, Bharat Bhushan wrote:
 
  Detect the availability of the reset hcalls by looking at
  kvm,has-reset property on the /hypervisor node in the device tree
  passed to the VM and patches the reset mechanism to use reset hcall.
 
  This patch uses the reser hcall when kvm,has-reset is there in
 
  Your patch description is pretty broken :).
 
 
  Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
  ---
  arch/powerpc/kernel/epapr_paravirt.c |   12 
  1 files changed, 12 insertions(+), 0 deletions(-)
 
  diff --git a/arch/powerpc/kernel/epapr_paravirt.c
  b/arch/powerpc/kernel/epapr_paravirt.c
  index d44a571..651d701 100644
  --- a/arch/powerpc/kernel/epapr_paravirt.c
  +++ b/arch/powerpc/kernel/epapr_paravirt.c
  @@ -22,6 +22,8 @@
  #include asm/cacheflush.h
  #include asm/code-patching.h
  #include asm/machdep.h
  +#include asm/kvm_para.h
  +#include asm/kvm_host.h
 
  Why would we need kvm_host.h? This is guest code.
 
 
  #if !defined(CONFIG_64BIT) || defined(CONFIG_PPC_BOOK3E_64) extern
  void epapr_ev_idle(void); @@ -30,6 +32,14 @@ extern u32
  epapr_ev_idle_start[];
 
  bool epapr_paravirt_enabled;
 
  +void epapr_hypercall_reset(char *cmd) {
  + long ret;
  + ret = kvm_hypercall0(KVM_HC_VM_RESET);
 
  Is this available without CONFIG_KVM_GUEST? kvm_hypercall() simply
  returns unimplemented for everything when that config option is not set.
 
  We are here because we patched the ppc_md.restart to point to new handler.
  So I think we should patch the ppc_md.restart only if CONFIG_KVM_GUEST is
 true.
 
 We should only patch it if kvm_para_available(). That should guard us against
 everything.
 
 
 
 
  + printk(error: system reset returned with error %ld\n, ret);
 
  So we should fall back to the normal reset handler here.
 
  Do you mean return normally from here, no BUG() etc?
 
 If we guard the patching against everything, we can treat a broken hcall as 
 BUG.
 However, if we don't we want to fall back to the normal guts based reset.

Will let Scott comment on this?

But ppc_md.restart can point to only one handler and during paravirt patching 
we changed this to new handler. So we cannot jump back to guts type handler 

-Bharat


--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 2/5] booke: exit to guest userspace for unimplemented hcalls in kvm

2013-07-15 Thread Bhushan Bharat-R65777


 -Original Message-
 From: Alexander Graf [mailto:ag...@suse.de]
 Sent: Monday, July 15, 2013 8:59 PM
 To: Bhushan Bharat-R65777
 Cc: k...@vger.kernel.org; kvm-ppc@vger.kernel.org; Wood Scott-B07421; Yoder
 Stuart-B08248
 Subject: Re: [PATCH 2/5] booke: exit to guest userspace for unimplemented 
 hcalls
 in kvm
 
 
 On 15.07.2013, at 17:13, Bhushan Bharat-R65777 wrote:
 
 
 
  -Original Message-
  From: Alexander Graf [mailto:ag...@suse.de]
  Sent: Monday, July 15, 2013 8:27 PM
  To: Bhushan Bharat-R65777
  Cc: k...@vger.kernel.org; kvm-ppc@vger.kernel.org; Wood Scott-B07421;
  Yoder
  Stuart-B08248
  Subject: Re: [PATCH 2/5] booke: exit to guest userspace for
  unimplemented hcalls in kvm
 
 
  On 15.07.2013, at 16:50, Bhushan Bharat-R65777 wrote:
 
 
 
  -Original Message-
  From: Alexander Graf [mailto:ag...@suse.de]
  Sent: Monday, July 15, 2013 5:16 PM
  To: Bhushan Bharat-R65777
  Cc: k...@vger.kernel.org; kvm-ppc@vger.kernel.org; Wood
  Scott-B07421; Yoder
  Stuart-B08248
  Subject: Re: [PATCH 2/5] booke: exit to guest userspace for
  unimplemented hcalls in kvm
 
 
  On 15.07.2013, at 13:38, Bhushan Bharat-R65777 wrote:
 
 
 
  -Original Message-
  From: Alexander Graf [mailto:ag...@suse.de]
  Sent: Monday, July 15, 2013 5:02 PM
  To: Bhushan Bharat-R65777
  Cc: k...@vger.kernel.org; kvm-ppc@vger.kernel.org; Wood
  Scott-B07421; Yoder Stuart-B08248; Bhushan Bharat-R65777
  Subject: Re: [PATCH 2/5] booke: exit to guest userspace for
  unimplemented hcalls in kvm
 
 
  On 15.07.2013, at 13:11, Bharat Bhushan wrote:
 
  Exit to guest user space if kvm does not implement the hcall.
 
  Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
  ---
  arch/powerpc/kvm/booke.c   |   47 
  +-
 --
  --
  -
  arch/powerpc/kvm/powerpc.c |1 +
  include/uapi/linux/kvm.h   |1 +
  3 files changed, 42 insertions(+), 7 deletions(-)
 
  diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
  index
  17722d8..c8b41b4 100644
  --- a/arch/powerpc/kvm/booke.c
  +++ b/arch/powerpc/kvm/booke.c
  @@ -1005,9 +1005,25 @@ int kvmppc_handle_exit(struct kvm_run
  *run, struct
  kvm_vcpu *vcpu,
break;
 
  #ifdef CONFIG_KVM_BOOKE_HV
  - case BOOKE_INTERRUPT_HV_SYSCALL:
  + case BOOKE_INTERRUPT_HV_SYSCALL: {
 
  This is getting large. Please extract hcall handling into its own
 function.
  Maybe you can merge the HV and non-HV case then too.
 
  + int i;
if (!(vcpu-arch.shared-msr  MSR_PR)) {
  - kvmppc_set_gpr(vcpu, 3, kvmppc_kvm_pv(vcpu));
  + r = kvmppc_kvm_pv(vcpu);
  + if (r != EV_UNIMPLEMENTED) {
  + /* except unimplemented return to guest 
  */
  + kvmppc_set_gpr(vcpu, 3, r);
  + kvmppc_account_exit(vcpu, 
  SYSCALL_EXITS);
  + r = RESUME_GUEST;
  + break;
  + }
  + /* Exit to userspace for unimplemented hcalls 
  in kvm
  */
  + run-epapr_hcall.nr = kvmppc_get_gpr(vcpu, 11);
  + run-epapr_hcall.ret = 0;
  + for (i = 0; i  8; i++)
  + run-epapr_hcall.args[i] = 
  kvmppc_get_gpr(vcpu,
  3 +
  i);
  + vcpu-arch.hcall_needed = 1;
  + kvmppc_account_exit(vcpu, SYSCALL_EXITS);
  + r = RESUME_HOST;
} else {
/*
 * hcall from guest userspace -- send 
  privileged @@ -
 1016,22
  +1032,39 @@ int kvmppc_handle_exit(struct kvm_run *run, struct
  +kvm_vcpu *vcpu,
kvmppc_core_queue_program(vcpu, ESR_PPR);
}
 
  - r = RESUME_GUEST;
  + run-exit_reason = KVM_EXIT_EPAPR_HCALL;
 
 
  Oops, what I have done, I wanted this to be
  kvmppc_account_exit(vcpu, SYSCALL_EXITS);
 
  s/ run-exit_reason = KVM_EXIT_EPAPR_HCALL;/
  kvmppc_account_exit(vcpu, SYSCALL_EXITS);
 
  -Bharat
 
 
  This looks odd. Your exit reason only changes when you do the
  hcall exiting, right?
 
  You also need to guard user space hcall exits with an ENABLE_CAP.
  Otherwise older user space will break, as it doesn't know about
  the exit type
  yet.
 
  So the user space so make enable_cap also?
 
  User space needs to call enable_cap on this cap, yes. Otherwise a
  guest can confuse user space with an hcall exit it can't handle.
 
  We do not have enable_cap for book3s, any specific reason why ?
 
  We do. If you enable PAPR, you get PAPR hcalls. If you enable OSI,
  you get OSI hcalls.
 
  Oh, We check this on book3s_PR and book3s_HV.
 
  KVM hcalls on book3s don't return to user space.
 
  It exits, is not it? arch/powerpc/kvm/book3s_pr.c exits with
 KVM_EXIT_PAPR_HCALL. And same in book3s_pv.
 
 It doesn't even start

RE: [PATCH 2/5] booke: exit to guest userspace for unimplemented hcalls in kvm

2013-07-15 Thread Bhushan Bharat-R65777


 -Original Message-
 From: Wood Scott-B07421
 Sent: Monday, July 15, 2013 11:38 PM
 To: Bhushan Bharat-R65777
 Cc: k...@vger.kernel.org; kvm-ppc@vger.kernel.org; ag...@suse.de; Yoder 
 Stuart-
 B08248; Bhushan Bharat-R65777; Bhushan Bharat-R65777
 Subject: Re: [PATCH 2/5] booke: exit to guest userspace for unimplemented 
 hcalls
 in kvm
 
 On 07/15/2013 06:11:16 AM, Bharat Bhushan wrote:
  Exit to guest user space if kvm does not implement the hcall.
 
  Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
  ---
   arch/powerpc/kvm/booke.c   |   47
  +--
   arch/powerpc/kvm/powerpc.c |1 +
   include/uapi/linux/kvm.h   |1 +
   3 files changed, 42 insertions(+), 7 deletions(-)
 
  diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c index
  17722d8..c8b41b4 100644
  --- a/arch/powerpc/kvm/booke.c
  +++ b/arch/powerpc/kvm/booke.c
  @@ -1005,9 +1005,25 @@ int kvmppc_handle_exit(struct kvm_run *run,
  struct kvm_vcpu *vcpu,
  break;
 
   #ifdef CONFIG_KVM_BOOKE_HV
  -   case BOOKE_INTERRUPT_HV_SYSCALL:
  +   case BOOKE_INTERRUPT_HV_SYSCALL: {
  +   int i;
  if (!(vcpu-arch.shared-msr  MSR_PR)) {
  -   kvmppc_set_gpr(vcpu, 3, kvmppc_kvm_pv(vcpu));
  +   r = kvmppc_kvm_pv(vcpu);
  +   if (r != EV_UNIMPLEMENTED) {
  +   /* except unimplemented return to guest
  */
  +   kvmppc_set_gpr(vcpu, 3, r);
  +   kvmppc_account_exit(vcpu,
  SYSCALL_EXITS);
  +   r = RESUME_GUEST;
  +   break;
  +   }
  +   /* Exit to userspace for unimplemented hcalls
  in kvm */
  +   run-epapr_hcall.nr = kvmppc_get_gpr(vcpu, 11);
  +   run-epapr_hcall.ret = 0;
  +   for (i = 0; i  8; i++)
  +   run-epapr_hcall.args[i] =
  kvmppc_get_gpr(vcpu, 3 + i);
 
 You need to clear the upper half of each register if CONFIG_PPC64=y and MSR_CM
 is not set.
 
  +   vcpu-arch.hcall_needed = 1;
 
 The existing code for hcall_needed restores 9 return arguments, rather than 
 the
 8 that are defined for this interface.  Thus, you'll be restoring one word of
 padding into the guest -- which could be arbitrary userspace data that 
 shouldn't
 be leaked.  r12 is volatile in the ePAPR hcall ABI so simply clobbering it 
 isn't
 a problem, though.

Oops; Not just that, currently this uses struct type papr_hcall while on 
booke we should use epapr_hcall. I will make a function which will be defined 
in book3s.c and booke.c to setup hcall return registers accordingly. 

-Bharat


 
 -Scott

--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 3/6 v5] powerpc: export debug register save function for KVM

2013-06-24 Thread Bhushan Bharat-R65777


 -Original Message-
 From: Alexander Graf [mailto:ag...@suse.de]
 Sent: Monday, June 24, 2013 3:03 PM
 To: Bhushan Bharat-R65777
 Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; Wood Scott-B07421;
 tiejun.c...@windriver.com; Bhushan Bharat-R65777
 Subject: Re: [PATCH 3/6 v5] powerpc: export debug register save function for 
 KVM
 
 
 On 24.06.2013, at 11:08, Bharat Bhushan wrote:
 
  KVM need this function when switching from vcpu to user-space thread.
  My subsequent patch will use this function.
 
  Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
  ---
  arch/powerpc/include/asm/switch_to.h |4 
  arch/powerpc/kernel/process.c|3 ++-
  2 files changed, 6 insertions(+), 1 deletions(-)
 
  diff --git a/arch/powerpc/include/asm/switch_to.h
  b/arch/powerpc/include/asm/switch_to.h
  index 200d763..50b357f 100644
  --- a/arch/powerpc/include/asm/switch_to.h
  +++ b/arch/powerpc/include/asm/switch_to.h
  @@ -30,6 +30,10 @@ extern void enable_kernel_spe(void); extern void
  giveup_spe(struct task_struct *); extern void load_up_spe(struct
  task_struct *);
 
  +#ifdef CONFIG_PPC_ADV_DEBUG_REGS
  +extern void switch_booke_debug_regs(struct thread_struct
  +*new_thread); #endif
  +
  #ifndef CONFIG_SMP
  extern void discard_lazy_cpu_state(void); #else diff --git
  a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/process.c index
  01ff496..3375cb7 100644
  --- a/arch/powerpc/kernel/process.c
  +++ b/arch/powerpc/kernel/process.c
  @@ -362,12 +362,13 @@ static void prime_debug_regs(struct
  thread_struct *thread)
   * debug registers, set the debug registers from the values
   * stored in the new thread.
   */
  -static void switch_booke_debug_regs(struct thread_struct *new_thread)
  +void switch_booke_debug_regs(struct thread_struct *new_thread)
  {
  if ((current-thread.debug.dbcr0  DBCR0_IDM)
  || (new_thread-debug.dbcr0  DBCR0_IDM))
  prime_debug_regs(new_thread);
  }
  +EXPORT_SYMBOL(switch_booke_debug_regs);
 
 EXPORT_SYMBOL_GPL?

Oops, I missed this comment. Will correct in next version. 

-Bharat

 
 
 Alex
 
  #else   /* !CONFIG_PPC_ADV_DEBUG_REGS */
  #ifndef CONFIG_HAVE_HW_BREAKPOINT
  static void set_debug_reg_defaults(struct thread_struct *thread)
  --
  1.7.0.4
 
 
  --
  To unsubscribe from this list: send the line unsubscribe kvm-ppc in
  the body of a message to majord...@vger.kernel.org More majordomo info
  at  http://vger.kernel.org/majordomo-info.html
 


--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 6/6 v5] KVM: PPC: Add userspace debug stub support

2013-06-24 Thread Bhushan Bharat-R65777


 -Original Message-
 From: Alexander Graf [mailto:ag...@suse.de]
 Sent: Monday, June 24, 2013 4:13 PM
 To: Bhushan Bharat-R65777
 Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; Wood Scott-B07421;
 tiejun.c...@windriver.com; Bhushan Bharat-R65777
 Subject: Re: [PATCH 6/6 v5] KVM: PPC: Add userspace debug stub support
 
 
 On 24.06.2013, at 11:08, Bharat Bhushan wrote:
 
  This patch adds the debug stub support on booke/bookehv.
  Now QEMU debug stub can use hw breakpoint, watchpoint and software
  breakpoint to debug guest.
 
  This is how we save/restore debug register context when switching
  between guest, userspace and kernel user-process:
 
  When QEMU is running
  - thread-debug_reg == QEMU debug register context.
  - Kernel will handle switching the debug register on context switch.
  - no vcpu_load() called
 
  QEMU makes ioctls (except RUN)
  - This will call vcpu_load()
  - should not change context.
  - Some ioctls can change vcpu debug register, context saved in
  - vcpu-debug_regs
 
  QEMU Makes RUN ioctl
  - Save thread-debug_reg on STACK
  - Store thread-debug_reg == vcpu-debug_reg load thread-debug_reg
  - RUN VCPU ( So thread points to vcpu context )
 
  Context switch happens When VCPU running
  - makes vcpu_load() should not load any context kernel loads the vcpu
  - context as thread-debug_regs points to vcpu context.
 
  On heavyweight_exit
  - Load the context saved on stack in thread-debug_reg
 
  Currently we do not support debug resource emulation to guest, On
  debug exception, always exit to user space irrespective of user space
  is expecting the debug exception or not. If this is unexpected
  exception (breakpoint/watchpoint event not set by
  userspace) then let us leave the action on user space. This is similar
  to what it was before, only thing is that now we have proper exit
  state available to user space.
 
  Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
  ---
  arch/powerpc/include/asm/kvm_host.h |3 +
  arch/powerpc/include/uapi/asm/kvm.h |1 +
  arch/powerpc/kvm/booke.c|  233 
  ---
  arch/powerpc/kvm/booke.h|5 +
  4 files changed, 224 insertions(+), 18 deletions(-)
 
  diff --git a/arch/powerpc/include/asm/kvm_host.h
  b/arch/powerpc/include/asm/kvm_host.h
  index 838a577..aeb490d 100644
  --- a/arch/powerpc/include/asm/kvm_host.h
  +++ b/arch/powerpc/include/asm/kvm_host.h
  @@ -524,7 +524,10 @@ struct kvm_vcpu_arch {
  u32 eptcfg;
  u32 epr;
  u32 crit_save;
  +   /* guest debug registers*/
  struct debug_reg dbg_reg;
  +   /* hardware visible debug registers when in guest state */
  +   struct debug_reg shadow_dbg_reg;
  #endif
  gpa_t paddr_accessed;
  gva_t vaddr_accessed;
  diff --git a/arch/powerpc/include/uapi/asm/kvm.h
  b/arch/powerpc/include/uapi/asm/kvm.h
  index ded0607..f5077c2 100644
  --- a/arch/powerpc/include/uapi/asm/kvm.h
  +++ b/arch/powerpc/include/uapi/asm/kvm.h
  @@ -27,6 +27,7 @@
  #define __KVM_HAVE_PPC_SMT
  #define __KVM_HAVE_IRQCHIP
  #define __KVM_HAVE_IRQ_LINE
  +#define __KVM_HAVE_GUEST_DEBUG
 
  struct kvm_regs {
  __u64 pc;
  diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c index
  3e9fc1d..8be3502 100644
  --- a/arch/powerpc/kvm/booke.c
  +++ b/arch/powerpc/kvm/booke.c
  @@ -133,6 +133,29 @@ static void kvmppc_vcpu_sync_fpu(struct kvm_vcpu
  *vcpu) #endif }
 
  +static void kvmppc_vcpu_sync_debug(struct kvm_vcpu *vcpu) {
  +   /* Synchronize guest's desire to get debug interrupts into shadow
  +MSR */ #ifndef CONFIG_KVM_BOOKE_HV
  +   vcpu-arch.shadow_msr = ~MSR_DE;
  +   vcpu-arch.shadow_msr |= vcpu-arch.shared-msr  MSR_DE; #endif
  +
  +   /* Force enable debug interrupts when user space wants to debug */
  +   if (vcpu-guest_debug) {
  +#ifdef CONFIG_KVM_BOOKE_HV
  +   /*
  +* Since there is no shadow MSR, sync MSR_DE into the guest
  +* visible MSR.
  +*/
  +   vcpu-arch.shared-msr |= MSR_DE;
  +#else
  +   vcpu-arch.shadow_msr |= MSR_DE;
  +   vcpu-arch.shared-msr = ~MSR_DE;
  +#endif
  +   }
  +}
  +
  /*
   * Helper function for full MSR writes.  No need to call this if
  only
   * EE/CE/ME/DE/RI are changing.
  @@ -150,6 +173,7 @@ void kvmppc_set_msr(struct kvm_vcpu *vcpu, u32 new_msr)
  kvmppc_mmu_msr_notify(vcpu, old_msr);
  kvmppc_vcpu_sync_spe(vcpu);
  kvmppc_vcpu_sync_fpu(vcpu);
  +   kvmppc_vcpu_sync_debug(vcpu);
  }
 
  static void kvmppc_booke_queue_irqprio(struct kvm_vcpu *vcpu, @@
  -655,6 +679,7 @@ int kvmppc_core_check_requests(struct kvm_vcpu *vcpu)
  int kvmppc_vcpu_run(struct kvm_run *kvm_run, struct kvm_vcpu *vcpu) {
  int ret, s;
  +   struct thread_struct thread;
  #ifdef CONFIG_PPC_FPU
  unsigned int fpscr;
  int fpexc_mode;
  @@ -698,12 +723,21 @@ int kvmppc_vcpu_run(struct kvm_run *kvm_run,
  struct kvm_vcpu *vcpu)
 
  kvmppc_load_guest_fp(vcpu);
  #endif
  +   /* Switch

RE: [PATCH] KVM: PPC: Add userspace debug stub support

2013-05-11 Thread Bhushan Bharat-R65777


 -Original Message-
 From: Alexander Graf [mailto:ag...@suse.de]
 Sent: Friday, May 10, 2013 11:14 PM
 To: Bhushan Bharat-R65777
 Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; Wood Scott-B07421;
 tiejun.c...@windriver.com
 Subject: Re: [PATCH] KVM: PPC: Add userspace debug stub support
 
 
 On 10.05.2013, at 19:31, Bhushan Bharat-R65777 wrote:
 
 
 
  -Original Message-
  From: Alexander Graf [mailto:ag...@suse.de]
  Sent: Friday, May 10, 2013 3:48 PM
  To: Bhushan Bharat-R65777
  Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; Wood Scott-B07421;
  tiejun.c...@windriver.com; Bhushan Bharat-R65777
  Subject: Re: [PATCH] KVM: PPC: Add userspace debug stub support
 
 
  On 07.05.2013, at 11:40, Bharat Bhushan wrote:
 
  This patch adds the debug stub support on booke/bookehv.
  Now QEMU debug stub can use hw breakpoint, watchpoint and software
  breakpoint to debug guest.
 
  This is how we save/restore debug register context when switching
  between guest, userspace and kernel user-process:
 
  When QEMU is running
  - thread-debug_reg == QEMU debug register context.
  - Kernel will handle switching the debug register on context switch.
  - no vcpu_load() called
 
  QEMU makes ioctls (except RUN)
  - This will call vcpu_load()
  - should not change context.
  - Some ioctls can change vcpu debug register, context saved in
  - vcpu-debug_regs
 
  QEMU Makes RUN ioctl
  - Save thread-debug_reg on STACK
  - Store thread-debug_reg == vcpu-debug_reg load thread-debug_reg
  - RUN VCPU ( So thread points to vcpu context )
 
  Context switch happens When VCPU running
  - makes vcpu_load() should not load any context kernel loads the
  - vcpu context as thread-debug_regs points to vcpu context.
 
  On heavyweight_exit
  - Load the context saved on stack in thread-debug_reg
 
  Currently we do not support debug resource emulation to guest, On
  debug exception, always exit to user space irrespective of user
  space is expecting the debug exception or not. If this is unexpected
  exception (breakpoint/watchpoint event not set by
  userspace) then let us leave the action on user space. This is
  similar to what it was before, only thing is that now we have proper
  exit state available to user space.
 
  Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
  ---
  arch/powerpc/include/asm/kvm_host.h |3 +
  arch/powerpc/include/uapi/asm/kvm.h |1 +
  arch/powerpc/kvm/booke.c|  242 
  -
 --
  arch/powerpc/kvm/booke.h|5 +
  4 files changed, 233 insertions(+), 18 deletions(-)
 
  diff --git a/arch/powerpc/include/asm/kvm_host.h
  b/arch/powerpc/include/asm/kvm_host.h
  index 838a577..1b29945 100644
  --- a/arch/powerpc/include/asm/kvm_host.h
  +++ b/arch/powerpc/include/asm/kvm_host.h
  @@ -524,7 +524,10 @@ struct kvm_vcpu_arch {
u32 eptcfg;
u32 epr;
u32 crit_save;
  + /* guest debug registers*/
struct debug_reg dbg_reg;
  + /* shadow debug registers */
 
  Please be more verbose here. What exactly does this contain? Why do
  we need shadow and non-shadow registers? The comment as it is reads
  like
 
   /* Add one plus one */
   x = 1 + 1;
 
 
  /*
  * Shadow debug registers hold the debug register content
  * to be written in h/w debug register on behalf of guest
  * written value or user space written value.
  */
 
 /* hardware visible debug registers when in guest state */
 
 
 
 
  + struct debug_reg shadow_dbg_reg;
  #endif
gpa_t paddr_accessed;
gva_t vaddr_accessed;
  diff --git a/arch/powerpc/include/uapi/asm/kvm.h
  b/arch/powerpc/include/uapi/asm/kvm.h
  index ded0607..f5077c2 100644
  --- a/arch/powerpc/include/uapi/asm/kvm.h
  +++ b/arch/powerpc/include/uapi/asm/kvm.h
  @@ -27,6 +27,7 @@
  #define __KVM_HAVE_PPC_SMT
  #define __KVM_HAVE_IRQCHIP
  #define __KVM_HAVE_IRQ_LINE
  +#define __KVM_HAVE_GUEST_DEBUG
 
  struct kvm_regs {
__u64 pc;
  diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
  index
  ef99536..6a44ad4 100644
  --- a/arch/powerpc/kvm/booke.c
  +++ b/arch/powerpc/kvm/booke.c
  @@ -133,6 +133,29 @@ static void kvmppc_vcpu_sync_fpu(struct
  kvm_vcpu
  *vcpu) #endif }
 
  +static void kvmppc_vcpu_sync_debug(struct kvm_vcpu *vcpu) {
  + /* Synchronize guest's desire to get debug interrupts into shadow
  +MSR */ #ifndef CONFIG_KVM_BOOKE_HV
  + vcpu-arch.shadow_msr = ~MSR_DE;
  + vcpu-arch.shadow_msr |= vcpu-arch.shared-msr  MSR_DE; #endif
  +
  + /* Force enable debug interrupts when user space wants to debug */
  + if (vcpu-guest_debug) {
  +#ifdef CONFIG_KVM_BOOKE_HV
  + /*
  +  * Since there is no shadow MSR, sync MSR_DE into the guest
  +  * visible MSR.
  +  */
  + vcpu-arch.shared-msr |= MSR_DE; #else
  + vcpu-arch.shadow_msr |= MSR_DE;
  + vcpu-arch.shared-msr = ~MSR_DE; #endif
  + }
  +}
  +
  /*
  * Helper function for full MSR writes.  No need to call this if
  only
  * EE/CE/ME/DE/RI are changing

RE: [PATCH] KVM: PPC: Add userspace debug stub support

2013-05-10 Thread Bhushan Bharat-R65777


 -Original Message-
 From: Alexander Graf [mailto:ag...@suse.de]
 Sent: Friday, May 10, 2013 3:48 PM
 To: Bhushan Bharat-R65777
 Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; Wood Scott-B07421;
 tiejun.c...@windriver.com; Bhushan Bharat-R65777
 Subject: Re: [PATCH] KVM: PPC: Add userspace debug stub support
 
 
 On 07.05.2013, at 11:40, Bharat Bhushan wrote:
 
  This patch adds the debug stub support on booke/bookehv.
  Now QEMU debug stub can use hw breakpoint, watchpoint and software
  breakpoint to debug guest.
 
  This is how we save/restore debug register context when switching
  between guest, userspace and kernel user-process:
 
  When QEMU is running
  - thread-debug_reg == QEMU debug register context.
  - Kernel will handle switching the debug register on context switch.
  - no vcpu_load() called
 
  QEMU makes ioctls (except RUN)
  - This will call vcpu_load()
  - should not change context.
  - Some ioctls can change vcpu debug register, context saved in
  - vcpu-debug_regs
 
  QEMU Makes RUN ioctl
  - Save thread-debug_reg on STACK
  - Store thread-debug_reg == vcpu-debug_reg load thread-debug_reg
  - RUN VCPU ( So thread points to vcpu context )
 
  Context switch happens When VCPU running
  - makes vcpu_load() should not load any context kernel loads the vcpu
  - context as thread-debug_regs points to vcpu context.
 
  On heavyweight_exit
  - Load the context saved on stack in thread-debug_reg
 
  Currently we do not support debug resource emulation to guest, On
  debug exception, always exit to user space irrespective of user space
  is expecting the debug exception or not. If this is unexpected
  exception (breakpoint/watchpoint event not set by
  userspace) then let us leave the action on user space. This is similar
  to what it was before, only thing is that now we have proper exit
  state available to user space.
 
  Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
  ---
  arch/powerpc/include/asm/kvm_host.h |3 +
  arch/powerpc/include/uapi/asm/kvm.h |1 +
  arch/powerpc/kvm/booke.c|  242 
  ---
  arch/powerpc/kvm/booke.h|5 +
  4 files changed, 233 insertions(+), 18 deletions(-)
 
  diff --git a/arch/powerpc/include/asm/kvm_host.h
  b/arch/powerpc/include/asm/kvm_host.h
  index 838a577..1b29945 100644
  --- a/arch/powerpc/include/asm/kvm_host.h
  +++ b/arch/powerpc/include/asm/kvm_host.h
  @@ -524,7 +524,10 @@ struct kvm_vcpu_arch {
  u32 eptcfg;
  u32 epr;
  u32 crit_save;
  +   /* guest debug registers*/
  struct debug_reg dbg_reg;
  +   /* shadow debug registers */
 
 Please be more verbose here. What exactly does this contain? Why do we need
 shadow and non-shadow registers? The comment as it is reads like
 
   /* Add one plus one */
   x = 1 + 1;


/*
 * Shadow debug registers hold the debug register content
 * to be written in h/w debug register on behalf of guest
 * written value or user space written value.
 */


 
  +   struct debug_reg shadow_dbg_reg;
  #endif
  gpa_t paddr_accessed;
  gva_t vaddr_accessed;
  diff --git a/arch/powerpc/include/uapi/asm/kvm.h
  b/arch/powerpc/include/uapi/asm/kvm.h
  index ded0607..f5077c2 100644
  --- a/arch/powerpc/include/uapi/asm/kvm.h
  +++ b/arch/powerpc/include/uapi/asm/kvm.h
  @@ -27,6 +27,7 @@
  #define __KVM_HAVE_PPC_SMT
  #define __KVM_HAVE_IRQCHIP
  #define __KVM_HAVE_IRQ_LINE
  +#define __KVM_HAVE_GUEST_DEBUG
 
  struct kvm_regs {
  __u64 pc;
  diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c index
  ef99536..6a44ad4 100644
  --- a/arch/powerpc/kvm/booke.c
  +++ b/arch/powerpc/kvm/booke.c
  @@ -133,6 +133,29 @@ static void kvmppc_vcpu_sync_fpu(struct kvm_vcpu
  *vcpu) #endif }
 
  +static void kvmppc_vcpu_sync_debug(struct kvm_vcpu *vcpu) {
  +   /* Synchronize guest's desire to get debug interrupts into shadow
  +MSR */ #ifndef CONFIG_KVM_BOOKE_HV
  +   vcpu-arch.shadow_msr = ~MSR_DE;
  +   vcpu-arch.shadow_msr |= vcpu-arch.shared-msr  MSR_DE; #endif
  +
  +   /* Force enable debug interrupts when user space wants to debug */
  +   if (vcpu-guest_debug) {
  +#ifdef CONFIG_KVM_BOOKE_HV
  +   /*
  +* Since there is no shadow MSR, sync MSR_DE into the guest
  +* visible MSR.
  +*/
  +   vcpu-arch.shared-msr |= MSR_DE;
  +#else
  +   vcpu-arch.shadow_msr |= MSR_DE;
  +   vcpu-arch.shared-msr = ~MSR_DE;
  +#endif
  +   }
  +}
  +
  /*
   * Helper function for full MSR writes.  No need to call this if
  only
   * EE/CE/ME/DE/RI are changing.
  @@ -150,6 +173,7 @@ void kvmppc_set_msr(struct kvm_vcpu *vcpu, u32 new_msr)
  kvmppc_mmu_msr_notify(vcpu, old_msr);
  kvmppc_vcpu_sync_spe(vcpu);
  kvmppc_vcpu_sync_fpu(vcpu);
  +   kvmppc_vcpu_sync_debug(vcpu);
  }
 
  static void kvmppc_booke_queue_irqprio(struct kvm_vcpu *vcpu, @@
  -655,6 +679,7 @@ int kvmppc_core_check_requests(struct kvm_vcpu *vcpu)
  int kvmppc_vcpu_run(struct

RE: [PATCH v2 3/4] kvm/ppc: Call trace_hardirqs_on before entry

2013-05-09 Thread Bhushan Bharat-R65777


 -Original Message-
 From: kvm-ow...@vger.kernel.org [mailto:kvm-ow...@vger.kernel.org] On Behalf 
 Of
 Scott Wood
 Sent: Friday, May 10, 2013 8:40 AM
 To: Alexander Graf; Benjamin Herrenschmidt
 Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; 
 linuxppc-...@lists.ozlabs.org;
 Wood Scott-B07421
 Subject: [PATCH v2 3/4] kvm/ppc: Call trace_hardirqs_on before entry
 
 Currently this is only being done on 64-bit.  Rather than just move it
 out of the 64-bit ifdef, move it to kvm_lazy_ee_enable() so that it is
 consistent with lazy ee state, and so that we don't track more host
 code as interrupts-enabled than necessary.
 
 Rename kvm_lazy_ee_enable() to kvm_fix_ee_before_entry() to reflect
 that this function now has a role on 32-bit as well.
 
 Signed-off-by: Scott Wood scottw...@freescale.com
 ---
  arch/powerpc/include/asm/kvm_ppc.h |   11 ---
  arch/powerpc/kvm/book3s_pr.c   |4 ++--
  arch/powerpc/kvm/booke.c   |4 ++--
  arch/powerpc/kvm/powerpc.c |2 --
  4 files changed, 12 insertions(+), 9 deletions(-)
 
 diff --git a/arch/powerpc/include/asm/kvm_ppc.h
 b/arch/powerpc/include/asm/kvm_ppc.h
 index a5287fe..6885846 100644
 --- a/arch/powerpc/include/asm/kvm_ppc.h
 +++ b/arch/powerpc/include/asm/kvm_ppc.h
 @@ -394,10 +394,15 @@ static inline void kvmppc_mmu_flush_icache(pfn_t pfn)
   }
  }
 
 -/* Please call after prepare_to_enter. This function puts the lazy ee state
 -   back to normal mode, without actually enabling interrupts. */
 -static inline void kvmppc_lazy_ee_enable(void)
 +/*
 + * Please call after prepare_to_enter. This function puts the lazy ee and irq
 + * disabled tracking state back to normal mode, without actually enabling
 + * interrupts.
 + */
 +static inline void kvmppc_fix_ee_before_entry(void)
  {
 + trace_hardirqs_on();
 +
  #ifdef CONFIG_PPC64
   /* Only need to enable IRQs by hard enabling them after this */
   local_paca-irq_happened = 0;
 diff --git a/arch/powerpc/kvm/book3s_pr.c b/arch/powerpc/kvm/book3s_pr.c
 index bdc40b8..0b97ce4 100644
 --- a/arch/powerpc/kvm/book3s_pr.c
 +++ b/arch/powerpc/kvm/book3s_pr.c
 @@ -890,7 +890,7 @@ program_interrupt:
   local_irq_enable();
   r = s;
   } else {
 - kvmppc_lazy_ee_enable();
 + kvmppc_fix_ee_before_entry();
   }
   }
 
 @@ -1161,7 +1161,7 @@ int kvmppc_vcpu_run(struct kvm_run *kvm_run, struct
 kvm_vcpu *vcpu)
   if (vcpu-arch.shared-msr  MSR_FP)
   kvmppc_handle_ext(vcpu, BOOK3S_INTERRUPT_FP_UNAVAIL, MSR_FP);
 
 - kvmppc_lazy_ee_enable();
 + kvmppc_fix_ee_before_entry();
 
   ret = __kvmppc_vcpu_run(kvm_run, vcpu);
 
 diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
 index 705fc5c..eb89b83 100644
 --- a/arch/powerpc/kvm/booke.c
 +++ b/arch/powerpc/kvm/booke.c
 @@ -673,7 +673,7 @@ int kvmppc_vcpu_run(struct kvm_run *kvm_run, struct 
 kvm_vcpu
 *vcpu)
   ret = s;
   goto out;
   }
 - kvmppc_lazy_ee_enable();
 + kvmppc_fix_ee_before_entry();

local_irq_disable() is called before kvmppc_prepare_to_enter().
Now we put the irq_happend and soft_enabled back to previous state without 
checking for any interrupt happened in between. If any interrupt happens in 
between, will not that be lost?

-Bharat

 
   kvm_guest_enter();
 
 @@ -1154,7 +1154,7 @@ int kvmppc_handle_exit(struct kvm_run *run, struct
 kvm_vcpu *vcpu,
   local_irq_enable();
   r = (s  2) | RESUME_HOST | (r  RESUME_FLAG_NV);
   } else {
 - kvmppc_lazy_ee_enable();
 + kvmppc_fix_ee_before_entry();
   }
   }
 
 diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c
 index 6316ee3..4e05f8c 100644
 --- a/arch/powerpc/kvm/powerpc.c
 +++ b/arch/powerpc/kvm/powerpc.c
 @@ -117,8 +117,6 @@ int kvmppc_prepare_to_enter(struct kvm_vcpu *vcpu)
   kvm_guest_exit();
   continue;
   }
 -
 - trace_hardirqs_on();
  #endif
 
   kvm_guest_enter();
 --
 1.7.10.4
 
 
 --
 To unsubscribe from this list: send the line unsubscribe kvm in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html


--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH v2 4/4] kvm/ppc: IRQ disabling cleanup

2013-05-09 Thread Bhushan Bharat-R65777


 -Original Message-
 From: kvm-ppc-ow...@vger.kernel.org [mailto:kvm-ppc-ow...@vger.kernel.org] On
 Behalf Of Scott Wood
 Sent: Friday, May 10, 2013 8:40 AM
 To: Alexander Graf; Benjamin Herrenschmidt
 Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; 
 linuxppc-...@lists.ozlabs.org;
 Wood Scott-B07421
 Subject: [PATCH v2 4/4] kvm/ppc: IRQ disabling cleanup
 
 Simplify the handling of lazy EE by going directly from fully-enabled
 to hard-disabled.  This replaces the lazy_irq_pending() check
 (including its misplaced kvm_guest_exit() call).
 
 As suggested by Tiejun Chen, move the interrupt disabling into
 kvmppc_prepare_to_enter() rather than have each caller do it.  Also
 move the IRQ enabling on heavyweight exit into
 kvmppc_prepare_to_enter().
 
 Don't move kvmppc_fix_ee_before_entry() into kvmppc_prepare_to_enter(),
 so that the caller can avoid marking interrupts enabled earlier than
 necessary (e.g. book3s_pr waits until after FP save/restore is done).
 
 Signed-off-by: Scott Wood scottw...@freescale.com
 ---
  arch/powerpc/include/asm/kvm_ppc.h |6 ++
  arch/powerpc/kvm/book3s_pr.c   |   12 +++-
  arch/powerpc/kvm/booke.c   |9 ++---
  arch/powerpc/kvm/powerpc.c |   21 -
  4 files changed, 19 insertions(+), 29 deletions(-)
 
 diff --git a/arch/powerpc/include/asm/kvm_ppc.h
 b/arch/powerpc/include/asm/kvm_ppc.h
 index 6885846..e4474f8 100644
 --- a/arch/powerpc/include/asm/kvm_ppc.h
 +++ b/arch/powerpc/include/asm/kvm_ppc.h
 @@ -404,6 +404,12 @@ static inline void kvmppc_fix_ee_before_entry(void)
   trace_hardirqs_on();
 
  #ifdef CONFIG_PPC64
 + /*
 +  * To avoid races, the caller must have gone directly from having
 +  * interrupts fully-enabled to hard-disabled.
 +  */
 + WARN_ON(local_paca-irq_happened != PACA_IRQ_HARD_DIS);
 +
   /* Only need to enable IRQs by hard enabling them after this */
   local_paca-irq_happened = 0;
   local_paca-soft_enabled = 1;
 diff --git a/arch/powerpc/kvm/book3s_pr.c b/arch/powerpc/kvm/book3s_pr.c
 index 0b97ce4..e61e39e 100644
 --- a/arch/powerpc/kvm/book3s_pr.c
 +++ b/arch/powerpc/kvm/book3s_pr.c
 @@ -884,14 +884,11 @@ program_interrupt:
* and if we really did time things so badly, then we just exit
* again due to a host external interrupt.
*/
 - local_irq_disable();
   s = kvmppc_prepare_to_enter(vcpu);
 - if (s = 0) {
 - local_irq_enable();
 + if (s = 0)
   r = s;
 - } else {
 + else
   kvmppc_fix_ee_before_entry();
 - }
   }
 
   trace_kvm_book3s_reenter(r, vcpu);
 @@ -1121,12 +1118,9 @@ int kvmppc_vcpu_run(struct kvm_run *kvm_run, struct
 kvm_vcpu *vcpu)
* really did time things so badly, then we just exit again due to
* a host external interrupt.
*/
 - local_irq_disable();
   ret = kvmppc_prepare_to_enter(vcpu);
 - if (ret = 0) {
 - local_irq_enable();
 + if (ret = 0)
   goto out;
 - }
 
   /* Save FPU state in stack */
   if (current-thread.regs-msr  MSR_FP)
 diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
 index eb89b83..f7c0111 100644
 --- a/arch/powerpc/kvm/booke.c
 +++ b/arch/powerpc/kvm/booke.c
 @@ -666,10 +666,8 @@ int kvmppc_vcpu_run(struct kvm_run *kvm_run, struct
 kvm_vcpu *vcpu)
   return -EINVAL;
   }
 
 - local_irq_disable();
   s = kvmppc_prepare_to_enter(vcpu);
   if (s = 0) {
 - local_irq_enable();
   ret = s;
   goto out;
   }
 @@ -1148,14 +1146,11 @@ int kvmppc_handle_exit(struct kvm_run *run, struct
 kvm_vcpu *vcpu,
* aren't already exiting to userspace for some other reason.
*/
   if (!(r  RESUME_HOST)) {
 - local_irq_disable();

Ok, Now we do not soft disable before kvmppc_prapare_to_enter().

   s = kvmppc_prepare_to_enter(vcpu);
 - if (s = 0) {
 - local_irq_enable();
 + if (s = 0)
   r = (s  2) | RESUME_HOST | (r  RESUME_FLAG_NV);
 - } else {
 + else
   kvmppc_fix_ee_before_entry();
 - }
   }
 
   return r;
 diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c
 index 4e05f8c..f8659aa 100644
 --- a/arch/powerpc/kvm/powerpc.c
 +++ b/arch/powerpc/kvm/powerpc.c
 @@ -64,12 +64,14 @@ int kvmppc_prepare_to_enter(struct kvm_vcpu *vcpu)
  {
   int r = 1;
 
 - WARN_ON_ONCE(!irqs_disabled());
 + WARN_ON(irqs_disabled());
 + hard_irq_disable();

Here we hard disable in kvmppc_prepare_to_enter(), so my comment in other patch 
about interrupt loss is no more valid.

So here
  MSR.EE = 0
  local_paca-soft_enabled = 0
  local_paca-irq_happened |= PACA_IRQ_HARD_DIS;

 +
   while (true) 

RE: [PATCH 7/7 v3] KVM: PPC: Add userspace debug stub support

2013-05-03 Thread Bhushan Bharat-R65777
  +static void kvmppc_booke_vcpu_load_debug_regs(struct kvm_vcpu
  +*vcpu) {
  + if (!vcpu-arch.debug_active)
  + return;
  +
  + /* Disable all debug events and clead pending debug events */
  + mtspr(SPRN_DBCR0, 0x0);
  + kvmppc_clear_dbsr();
  +
  + /*
  +  * Check whether guest still need debug resource, if not then there
  +  * is no need to restore guest context.
  +  */
  + if (!vcpu-arch.shadow_dbg_reg.dbcr0)
  + return;
  +
  + /* Load Guest Context */
  + mtspr(SPRN_DBCR1, vcpu-arch.shadow_dbg_reg.dbcr1);
  + mtspr(SPRN_DBCR2, vcpu-arch.shadow_dbg_reg.dbcr2); #ifdef
  +CONFIG_KVM_E500MC
  + mtspr(SPRN_DBCR4, vcpu-arch.shadow_dbg_reg.dbcr4);
 
  You need to make sure DBCR4 is 0 when you leave things back to normal
  user space. Otherwise guest debug can interfere with host debug.
 
 
  ok
 
 
  +#endif
  + mtspr(SPRN_IAC1, vcpu-arch.shadow_dbg_reg.iac[0]);
  + mtspr(SPRN_IAC2, vcpu-arch.shadow_dbg_reg.iac[1]);
  +#if CONFIG_PPC_ADV_DEBUG_IACS  2
  + mtspr(SPRN_IAC3, vcpu-arch.shadow_dbg_reg.iac[2]);
  + mtspr(SPRN_IAC4, vcpu-arch.shadow_dbg_reg.iac[3]);
  +#endif
  + mtspr(SPRN_DAC1, vcpu-arch.shadow_dbg_reg.dac[0]);
  + mtspr(SPRN_DAC2, vcpu-arch.shadow_dbg_reg.dac[1]);
  +
  + /* Enable debug events after other debug registers restored */
  + mtspr(SPRN_DBCR0, vcpu-arch.shadow_dbg_reg.dbcr0); }
 
  All of the code above looks suspiciously similar to
  prime_debug_regs();. Can't we somehow reuse that?
 
  I think we can if
  - Save thread-debug_regs in local data structure
 
 Yes, it can even be on the stack.
 
  - Load vcpu-arch-debug_regs in thread-debug_regs
  - Call prime_debug_regs();
  - Restore thread-debug_regs from local save values in first step
 
 On heavyweight exit, based on the values on stack, yes.

This is how I think we can save/restore debug context. Please correct if I am 
missing something.

1) When QEMU is running

- thread-debug_reg == QEMU debug register context.
- Kernel will handle switching the debug register on context switch.
- no vcpu_load() called

2) QEMU makes ioctls (except RUN) 
 - This will call vcpu_load()
 - should not change context.
 - Some ioctls can change vcpu debug register, context saved in 
vcpu-debug_regs

3) QEMU Makes RUN ioctl
 - Save thread-debug_reg on STACK
 - Store thread-debug_reg == vcpu-debug_reg
 - load thread-debug_reg 
 - RUN VCPU ( So thread points to vcpu context )

4) Context switch happens When VCPU running
 - makes vcpu_load() should not load any context
 - kernel loads the vcpu context as thread-debug_regs points to vcpu context.

5) On heavyweight_exit
 - Load the context saved on stack in thread-debug_reg

Thanks
-Bharat



--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 7/7 v3] KVM: PPC: Add userspace debug stub support

2013-05-03 Thread Bhushan Bharat-R65777


 -Original Message-
 From: Alexander Graf [mailto:ag...@suse.de]
 Sent: Friday, May 03, 2013 6:00 PM
 To: Bhushan Bharat-R65777
 Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; Wood Scott-B07421
 Subject: Re: [PATCH 7/7 v3] KVM: PPC: Add userspace debug stub support
 
 
 On 03.05.2013, at 13:08, Alexander Graf wrote:
 
 
 
  Am 03.05.2013 um 12:48 schrieb Bhushan Bharat-R65777 r65...@freescale.com:
 
  +static void kvmppc_booke_vcpu_load_debug_regs(struct kvm_vcpu
  +*vcpu) {
  +if (!vcpu-arch.debug_active)
  +return;
  +
  +/* Disable all debug events and clead pending debug events */
  +mtspr(SPRN_DBCR0, 0x0);
  +kvmppc_clear_dbsr();
  +
  +/*
  + * Check whether guest still need debug resource, if not then 
  there
  + * is no need to restore guest context.
  + */
  +if (!vcpu-arch.shadow_dbg_reg.dbcr0)
  +return;
  +
  +/* Load Guest Context */
  +mtspr(SPRN_DBCR1, vcpu-arch.shadow_dbg_reg.dbcr1);
  +mtspr(SPRN_DBCR2, vcpu-arch.shadow_dbg_reg.dbcr2); #ifdef
  +CONFIG_KVM_E500MC
  +mtspr(SPRN_DBCR4, vcpu-arch.shadow_dbg_reg.dbcr4);
 
  You need to make sure DBCR4 is 0 when you leave things back to
  normal user space. Otherwise guest debug can interfere with host debug.
 
 
  ok
 
 
  +#endif
  +mtspr(SPRN_IAC1, vcpu-arch.shadow_dbg_reg.iac[0]);
  +mtspr(SPRN_IAC2, vcpu-arch.shadow_dbg_reg.iac[1]);
  +#if CONFIG_PPC_ADV_DEBUG_IACS  2
  +mtspr(SPRN_IAC3, vcpu-arch.shadow_dbg_reg.iac[2]);
  +mtspr(SPRN_IAC4, vcpu-arch.shadow_dbg_reg.iac[3]);
  +#endif
  +mtspr(SPRN_DAC1, vcpu-arch.shadow_dbg_reg.dac[0]);
  +mtspr(SPRN_DAC2, vcpu-arch.shadow_dbg_reg.dac[1]);
  +
  +/* Enable debug events after other debug registers restored */
  +mtspr(SPRN_DBCR0, vcpu-arch.shadow_dbg_reg.dbcr0); }
 
  All of the code above looks suspiciously similar to
  prime_debug_regs();. Can't we somehow reuse that?
 
  I think we can if
  - Save thread-debug_regs in local data structure
 
  Yes, it can even be on the stack.
 
  - Load vcpu-arch-debug_regs in thread-debug_regs
  - Call prime_debug_regs();
  - Restore thread-debug_regs from local save values in first step
 
  On heavyweight exit, based on the values on stack, yes.
 
  This is how I think we can save/restore debug context. Please correct if I 
  am
 missing something.
 
  Sounds about right :)
 
 Actually, what happens if a guest breakpoint is set to a kernel address that
 happens to be within the scope of kvm code?

You mean address of kvm code in guest or host?

If host, we already mentioned that we do not support that. Right?

-Bharat

 We do accept debug events between
 vcpu_run and the assembly code, right?
 
 
 Alex
 


--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 7/7 v3] KVM: PPC: Add userspace debug stub support

2013-05-02 Thread Bhushan Bharat-R65777


 -Original Message-
 From: Alexander Graf [mailto:ag...@suse.de]
 Sent: Friday, April 26, 2013 4:46 PM
 To: Bhushan Bharat-R65777
 Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; Wood Scott-B07421; Bhushan
 Bharat-R65777
 Subject: Re: [PATCH 7/7 v3] KVM: PPC: Add userspace debug stub support
 
 
 On 08.04.2013, at 12:32, Bharat Bhushan wrote:
 
  From: Bharat Bhushan bharat.bhus...@freescale.com
 
  This patch adds the debug stub support on booke/bookehv.
  Now QEMU debug stub can use hw breakpoint, watchpoint and software
  breakpoint to debug guest.
 
  Debug registers are saved/restored on vcpu_put()/vcpu_get().
  Also the debug registers are saved restored only if guest
  is using debug resources.
 
  Currently we do not support debug resource emulation to guest,
  so always exit to user space irrespective of user space is expecting
  the debug exception or not. This is unexpected event and let us
  leave the action on user space. This is similar to what it was before,
  only thing is that now we have proper exit state available to user space.
 
  Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
  ---
  arch/powerpc/include/asm/kvm_host.h |8 +
  arch/powerpc/include/uapi/asm/kvm.h |   22 +++-
  arch/powerpc/kvm/booke.c|  242 
  ---
  arch/powerpc/kvm/booke.h|5 +
  4 files changed, 255 insertions(+), 22 deletions(-)
 
  diff --git a/arch/powerpc/include/asm/kvm_host.h
 b/arch/powerpc/include/asm/kvm_host.h
  index e34f8fe..b9ad20f 100644
  --- a/arch/powerpc/include/asm/kvm_host.h
  +++ b/arch/powerpc/include/asm/kvm_host.h
  @@ -505,7 +505,15 @@ struct kvm_vcpu_arch {
  u32 mmucfg;
  u32 epr;
  u32 crit_save;
  +
  +   /* Flag indicating that debug registers are used by guest */
  +   bool debug_active;
  +   /* for save/restore thread-dbcr0 on vcpu run/heavyweight_exit */
  +   u32 saved_dbcr0;
  +   /* guest debug registers*/
  struct kvmppc_booke_debug_reg dbg_reg;
  +   /* shadow debug registers */
  +   struct kvmppc_booke_debug_reg shadow_dbg_reg;
  #endif
  gpa_t paddr_accessed;
  gva_t vaddr_accessed;
  diff --git a/arch/powerpc/include/uapi/asm/kvm.h
 b/arch/powerpc/include/uapi/asm/kvm.h
  index c0c38ed..d7ce449 100644
  --- a/arch/powerpc/include/uapi/asm/kvm.h
  +++ b/arch/powerpc/include/uapi/asm/kvm.h
  @@ -25,6 +25,7 @@
  /* Select powerpc specific features in linux/kvm.h */
  #define __KVM_HAVE_SPAPR_TCE
  #define __KVM_HAVE_PPC_SMT
  +#define __KVM_HAVE_GUEST_DEBUG
 
  struct kvm_regs {
  __u64 pc;
  @@ -267,7 +268,24 @@ struct kvm_fpu {
  __u64 fpr[32];
  };
 
  +/*
  + * Defines for h/w breakpoint, watchpoint (read, write or both) and
  + * software breakpoint.
  + * These are used as type in KVM_SET_GUEST_DEBUG ioctl and status
  + * for KVM_DEBUG_EXIT.
  + */
  +#define KVMPPC_DEBUG_NONE  0x0
  +#define KVMPPC_DEBUG_BREAKPOINT(1UL  1)
  +#define KVMPPC_DEBUG_WATCH_WRITE   (1UL  2)
  +#define KVMPPC_DEBUG_WATCH_READ(1UL  3)
  struct kvm_debug_exit_arch {
  +   __u64 address;
  +   /*
  +* exiting to userspace because of h/w breakpoint, watchpoint
  +* (read, write or both) and software breakpoint.
  +*/
  +   __u32 status;
  +   __u32 reserved;
  };
 
  /* for KVM_SET_GUEST_DEBUG */
  @@ -279,10 +297,6 @@ struct kvm_guest_debug_arch {
   * Type denotes h/w breakpoint, read watchpoint, write
   * watchpoint or watchpoint (both read and write).
   */
  -#define KVMPPC_DEBUG_NONE  0x0
  -#define KVMPPC_DEBUG_BREAKPOINT(1UL  1)
  -#define KVMPPC_DEBUG_WATCH_WRITE   (1UL  2)
  -#define KVMPPC_DEBUG_WATCH_READ(1UL  3)
  __u32 type;
  __u32 reserved;
  } bp[16];
  diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
  index 97ae158..0e93416 100644
  --- a/arch/powerpc/kvm/booke.c
  +++ b/arch/powerpc/kvm/booke.c
  @@ -133,6 +133,29 @@ static void kvmppc_vcpu_sync_fpu(struct kvm_vcpu *vcpu)
  #endif
  }
 
  +static void kvmppc_vcpu_sync_debug(struct kvm_vcpu *vcpu)
  +{
  +   /* Synchronize guest's desire to get debug interrupts into shadow MSR */
  +#ifndef CONFIG_KVM_BOOKE_HV
  +   vcpu-arch.shadow_msr = ~MSR_DE;
  +   vcpu-arch.shadow_msr |= vcpu-arch.shared-msr  MSR_DE;
  +#endif
  +
  +   /* Force enable debug interrupts when user space wants to debug */
  +   if (vcpu-guest_debug) {
  +#ifdef CONFIG_KVM_BOOKE_HV
  +   /*
  +* Since there is no shadow MSR, sync MSR_DE into the guest
  +* visible MSR.
  +*/
  +   vcpu-arch.shared-msr |= MSR_DE;
  +#else
  +   vcpu-arch.shadow_msr |= MSR_DE;
  +   vcpu-arch.shared-msr = ~MSR_DE;
  +#endif
  +   }
  +}
  +
  /*
   * Helper function for full MSR writes.  No need to call this if only
   * EE/CE/ME/DE/RI are changing.
  @@ -150,6 +173,7 @@ void kvmppc_set_msr(struct kvm_vcpu *vcpu, u32 new_msr

RE: [PATCH 7/7 v3] KVM: PPC: Add userspace debug stub support

2013-05-02 Thread Bhushan Bharat-R65777


 -Original Message-
 From: Alexander Graf [mailto:ag...@suse.de]
 Sent: Thursday, May 02, 2013 4:35 PM
 To: Bhushan Bharat-R65777
 Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; Wood Scott-B07421
 Subject: Re: [PATCH 7/7 v3] KVM: PPC: Add userspace debug stub support
 
 
 On 02.05.2013, at 11:46, Bhushan Bharat-R65777 wrote:
 
 
 
  -Original Message-
  From: Alexander Graf [mailto:ag...@suse.de]
  Sent: Friday, April 26, 2013 4:46 PM
  To: Bhushan Bharat-R65777
  Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; Wood Scott-B07421;
  Bhushan
  Bharat-R65777
  Subject: Re: [PATCH 7/7 v3] KVM: PPC: Add userspace debug stub
  support
 
 
  On 08.04.2013, at 12:32, Bharat Bhushan wrote:
 
  From: Bharat Bhushan bharat.bhus...@freescale.com
 
  This patch adds the debug stub support on booke/bookehv.
  Now QEMU debug stub can use hw breakpoint, watchpoint and software
  breakpoint to debug guest.
 
  Debug registers are saved/restored on vcpu_put()/vcpu_get().
  Also the debug registers are saved restored only if guest is using
  debug resources.
 
  Currently we do not support debug resource emulation to guest, so
  always exit to user space irrespective of user space is expecting
  the debug exception or not. This is unexpected event and let us
  leave the action on user space. This is similar to what it was
  before, only thing is that now we have proper exit state available to user
 space.
 
  Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
  ---
  arch/powerpc/include/asm/kvm_host.h |8 +
  arch/powerpc/include/uapi/asm/kvm.h |   22 +++-
  arch/powerpc/kvm/booke.c|  242 
  -
 --
  arch/powerpc/kvm/booke.h|5 +
  4 files changed, 255 insertions(+), 22 deletions(-)
 
  diff --git a/arch/powerpc/include/asm/kvm_host.h
  b/arch/powerpc/include/asm/kvm_host.h
  index e34f8fe..b9ad20f 100644
  --- a/arch/powerpc/include/asm/kvm_host.h
  +++ b/arch/powerpc/include/asm/kvm_host.h
  @@ -505,7 +505,15 @@ struct kvm_vcpu_arch {
u32 mmucfg;
u32 epr;
u32 crit_save;
  +
  + /* Flag indicating that debug registers are used by guest */
  + bool debug_active;
  + /* for save/restore thread-dbcr0 on vcpu run/heavyweight_exit */
  + u32 saved_dbcr0;
  + /* guest debug registers*/
struct kvmppc_booke_debug_reg dbg_reg;
  + /* shadow debug registers */
  + struct kvmppc_booke_debug_reg shadow_dbg_reg;
  #endif
gpa_t paddr_accessed;
gva_t vaddr_accessed;
  diff --git a/arch/powerpc/include/uapi/asm/kvm.h
  b/arch/powerpc/include/uapi/asm/kvm.h
  index c0c38ed..d7ce449 100644
  --- a/arch/powerpc/include/uapi/asm/kvm.h
  +++ b/arch/powerpc/include/uapi/asm/kvm.h
  @@ -25,6 +25,7 @@
  /* Select powerpc specific features in linux/kvm.h */ #define
  __KVM_HAVE_SPAPR_TCE #define __KVM_HAVE_PPC_SMT
  +#define __KVM_HAVE_GUEST_DEBUG
 
  struct kvm_regs {
__u64 pc;
  @@ -267,7 +268,24 @@ struct kvm_fpu {
__u64 fpr[32];
  };
 
  +/*
  + * Defines for h/w breakpoint, watchpoint (read, write or both) and
  + * software breakpoint.
  + * These are used as type in KVM_SET_GUEST_DEBUG ioctl and status
  + * for KVM_DEBUG_EXIT.
  + */
  +#define KVMPPC_DEBUG_NONE0x0
  +#define KVMPPC_DEBUG_BREAKPOINT  (1UL  1)
  +#define KVMPPC_DEBUG_WATCH_WRITE (1UL  2)
  +#define KVMPPC_DEBUG_WATCH_READ  (1UL  3)
  struct kvm_debug_exit_arch {
  + __u64 address;
  + /*
  +  * exiting to userspace because of h/w breakpoint, watchpoint
  +  * (read, write or both) and software breakpoint.
  +  */
  + __u32 status;
  + __u32 reserved;
  };
 
  /* for KVM_SET_GUEST_DEBUG */
  @@ -279,10 +297,6 @@ struct kvm_guest_debug_arch {
 * Type denotes h/w breakpoint, read watchpoint, write
 * watchpoint or watchpoint (both read and write).
 */
  -#define KVMPPC_DEBUG_NONE0x0
  -#define KVMPPC_DEBUG_BREAKPOINT  (1UL  1)
  -#define KVMPPC_DEBUG_WATCH_WRITE (1UL  2)
  -#define KVMPPC_DEBUG_WATCH_READ  (1UL  3)
__u32 type;
__u32 reserved;
} bp[16];
  diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
  index 97ae158..0e93416 100644
  --- a/arch/powerpc/kvm/booke.c
  +++ b/arch/powerpc/kvm/booke.c
  @@ -133,6 +133,29 @@ static void kvmppc_vcpu_sync_fpu(struct
  kvm_vcpu *vcpu) #endif }
 
  +static void kvmppc_vcpu_sync_debug(struct kvm_vcpu *vcpu) {
  + /* Synchronize guest's desire to get debug interrupts into shadow
  +MSR */ #ifndef CONFIG_KVM_BOOKE_HV
  + vcpu-arch.shadow_msr = ~MSR_DE;
  + vcpu-arch.shadow_msr |= vcpu-arch.shared-msr  MSR_DE; #endif
  +
  + /* Force enable debug interrupts when user space wants to debug */
  + if (vcpu-guest_debug) {
  +#ifdef CONFIG_KVM_BOOKE_HV
  + /*
  +  * Since there is no shadow MSR, sync MSR_DE into the guest
  +  * visible MSR.
  +  */
  + vcpu-arch.shared-msr |= MSR_DE; #else
  + vcpu-arch.shadow_msr |= MSR_DE

RE: [PATCH] ppc: initialize GPRs as per epapr

2013-04-26 Thread Bhushan Bharat-R65777
This was supposed to go to qemu-devel.

Please Ignore this patch:

Thanks
-Bharat

 -Original Message-
 From: Bhushan Bharat-R65777
 Sent: Friday, April 26, 2013 11:44 AM
 To: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; ag...@suse.de; Wood Scott-
 B07421
 Cc: Bhushan Bharat-R65777; Bhushan Bharat-R65777; Yoder Stuart-B08248
 Subject: [PATCH] ppc: initialize GPRs as per epapr
 
 ePAPR defines the initial values of cpu registers. This patch initialize the
 GPRs as per ePAPR specification.
 
 This resolves the issue of guest reboot/reset (guest hang on reboot).
 
 Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
 Signed-off-by: Stuart Yoder stuart.yo...@freescale.com
 ---
  hw/ppc/e500.c |7 +++
  1 files changed, 7 insertions(+), 0 deletions(-)
 
 diff --git a/hw/ppc/e500.c b/hw/ppc/e500.c index c1bdb6b..a47f976 100644
 --- a/hw/ppc/e500.c
 +++ b/hw/ppc/e500.c
 @@ -37,6 +37,7 @@
  #include qemu/host-utils.h
  #include hw/pci-host/ppce500.h
 
 +#define EPAPR_MAGIC(0x45504150)
  #define BINARY_DEVICE_TREE_FILEmpc8544ds.dtb
  #define UIMAGE_LOAD_BASE   0
  #define DTC_LOAD_PAD   0x180
 @@ -444,6 +445,12 @@ static void ppce500_cpu_reset(void *opaque)
  cs-halted = 0;
  env-gpr[1] = (1620) - 8;
  env-gpr[3] = bi-dt_base;
 +env-gpr[4] = 0;
 +env-gpr[5] = 0;
 +env-gpr[6] = EPAPR_MAGIC;
 +env-gpr[7] = (64 * 1024 * 1024);
 +env-gpr[8] = 0;
 +env-gpr[9] = 0;
  env-nip = bi-entry;
  mmubooke_create_initial_mapping(env);
  }
 --
 1.7.0.4


--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH] KVM : PPC : cache flush for kernel managed pages

2013-04-25 Thread Bhushan Bharat-R65777


 -Original Message-
 From: Alexander Graf [mailto:ag...@suse.de]
 Sent: Thursday, April 25, 2013 8:36 PM
 To: Bhushan Bharat-R65777
 Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; Wood Scott-B07421; Bhushan
 Bharat-R65777
 Subject: Re: [PATCH] KVM : PPC : cache flush for kernel managed pages
 
 
 On 23.04.2013, at 08:39, Bharat Bhushan wrote:
 
  Kernel should only try flushing pages which are managed by kernel.
  pfn_to_page will returns junk struct page for pages not managed by
  kernel, so if kernel will try to flush direct mapped memory or direct
  assigned device mapping then it will work on junk struct page.
 
  Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
  ---
  arch/powerpc/kvm/e500_mmu_host.c |3 ++-
  1 files changed, 2 insertions(+), 1 deletions(-)
 
  diff --git a/arch/powerpc/kvm/e500_mmu_host.c
  b/arch/powerpc/kvm/e500_mmu_host.c
  index 1c6a9d7..e07da21 100644
  --- a/arch/powerpc/kvm/e500_mmu_host.c
  +++ b/arch/powerpc/kvm/e500_mmu_host.c
  @@ -455,7 +455,8 @@ static inline int kvmppc_e500_shadow_map(struct
 kvmppc_vcpu_e500 *vcpu_e500,
  ref, gvaddr, stlbe);
 
  /* Clear i-cache for new pages */
  -   kvmppc_mmu_flush_icache(pfn);
  +   if (pfn_valid(pfn))
  +   kvmppc_mmu_flush_icache(pfn);
 
 Could you please move the check into kvmppc_mmu_flush_icache()? That way we're
 guaranteed we can't screw up cache flushes ever :).
 
 Also, please add a comment saying why we need this.

Ok

-Bharat

 
 
 Alex
 
 
  /* Drop refcount on page, so that mmu notifiers can clear it */
  kvm_release_pfn_clean(pfn);
  --
  1.7.0.4
 
 
 


--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH] KVM/PPC: emulate ehpriv

2013-04-19 Thread Bhushan Bharat-R65777


 -Original Message-
 From: Alexander Graf [mailto:ag...@suse.de]
 Sent: Friday, April 19, 2013 5:44 PM
 To: Tiejun Chen
 Cc: k...@vger.kernel.org mailing list; kvm-ppc@vger.kernel.org; Bhushan 
 Bharat-
 R65777
 Subject: Re: [PATCH] KVM/PPC: emulate ehpriv
 
 
 On 19.04.2013, at 04:44, Tiejun Chen wrote:
 
  We can provide this emulation to simplify more extension later.
 
 Works for me, but this should really be part of a series that makes use of
 ehpriv.

Alex, this already planned to be in my debug patches. I know you are busy and I 
am just waiting for other patches to be reviewed :) 

-Bharat

 
 
 Alex
 
 
  Signed-off-by: Tiejun Chen tiejun.c...@windriver.com
  ---
  arch/powerpc/include/asm/disassemble.h |4 
  arch/powerpc/kvm/e500_emulate.c|   17 +
  2 files changed, 21 insertions(+)
 
  diff --git a/arch/powerpc/include/asm/disassemble.h
  b/arch/powerpc/include/asm/disassemble.h
  index 9b198d1..856f8de 100644
  --- a/arch/powerpc/include/asm/disassemble.h
  +++ b/arch/powerpc/include/asm/disassemble.h
  @@ -77,4 +77,8 @@ static inline unsigned int get_d(u32 inst)
  return inst  0x;
  }
 
  +static inline unsigned int get_oc(u32 inst) {
  +   return (inst  11)  0x7fff;
  +}
  #endif /* __ASM_PPC_DISASSEMBLE_H__ */ diff --git
  a/arch/powerpc/kvm/e500_emulate.c b/arch/powerpc/kvm/e500_emulate.c
  index e78f353..36492cf 100644
  --- a/arch/powerpc/kvm/e500_emulate.c
  +++ b/arch/powerpc/kvm/e500_emulate.c
  @@ -26,6 +26,7 @@
  #define XOP_TLBRE   946
  #define XOP_TLBWE   978
  #define XOP_TLBILX  18
  +#define XOP_EHPRIV  270
 
  #ifdef CONFIG_KVM_E500MC
  static int dbell2prio(ulong param)
  @@ -80,6 +81,18 @@ static int kvmppc_e500_emul_msgsnd(struct kvm_vcpu
  *vcpu, int rb)
 
  return EMULATE_DONE;
  }
  +
  +static int kvmppc_e500_emul_ehpriv(struct kvm_run *run, struct kvm_vcpu
 *vcpu,
  +   unsigned int inst)
  +{
  +   int emulated = EMULATE_DONE;
  +
  +   switch (get_oc(inst)) {
  +   default:
  +   emulated = EMULATE_FAIL;
  +   }
  +   return emulated;
  +}
  #endif
 
  int kvmppc_core_emulate_op(struct kvm_run *run, struct kvm_vcpu *vcpu,
  @@ -130,6 +143,10 @@ int kvmppc_core_emulate_op(struct kvm_run *run, struct
 kvm_vcpu *vcpu,
  emulated = kvmppc_e500_emul_tlbivax(vcpu, ea);
  break;
 
  +   case XOP_EHPRIV:
  +   emulated = kvmppc_e500_emul_ehpriv(run, vcpu, inst);
  +   break;
  +
  default:
  emulated = EMULATE_FAIL;
  }
  --
  1.7.9.5
 
 


--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH] bookehv: Handle debug exception on guest exit

2013-04-05 Thread Bhushan Bharat-R65777
Hi Kumar/Benh,

After further looking into the code I think that if we correct the vector range 
below in DebugDebug handler then we do not need the change I provided in this 
patch.

Here is the snapshot for 32 bit (head_booke.h, same will be true for 64 bit):

#define DEBUG_DEBUG_EXCEPTION \
START_EXCEPTION(DebugDebug);  \
DEBUG_EXCEPTION_PROLOG;   \
  \
/*\
 * If there is a single step or branch-taken exception in an  \
 * exception entry sequence, it was probably meant to apply to\
 * the code where the exception occurred (since exception entry   \
 * doesn't turn off DE automatically).  We simulate the effect\
 * of turning off DE on entry to an exception handler by turning  \
 * off DE in the DSRR1 value and clearing the debug status.   \
 */   \
mfspr   r10,SPRN_DBSR;  /* check single-step/branch taken */  \
andis.  r10,r10,(DBSR_IC|DBSR_BT)@h;  \
beq+2f;   \
  \
lis r10,KERNELBASE@h;   /* check if exception in vectors */   \
ori r10,r10,KERNELBASE@l; \
cmplw   r12,r10;  \
blt+2f; /* addr below exception vectors */\
  \
lis r10,DebugDebug@h;\
ori r10,r10,DebugDebug@l;   
 \


Here we assume all exception vector ends at DebugDebug, which is not 
correct.
We probably should get proper end by using some start_vector and 
end_vector lebels
or at least use end at Ehvpriv (which is last defined in 
head_fsl_booke.S for PowerPC. Is that correct?


cmplw   r12,r10;  \
bgt+2f; /* addr above exception vectors */\

Thanks
-Bharat


 -Original Message-
 From: kvm-ppc-ow...@vger.kernel.org [mailto:kvm-ppc-ow...@vger.kernel.org] On
 Behalf Of Bhushan Bharat-R65777
 Sent: Thursday, April 04, 2013 8:29 PM
 To: Alexander Graf
 Cc: linuxppc-...@lists.ozlabs.org; k...@vger.kernel.org; 
 kvm-ppc@vger.kernel.org;
 Wood Scott-B07421
 Subject: RE: [PATCH] bookehv: Handle debug exception on guest exit
 
 
 
  -Original Message-
  From: Alexander Graf [mailto:ag...@suse.de]
  Sent: Thursday, April 04, 2013 6:55 PM
  To: Bhushan Bharat-R65777
  Cc: linuxppc-...@lists.ozlabs.org; k...@vger.kernel.org;
  kvm-ppc@vger.kernel.org; Wood Scott-B07421; Bhushan Bharat-R65777
  Subject: Re: [PATCH] bookehv: Handle debug exception on guest exit
 
 
  On 20.03.2013, at 18:45, Bharat Bhushan wrote:
 
   EPCR.DUVD controls whether the debug events can come in hypervisor
   mode or not. When KVM guest is using the debug resource then we do
   not want debug events to be captured in guest entry/exit path. So we
   set EPCR.DUVD when entering and clears EPCR.DUVD when exiting from guest.
  
   Debug instruction complete is a post-completion debug exception but
   debug event gets posted on the basis of MSR before the instruction
   is executed. Now if the instruction switches the context from guest
   mode (MSR.GS = 1) to hypervisor mode (MSR.GS = 0) then the xSRR0
   points to first instruction of KVM handler and xSRR1 points that
   MSR.GS is clear (hypervisor context). Now as xSRR1.GS is used to
   decide whether KVM handler will be invoked to handle the exception
   or host host kernel debug handler will be invoked to handle the exception.
   This leads to host kernel debug handler handling the exception which
   should either be handled by KVM.
  
   This is tested on e500mc in 32 bit mode
  
   Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
   ---
   v0:
   - Do not apply this change for debug_crit as we do not know those
   chips have
  issue or not.
   - corrected 64bit case branching
  
   arch/powerpc/kernel/exceptions-64e.S |   29 -
   arch/powerpc/kernel/head_booke.h |   26 ++
   2 files changed, 54 insertions(+), 1 deletions(-)
  
   diff --git a/arch/powerpc/kernel/exceptions-64e.S
   b/arch/powerpc/kernel/exceptions-64e.S
   index 4684e33..8b26294 100644
   --- a/arch/powerpc/kernel/exceptions-64e.S
   +++ b/arch/powerpc/kernel

RE: [PATCH] bookehv: Handle debug exception on guest exit

2013-04-04 Thread Bhushan Bharat-R65777


 -Original Message-
 From: Alexander Graf [mailto:ag...@suse.de]
 Sent: Thursday, April 04, 2013 6:55 PM
 To: Bhushan Bharat-R65777
 Cc: linuxppc-...@lists.ozlabs.org; k...@vger.kernel.org; 
 kvm-ppc@vger.kernel.org;
 Wood Scott-B07421; Bhushan Bharat-R65777
 Subject: Re: [PATCH] bookehv: Handle debug exception on guest exit
 
 
 On 20.03.2013, at 18:45, Bharat Bhushan wrote:
 
  EPCR.DUVD controls whether the debug events can come in hypervisor
  mode or not. When KVM guest is using the debug resource then we do not
  want debug events to be captured in guest entry/exit path. So we set
  EPCR.DUVD when entering and clears EPCR.DUVD when exiting from guest.
 
  Debug instruction complete is a post-completion debug exception but
  debug event gets posted on the basis of MSR before the instruction is
  executed. Now if the instruction switches the context from guest mode
  (MSR.GS = 1) to hypervisor mode (MSR.GS = 0) then the xSRR0 points to
  first instruction of KVM handler and xSRR1 points that MSR.GS is clear
  (hypervisor context). Now as xSRR1.GS is used to decide whether KVM
  handler will be invoked to handle the exception or host host kernel
  debug handler will be invoked to handle the exception.
  This leads to host kernel debug handler handling the exception which
  should either be handled by KVM.
 
  This is tested on e500mc in 32 bit mode
 
  Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
  ---
  v0:
  - Do not apply this change for debug_crit as we do not know those chips have
 issue or not.
  - corrected 64bit case branching
 
  arch/powerpc/kernel/exceptions-64e.S |   29 -
  arch/powerpc/kernel/head_booke.h |   26 ++
  2 files changed, 54 insertions(+), 1 deletions(-)
 
  diff --git a/arch/powerpc/kernel/exceptions-64e.S
  b/arch/powerpc/kernel/exceptions-64e.S
  index 4684e33..8b26294 100644
  --- a/arch/powerpc/kernel/exceptions-64e.S
  +++ b/arch/powerpc/kernel/exceptions-64e.S
  @@ -516,6 +516,33 @@ kernel_dbg_exc:
  andis.  r15,r14,DBSR_IC@h
  beq+1f
 
  +#ifdef CONFIG_KVM_BOOKE_HV
  +   /*
  +* EPCR.DUVD controls whether the debug events can come in
  +* hypervisor mode or not. When KVM guest is using the debug
  +* resource then we do not want debug events to be captured
  +* in guest entry/exit path. So we set EPCR.DUVD when entering
  +* and clears EPCR.DUVD when exiting from guest.
  +* Debug instruction complete is a post-completion debug
  +* exception but debug event gets posted on the basis of MSR
  +* before the instruction is executed. Now if the instruction
  +* switches the context from guest mode (MSR.GS = 1) to hypervisor
  +* mode (MSR.GS = 0) then the xSRR0 points to first instruction of
 
 Can't we just execute that code path with MSR.DE=0?

Single stepping uses DBCR0.IC (instruction complete).
Can you describe how MSR.DE = 0 will work?

 
 
 Alex
 
  +* KVM handler and xSRR1 points that MSR.GS is clear
  +* (hypervisor context). Now as xSRR1.GS is used to decide whether
  +* KVM handler will be invoked to handle the exception or host
  +* host kernel debug handler will be invoked to handle the exception.
  +* This leads to host kernel debug handler handling the exception
  +* which should either be handled by KVM.
  +*/
  +   mfspr   r10, SPRN_EPCR
  +   andis.  r10,r10,SPRN_EPCR_DUVD@h
  +   beq+2f
  +
  +   andis.  r10,r9,MSR_GS@h
  +   beq+3f
  +2:
  +#endif
  LOAD_REG_IMMEDIATE(r14,interrupt_base_book3e)
  LOAD_REG_IMMEDIATE(r15,interrupt_end_book3e)
  cmpld   cr0,r10,r14
  @@ -523,7 +550,7 @@ kernel_dbg_exc:
  blt+cr0,1f
  bge+cr1,1f
 
  -   /* here it looks like we got an inappropriate debug exception. */
  +3: /* here it looks like we got an inappropriate debug exception. */
  lis r14,DBSR_IC@h   /* clear the IC event */
  rlwinm  r11,r11,0,~MSR_DE   /* clear DE in the DSRR1 value */
  mtspr   SPRN_DBSR,r14
  diff --git a/arch/powerpc/kernel/head_booke.h
  b/arch/powerpc/kernel/head_booke.h
  index 5f051ee..edc6a3b 100644
  --- a/arch/powerpc/kernel/head_booke.h
  +++ b/arch/powerpc/kernel/head_booke.h
  @@ -285,7 +285,33 @@ label:
  mfspr   r10,SPRN_DBSR;  /* check single-step/branch taken */  \
  andis.  r10,r10,(DBSR_IC|DBSR_BT)@h;  \
  beq+2f;   \
  +#ifdef CONFIG_KVM_BOOKE_HV   \
  +   /*\
  +* EPCR.DUVD controls whether the debug events can come in\
  +* hypervisor mode or not. When KVM guest is using the debug  \
  +* resource then we do not want debug events to be captured   \
  +* in guest entry/exit path. So we set EPCR.DUVD when entering\
  +* and clears

RE: [PATCH 4/4 v2] KVM: PPC: Add userspace debug stub support

2013-04-03 Thread Bhushan Bharat-R65777


 -Original Message-
 From: Wood Scott-B07421
 Sent: Tuesday, April 02, 2013 11:30 PM
 To: Bhushan Bharat-R65777
 Cc: Alexander Graf; kvm-ppc@vger.kernel.org; k...@vger.kernel.org; Wood Scott-
 B07421
 Subject: Re: [PATCH 4/4 v2] KVM: PPC: Add userspace debug stub support
 
 On 04/02/2013 09:09:34 AM, Bhushan Bharat-R65777 wrote:
 
 
   -Original Message-
   From: Alexander Graf [mailto:ag...@suse.de]
   Sent: Tuesday, April 02, 2013 1:57 PM
   To: Bhushan Bharat-R65777
   Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; Wood Scott-B07421
   Subject: Re: [PATCH 4/4 v2] KVM: PPC: Add userspace debug stub
  support
  
  
   On 29.03.2013, at 07:04, Bhushan Bharat-R65777 wrote:
  
   
   
-Original Message-
From: Alexander Graf [mailto:ag...@suse.de]
Sent: Thursday, March 28, 2013 10:06 PM
To: Bhushan Bharat-R65777
Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; Wood
  Scott-B07421;
Bhushan
Bharat-R65777
Subject: Re: [PATCH 4/4 v2] KVM: PPC: Add userspace debug stub
support
   
   
How does the normal debug register switching code work in Linux?
Can't we just reuse that? Or rely on it to restore working state
  when
another process gets scheduled in?
   
Good point, I can see debug registers loading in function
  __switch_to()-
   switch_booke_debug_regs() in file arch/powerpc/kernel/process.c.
So as long as assume that host will not use debug resources we
  can rely on
   this restore. But I am not sure that this is a fare assumption. As
  Scott earlier
   mentioned someone can use debug resource for kernel debugging also.
  
   Someone in the kernel can also use floating point registers. But
  then it's his
   responsibility to clean up the mess he leaves behind.
 
  I am neither convinced by what you said and nor even have much reason
  to oppose :)
 
  Scott,
  I remember you mentioned that host can use debug resources, you
  comment on this ?
 
 I thought the conclusion we reached was that it was OK as long as KVM waits
 until it actually needs the debug resources to mess with the registers.

Right,  Are we also agreeing on that KVM will not save/restore host debug 
context on vcpu_load/vcpu_put()? KVM will load its context in vcpu_load() if 
needed and on vcpu_put() it will clear DBCR0 and DBSR.

Thanks
-Bharat




--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 4/4 v2] KVM: PPC: Add userspace debug stub support

2013-04-03 Thread Bhushan Bharat-R65777


 -Original Message-
 From: Alexander Graf [mailto:ag...@suse.de]
 Sent: Wednesday, April 03, 2013 7:39 PM
 To: Bhushan Bharat-R65777
 Cc: Wood Scott-B07421; kvm-ppc@vger.kernel.org; k...@vger.kernel.org
 Subject: Re: [PATCH 4/4 v2] KVM: PPC: Add userspace debug stub support
 
 
 On 03.04.2013, at 15:50, Bhushan Bharat-R65777 wrote:
 
 
 
  -Original Message-
  From: kvm-ppc-ow...@vger.kernel.org
  [mailto:kvm-ppc-ow...@vger.kernel.org] On Behalf Of Alexander Graf
  Sent: Wednesday, April 03, 2013 3:58 PM
  To: Bhushan Bharat-R65777
  Cc: Wood Scott-B07421; kvm-ppc@vger.kernel.org; k...@vger.kernel.org
  Subject: Re: [PATCH 4/4 v2] KVM: PPC: Add userspace debug stub
  support
 
 
 
  Am 03.04.2013 um 12:03 schrieb Bhushan Bharat-R65777 
  r65...@freescale.com:
 
 
 
  -Original Message-
  From: Wood Scott-B07421
  Sent: Tuesday, April 02, 2013 11:30 PM
  To: Bhushan Bharat-R65777
  Cc: Alexander Graf; kvm-ppc@vger.kernel.org; k...@vger.kernel.org;
  Wood Scott-
  B07421
  Subject: Re: [PATCH 4/4 v2] KVM: PPC: Add userspace debug stub
  support
 
  On 04/02/2013 09:09:34 AM, Bhushan Bharat-R65777 wrote:
 
 
  -Original Message-
  From: Alexander Graf [mailto:ag...@suse.de]
  Sent: Tuesday, April 02, 2013 1:57 PM
  To: Bhushan Bharat-R65777
  Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; Wood
  Scott-B07421
  Subject: Re: [PATCH 4/4 v2] KVM: PPC: Add userspace debug stub
  support
 
 
  On 29.03.2013, at 07:04, Bhushan Bharat-R65777 wrote:
 
 
 
  -Original Message-
  From: Alexander Graf [mailto:ag...@suse.de]
  Sent: Thursday, March 28, 2013 10:06 PM
  To: Bhushan Bharat-R65777
  Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; Wood
  Scott-B07421;
  Bhushan
  Bharat-R65777
  Subject: Re: [PATCH 4/4 v2] KVM: PPC: Add userspace debug stub
  support
 
 
  How does the normal debug register switching code work in Linux?
  Can't we just reuse that? Or rely on it to restore working
  state
  when
  another process gets scheduled in?
 
  Good point, I can see debug registers loading in function
  __switch_to()-
  switch_booke_debug_regs() in file arch/powerpc/kernel/process.c.
  So as long as assume that host will not use debug resources we
  can rely on
  this restore. But I am not sure that this is a fare assumption.
  As
  Scott earlier
  mentioned someone can use debug resource for kernel debugging also.
 
  Someone in the kernel can also use floating point registers. But
  then it's his
  responsibility to clean up the mess he leaves behind.
 
  I am neither convinced by what you said and nor even have much
  reason to oppose :)
 
  Scott,
I remember you mentioned that host can use debug resources, you
  comment on this ?
 
  I thought the conclusion we reached was that it was OK as long as
  KVM waits until it actually needs the debug resources to mess with the
 registers.
 
  Right,  Are we also agreeing on that KVM will not save/restore host
  debug
  context on vcpu_load/vcpu_put()? KVM will load its context in
  vcpu_load() if needed and on vcpu_put() it will clear DBCR0 and DBSR.
 
  That depends on whether the kernel restores the debug registers.
  Please double- check that.
 
  Currently the kernel code restore the debug state of new schedule process in
 context_switch().
 
  switch_booke_debug_regs() from __switch_to() and defined as :
  /*
  * Unless neither the old or new thread are making use of the
  * debug registers, set the debug registers from the values
  * stored in the new thread.
  */
  static void switch_booke_debug_regs(struct thread_struct *new_thread)
  {
 if ((current-thread.dbcr0  DBCR0_IDM)
 || (new_thread-dbcr0  DBCR0_IDM))
 prime_debug_regs(new_thread); }
 
  static void prime_debug_regs(struct thread_struct *thread) {
 mtspr(SPRN_IAC1, thread-iac1);
 mtspr(SPRN_IAC2, thread-iac2); #if CONFIG_PPC_ADV_DEBUG_IACS 
  2
 mtspr(SPRN_IAC3, thread-iac3);
 mtspr(SPRN_IAC4, thread-iac4); #endif
 mtspr(SPRN_DAC1, thread-dac1);
 mtspr(SPRN_DAC2, thread-dac2); #if CONFIG_PPC_ADV_DEBUG_DVCS 
  0
 mtspr(SPRN_DVC1, thread-dvc1);
 mtspr(SPRN_DVC2, thread-dvc2); #endif
 mtspr(SPRN_DBCR0, thread-dbcr0);
 mtspr(SPRN_DBCR1, thread-dbcr1); #ifdef CONFIG_BOOKE
 mtspr(SPRN_DBCR2, thread-dbcr2); #endif } This is analogous to
  moving from guest to/from QEMU. so we can make prime_debug_regs() available 
  to
 kvm code for heavyweight_exit. And vcpu_load() will load guest state and save
 host state (update thread-debug_registers).
 
 I don't think we need to do anything on vcpu_load if we just swap the thread-
 debug_registers. Just make sure to restore them before you return from a 
 heavy
 weight exit.

My understanding is :

1)
When VCPU is running  - h/w debug registers have vcpu-arch.debug_registers
Goes for heavyweight_exit - h/w debug registers are loaded with 
thread-debug_registers
Return from

RE: [PATCH 4/4 v2] KVM: PPC: Add userspace debug stub support

2013-04-03 Thread Bhushan Bharat-R65777


 -Original Message-
 From: Alexander Graf [mailto:ag...@suse.de]
 Sent: Tuesday, April 02, 2013 9:11 PM
 To: Bhushan Bharat-R65777
 Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; Wood Scott-B07421
 Subject: Re: [PATCH 4/4 v2] KVM: PPC: Add userspace debug stub support
 
 On 04/02/2013 04:09 PM, Bhushan Bharat-R65777 wrote:
 
  -Original Message-
  From: Alexander Graf [mailto:ag...@suse.de]
  Sent: Tuesday, April 02, 2013 1:57 PM
  To: Bhushan Bharat-R65777
  Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; Wood Scott-B07421
  Subject: Re: [PATCH 4/4 v2] KVM: PPC: Add userspace debug stub
  support
 
 
  On 29.03.2013, at 07:04, Bhushan Bharat-R65777 wrote:
 
 
  -Original Message-
  From: Alexander Graf [mailto:ag...@suse.de]
  Sent: Thursday, March 28, 2013 10:06 PM
  To: Bhushan Bharat-R65777
  Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; Wood
  Scott-B07421; Bhushan
  Bharat-R65777
  Subject: Re: [PATCH 4/4 v2] KVM: PPC: Add userspace debug stub
  support
 
 
  On 21.03.2013, at 07:25, Bharat Bhushan wrote:
 
  From: Bharat Bhushanbharat.bhus...@freescale.com
 
  This patch adds the debug stub support on booke/bookehv.
  Now QEMU debug stub can use hw breakpoint, watchpoint and software
  breakpoint to debug guest.
 
  Debug registers are saved/restored on vcpu_put()/vcpu_get().
  Also the debug registers are saved restored only if guest is using
  debug resources.
 
  Signed-off-by: Bharat Bhushanbharat.bhus...@freescale.com
  ---
  v2:
  - save/restore in vcpu_get()/vcpu_put()
  - some more minor cleanup based on review comments.
 
  arch/powerpc/include/asm/kvm_host.h |   10 ++
  arch/powerpc/include/uapi/asm/kvm.h |   22 +++-
  arch/powerpc/kvm/booke.c|  252
 -
  --
  arch/powerpc/kvm/e500_emulate.c |   10 ++
  4 files changed, 272 insertions(+), 22 deletions(-)
 
  diff --git a/arch/powerpc/include/asm/kvm_host.h
  b/arch/powerpc/include/asm/kvm_host.h
  index f4ba881..8571952 100644
  --- a/arch/powerpc/include/asm/kvm_host.h
  +++ b/arch/powerpc/include/asm/kvm_host.h
  @@ -504,7 +504,17 @@ struct kvm_vcpu_arch {
  u32 mmucfg;
  u32 epr;
  u32 crit_save;
  +   /* guest debug registers*/
  struct kvmppc_booke_debug_reg dbg_reg;
  +   /* shadow debug registers */
  +   struct kvmppc_booke_debug_reg shadow_dbg_reg;
  +   /* host debug registers*/
  +   struct kvmppc_booke_debug_reg host_dbg_reg;
  +   /*
  +* Flag indicating that debug registers are used by guest
  +* and requires save restore.
  +   */
  +   bool debug_save_restore;
  #endif
  gpa_t paddr_accessed;
  gva_t vaddr_accessed;
  diff --git a/arch/powerpc/include/uapi/asm/kvm.h
  b/arch/powerpc/include/uapi/asm/kvm.h
  index 15f9a00..d7ce449 100644
  --- a/arch/powerpc/include/uapi/asm/kvm.h
  +++ b/arch/powerpc/include/uapi/asm/kvm.h
  @@ -25,6 +25,7 @@
  /* Select powerpc specific features inlinux/kvm.h  */ #define
  __KVM_HAVE_SPAPR_TCE #define __KVM_HAVE_PPC_SMT
  +#define __KVM_HAVE_GUEST_DEBUG
 
  struct kvm_regs {
  __u64 pc;
  @@ -267,7 +268,24 @@ struct kvm_fpu {
  __u64 fpr[32];
  };
 
  +/*
  + * Defines for h/w breakpoint, watchpoint (read, write or both)
  +and
  + * software breakpoint.
  + * These are used as type in KVM_SET_GUEST_DEBUG ioctl and status
  + * for KVM_DEBUG_EXIT.
  + */
  +#define KVMPPC_DEBUG_NONE  0x0
  +#define KVMPPC_DEBUG_BREAKPOINT(1UL  1)
  +#define KVMPPC_DEBUG_WATCH_WRITE   (1UL  2)
  +#define KVMPPC_DEBUG_WATCH_READ(1UL  3)
  struct kvm_debug_exit_arch {
  +   __u64 address;
  +   /*
  +* exiting to userspace because of h/w breakpoint, watchpoint
  +* (read, write or both) and software breakpoint.
  +*/
  +   __u32 status;
  +   __u32 reserved;
  };
 
  /* for KVM_SET_GUEST_DEBUG */
  @@ -279,10 +297,6 @@ struct kvm_guest_debug_arch {
   * Type denotes h/w breakpoint, read watchpoint, write
   * watchpoint or watchpoint (both read and write).
   */
  -#define KVMPPC_DEBUG_NOTYPE0x0
  -#define KVMPPC_DEBUG_BREAKPOINT(1UL  1)
  -#define KVMPPC_DEBUG_WATCH_WRITE   (1UL  2)
  -#define KVMPPC_DEBUG_WATCH_READ(1UL  3)
  __u32 type;
  __u32 reserved;
  } bp[16];
  diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
  index
  1de93a8..bf20056 100644
  --- a/arch/powerpc/kvm/booke.c
  +++ b/arch/powerpc/kvm/booke.c
  @@ -133,6 +133,30 @@ static void kvmppc_vcpu_sync_fpu(struct
  kvm_vcpu
  *vcpu) #endif }
 
  +static void kvmppc_vcpu_sync_debug(struct kvm_vcpu *vcpu) {
  +   /* Synchronize guest's desire to get debug interrupts into
  +shadow MSR */ #ifndef CONFIG_KVM_BOOKE_HV
  +   vcpu-arch.shadow_msr= ~MSR_DE;
  +   vcpu-arch.shadow_msr |= vcpu-arch.shared-msr

RE: [PATCH 4/4 v2] KVM: PPC: Add userspace debug stub support

2013-04-03 Thread Bhushan Bharat-R65777
  + dbg_reg =(vcpu-arch.shadow_dbg_reg);
  +
  + /*
  +  * On BOOKE (e500v2); Set DBCR1 and DBCR2 to allow debug events
  +  * to occur when MSR.PR is set.
  +  * On BOOKE-HV (e500mc+); MSR.PR = 0 when guest is running. So 
  we
  +  * should clear DBCR1 and DBCR2.
  +  */
  +#ifdef CONFIG_KVM_BOOKE_HV
  + dbg_reg-dbcr1 = 0;
  + dbg_reg-dbcr2 = 0;
  Does that mean we can't debug guest user space?
  Yes
  This is wrong.
  Really, So far I am assuming qemu debug stub is not mean for
  debugging guest
  application.
 
  Ok, let me rephrase: This is confusing. You do trap in PR mode on
  e500v2. IIRC
  x86 also traps in kernel and user space. I don't see why e500 hv
  should be different.
 
  I am sorry, I think did not read the document correctly.
 
  DBCR1 = 0 ; means the 00 IAC1 debug conditions unaffected by 
  MSR[PR],MSR[GS].
 
  Similarly for dbcr2.
 
  So yes the guest user space can be debugged.
 
 So why is this conditional on BOOKE_HV then? Wouldn't it make things easier to
 treat HV and PR identical?
 

On BOOKE-HV we have to keep these to 0, so guest and guest application both can 
be debugged. Also on HV we have EPCR.DUVD to control that debug events will not 
come in hypervisor (GS = 0).

On BOOKE; guest and guest application both runs in PR = 1 and hypervisor in PR 
= 0. So with dbcr1/dbcr2 on booke we control debug exception not to come in 
hypervisor mode still allow guest and its application debugging.

Thanks
-Bharat





--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 4/4 v2] KVM: PPC: Add userspace debug stub support

2013-04-02 Thread Bhushan Bharat-R65777


 -Original Message-
 From: Alexander Graf [mailto:ag...@suse.de]
 Sent: Tuesday, April 02, 2013 1:57 PM
 To: Bhushan Bharat-R65777
 Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; Wood Scott-B07421
 Subject: Re: [PATCH 4/4 v2] KVM: PPC: Add userspace debug stub support
 
 
 On 29.03.2013, at 07:04, Bhushan Bharat-R65777 wrote:
 
 
 
  -Original Message-
  From: Alexander Graf [mailto:ag...@suse.de]
  Sent: Thursday, March 28, 2013 10:06 PM
  To: Bhushan Bharat-R65777
  Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; Wood Scott-B07421;
  Bhushan
  Bharat-R65777
  Subject: Re: [PATCH 4/4 v2] KVM: PPC: Add userspace debug stub
  support
 
 
  On 21.03.2013, at 07:25, Bharat Bhushan wrote:
 
  From: Bharat Bhushan bharat.bhus...@freescale.com
 
  This patch adds the debug stub support on booke/bookehv.
  Now QEMU debug stub can use hw breakpoint, watchpoint and software
  breakpoint to debug guest.
 
  Debug registers are saved/restored on vcpu_put()/vcpu_get().
  Also the debug registers are saved restored only if guest is using
  debug resources.
 
  Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
  ---
  v2:
  - save/restore in vcpu_get()/vcpu_put()
  - some more minor cleanup based on review comments.
 
  arch/powerpc/include/asm/kvm_host.h |   10 ++
  arch/powerpc/include/uapi/asm/kvm.h |   22 +++-
  arch/powerpc/kvm/booke.c|  252 
  -
 --
  arch/powerpc/kvm/e500_emulate.c |   10 ++
  4 files changed, 272 insertions(+), 22 deletions(-)
 
  diff --git a/arch/powerpc/include/asm/kvm_host.h
  b/arch/powerpc/include/asm/kvm_host.h
  index f4ba881..8571952 100644
  --- a/arch/powerpc/include/asm/kvm_host.h
  +++ b/arch/powerpc/include/asm/kvm_host.h
  @@ -504,7 +504,17 @@ struct kvm_vcpu_arch {
u32 mmucfg;
u32 epr;
u32 crit_save;
  + /* guest debug registers*/
struct kvmppc_booke_debug_reg dbg_reg;
  + /* shadow debug registers */
  + struct kvmppc_booke_debug_reg shadow_dbg_reg;
  + /* host debug registers*/
  + struct kvmppc_booke_debug_reg host_dbg_reg;
  + /*
  +  * Flag indicating that debug registers are used by guest
  +  * and requires save restore.
  + */
  + bool debug_save_restore;
  #endif
gpa_t paddr_accessed;
gva_t vaddr_accessed;
  diff --git a/arch/powerpc/include/uapi/asm/kvm.h
  b/arch/powerpc/include/uapi/asm/kvm.h
  index 15f9a00..d7ce449 100644
  --- a/arch/powerpc/include/uapi/asm/kvm.h
  +++ b/arch/powerpc/include/uapi/asm/kvm.h
  @@ -25,6 +25,7 @@
  /* Select powerpc specific features in linux/kvm.h */ #define
  __KVM_HAVE_SPAPR_TCE #define __KVM_HAVE_PPC_SMT
  +#define __KVM_HAVE_GUEST_DEBUG
 
  struct kvm_regs {
__u64 pc;
  @@ -267,7 +268,24 @@ struct kvm_fpu {
__u64 fpr[32];
  };
 
  +/*
  + * Defines for h/w breakpoint, watchpoint (read, write or both) and
  + * software breakpoint.
  + * These are used as type in KVM_SET_GUEST_DEBUG ioctl and status
  + * for KVM_DEBUG_EXIT.
  + */
  +#define KVMPPC_DEBUG_NONE0x0
  +#define KVMPPC_DEBUG_BREAKPOINT  (1UL  1)
  +#define KVMPPC_DEBUG_WATCH_WRITE (1UL  2)
  +#define KVMPPC_DEBUG_WATCH_READ  (1UL  3)
  struct kvm_debug_exit_arch {
  + __u64 address;
  + /*
  +  * exiting to userspace because of h/w breakpoint, watchpoint
  +  * (read, write or both) and software breakpoint.
  +  */
  + __u32 status;
  + __u32 reserved;
  };
 
  /* for KVM_SET_GUEST_DEBUG */
  @@ -279,10 +297,6 @@ struct kvm_guest_debug_arch {
 * Type denotes h/w breakpoint, read watchpoint, write
 * watchpoint or watchpoint (both read and write).
 */
  -#define KVMPPC_DEBUG_NOTYPE  0x0
  -#define KVMPPC_DEBUG_BREAKPOINT  (1UL  1)
  -#define KVMPPC_DEBUG_WATCH_WRITE (1UL  2)
  -#define KVMPPC_DEBUG_WATCH_READ  (1UL  3)
__u32 type;
__u32 reserved;
} bp[16];
  diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
  index
  1de93a8..bf20056 100644
  --- a/arch/powerpc/kvm/booke.c
  +++ b/arch/powerpc/kvm/booke.c
  @@ -133,6 +133,30 @@ static void kvmppc_vcpu_sync_fpu(struct
  kvm_vcpu
  *vcpu) #endif }
 
  +static void kvmppc_vcpu_sync_debug(struct kvm_vcpu *vcpu) {
  + /* Synchronize guest's desire to get debug interrupts into shadow
  +MSR */ #ifndef CONFIG_KVM_BOOKE_HV
  + vcpu-arch.shadow_msr = ~MSR_DE;
  + vcpu-arch.shadow_msr |= vcpu-arch.shared-msr  MSR_DE; #endif
  +
  + /* Force enable debug interrupts when user space wants to debug */
  + if (vcpu-guest_debug) {
  +#ifdef CONFIG_KVM_BOOKE_HV
  + /*
  +  * Since there is no shadow MSR, sync MSR_DE into the guest
  +  * visible MSR. Do not allow guest to change MSR[DE].
  +  */
  + vcpu-arch.shared-msr |= MSR_DE;
  + mtspr(SPRN_MSRP, mfspr(SPRN_MSRP) | MSRP_DEP);
 
  This mtspr should really just be a bit or in shadow_mspr when
  guest_debug gets enabled. It should automatically get synchronized as
  soon as the next

RE: [PATCH 4/4 v2] KVM: PPC: Add userspace debug stub support

2013-03-29 Thread Bhushan Bharat-R65777


 -Original Message-
 From: Alexander Graf [mailto:ag...@suse.de]
 Sent: Thursday, March 28, 2013 10:06 PM
 To: Bhushan Bharat-R65777
 Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; Wood Scott-B07421; Bhushan
 Bharat-R65777
 Subject: Re: [PATCH 4/4 v2] KVM: PPC: Add userspace debug stub support
 
 
 On 21.03.2013, at 07:25, Bharat Bhushan wrote:
 
  From: Bharat Bhushan bharat.bhus...@freescale.com
 
  This patch adds the debug stub support on booke/bookehv.
  Now QEMU debug stub can use hw breakpoint, watchpoint and software
  breakpoint to debug guest.
 
  Debug registers are saved/restored on vcpu_put()/vcpu_get().
  Also the debug registers are saved restored only if guest is using
  debug resources.
 
  Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
  ---
  v2:
  - save/restore in vcpu_get()/vcpu_put()
  - some more minor cleanup based on review comments.
 
  arch/powerpc/include/asm/kvm_host.h |   10 ++
  arch/powerpc/include/uapi/asm/kvm.h |   22 +++-
  arch/powerpc/kvm/booke.c|  252 
  ---
  arch/powerpc/kvm/e500_emulate.c |   10 ++
  4 files changed, 272 insertions(+), 22 deletions(-)
 
  diff --git a/arch/powerpc/include/asm/kvm_host.h
  b/arch/powerpc/include/asm/kvm_host.h
  index f4ba881..8571952 100644
  --- a/arch/powerpc/include/asm/kvm_host.h
  +++ b/arch/powerpc/include/asm/kvm_host.h
  @@ -504,7 +504,17 @@ struct kvm_vcpu_arch {
  u32 mmucfg;
  u32 epr;
  u32 crit_save;
  +   /* guest debug registers*/
  struct kvmppc_booke_debug_reg dbg_reg;
  +   /* shadow debug registers */
  +   struct kvmppc_booke_debug_reg shadow_dbg_reg;
  +   /* host debug registers*/
  +   struct kvmppc_booke_debug_reg host_dbg_reg;
  +   /*
  +* Flag indicating that debug registers are used by guest
  +* and requires save restore.
  +   */
  +   bool debug_save_restore;
  #endif
  gpa_t paddr_accessed;
  gva_t vaddr_accessed;
  diff --git a/arch/powerpc/include/uapi/asm/kvm.h
  b/arch/powerpc/include/uapi/asm/kvm.h
  index 15f9a00..d7ce449 100644
  --- a/arch/powerpc/include/uapi/asm/kvm.h
  +++ b/arch/powerpc/include/uapi/asm/kvm.h
  @@ -25,6 +25,7 @@
  /* Select powerpc specific features in linux/kvm.h */ #define
  __KVM_HAVE_SPAPR_TCE #define __KVM_HAVE_PPC_SMT
  +#define __KVM_HAVE_GUEST_DEBUG
 
  struct kvm_regs {
  __u64 pc;
  @@ -267,7 +268,24 @@ struct kvm_fpu {
  __u64 fpr[32];
  };
 
  +/*
  + * Defines for h/w breakpoint, watchpoint (read, write or both) and
  + * software breakpoint.
  + * These are used as type in KVM_SET_GUEST_DEBUG ioctl and status
  + * for KVM_DEBUG_EXIT.
  + */
  +#define KVMPPC_DEBUG_NONE  0x0
  +#define KVMPPC_DEBUG_BREAKPOINT(1UL  1)
  +#define KVMPPC_DEBUG_WATCH_WRITE   (1UL  2)
  +#define KVMPPC_DEBUG_WATCH_READ(1UL  3)
  struct kvm_debug_exit_arch {
  +   __u64 address;
  +   /*
  +* exiting to userspace because of h/w breakpoint, watchpoint
  +* (read, write or both) and software breakpoint.
  +*/
  +   __u32 status;
  +   __u32 reserved;
  };
 
  /* for KVM_SET_GUEST_DEBUG */
  @@ -279,10 +297,6 @@ struct kvm_guest_debug_arch {
   * Type denotes h/w breakpoint, read watchpoint, write
   * watchpoint or watchpoint (both read and write).
   */
  -#define KVMPPC_DEBUG_NOTYPE0x0
  -#define KVMPPC_DEBUG_BREAKPOINT(1UL  1)
  -#define KVMPPC_DEBUG_WATCH_WRITE   (1UL  2)
  -#define KVMPPC_DEBUG_WATCH_READ(1UL  3)
  __u32 type;
  __u32 reserved;
  } bp[16];
  diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c index
  1de93a8..bf20056 100644
  --- a/arch/powerpc/kvm/booke.c
  +++ b/arch/powerpc/kvm/booke.c
  @@ -133,6 +133,30 @@ static void kvmppc_vcpu_sync_fpu(struct kvm_vcpu
  *vcpu) #endif }
 
  +static void kvmppc_vcpu_sync_debug(struct kvm_vcpu *vcpu) {
  +   /* Synchronize guest's desire to get debug interrupts into shadow
  +MSR */ #ifndef CONFIG_KVM_BOOKE_HV
  +   vcpu-arch.shadow_msr = ~MSR_DE;
  +   vcpu-arch.shadow_msr |= vcpu-arch.shared-msr  MSR_DE; #endif
  +
  +   /* Force enable debug interrupts when user space wants to debug */
  +   if (vcpu-guest_debug) {
  +#ifdef CONFIG_KVM_BOOKE_HV
  +   /*
  +* Since there is no shadow MSR, sync MSR_DE into the guest
  +* visible MSR. Do not allow guest to change MSR[DE].
  +*/
  +   vcpu-arch.shared-msr |= MSR_DE;
  +   mtspr(SPRN_MSRP, mfspr(SPRN_MSRP) | MSRP_DEP);
 
 This mtspr should really just be a bit or in shadow_mspr when guest_debug gets
 enabled. It should automatically get synchronized as soon as the next
 vpcu_load() happens.

I think this is not required here as shadow_dbsr already have MSRP_DEP set.

Will setup shadow_msrp when setting guest_debug and clear shadow_msrp when 
guest_debug is cleared.
But that will also not be sufficient as it not sure when vcpu_load

RE: [PATCH 2/4 v2] KVM: PPC: debug stub interface parameter defined

2013-03-28 Thread Bhushan Bharat-R65777


 -Original Message-
 From: Alexander Graf [mailto:ag...@suse.de]
 Sent: Friday, March 29, 2013 7:26 AM
 To: Bhushan Bharat-R65777
 Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; Wood Scott-B07421; Bhushan
 Bharat-R65777
 Subject: Re: [PATCH 2/4 v2] KVM: PPC: debug stub interface parameter defined
 
 
 On 21.03.2013, at 07:24, Bharat Bhushan wrote:
 
  From: Bharat Bhushan bharat.bhus...@freescale.com
 
  This patch defines the interface parameter for KVM_SET_GUEST_DEBUG
  ioctl support. Follow up patches will use this for setting up hardware
  breakpoints, watchpoints and software breakpoints.
 
  Also kvm_arch_vcpu_ioctl_set_guest_debug() is brought one level below.
  This is because I am not sure what is required for book3s. So this
  ioctl behaviour will not change for book3s.
 
  Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
  ---
  v2:
  - No Change
 
  arch/powerpc/include/uapi/asm/kvm.h |   23 +++
  arch/powerpc/kvm/book3s.c   |6 ++
  arch/powerpc/kvm/booke.c|6 ++
  arch/powerpc/kvm/powerpc.c  |6 --
  4 files changed, 35 insertions(+), 6 deletions(-)
 
  diff --git a/arch/powerpc/include/uapi/asm/kvm.h
  b/arch/powerpc/include/uapi/asm/kvm.h
  index c2ff99c..15f9a00 100644
  --- a/arch/powerpc/include/uapi/asm/kvm.h
  +++ b/arch/powerpc/include/uapi/asm/kvm.h
  @@ -272,8 +272,31 @@ struct kvm_debug_exit_arch {
 
  /* for KVM_SET_GUEST_DEBUG */
  struct kvm_guest_debug_arch {
  +   struct {
  +   /* H/W breakpoint/watchpoint address */
  +   __u64 addr;
  +   /*
  +* Type denotes h/w breakpoint, read watchpoint, write
  +* watchpoint or watchpoint (both read and write).
  +*/
  +#define KVMPPC_DEBUG_NOTYPE0x0
  +#define KVMPPC_DEBUG_BREAKPOINT(1UL  1)
  +#define KVMPPC_DEBUG_WATCH_WRITE   (1UL  2)
  +#define KVMPPC_DEBUG_WATCH_READ(1UL  3)
 
 Are you sure you want to introduce these here, just to remove them again in a
 later patch?

Up to this patch the scope was limited to this structure. So for clarity I 
defined here and later the scope expands so moved out of this structure. I do 
not think this really matters, let me know how you want to see ?

-Bharat

 
 
 Alex
 


--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 6/7] Rename EMULATE_DO_PAPR to EMULATE_EXIT_USER

2013-03-14 Thread Bhushan Bharat-R65777


 -Original Message-
 From: kvm-ppc-ow...@vger.kernel.org [mailto:kvm-ppc-ow...@vger.kernel.org] On
 Behalf Of Alexander Graf
 Sent: Thursday, March 07, 2013 4:17 PM
 To: Wood Scott-B07421
 Cc: Bhushan Bharat-R65777; kvm-ppc@vger.kernel.org; k...@vger.kernel.org; 
 Bhushan
 Bharat-R65777
 Subject: Re: [PATCH 6/7] Rename EMULATE_DO_PAPR to EMULATE_EXIT_USER
 
 
 On 28.02.2013, at 17:53, Scott Wood wrote:
 
  On 02/28/2013 10:51:10 AM, Alexander Graf wrote:
  On 28.02.2013, at 17:31, Scott Wood wrote:
   On 02/27/2013 10:13:15 PM, Bharat Bhushan wrote:
   Instruction emulation return EMULATE_DO_PAPR when it requires exit
   to userspace on book3s. Similar return is required for booke.
   EMULATE_DO_PAPR reads out to be confusing so it is renamed to
   EMULATE_EXIT_USER.
   Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
   ---
   arch/powerpc/include/asm/kvm_ppc.h |2 +-
   arch/powerpc/kvm/book3s_emulate.c  |2 +-
   arch/powerpc/kvm/book3s_pr.c   |2 +-
   3 files changed, 3 insertions(+), 3 deletions(-) diff --git
   a/arch/powerpc/include/asm/kvm_ppc.h
   b/arch/powerpc/include/asm/kvm_ppc.h
   index 44a657a..8b81468 100644
   --- a/arch/powerpc/include/asm/kvm_ppc.h
   +++ b/arch/powerpc/include/asm/kvm_ppc.h
   @@ -44,7 +44,7 @@ enum emulation_result {
   EMULATE_DO_DCR,   /* kvm_run filled with DCR request */
   EMULATE_FAIL, /* can't emulate this instruction */
   EMULATE_AGAIN,/* something went wrong. go again */
   -   EMULATE_DO_PAPR,  /* kvm_run filled with PAPR request */
   +   EMULATE_EXIT_USER,/* emulation requires exit to user-space 
   */
   };
   extern int kvmppc_vcpu_run(struct kvm_run *kvm_run, struct
   kvm_vcpu *vcpu); diff --git a/arch/powerpc/kvm/book3s_emulate.c
   b/arch/powerpc/kvm/book3s_emulate.c
   index 836c569..cdd19d6 100644
   --- a/arch/powerpc/kvm/book3s_emulate.c
   +++ b/arch/powerpc/kvm/book3s_emulate.c
   @@ -194,7 +194,7 @@ int kvmppc_core_emulate_op(struct kvm_run *run, 
   struct
 kvm_vcpu *vcpu,
   run-papr_hcall.args[i] = gpr;
   }
   -   emulated = EMULATE_DO_PAPR;
   +   emulated = EMULATE_EXIT_USER;
   break;
   }
   #endif
   diff --git a/arch/powerpc/kvm/book3s_pr.c
   b/arch/powerpc/kvm/book3s_pr.c index 73ed11c..8df2d2d 100644
   --- a/arch/powerpc/kvm/book3s_pr.c
   +++ b/arch/powerpc/kvm/book3s_pr.c
   @@ -760,7 +760,7 @@ program_interrupt:
   run-exit_reason = KVM_EXIT_MMIO;
   r = RESUME_HOST_NV;
   break;
   -   case EMULATE_DO_PAPR:
   +   case EMULATE_EXIT_USER:
   run-exit_reason = KVM_EXIT_PAPR_HCALL;
   vcpu-arch.hcall_needed = 1;
   r = RESUME_HOST_NV;
  
   I don't think it makes sense to genericize this.
  It makes sense if the run-exit_reason = ... and hcall_needed = ... lines 
  get
 pulled into the emulator.
 
  That would be fine.
 
 Bharat, did I miss a new patch version with that mess up there fixed?

Do you mean moving run-exit_reason = ... and vcpu-arch.hcall_needed = ... 
into arch/powerpc/kvm/book3s_emulate.c ? If yes, then no you did not miss :) as 
I have not sent.
I will send the new patch with other patches in the patch-set.

-Bharat

 
 
 Alex
 
 --
 To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body
 of a message to majord...@vger.kernel.org More majordomo info at
 http://vger.kernel.org/majordomo-info.html


--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 7/7] KVM: PPC: Add userspace debug stub support

2013-03-14 Thread Bhushan Bharat-R65777


 -Original Message-
 From: kvm-ppc-ow...@vger.kernel.org [mailto:kvm-ppc-ow...@vger.kernel.org] On
 Behalf Of Alexander Graf
 Sent: Thursday, March 14, 2013 5:20 PM
 To: Bhushan Bharat-R65777
 Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; Wood Scott-B07421
 Subject: Re: [PATCH 7/7] KVM: PPC: Add userspace debug stub support
 
 
 On 14.03.2013, at 06:18, Bhushan Bharat-R65777 wrote:
 
 
 
  -Original Message-
  From: Alexander Graf [mailto:ag...@suse.de]
  Sent: Thursday, March 07, 2013 7:09 PM
  To: Bhushan Bharat-R65777
  Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; Wood Scott-B07421;
  Bhushan
  Bharat-R65777
  Subject: Re: [PATCH 7/7] KVM: PPC: Add userspace debug stub support
 
 
  On 28.02.2013, at 05:13, Bharat Bhushan wrote:
 
  This patch adds the debug stub support on booke/bookehv.
  Now QEMU debug stub can use hw breakpoint, watchpoint and software
  breakpoint to debug guest.
 
  Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
  ---
  arch/powerpc/include/uapi/asm/kvm.h |   22 +-
  arch/powerpc/kvm/booke.c|  143 
  +++--
 -
  arch/powerpc/kvm/e500_emulate.c |6 ++
  arch/powerpc/kvm/e500mc.c   |3 +-
  4 files changed, 155 insertions(+), 19 deletions(-)
 
  diff --git a/arch/powerpc/include/uapi/asm/kvm.h
  b/arch/powerpc/include/uapi/asm/kvm.h
  index 15f9a00..d7ce449 100644
  --- a/arch/powerpc/include/uapi/asm/kvm.h
  +++ b/arch/powerpc/include/uapi/asm/kvm.h
  @@ -25,6 +25,7 @@
  /* Select powerpc specific features in linux/kvm.h */ #define
  __KVM_HAVE_SPAPR_TCE #define __KVM_HAVE_PPC_SMT
  +#define __KVM_HAVE_GUEST_DEBUG
 
  struct kvm_regs {
__u64 pc;
  @@ -267,7 +268,24 @@ struct kvm_fpu {
__u64 fpr[32];
  };
 
  +/*
  + * Defines for h/w breakpoint, watchpoint (read, write or both) and
  + * software breakpoint.
  + * These are used as type in KVM_SET_GUEST_DEBUG ioctl and status
  + * for KVM_DEBUG_EXIT.
  + */
  +#define KVMPPC_DEBUG_NONE0x0
  +#define KVMPPC_DEBUG_BREAKPOINT  (1UL  1)
  +#define KVMPPC_DEBUG_WATCH_WRITE (1UL  2)
  +#define KVMPPC_DEBUG_WATCH_READ  (1UL  3)
  struct kvm_debug_exit_arch {
  + __u64 address;
  + /*
  +  * exiting to userspace because of h/w breakpoint, watchpoint
  +  * (read, write or both) and software breakpoint.
  +  */
  + __u32 status;
  + __u32 reserved;
  };
 
  /* for KVM_SET_GUEST_DEBUG */
  @@ -279,10 +297,6 @@ struct kvm_guest_debug_arch {
 * Type denotes h/w breakpoint, read watchpoint, write
 * watchpoint or watchpoint (both read and write).
 */
  -#define KVMPPC_DEBUG_NOTYPE  0x0
  -#define KVMPPC_DEBUG_BREAKPOINT  (1UL  1)
  -#define KVMPPC_DEBUG_WATCH_WRITE (1UL  2)
  -#define KVMPPC_DEBUG_WATCH_READ  (1UL  3)
__u32 type;
__u32 reserved;
} bp[16];
  diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
  index
  1de93a8..21b0313 100644
  --- a/arch/powerpc/kvm/booke.c
  +++ b/arch/powerpc/kvm/booke.c
  @@ -133,6 +133,30 @@ static void kvmppc_vcpu_sync_fpu(struct
  kvm_vcpu
  *vcpu) #endif }
 
  +static void kvmppc_vcpu_sync_debug(struct kvm_vcpu *vcpu) {
  + /* Synchronize guest's desire to get debug interrupts into shadow
  +MSR */ #ifndef CONFIG_KVM_BOOKE_HV
  + vcpu-arch.shadow_msr = ~MSR_DE;
  + vcpu-arch.shadow_msr |= vcpu-arch.shared-msr  MSR_DE; #endif
  +
  + /* Force enable debug interrupts when user space wants to debug */
  + if (vcpu-guest_debug) {
  +#ifdef CONFIG_KVM_BOOKE_HV
  + /*
  +  * Since there is no shadow MSR, sync MSR_DE into the guest
  +  * visible MSR. Do not allow guest to change MSR[DE].
  +  */
  + vcpu-arch.shared-msr |= MSR_DE;
  + mtspr(SPRN_MSRP, mfspr(SPRN_MSRP) | MSRP_DEP); #else
  + vcpu-arch.shadow_msr |= MSR_DE;
  + vcpu-arch.shared-msr = ~MSR_DE; #endif
  + }
  +}
  +
  /*
  * Helper function for full MSR writes.  No need to call this if
  only
  * EE/CE/ME/DE/RI are changing.
  @@ -150,6 +174,7 @@ void kvmppc_set_msr(struct kvm_vcpu *vcpu, u32 
  new_msr)
kvmppc_mmu_msr_notify(vcpu, old_msr);
kvmppc_vcpu_sync_spe(vcpu);
kvmppc_vcpu_sync_fpu(vcpu);
  + kvmppc_vcpu_sync_debug(vcpu);
  }
 
  static void kvmppc_booke_queue_irqprio(struct kvm_vcpu *vcpu, @@
  -736,6 +761,13 @@ static int emulation_exit(struct kvm_run *run,
  struct
  kvm_vcpu *vcpu)
run-exit_reason = KVM_EXIT_DCR;
return RESUME_HOST;
 
  + case EMULATE_EXIT_USER:
  + run-exit_reason = KVM_EXIT_DEBUG;
  + run-debug.arch.address = vcpu-arch.pc;
  + run-debug.arch.status = 0;
  + kvmppc_account_exit(vcpu, DEBUG_EXITS);
 
  As mentioned previously, this is wrong and needs to go into the
  instruction emulation code for that opcode.
 
  ok
 
 
  + return RESUME_HOST;
  +
case EMULATE_FAIL:
printk(KERN_CRIT %s: emulation at %lx failed (%08x)\n

RE: [PATCH 4/7] booke: Save and restore debug registers on guest entry and exit

2013-03-14 Thread Bhushan Bharat-R65777
  -Original Message-
  From: Alexander Graf [mailto:ag...@suse.de]
  Sent: Thursday, March 07, 2013 6:56 PM
  To: Bhushan Bharat-R65777
  Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; Wood Scott-B07421;
  Bhushan
  Bharat-R65777
  Subject: Re: [PATCH 4/7] booke: Save and restore debug registers on
  guest entry and exit
 
 
  On 28.02.2013, at 05:13, Bharat Bhushan wrote:
 
  On Guest entry: if guest is wants to use the debug register then
  save h/w debug register in host_dbg_reg and load the debug registers
  with shadow_dbg_reg. Otherwise leave h/w debug registers as is.
 
  Why can't we switch the majority of registers on vcpu_put/get and
  only enable or disable debugging on guest entry/exit?
 
 
  One of the reason for not doing this is that the KVM is a host kernel
  module and let this be debugged by host (I do not this how much useful this 
  is
 :)) So I am not able to recall the specific reason, maybe we have just coded
 this like this and tried to keep overhead as low as possible by switching
 registers only when they are used.
 
 My point is that the overhead is _higher_ this way, because we need to do 
 checks
 and switches on every guest entry/exit, which happens a _lot_ more often than 
 a
 host context switch.
 
  As we discussed before, we can keep this option open for future.
 
 What future? Just ignore debug events in the entry/exit code path and 
 suddenly a
 lot of the code becomes a lot easier.

Just to summarize what we agreed upon:

- Save/restore will happen on vcpu_get()/vcpu_put(). This will happen only if 
guest is using debug registers. Probably using a flag to indicate guest is 
using debug APU. 
- On debug register access from QEMU, always set value in h/w debug register.
- On guest access of debug register, also save xxx h/w register in 
vcpu-host_debug_reg.xxx and load guest provided value in h/w debug register, 
ensure this happen on first access only, probably all debug registers once 
debug events enabled in dbcr0. Direct access from guest was not part of this 
patchset and support for this will be done separately.

Thanks
-Bharat

 
 
 Alex
 


--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 7/7] KVM: PPC: Add userspace debug stub support

2013-03-14 Thread Bhushan Bharat-R65777


 -Original Message-
 From: Wood Scott-B07421
 Sent: Thursday, March 14, 2013 9:36 PM
 To: Bhushan Bharat-R65777
 Cc: Alexander Graf; kvm-ppc@vger.kernel.org; k...@vger.kernel.org; Wood Scott-
 B07421
 Subject: Re: [PATCH 7/7] KVM: PPC: Add userspace debug stub support
 
 On 03/14/2013 08:57:53 AM, Bhushan Bharat-R65777 wrote:
diff --git a/arch/powerpc/kvm/e500mc.c
  b/arch/powerpc/kvm/e500mc.c
index 1f89d26..f5fc6f5 100644
--- a/arch/powerpc/kvm/e500mc.c
+++ b/arch/powerpc/kvm/e500mc.c
@@ -182,8 +182,7 @@ int kvmppc_core_vcpu_setup(struct kvm_vcpu
*vcpu) {
  struct kvmppc_vcpu_e500 *vcpu_e500 = to_e500(vcpu);
   
- vcpu-arch.shadow_epcr = SPRN_EPCR_DSIGS |
  SPRN_EPCR_DGTMI | \
-  SPRN_EPCR_DUVD;
+ vcpu-arch.shadow_epcr = SPRN_EPCR_DSIGS |
  SPRN_EPCR_DGTMI;
   
Doesn't this route all debug events through the host?
   
No; This means that debug events can occur in hypervisor state or
  not.
   
EPCR.DUVD = 0 ; Debug events can occur in the hypervisor state.
   
EPCR.DUVD = 1 ; Debug events cannot occur in the hypervisor state.
   
So we allow debug events to occur in hypervisor state.
  
   Why do we care about debug events in our entry/exit code and didn't
  care about
   them before?
 
  We care for single stepping in guest to not step in KVM code.
 
   If anything, this is a completely separate patch, orthogonal to this
   patch series, and requires a good bit of explanation.
 
  Not sure why you think separate patch; this patch add support for
  single stepping and also takes care that debug event does not comes in
  host when doing single stepping.
 
 How does *removing* DUVD ensure that?

By default we clear DUVD, so debug events can come in hypervisor state. But on 
lightweight exit, when restoring guest debug context, we set DUVD so the debug 
interrupt will not come in hypervisor state as debug resource are taken by 
guest.

On guest exit, when restoring the host context we clear DUVD so now debug 
resource are having host context.

With proposed change of save and restore on vcpu_get/vcpu_put this switching 
witching will be done in vcpu_get/set().

Thanks
-Bharat

 
 -Scott

--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 2/7] Added ONE_REG interface for debug instruction

2013-03-13 Thread Bhushan Bharat-R65777


 -Original Message-
 From: Alexander Graf [mailto:ag...@suse.de]
 Sent: Thursday, March 07, 2013 6:38 PM
 To: Bhushan Bharat-R65777
 Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; Wood Scott-B07421; Bhushan
 Bharat-R65777
 Subject: Re: [PATCH 2/7] Added ONE_REG interface for debug instruction
 
 
 On 28.02.2013, at 05:13, Bharat Bhushan wrote:
 
  This patch adds the one_reg interface to get the special instruction
  to be used for setting software breakpoint from userspace.
 
  Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
  ---
  Documentation/virtual/kvm/api.txt |1 +
  arch/powerpc/include/asm/kvm_book3s.h |1 +
  arch/powerpc/include/asm/kvm_booke.h  |2 ++
  arch/powerpc/include/uapi/asm/kvm.h   |4 
  arch/powerpc/kvm/book3s.c |6 ++
  arch/powerpc/kvm/booke.c  |6 ++
  6 files changed, 20 insertions(+), 0 deletions(-)
 
  diff --git a/Documentation/virtual/kvm/api.txt
  b/Documentation/virtual/kvm/api.txt
  index cce500a..dbfcc04 100644
  --- a/Documentation/virtual/kvm/api.txt
  +++ b/Documentation/virtual/kvm/api.txt
  @@ -1766,6 +1766,7 @@ registers, find a list below:
PPC   | KVM_REG_PPC_TSR   | 32
PPC   | KVM_REG_PPC_OR_TSR| 32
PPC   | KVM_REG_PPC_CLEAR_TSR | 32
  +  PPC   | KVM_REG_PPC_DEBUG_INST| 32
 
  4.69 KVM_GET_ONE_REG
 
  diff --git a/arch/powerpc/include/asm/kvm_book3s.h
  b/arch/powerpc/include/asm/kvm_book3s.h
  index 5a56e1c..36164cc 100644
  --- a/arch/powerpc/include/asm/kvm_book3s.h
  +++ b/arch/powerpc/include/asm/kvm_book3s.h
  @@ -458,6 +458,7 @@ static inline bool kvmppc_critical_section(struct 
  kvm_vcpu
 *vcpu)
  #define OSI_SC_MAGIC_R4 0x77810F9B
 
  #define INS_DCBZ0x7c0007ec
  +#define INS_TW 0x7c08
 
 This one should be trap, so TO needs to be 31. The instruction as it's here 
 is
 a nop if I read the spec correctly.

Yes I missed this.
BTW rather than setting TO = 31, what if we set TO = 2 as RA and RB is same 
here.

-Bharat

 
 Alex
 
 
  /* LPIDs we support with this build -- runtime limit may be lower */
  #define KVMPPC_NR_LPIDS (LPID_RSVD + 1)
  diff --git a/arch/powerpc/include/asm/kvm_booke.h
  b/arch/powerpc/include/asm/kvm_booke.h
  index b7cd335..d3c1eb3 100644
  --- a/arch/powerpc/include/asm/kvm_booke.h
  +++ b/arch/powerpc/include/asm/kvm_booke.h
  @@ -26,6 +26,8 @@
  /* LPIDs we support with this build -- runtime limit may be lower */
  #define KVMPPC_NR_LPIDS64
 
  +#define KVMPPC_INST_EHPRIV 0x7c00021c
  +
  static inline void kvmppc_set_gpr(struct kvm_vcpu *vcpu, int num,
  ulong val) {
  vcpu-arch.gpr[num] = val;
  diff --git a/arch/powerpc/include/uapi/asm/kvm.h
  b/arch/powerpc/include/uapi/asm/kvm.h
  index ef072b1..c2ff99c 100644
  --- a/arch/powerpc/include/uapi/asm/kvm.h
  +++ b/arch/powerpc/include/uapi/asm/kvm.h
  @@ -422,4 +422,8 @@ struct kvm_get_htab_header {
  #define KVM_REG_PPC_CLEAR_TSR   (KVM_REG_PPC | KVM_REG_SIZE_U32 | 0x88)
  #define KVM_REG_PPC_TCR (KVM_REG_PPC | KVM_REG_SIZE_U32 | 0x89)
  #define KVM_REG_PPC_TSR (KVM_REG_PPC | KVM_REG_SIZE_U32 | 0x8a)
  +
  +/* Debugging: Special instruction for software breakpoint */
  +#define KVM_REG_PPC_DEBUG_INST (KVM_REG_PPC | KVM_REG_SIZE_U32 | 0x8b)
  +
  #endif /* __LINUX_KVM_POWERPC_H */
  diff --git a/arch/powerpc/kvm/book3s.c b/arch/powerpc/kvm/book3s.c
  index a4b6452..975a401 100644
  --- a/arch/powerpc/kvm/book3s.c
  +++ b/arch/powerpc/kvm/book3s.c
  @@ -530,6 +530,12 @@ int kvm_vcpu_ioctl_get_one_reg(struct kvm_vcpu *vcpu,
 struct kvm_one_reg *reg)
  val = get_reg_val(reg-id, vcpu-arch.vscr.u[3]);
  break;
  #endif /* CONFIG_ALTIVEC */
  +   case KVM_REG_PPC_DEBUG_INST: {
  +   u32 opcode = INS_TW;
  +   r = copy_to_user((u32 __user *)(long)reg-addr,
  +opcode, sizeof(u32));
  +   break;
  +   }
  default:
  r = -EINVAL;
  break;
  diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c index
  8b553c0..a41cd6d 100644
  --- a/arch/powerpc/kvm/booke.c
  +++ b/arch/powerpc/kvm/booke.c
  @@ -1448,6 +1448,12 @@ int kvm_vcpu_ioctl_get_one_reg(struct kvm_vcpu *vcpu,
 struct kvm_one_reg *reg)
  case KVM_REG_PPC_TSR:
  r = put_user(vcpu-arch.tsr, (u32 __user *)(long)reg-addr);
  break;
  +   case KVM_REG_PPC_DEBUG_INST: {
  +   u32 opcode = KVMPPC_INST_EHPRIV;
  +   r = copy_to_user((u32 __user *)(long)reg-addr,
  +opcode, sizeof(u32));
  +   break;
  +   }
  default:
  break;
  }
  --
  1.7.0.4
 
 
  --
  To unsubscribe from this list: send the line unsubscribe kvm-ppc in
  the body of a message to majord...@vger.kernel.org More

RE: [PATCH 3/7] KVM: PPC: debug stub interface parameter defined

2013-03-13 Thread Bhushan Bharat-R65777


 -Original Message-
 From: Alexander Graf [mailto:ag...@suse.de]
 Sent: Thursday, March 07, 2013 6:51 PM
 To: Bhushan Bharat-R65777
 Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; Wood Scott-B07421; Bhushan
 Bharat-R65777
 Subject: Re: [PATCH 3/7] KVM: PPC: debug stub interface parameter defined
 
 
 On 28.02.2013, at 05:13, Bharat Bhushan wrote:
 
  This patch defines the interface parameter for KVM_SET_GUEST_DEBUG
  ioctl support. Follow up patches will use this for setting up hardware
  breakpoints, watchpoints and software breakpoints.
 
  Also kvm_arch_vcpu_ioctl_set_guest_debug() is brought one level below.
  This is because I am not sure what is required for book3s. So this
  ioctl behaviour will not change for book3s.
 
  Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
  ---
  arch/powerpc/include/uapi/asm/kvm.h |   23 +++
  arch/powerpc/kvm/book3s.c   |6 ++
  arch/powerpc/kvm/booke.c|6 ++
  arch/powerpc/kvm/powerpc.c  |6 --
  4 files changed, 35 insertions(+), 6 deletions(-)
 
  diff --git a/arch/powerpc/include/uapi/asm/kvm.h
  b/arch/powerpc/include/uapi/asm/kvm.h
  index c2ff99c..15f9a00 100644
  --- a/arch/powerpc/include/uapi/asm/kvm.h
  +++ b/arch/powerpc/include/uapi/asm/kvm.h
  @@ -272,8 +272,31 @@ struct kvm_debug_exit_arch {
 
  /* for KVM_SET_GUEST_DEBUG */
  struct kvm_guest_debug_arch {
  +   struct {
  +   /* H/W breakpoint/watchpoint address */
  +   __u64 addr;
  +   /*
  +* Type denotes h/w breakpoint, read watchpoint, write
  +* watchpoint or watchpoint (both read and write).
  +*/
  +#define KVMPPC_DEBUG_NOTYPE0x0
  +#define KVMPPC_DEBUG_BREAKPOINT(1UL  1)
  +#define KVMPPC_DEBUG_WATCH_WRITE   (1UL  2)
  +#define KVMPPC_DEBUG_WATCH_READ(1UL  3)
  +   __u32 type;
  +   __u32 reserved;
  +   } bp[16];
  };
 
  +/* Debug related defines */
  +/*
  + * kvm_guest_debug-control is a 32 bit field. The lower 16 bits are
  +generic
  + * and upper 16 bits are architecture specific. Architecture specific
  +defines
  + * that ioctl is for setting hardware breakpoint or software breakpoint.
  + */
  +#define KVM_GUESTDBG_USE_SW_BP 0x0001
  +#define KVM_GUESTDBG_USE_HW_BP 0x0002
 
 You only need
 
 #define KVM_GUESTDBG_HW_BP 0x0001
 
 In absence of the flag, it's a SW breakpoint.

We kept this for 2 reasons; 1) Same logic is applied for i386, so trying to 
keep consistent 2) better clarity.

If you want than I can code this as you described.

-Bharat

 
 
 Alex
 
  +
  /* definition of registers in kvm_run */ struct kvm_sync_regs { };
  diff --git a/arch/powerpc/kvm/book3s.c b/arch/powerpc/kvm/book3s.c
  index 975a401..cb85d73 100644
  --- a/arch/powerpc/kvm/book3s.c
  +++ b/arch/powerpc/kvm/book3s.c
  @@ -613,6 +613,12 @@ int kvm_arch_vcpu_ioctl_translate(struct kvm_vcpu 
  *vcpu,
  return 0;
  }
 
  +int kvm_arch_vcpu_ioctl_set_guest_debug(struct kvm_vcpu *vcpu,
  +   struct kvm_guest_debug *dbg)
  +{
  +   return -EINVAL;
  +}
  +
  void kvmppc_decrementer_func(unsigned long data) {
  struct kvm_vcpu *vcpu = (struct kvm_vcpu *)data; diff --git
  a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c index
  a41cd6d..1de93a8 100644
  --- a/arch/powerpc/kvm/booke.c
  +++ b/arch/powerpc/kvm/booke.c
  @@ -1527,6 +1527,12 @@ int kvm_vcpu_ioctl_set_one_reg(struct kvm_vcpu *vcpu,
 struct kvm_one_reg *reg)
  return r;
  }
 
  +int kvm_arch_vcpu_ioctl_set_guest_debug(struct kvm_vcpu *vcpu,
  +struct kvm_guest_debug *dbg)
  +{
  +   return -EINVAL;
  +}
  +
  int kvm_arch_vcpu_ioctl_get_fpu(struct kvm_vcpu *vcpu, struct kvm_fpu
  *fpu) {
  return -ENOTSUPP;
  diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c
  index 934413c..4c94ca9 100644
  --- a/arch/powerpc/kvm/powerpc.c
  +++ b/arch/powerpc/kvm/powerpc.c
  @@ -532,12 +532,6 @@ void kvm_arch_vcpu_put(struct kvm_vcpu *vcpu)
  #endif }
 
  -int kvm_arch_vcpu_ioctl_set_guest_debug(struct kvm_vcpu *vcpu,
  -struct kvm_guest_debug *dbg)
  -{
  -   return -EINVAL;
  -}
  -
  static void kvmppc_complete_dcr_load(struct kvm_vcpu *vcpu,
   struct kvm_run *run) {
  --
  1.7.0.4
 
 
 


--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 4/7] booke: Save and restore debug registers on guest entry and exit

2013-03-13 Thread Bhushan Bharat-R65777


 -Original Message-
 From: Alexander Graf [mailto:ag...@suse.de]
 Sent: Thursday, March 07, 2013 6:56 PM
 To: Bhushan Bharat-R65777
 Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; Wood Scott-B07421; Bhushan
 Bharat-R65777
 Subject: Re: [PATCH 4/7] booke: Save and restore debug registers on guest 
 entry
 and exit
 
 
 On 28.02.2013, at 05:13, Bharat Bhushan wrote:
 
  On Guest entry: if guest is wants to use the debug register then save
  h/w debug register in host_dbg_reg and load the debug registers with
  shadow_dbg_reg. Otherwise leave h/w debug registers as is.
 
 Why can't we switch the majority of registers on vcpu_put/get and only enable 
 or
 disable debugging on guest entry/exit?


One of the reason for not doing this is that the KVM is a host kernel module 
and let this be debugged by host (I do not this how much useful this is :)) 
So I am not able to recall the specific reason, maybe we have just coded this 
like this and tried to keep overhead as low as possible by switching registers 
only when they are used.

As we discussed before, we can keep this option open for future.

-Bharat

 
 
 Alex
 
 
  On guest exit: If guest/user-space is using the debug resource then
  restore the h/w debug register with host_dbg_reg. No need to save
  guest debug register as shadow_dbg_reg is having required values. If
  guest is not using the debug resources then no need to restore h/w 
  registers.
 
  Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
  ---
  arch/powerpc/include/asm/kvm_host.h |5 ++
  arch/powerpc/kernel/asm-offsets.c   |   26 
  arch/powerpc/kvm/booke_interrupts.S |  114
  +++
  3 files changed, 145 insertions(+), 0 deletions(-)
 
  diff --git a/arch/powerpc/include/asm/kvm_host.h
  b/arch/powerpc/include/asm/kvm_host.h
  index f4ba881..a9feeb0 100644
  --- a/arch/powerpc/include/asm/kvm_host.h
  +++ b/arch/powerpc/include/asm/kvm_host.h
  @@ -504,7 +504,12 @@ struct kvm_vcpu_arch {
  u32 mmucfg;
  u32 epr;
  u32 crit_save;
  +   /* guest debug registers*/
  struct kvmppc_booke_debug_reg dbg_reg;
  +   /* shadow debug registers */
  +   struct kvmppc_booke_debug_reg shadow_dbg_reg;
  +   /* host debug registers*/
  +   struct kvmppc_booke_debug_reg host_dbg_reg;
  #endif
  gpa_t paddr_accessed;
  gva_t vaddr_accessed;
  diff --git a/arch/powerpc/kernel/asm-offsets.c
  b/arch/powerpc/kernel/asm-offsets.c
  index 02048f3..22deda7 100644
  --- a/arch/powerpc/kernel/asm-offsets.c
  +++ b/arch/powerpc/kernel/asm-offsets.c
  @@ -563,6 +563,32 @@ int main(void)
  DEFINE(VCPU_FAULT_DEAR, offsetof(struct kvm_vcpu, arch.fault_dear));
  DEFINE(VCPU_FAULT_ESR, offsetof(struct kvm_vcpu, arch.fault_esr));
  DEFINE(VCPU_CRIT_SAVE, offsetof(struct kvm_vcpu, arch.crit_save));
  +   DEFINE(VCPU_DBSR, offsetof(struct kvm_vcpu, arch.dbsr));
  +   DEFINE(VCPU_SHADOW_DBG, offsetof(struct kvm_vcpu, arch.shadow_dbg_reg));
  +   DEFINE(VCPU_HOST_DBG, offsetof(struct kvm_vcpu, arch.host_dbg_reg));
  +   DEFINE(KVMPPC_DBG_DBCR0, offsetof(struct kvmppc_booke_debug_reg,
  + dbcr0));
  +   DEFINE(KVMPPC_DBG_DBCR1, offsetof(struct kvmppc_booke_debug_reg,
  + dbcr1));
  +   DEFINE(KVMPPC_DBG_DBCR2, offsetof(struct kvmppc_booke_debug_reg,
  + dbcr2));
  +#ifdef CONFIG_KVM_E500MC
  +   DEFINE(KVMPPC_DBG_DBCR4, offsetof(struct kvmppc_booke_debug_reg,
  + dbcr4));
  +#endif
  +   DEFINE(KVMPPC_DBG_IAC1, offsetof(struct kvmppc_booke_debug_reg,
  +iac[0]));
  +   DEFINE(KVMPPC_DBG_IAC2, offsetof(struct kvmppc_booke_debug_reg,
  +iac[1]));
  +   DEFINE(KVMPPC_DBG_IAC3, offsetof(struct kvmppc_booke_debug_reg,
  +iac[2]));
  +   DEFINE(KVMPPC_DBG_IAC4, offsetof(struct kvmppc_booke_debug_reg,
  +iac[3]));
  +   DEFINE(KVMPPC_DBG_DAC1, offsetof(struct kvmppc_booke_debug_reg,
  +dac[0]));
  +   DEFINE(KVMPPC_DBG_DAC2, offsetof(struct kvmppc_booke_debug_reg,
  +dac[1]));
  +   DEFINE(VCPU_GUEST_DEBUG, offsetof(struct kvm_vcpu, guest_debug));
  #endif /* CONFIG_PPC_BOOK3S */
  #endif /* CONFIG_KVM */
 
  diff --git a/arch/powerpc/kvm/booke_interrupts.S
  b/arch/powerpc/kvm/booke_interrupts.S
  index 2c6deb5..6d78e01 100644
  --- a/arch/powerpc/kvm/booke_interrupts.S
  +++ b/arch/powerpc/kvm/booke_interrupts.S
  @@ -39,6 +39,8 @@
  #define HOST_MIN_STACK_SIZE (HOST_NV_GPR(R31) + 4) #define
  HOST_STACK_SIZE (((HOST_MIN_STACK_SIZE + 15) / 16) * 16) /* Align. */
  #define HOST_STACK_LR   (HOST_STACK_SIZE + 4) /* In caller stack frame. */
  +#define DBCR0_AC_BITS  (DBCR0_IAC1 | DBCR0_IAC2 | DBCR0_IAC3 | 
  DBCR0_IAC4 | \
  +DBCR0_DAC1R | DBCR0_DAC1W

RE: [PATCH 7/7] KVM: PPC: Add userspace debug stub support

2013-03-13 Thread Bhushan Bharat-R65777


 -Original Message-
 From: Alexander Graf [mailto:ag...@suse.de]
 Sent: Thursday, March 07, 2013 7:09 PM
 To: Bhushan Bharat-R65777
 Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; Wood Scott-B07421; Bhushan
 Bharat-R65777
 Subject: Re: [PATCH 7/7] KVM: PPC: Add userspace debug stub support
 
 
 On 28.02.2013, at 05:13, Bharat Bhushan wrote:
 
  This patch adds the debug stub support on booke/bookehv.
  Now QEMU debug stub can use hw breakpoint, watchpoint and software
  breakpoint to debug guest.
 
  Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
  ---
  arch/powerpc/include/uapi/asm/kvm.h |   22 +-
  arch/powerpc/kvm/booke.c|  143 
  +++---
  arch/powerpc/kvm/e500_emulate.c |6 ++
  arch/powerpc/kvm/e500mc.c   |3 +-
  4 files changed, 155 insertions(+), 19 deletions(-)
 
  diff --git a/arch/powerpc/include/uapi/asm/kvm.h
  b/arch/powerpc/include/uapi/asm/kvm.h
  index 15f9a00..d7ce449 100644
  --- a/arch/powerpc/include/uapi/asm/kvm.h
  +++ b/arch/powerpc/include/uapi/asm/kvm.h
  @@ -25,6 +25,7 @@
  /* Select powerpc specific features in linux/kvm.h */ #define
  __KVM_HAVE_SPAPR_TCE #define __KVM_HAVE_PPC_SMT
  +#define __KVM_HAVE_GUEST_DEBUG
 
  struct kvm_regs {
  __u64 pc;
  @@ -267,7 +268,24 @@ struct kvm_fpu {
  __u64 fpr[32];
  };
 
  +/*
  + * Defines for h/w breakpoint, watchpoint (read, write or both) and
  + * software breakpoint.
  + * These are used as type in KVM_SET_GUEST_DEBUG ioctl and status
  + * for KVM_DEBUG_EXIT.
  + */
  +#define KVMPPC_DEBUG_NONE  0x0
  +#define KVMPPC_DEBUG_BREAKPOINT(1UL  1)
  +#define KVMPPC_DEBUG_WATCH_WRITE   (1UL  2)
  +#define KVMPPC_DEBUG_WATCH_READ(1UL  3)
  struct kvm_debug_exit_arch {
  +   __u64 address;
  +   /*
  +* exiting to userspace because of h/w breakpoint, watchpoint
  +* (read, write or both) and software breakpoint.
  +*/
  +   __u32 status;
  +   __u32 reserved;
  };
 
  /* for KVM_SET_GUEST_DEBUG */
  @@ -279,10 +297,6 @@ struct kvm_guest_debug_arch {
   * Type denotes h/w breakpoint, read watchpoint, write
   * watchpoint or watchpoint (both read and write).
   */
  -#define KVMPPC_DEBUG_NOTYPE0x0
  -#define KVMPPC_DEBUG_BREAKPOINT(1UL  1)
  -#define KVMPPC_DEBUG_WATCH_WRITE   (1UL  2)
  -#define KVMPPC_DEBUG_WATCH_READ(1UL  3)
  __u32 type;
  __u32 reserved;
  } bp[16];
  diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c index
  1de93a8..21b0313 100644
  --- a/arch/powerpc/kvm/booke.c
  +++ b/arch/powerpc/kvm/booke.c
  @@ -133,6 +133,30 @@ static void kvmppc_vcpu_sync_fpu(struct kvm_vcpu
  *vcpu) #endif }
 
  +static void kvmppc_vcpu_sync_debug(struct kvm_vcpu *vcpu) {
  +   /* Synchronize guest's desire to get debug interrupts into shadow
  +MSR */ #ifndef CONFIG_KVM_BOOKE_HV
  +   vcpu-arch.shadow_msr = ~MSR_DE;
  +   vcpu-arch.shadow_msr |= vcpu-arch.shared-msr  MSR_DE; #endif
  +
  +   /* Force enable debug interrupts when user space wants to debug */
  +   if (vcpu-guest_debug) {
  +#ifdef CONFIG_KVM_BOOKE_HV
  +   /*
  +* Since there is no shadow MSR, sync MSR_DE into the guest
  +* visible MSR. Do not allow guest to change MSR[DE].
  +*/
  +   vcpu-arch.shared-msr |= MSR_DE;
  +   mtspr(SPRN_MSRP, mfspr(SPRN_MSRP) | MSRP_DEP); #else
  +   vcpu-arch.shadow_msr |= MSR_DE;
  +   vcpu-arch.shared-msr = ~MSR_DE;
  +#endif
  +   }
  +}
  +
  /*
   * Helper function for full MSR writes.  No need to call this if
  only
   * EE/CE/ME/DE/RI are changing.
  @@ -150,6 +174,7 @@ void kvmppc_set_msr(struct kvm_vcpu *vcpu, u32 new_msr)
  kvmppc_mmu_msr_notify(vcpu, old_msr);
  kvmppc_vcpu_sync_spe(vcpu);
  kvmppc_vcpu_sync_fpu(vcpu);
  +   kvmppc_vcpu_sync_debug(vcpu);
  }
 
  static void kvmppc_booke_queue_irqprio(struct kvm_vcpu *vcpu, @@
  -736,6 +761,13 @@ static int emulation_exit(struct kvm_run *run, struct
 kvm_vcpu *vcpu)
  run-exit_reason = KVM_EXIT_DCR;
  return RESUME_HOST;
 
  +   case EMULATE_EXIT_USER:
  +   run-exit_reason = KVM_EXIT_DEBUG;
  +   run-debug.arch.address = vcpu-arch.pc;
  +   run-debug.arch.status = 0;
  +   kvmppc_account_exit(vcpu, DEBUG_EXITS);
 
 As mentioned previously, this is wrong and needs to go into the instruction
 emulation code for that opcode.

ok

 
  +   return RESUME_HOST;
  +
  case EMULATE_FAIL:
  printk(KERN_CRIT %s: emulation at %lx failed (%08x)\n,
 __func__, vcpu-arch.pc, vcpu-arch.last_inst); @@ -751,6
  +783,28 @@ static int emulation_exit(struct kvm_run *run, struct kvm_vcpu
 *vcpu)
  }
  }
 
  +static int kvmppc_handle_debug(struct kvm_run *run, struct kvm_vcpu
  +*vcpu) {
  +   u32 dbsr = vcpu-arch.dbsr;
  +   run-debug.arch.status = 0;
  +   run

RE: [PATCH 2/7] Added ONE_REG interface for debug instruction

2013-02-28 Thread Bhushan Bharat-R65777


 -Original Message-
 From: kvm-ppc-ow...@vger.kernel.org [mailto:kvm-ppc-
 ow...@vger.kernel.org] On Behalf Of Alexander Graf
 Sent: Thursday, February 28, 2013 10:22 PM
 To: Wood Scott-B07421
 Cc: Bhushan Bharat-R65777; kvm-ppc@vger.kernel.org; k...@vger.kernel.org;
 Bhushan Bharat-R65777
 Subject: Re: [PATCH 2/7] Added ONE_REG interface for debug instruction
 
 
 On 28.02.2013, at 17:23, Scott Wood wrote:
 
  On 02/27/2013 10:13:11 PM, Bharat Bhushan wrote:
  diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
  index 8b553c0..a41cd6d 100644
  --- a/arch/powerpc/kvm/booke.c
  +++ b/arch/powerpc/kvm/booke.c
  @@ -1448,6 +1448,12 @@ int kvm_vcpu_ioctl_get_one_reg(struct kvm_vcpu
 *vcpu, struct kvm_one_reg *reg)
 case KVM_REG_PPC_TSR:
 r = put_user(vcpu-arch.tsr, (u32 __user *)(long)reg-addr);
 break;
  +  case KVM_REG_PPC_DEBUG_INST: {
  +  u32 opcode = KVMPPC_INST_EHPRIV;
  +  r = copy_to_user((u32 __user *)(long)reg-addr,
  +   opcode, sizeof(u32));
  +  break;
  +  }
 
  We're using ehpriv even for PR-mode KVM (e.g. e500v2)?
 
 If it's a reserved instruction, that should work. Since we need to use a
 single instruction to replace the debugged one with, any reserved opcode
 should be as good as any other, right?

Right, that has been the idea here.

Thanks
-Bharat

 
 
 Alex
 
 --
 To unsubscribe from this list: send the line unsubscribe kvm-ppc in the
 body of a message to majord...@vger.kernel.org More majordomo info at
 http://vger.kernel.org/majordomo-info.html


--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 3/8] KVM: PPC: booke: Added debug handler

2013-02-07 Thread Bhushan Bharat-R65777
  -Original Message-
  From: Alexander Graf [mailto:ag...@suse.de]
  Sent: Friday, January 25, 2013 5:13 PM
  To: Bhushan Bharat-R65777
  Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; Bhushan
  Bharat-R65777
  Subject: Re: [PATCH 3/8] KVM: PPC: booke: Added debug handler
 
 
  On 16.01.2013, at 09:24, Bharat Bhushan wrote:
 
  From: Bharat Bhushan bharat.bhus...@freescale.com
 
  Installed debug handler will be used for guest debug support
  and debug facility emulation features (patches for these
  features will follow this patch).
 
  Signed-off-by: Liu Yu yu@freescale.com
  [bharat.bhus...@freescale.com: Substantial changes]
  Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
  ---
  arch/powerpc/include/asm/kvm_host.h |1 +
  arch/powerpc/kernel/asm-offsets.c   |1 +
  arch/powerpc/kvm/booke_interrupts.S |   49
  ++-
  --
  --
  3 files changed, 44 insertions(+), 7 deletions(-)
 
  diff --git a/arch/powerpc/include/asm/kvm_host.h
  b/arch/powerpc/include/asm/kvm_host.h
  index 8a72d59..f4ba881 100644
  --- a/arch/powerpc/include/asm/kvm_host.h
  +++ b/arch/powerpc/include/asm/kvm_host.h
  @@ -503,6 +503,7 @@ struct kvm_vcpu_arch {
  u32 tlbcfg[4];
  u32 mmucfg;
  u32 epr;
  +   u32 crit_save;
  struct kvmppc_booke_debug_reg dbg_reg; #endif
  gpa_t paddr_accessed;
  diff --git a/arch/powerpc/kernel/asm-offsets.c
  b/arch/powerpc/kernel/asm-offsets.c
  index 46f6afd..02048f3 100644
  --- a/arch/powerpc/kernel/asm-offsets.c
  +++ b/arch/powerpc/kernel/asm-offsets.c
  @@ -562,6 +562,7 @@ int main(void)
  DEFINE(VCPU_LAST_INST, offsetof(struct kvm_vcpu, 
  arch.last_inst));
  DEFINE(VCPU_FAULT_DEAR, offsetof(struct kvm_vcpu,
  arch.fault_dear));
  DEFINE(VCPU_FAULT_ESR, offsetof(struct kvm_vcpu,
  arch.fault_esr));
  +   DEFINE(VCPU_CRIT_SAVE, offsetof(struct kvm_vcpu,
  +arch.crit_save));
  #endif /* CONFIG_PPC_BOOK3S */ #endif /* CONFIG_KVM */
 
  diff --git a/arch/powerpc/kvm/booke_interrupts.S
  b/arch/powerpc/kvm/booke_interrupts.S
  index eae8483..dd9c5d4 100644
  --- a/arch/powerpc/kvm/booke_interrupts.S
  +++ b/arch/powerpc/kvm/booke_interrupts.S
  @@ -52,12 +52,7 @@
 (1BOOKE_INTERRUPT_PROGRAM) | \
 (1BOOKE_INTERRUPT_DTLB_MISS))
 
  -.macro KVM_HANDLER ivor_nr scratch srr0
  -_GLOBAL(kvmppc_handler_\ivor_nr)
  -   /* Get pointer to vcpu and record exit number. */
  -   mtspr   \scratch , r4
  -   mfspr   r4, SPRN_SPRG_THREAD
  -   lwz r4, THREAD_KVM_VCPU(r4)
  +.macro __KVM_HANDLER ivor_nr scratch srr0
  stw r3, VCPU_GPR(R3)(r4)
  stw r5, VCPU_GPR(R5)(r4)
  stw r6, VCPU_GPR(R6)(r4)
  @@ -74,6 +69,46 @@ _GLOBAL(kvmppc_handler_\ivor_nr)
  bctr
  .endm
 
  +.macro KVM_HANDLER ivor_nr scratch srr0
  +_GLOBAL(kvmppc_handler_\ivor_nr)
  +   /* Get pointer to vcpu and record exit number. */
  +   mtspr   \scratch , r4
  +   mfspr   r4, SPRN_SPRG_THREAD
  +   lwz r4, THREAD_KVM_VCPU(r4)
  +   __KVM_HANDLER \ivor_nr \scratch \srr0 .endm
  +
  +.macro KVM_DBG_HANDLER ivor_nr scratch srr0
  +_GLOBAL(kvmppc_handler_\ivor_nr)
  +   mtspr   \scratch, r4
  +   mfspr   r4, SPRN_SPRG_THREAD
  +   lwz r4, THREAD_KVM_VCPU(r4)
  +   stw r3, VCPU_CRIT_SAVE(r4)
  +   mfcrr3
  +   mfspr   r4, SPRN_CSRR1
  +   andi.   r4, r4, MSR_PR
  +   bne 1f
 
 
  +   /* debug interrupt happened in enter/exit path */
  +   mfspr   r4, SPRN_CSRR1
  +   rlwinm  r4, r4, 0, ~MSR_DE
  +   mtspr   SPRN_CSRR1, r4
  +   lis r4, 0x
  +   ori r4, r4, 0x
  +   mtspr   SPRN_DBSR, r4
  +   mfspr   r4, SPRN_SPRG_THREAD
  +   lwz r4, THREAD_KVM_VCPU(r4)
  +   mtcrr3
  +   lwz r3, VCPU_CRIT_SAVE(r4)
  +   mfspr   r4, \scratch
  +   rfci
 
  What is this part doing? Try to ignore the debug exit?
 
  As BOOKE doesn't have hardware support for virtualization,
  hardware never know
  current pc is in guest or in host.
  So when enable hardware single step for guest, it cannot be
  disabled at the
  time guest exit. Thus, we'll see that an single step interrupt
  happens at the beginning of guest exit path.
 
  With the above code we recognize this kind of single step
  interrupt disable
  single step and rfci.
 
  Why would we have MSR_DE
  enabled in the first place when we can't handle it?
 
  When QEMU is using hardware debug resource then we always set
  MSR_DE during
  guest is running.
 
  Right, but why is MSR_DE enabled during the exit path? If MSR_DE
  wasn't set, you wouldn't get a single step exit.
 
  We always set MSR_DE in hw MSR when qemu using the debug resource.
 
  In the _guest_ MSR, yes. But once we exit the guest, it shouldn't
  be set anymore, because we're in an interrupt handler, no? Or is
  MSR_DE kept alive on interrupts?
 
 
  During the exit code path, you could then swap DBSR back to what
  the host expects (which means no single step). Only after that
  enable MSR_DE again.
 
  We do not support deferred debug

RE: [PATCH 8/8] KVM:PPC:booke: Allow debug interrupt injection to guest

2013-02-07 Thread Bhushan Bharat-R65777


 -Original Message-
 From: kvm-ppc-ow...@vger.kernel.org [mailto:kvm-ppc-ow...@vger.kernel.org] On
 Behalf Of Alexander Graf
 Sent: Thursday, February 07, 2013 8:29 PM
 To: Wood Scott-B07421
 Cc: Bhushan Bharat-R65777; kvm-ppc@vger.kernel.org; k...@vger.kernel.org
 Subject: Re: [PATCH 8/8] KVM:PPC:booke: Allow debug interrupt injection to 
 guest
 
 
 On 01.02.2013, at 23:38, Scott Wood wrote:
 
  On 01/31/2013 06:11:32 PM, Alexander Graf wrote:
  On 31.01.2013, at 23:40, Scott Wood wrote:
   On 01/31/2013 01:20:39 PM, Alexander Graf wrote:
   On 31.01.2013, at 20:05, Alexander Graf wrote:
   
On 31.01.2013, at 19:54, Scott Wood wrote:
   
On 01/31/2013 12:52:41 PM, Alexander Graf wrote:
On 31.01.2013, at 19:43, Scott Wood wrote:
On 01/31/2013 12:21:07 PM, Alexander Graf wrote:
How about something like this? Then both targets at least suck as
 much :).
   
I'm not sure that should be the goal...
   
Thanks to e500mc's awful hardware design, we don't know who sets 
the
 MSR_DE bit. Once we forced it onto the guest, we have no change to know 
 whether
 the guest also set it or not. We could only guess.
   
MSRP[DEP] can prevent the guest from modifying MSR[DE] -- but we
 still need to set it in the first place.
   
According to ISA V2.06B, the hypervisor should set DBCR0[EDM] to 
let
 the guest know that the debug resources are not available, and that the value
 of MSR[DE] is not specified and not modifiable.
So what would the guest do then to tell the hypervisor that it
 actually wants to know about debug events?
   
The guest is out of luck, just as if a JTAG were in use.
   
Hrm.
   
Can we somehow generalize this out of luck behavior?
   
Every time we would set or clear an MSR bit in shadow_msr on e500v2, 
we
 would instead set or clear it in the real MSR. That way only e500mc is out of
 luck, but the code would still be shared.
  
   I don't follow.  e500v2 is just as out-of-luck.  The mechanism simply 
   does
 not support sharing debug resources.
  For e500v2 we have 2 fields
   * MSR as the guest sees it
   * MSR as we execute when the guest runs Since we know the MSR when
  the guest sees it, we can decide what to do when we get an unhandled debug
 interrupt.
 
  That's not the same thing as making the real MSR[DE] show up in the guest
 MSR[DE].
 
  There are other problems with sharing -- what happens when both host and 
  guest
 try to write to a particular IAC or DAC?
 
  Also, performance would be pretty awful if the guest has e.g. single 
  stepping
 in DBCR0 enabled but MSR[DE]=0, and the host doesn't care about single 
 stepping
 (but does want debugging enabled in general).
 
   What do you mean by the real MSR?  The real MSR is shadow_msr, and 
   MSR_DE
 must always be set there if the host is debugging the guest.  As for 
 reflecting
 it into the guest MSR, we could, but I don't really see the point.  We're 
 never
 going to actually send a debug exception to the guest when the host owns the
 debug resources.
  Why not? That's the whole point of jumping through user space.
 
  That's still needed for software breakpoints, which don't rely on the debug
 resources.
 
   1) guest exits with debug interrupt
   2) QEMU gets a debug exit
   3) QEMU checks in its list whether it belongs to its own debug
  points
   4) if not, it reinjects the interrupt into the guest Step 4 is
  pretty difficult to do when we don't know whether the guest is actually
 capable of handling debug interrupts at that moment.
 
  Software breakpoints take a Program interrupt rather than a Debug interrupt,
 unless MSR[DE]=1 and DBCR0[TRAP]=1.  If the guest does not own debug resources
 we should always send it to the Program interrupt, so MSR[DE] doesn't matter.
 
   The = ~MSR_DE line is pointless on bookehv, and makes it harder to 
   read.
 I had to stare at it a while before noticing that you initially set is_debug
 from the guest MSR and that you'd never really clear MSR_DE here on bookehv.
  Well, I'm mostly bouncing ideas here to find a way to express what we're
 trying to say in a way that someone who hasn't read this email thread would
 still understand what's going on :).
 
  I think it's already straightforward enough if you accept that shared debug
 resources aren't supported, and that we are either in a mode where the real
 MSR[DE] reflects the guest MSR[DE], or a mode where the real MSR[DE] is always
 on in guest mode and the guest MSR[DE] is irrelevant.
 
 I think I'm starting to grasp what you're suggesting:
 
 On e500mc, have 2 modes
 
   1) guest owns debug
 
   This is the normal operation. Here the guest defines the value of MSR_DE. 
 The
 guest gets debug interrupts directly.
 
   2) host owns debug
 
   In this case, take away any debug capabilities from the guest. Everything
 debug related goes straight to QEMU.
 
 
 On e500v2, have 2 modes
 
   1) guest owns debug
 
   This is the normal operation. Here the guest

RE: [PATCH 8/8] KVM:PPC:booke: Allow debug interrupt injection to guest

2013-02-03 Thread Bhushan Bharat-R65777


 -Original Message-
 From: Wood Scott-B07421
 Sent: Saturday, February 02, 2013 4:09 AM
 To: Alexander Graf
 Cc: Bhushan Bharat-R65777; kvm-ppc@vger.kernel.org; k...@vger.kernel.org
 Subject: Re: [PATCH 8/8] KVM:PPC:booke: Allow debug interrupt injection to 
 guest
 
 On 01/31/2013 06:11:32 PM, Alexander Graf wrote:
 
  On 31.01.2013, at 23:40, Scott Wood wrote:
 
   On 01/31/2013 01:20:39 PM, Alexander Graf wrote:
   On 31.01.2013, at 20:05, Alexander Graf wrote:
   
On 31.01.2013, at 19:54, Scott Wood wrote:
   
On 01/31/2013 12:52:41 PM, Alexander Graf wrote:
On 31.01.2013, at 19:43, Scott Wood wrote:
On 01/31/2013 12:21:07 PM, Alexander Graf wrote:
How about something like this? Then both targets at least
  suck as much :).
   
I'm not sure that should be the goal...
   
Thanks to e500mc's awful hardware design, we don't know who
  sets the MSR_DE bit. Once we forced it onto the guest, we have no
  change to know whether the guest also set it or not. We could only
  guess.
   
MSRP[DEP] can prevent the guest from modifying MSR[DE] -- but
  we still need to set it in the first place.
   
According to ISA V2.06B, the hypervisor should set DBCR0[EDM]
  to let the guest know that the debug resources are not available, and
  that the value of MSR[DE] is not specified and not modifiable.
So what would the guest do then to tell the hypervisor that it
  actually wants to know about debug events?
   
The guest is out of luck, just as if a JTAG were in use.
   
Hrm.
   
Can we somehow generalize this out of luck behavior?
   
Every time we would set or clear an MSR bit in shadow_msr on
  e500v2, we would instead set or clear it in the real MSR. That way
  only e500mc is out of luck, but the code would still be shared.
  
   I don't follow.  e500v2 is just as out-of-luck.  The mechanism
  simply does not support sharing debug resources.
 
  For e500v2 we have 2 fields
 
* MSR as the guest sees it
* MSR as we execute when the guest runs
 
  Since we know the MSR when the guest sees it, we can decide what to do
  when we get an unhandled debug interrupt.
 
 That's not the same thing as making the real MSR[DE] show up in the guest
 MSR[DE].
 
 There are other problems with sharing -- what happens when both host and guest
 try to write to a particular IAC or DAC?
 
 Also, performance would be pretty awful if the guest has e.g. single stepping 
 in
 DBCR0 enabled but MSR[DE]=0, and the host doesn't care about single stepping
 (but does want debugging enabled in general).
 
   What do you mean by the real MSR?  The real MSR is shadow_msr,
  and MSR_DE must always be set there if the host is debugging the
  guest.  As for reflecting it into the guest MSR, we could, but I don't
  really see the point.  We're never going to actually send a debug
  exception to the guest when the host owns the debug resources.
 
  Why not? That's the whole point of jumping through user space.
 
 That's still needed for software breakpoints, which don't rely on the debug
 resources.
 
1) guest exits with debug interrupt
2) QEMU gets a debug exit
3) QEMU checks in its list whether it belongs to its own debug
  points
4) if not, it reinjects the interrupt into the guest
 
  Step 4 is pretty difficult to do when we don't know whether the guest
  is actually capable of handling debug interrupts at that moment.
 
 Software breakpoints take a Program interrupt rather than a Debug interrupt,
 unless MSR[DE]=1 and DBCR0[TRAP]=1.  If the guest does not own debug resources
 we should always send it to the Program interrupt, so MSR[DE] doesn't matter.
 
   The = ~MSR_DE line is pointless on bookehv, and makes it harder
  to read.  I had to stare at it a while before noticing that you
  initially set is_debug from the guest MSR and that you'd never really
  clear MSR_DE here on bookehv.
 
  Well, I'm mostly bouncing ideas here to find a way to express what
  we're trying to say in a way that someone who hasn't read this email
  thread would still understand what's going on :).
 
 I think it's already straightforward enough if you accept that shared debug
 resources aren't supported, and that we are either in a mode where the real
 MSR[DE] reflects the guest MSR[DE], or a mode where the real MSR[DE] is always
 on in guest mode and the guest MSR[DE] is irrelevant.
 
  How about this version?
 
 
  diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c index
  38a62ef..9929c41 100644
  --- a/arch/powerpc/kvm/booke.c
  +++ b/arch/powerpc/kvm/booke.c
  @@ -133,6 +133,28 @@ static void kvmppc_vcpu_sync_fpu(struct kvm_vcpu
  *vcpu)
   #endif
   }
 
  +static void kvmppc_vcpu_sync_debug(struct kvm_vcpu *vcpu) { #ifndef
  +CONFIG_KVM_BOOKE_HV
  +   /* Synchronize guest's desire to get debug interrupts into
  shadow MSR */
  +   vcpu-arch.shadow_msr = ~MSR_DE;
  +   vcpu-arch.shadow_msr |= vcpu-arch.shared-msr  MSR_DE; #endif
  +
  +   /* Force enable debug

One reg interface for Timer register

2013-02-03 Thread Bhushan Bharat-R65777
Hi Alex/Scott,

Below is my understanding about the ONE_REG interface requirement for timer 
registers.

Define the below 2 ONE_REG interface for TSR access:
KVM_REG_SET_TSR,  // Set the specified bits in TSR
KVM_REG_CLEAR_TSR, // Clear the specified bits in TSR

QEMU will use the above ioctl call to selectively set/clear bits of TSR.
We do not need the similar interface for TCR as there is no race issue with 
TCR. So for TCR QEMU will keep on using the SREGS interface.

Thanks
-Bharat


--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 3/8] KVM: PPC: booke: Added debug handler

2013-02-01 Thread Bhushan Bharat-R65777


 -Original Message-
 From: Alexander Graf [mailto:ag...@suse.de]
 Sent: Friday, February 01, 2013 1:36 PM
 To: Bhushan Bharat-R65777
 Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org
 Subject: Re: [PATCH 3/8] KVM: PPC: booke: Added debug handler
 
 
 On 01.02.2013, at 06:04, Bhushan Bharat-R65777 wrote:
 
 
 
  -Original Message-
  From: kvm-ppc-ow...@vger.kernel.org
  [mailto:kvm-ppc-ow...@vger.kernel.org] On Behalf Of Alexander Graf
  Sent: Thursday, January 31, 2013 10:38 PM
  To: Bhushan Bharat-R65777
  Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org
  Subject: Re: [PATCH 3/8] KVM: PPC: booke: Added debug handler
 
 
  On 31.01.2013, at 17:58, Bhushan Bharat-R65777 wrote:
 
 
 
  -Original Message-
  From: Alexander Graf [mailto:ag...@suse.de]
  Sent: Thursday, January 31, 2013 5:47 PM
  To: Bhushan Bharat-R65777
  Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org
  Subject: Re: [PATCH 3/8] KVM: PPC: booke: Added debug handler
 
 
  On 30.01.2013, at 12:30, Bhushan Bharat-R65777 wrote:
 
 
 
  -Original Message-
  From: Alexander Graf [mailto:ag...@suse.de]
  Sent: Friday, January 25, 2013 5:13 PM
  To: Bhushan Bharat-R65777
  Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; Bhushan
  Bharat-R65777
  Subject: Re: [PATCH 3/8] KVM: PPC: booke: Added debug handler
 
 
  On 16.01.2013, at 09:24, Bharat Bhushan wrote:
 
  From: Bharat Bhushan bharat.bhus...@freescale.com
 
  Installed debug handler will be used for guest debug support and
  debug facility emulation features (patches for these features
  will follow this patch).
 
  Signed-off-by: Liu Yu yu@freescale.com
  [bharat.bhus...@freescale.com: Substantial changes]
  Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
  ---
  arch/powerpc/include/asm/kvm_host.h |1 +
  arch/powerpc/kernel/asm-offsets.c   |1 +
  arch/powerpc/kvm/booke_interrupts.S |   49
 ++-
  --
  --
  3 files changed, 44 insertions(+), 7 deletions(-)
 
  diff --git a/arch/powerpc/include/asm/kvm_host.h
  b/arch/powerpc/include/asm/kvm_host.h
  index 8a72d59..f4ba881 100644
  --- a/arch/powerpc/include/asm/kvm_host.h
  +++ b/arch/powerpc/include/asm/kvm_host.h
  @@ -503,6 +503,7 @@ struct kvm_vcpu_arch {
u32 tlbcfg[4];
u32 mmucfg;
u32 epr;
  + u32 crit_save;
struct kvmppc_booke_debug_reg dbg_reg; #endif
gpa_t paddr_accessed;
  diff --git a/arch/powerpc/kernel/asm-offsets.c
  b/arch/powerpc/kernel/asm-offsets.c
  index 46f6afd..02048f3 100644
  --- a/arch/powerpc/kernel/asm-offsets.c
  +++ b/arch/powerpc/kernel/asm-offsets.c
  @@ -562,6 +562,7 @@ int main(void)
DEFINE(VCPU_LAST_INST, offsetof(struct kvm_vcpu, 
  arch.last_inst));
DEFINE(VCPU_FAULT_DEAR, offsetof(struct kvm_vcpu,
 arch.fault_dear));
DEFINE(VCPU_FAULT_ESR, offsetof(struct kvm_vcpu,
  arch.fault_esr));
  + DEFINE(VCPU_CRIT_SAVE, offsetof(struct kvm_vcpu,
  +arch.crit_save));
  #endif /* CONFIG_PPC_BOOK3S */
  #endif /* CONFIG_KVM */
 
  diff --git a/arch/powerpc/kvm/booke_interrupts.S
  b/arch/powerpc/kvm/booke_interrupts.S
  index eae8483..dd9c5d4 100644
  --- a/arch/powerpc/kvm/booke_interrupts.S
  +++ b/arch/powerpc/kvm/booke_interrupts.S
  @@ -52,12 +52,7 @@
  (1BOOKE_INTERRUPT_PROGRAM) | \
  (1BOOKE_INTERRUPT_DTLB_MISS))
 
  -.macro KVM_HANDLER ivor_nr scratch srr0
  -_GLOBAL(kvmppc_handler_\ivor_nr)
  - /* Get pointer to vcpu and record exit number. */
  - mtspr   \scratch , r4
  - mfspr   r4, SPRN_SPRG_THREAD
  - lwz r4, THREAD_KVM_VCPU(r4)
  +.macro __KVM_HANDLER ivor_nr scratch srr0
stw r3, VCPU_GPR(R3)(r4)
stw r5, VCPU_GPR(R5)(r4)
stw r6, VCPU_GPR(R6)(r4)
  @@ -74,6 +69,46 @@ _GLOBAL(kvmppc_handler_\ivor_nr)
bctr
  .endm
 
  +.macro KVM_HANDLER ivor_nr scratch srr0
  +_GLOBAL(kvmppc_handler_\ivor_nr)
  + /* Get pointer to vcpu and record exit number. */
  + mtspr   \scratch , r4
  + mfspr   r4, SPRN_SPRG_THREAD
  + lwz r4, THREAD_KVM_VCPU(r4)
  + __KVM_HANDLER \ivor_nr \scratch \srr0 .endm
  +
  +.macro KVM_DBG_HANDLER ivor_nr scratch srr0
  +_GLOBAL(kvmppc_handler_\ivor_nr)
  + mtspr   \scratch, r4
  + mfspr   r4, SPRN_SPRG_THREAD
  + lwz r4, THREAD_KVM_VCPU(r4)
  + stw r3, VCPU_CRIT_SAVE(r4)
  + mfcrr3
  + mfspr   r4, SPRN_CSRR1
  + andi.   r4, r4, MSR_PR
  + bne 1f
 
 
  + /* debug interrupt happened in enter/exit path */
  + mfspr   r4, SPRN_CSRR1
  + rlwinm  r4, r4, 0, ~MSR_DE
  + mtspr   SPRN_CSRR1, r4
  + lis r4, 0x
  + ori r4, r4, 0x
  + mtspr   SPRN_DBSR, r4
  + mfspr   r4, SPRN_SPRG_THREAD
  + lwz r4, THREAD_KVM_VCPU(r4)
  + mtcrr3
  + lwz r3, VCPU_CRIT_SAVE(r4)
  + mfspr   r4, \scratch
  + rfci
 
  What is this part doing? Try to ignore the debug exit?
 
  As BOOKE doesn't have hardware

RE: [PATCH 0/8] KVM: BOOKE/BOOKEHV : Added debug stub support

2013-02-01 Thread Bhushan Bharat-R65777


 -Original Message-
 From: Alexander Graf [mailto:ag...@suse.de]
 Sent: Friday, February 01, 2013 1:34 PM
 To: Bhushan Bharat-R65777
 Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org
 Subject: Re: [PATCH 0/8] KVM: BOOKE/BOOKEHV : Added debug stub support
 
 
 On 01.02.2013, at 04:49, Bhushan Bharat-R65777 wrote:
 
 
 
  -Original Message-
  From: kvm-ppc-ow...@vger.kernel.org
  [mailto:kvm-ppc-ow...@vger.kernel.org] On Behalf Of Alexander Graf
  Sent: Friday, January 25, 2013 6:08 PM
  To: Bhushan Bharat-R65777
  Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; Bhushan
  Bharat-R65777
  Subject: Re: [PATCH 0/8] KVM: BOOKE/BOOKEHV : Added debug stub
  support
 
 
  On 16.01.2013, at 09:20, Bharat Bhushan wrote:
 
  This patchset adds the QEMU debug stub support for powerpc 
  (booke/bookehv).
  [1/8] KVM: PPC: booke: use vcpu reference from thread_struct
- This is a cleanup patch to use vcpu reference from thread struct
  [2/8] KVM: PPC: booke: Allow multiple exception types [3/8] KVM: PPC:
  booke: Added debug handler
- These two patches install the KVM debug handler.
  [4/8] Added ONE_REG interface for debug instruction
- Add the ioctl interface to get the debug instruction for
  setting software breakpoint from QEMU debug stub.
  [5/8] KVM: PPC: debug stub interface parameter defined [6/8] booke:
  Added DBCR4 SPR number [7/8] KVM: booke/bookehv: Add debug stub
  support
- Add the debug stub interface on booke/bookehv [8/8] KVM:PPC:booke:
  Allow debug interrupt injection to guest
-- with this qemu can inject debug interrupt to guest
 
  Thanks, applied 1/8, 2/8, 6/8.
 
 
  Alex I cannot see these 3 patches on kvm-ppc-next branch. Are those applied 
  on
 some other branch ?
 
 Yes, my staging tree is now kvm-ppc-queue, as I'm not allowed to rebase 
 kvm-ppc-
 next...

On which branch we should send our patches on kvm-ppc-queue or kmv-ppc-next?

Thanks
-Bharat

--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 5/8] KVM: PPC: debug stub interface parameter defined

2013-01-31 Thread Bhushan Bharat-R65777


 -Original Message-
 From: kvm-ppc-ow...@vger.kernel.org [mailto:kvm-ppc-ow...@vger.kernel.org] On
 Behalf Of Alexander Graf
 Sent: Thursday, January 31, 2013 6:31 PM
 To: Bhushan Bharat-R65777
 Cc: Paul Mackerras; kvm-ppc@vger.kernel.org; k...@vger.kernel.org
 Subject: Re: [PATCH 5/8] KVM: PPC: debug stub interface parameter defined
 
 
 On 30.01.2013, at 15:15, Bhushan Bharat-R65777 wrote:
 
 
 
  -Original Message-
  From: Alexander Graf [mailto:ag...@suse.de]
  Sent: Friday, January 25, 2013 5:24 PM
  To: Bhushan Bharat-R65777
  Cc: Paul Mackerras; kvm-ppc@vger.kernel.org; k...@vger.kernel.org
  Subject: Re: [PATCH 5/8] KVM: PPC: debug stub interface parameter
  defined
 
 
  On 17.01.2013, at 12:11, Bhushan Bharat-R65777 wrote:
 
 
 
  -Original Message-
  From: Paul Mackerras [mailto:pau...@samba.org]
  Sent: Thursday, January 17, 2013 12:53 PM
  To: Bhushan Bharat-R65777
  Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; ag...@suse.de;
  Bhushan Bharat-
  R65777
  Subject: Re: [PATCH 5/8] KVM: PPC: debug stub interface parameter
  defined
 
  On Wed, Jan 16, 2013 at 01:54:42PM +0530, Bharat Bhushan wrote:
  This patch defines the interface parameter for KVM_SET_GUEST_DEBUG
  ioctl support. Follow up patches will use this for setting up
  hardware breakpoints, watchpoints and software breakpoints.
 
  [snip]
 
  diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
  index 453a10f..7d5a51c 100644
  --- a/arch/powerpc/kvm/booke.c
  +++ b/arch/powerpc/kvm/booke.c
  @@ -1483,6 +1483,12 @@ int kvm_vcpu_ioctl_set_one_reg(struct
  kvm_vcpu *vcpu,
  struct kvm_one_reg *reg)
  return r;
  }
 
  +int kvm_arch_vcpu_ioctl_set_guest_debug(struct kvm_vcpu *vcpu,
  +struct kvm_guest_debug *dbg) {
  +   return -EINVAL;
  +}
  +
  int kvm_arch_vcpu_ioctl_get_fpu(struct kvm_vcpu *vcpu, struct
  kvm_fpu
  *fpu)  {
  return -ENOTSUPP;
  diff --git a/arch/powerpc/kvm/powerpc.c
  b/arch/powerpc/kvm/powerpc.c index 934413c..4c94ca9 100644
  --- a/arch/powerpc/kvm/powerpc.c
  +++ b/arch/powerpc/kvm/powerpc.c
  @@ -532,12 +532,6 @@ void kvm_arch_vcpu_put(struct kvm_vcpu *vcpu)
  #endif  }
 
  -int kvm_arch_vcpu_ioctl_set_guest_debug(struct kvm_vcpu *vcpu,
  -struct kvm_guest_debug *dbg)
  -{
  -   return -EINVAL;
  -}
  -
 
  This will break the build for non-book E machines, since
  kvm_arch_vcpu_ioctl_set_guest_debug() is referenced from generic code.
  You need to add it to arch/powerpc/kvm/book3s.c as well.
 
  right,  I will correct this.
 
  Would the implementation actually be different on booke vs book3s? My
  feeling is that powerpc.c is actually the right place for this.
 
 
  I am not sure there will be anything common between book3s and booke. Should
 we define the cpu specific function something like
 kvm_ppc_vcpu_ioctl_set_guest_debug() for booke and book3s and call this new
 defined function from kvm_arch_vcpu_ioctl_set_guest_debug() in powerpc.c ?
 
 No, just put it into the subarch directories then :). No need to overengineer
 anything for now.

What you mean by subarch?  Above you mentioned that powerpc.c is right place? 
Is not this patch is doing partially :)

Thanks
-Bharat


--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 5/8] KVM: PPC: debug stub interface parameter defined

2013-01-31 Thread Bhushan Bharat-R65777


 -Original Message-
 From: kvm-ow...@vger.kernel.org [mailto:kvm-ow...@vger.kernel.org] On Behalf 
 Of
 Alexander Graf
 Sent: Thursday, January 31, 2013 7:58 PM
 To: Bhushan Bharat-R65777
 Cc: Paul Mackerras; kvm-ppc@vger.kernel.org; k...@vger.kernel.org
 Subject: Re: [PATCH 5/8] KVM: PPC: debug stub interface parameter defined
 
 
 On 31.01.2013, at 15:05, Bhushan Bharat-R65777 wrote:
 
 
 
  -Original Message-
  From: kvm-ppc-ow...@vger.kernel.org
  [mailto:kvm-ppc-ow...@vger.kernel.org] On Behalf Of Alexander Graf
  Sent: Thursday, January 31, 2013 6:31 PM
  To: Bhushan Bharat-R65777
  Cc: Paul Mackerras; kvm-ppc@vger.kernel.org; k...@vger.kernel.org
  Subject: Re: [PATCH 5/8] KVM: PPC: debug stub interface parameter
  defined
 
 
  On 30.01.2013, at 15:15, Bhushan Bharat-R65777 wrote:
 
 
 
  -Original Message-
  From: Alexander Graf [mailto:ag...@suse.de]
  Sent: Friday, January 25, 2013 5:24 PM
  To: Bhushan Bharat-R65777
  Cc: Paul Mackerras; kvm-ppc@vger.kernel.org; k...@vger.kernel.org
  Subject: Re: [PATCH 5/8] KVM: PPC: debug stub interface parameter
  defined
 
 
  On 17.01.2013, at 12:11, Bhushan Bharat-R65777 wrote:
 
 
 
  -Original Message-
  From: Paul Mackerras [mailto:pau...@samba.org]
  Sent: Thursday, January 17, 2013 12:53 PM
  To: Bhushan Bharat-R65777
  Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; ag...@suse.de;
  Bhushan Bharat-
  R65777
  Subject: Re: [PATCH 5/8] KVM: PPC: debug stub interface parameter
  defined
 
  On Wed, Jan 16, 2013 at 01:54:42PM +0530, Bharat Bhushan wrote:
  This patch defines the interface parameter for
  KVM_SET_GUEST_DEBUG ioctl support. Follow up patches will use
  this for setting up hardware breakpoints, watchpoints and software
 breakpoints.
 
  [snip]
 
  diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
  index 453a10f..7d5a51c 100644
  --- a/arch/powerpc/kvm/booke.c
  +++ b/arch/powerpc/kvm/booke.c
  @@ -1483,6 +1483,12 @@ int kvm_vcpu_ioctl_set_one_reg(struct
  kvm_vcpu *vcpu,
  struct kvm_one_reg *reg)
return r;
  }
 
  +int kvm_arch_vcpu_ioctl_set_guest_debug(struct kvm_vcpu *vcpu,
  +  struct kvm_guest_debug *dbg) {
  + return -EINVAL;
  +}
  +
  int kvm_arch_vcpu_ioctl_get_fpu(struct kvm_vcpu *vcpu, struct
  kvm_fpu
  *fpu)  {
return -ENOTSUPP;
  diff --git a/arch/powerpc/kvm/powerpc.c
  b/arch/powerpc/kvm/powerpc.c index 934413c..4c94ca9 100644
  --- a/arch/powerpc/kvm/powerpc.c
  +++ b/arch/powerpc/kvm/powerpc.c
  @@ -532,12 +532,6 @@ void kvm_arch_vcpu_put(struct kvm_vcpu
  *vcpu) #endif  }
 
  -int kvm_arch_vcpu_ioctl_set_guest_debug(struct kvm_vcpu *vcpu,
  -struct kvm_guest_debug *dbg)
  -{
  - return -EINVAL;
  -}
  -
 
  This will break the build for non-book E machines, since
  kvm_arch_vcpu_ioctl_set_guest_debug() is referenced from generic code.
  You need to add it to arch/powerpc/kvm/book3s.c as well.
 
  right,  I will correct this.
 
  Would the implementation actually be different on booke vs book3s?
  My feeling is that powerpc.c is actually the right place for this.
 
 
  I am not sure there will be anything common between book3s and
  booke. Should
  we define the cpu specific function something like
  kvm_ppc_vcpu_ioctl_set_guest_debug() for booke and book3s and call
  this new defined function from kvm_arch_vcpu_ioctl_set_guest_debug() in
 powerpc.c ?
 
  No, just put it into the subarch directories then :). No need to
  overengineer anything for now.
 
  What you mean by subarch?  Above you mentioned that powerpc.c is right 
  place?
  Is not this patch is doing partially :)
 
 If the code in powerpc.c only says
 
 void kvm_arch_vcpu_ioctl_set_guest_debug(struct kvm_vcpu *vcpu, struct
 kvm_guest_debug *dbg) {
 kvmppc_core_set_guest_debug(vcpu, dbg); }
 
 then doing it in powerpc.c is obviously moot. Since there is no other debug
 implementation, it's ok if we try and find (and create) commonalities later.
 So
 yes, it's ok if you put it into booke.c or even e500.c. Just make sure to not
 break any other archs (440, book3s_pr, book3s_hv).

Right, yes I will correct that it compiles for all archs.

Thanks.
-Bharat

 
 
 Alex
 
 --
 To unsubscribe from this list: send the line unsubscribe kvm in the body of 
 a
 message to majord...@vger.kernel.org More majordomo info at
 http://vger.kernel.org/majordomo-info.html


--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 4/8] Added ONE_REG interface for debug instruction

2013-01-31 Thread Bhushan Bharat-R65777


 -Original Message-
 From: Alexander Graf [mailto:ag...@suse.de]
 Sent: Thursday, January 31, 2013 11:23 PM
 To: Bhushan Bharat-R65777
 Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org
 Subject: Re: [PATCH 4/8] Added ONE_REG interface for debug instruction
 
 
 On 31.01.2013, at 18:44, Bhushan Bharat-R65777 wrote:
 
 
 
  -Original Message-
  From: Alexander Graf [mailto:ag...@suse.de]
  Sent: Friday, January 25, 2013 5:18 PM
  To: Bhushan Bharat-R65777
  Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; Bhushan
  Bharat-R65777
  Subject: Re: [PATCH 4/8] Added ONE_REG interface for debug
  instruction
 
 
  On 16.01.2013, at 09:24, Bharat Bhushan wrote:
 
  This patch adds the one_reg interface to get the special instruction
  to be used for setting software breakpoint from userspace.
 
  Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
  ---
  Documentation/virtual/kvm/api.txt   |1 +
  arch/powerpc/include/asm/kvm_ppc.h  |1 +
  arch/powerpc/include/uapi/asm/kvm.h |3 +++
  arch/powerpc/kvm/44x.c  |5 +
  arch/powerpc/kvm/booke.c|   10 ++
  arch/powerpc/kvm/e500.c |5 +
  arch/powerpc/kvm/e500.h |9 +
  arch/powerpc/kvm/e500mc.c   |5 +
  8 files changed, 39 insertions(+), 0 deletions(-)
 
  diff --git a/Documentation/virtual/kvm/api.txt
  b/Documentation/virtual/kvm/api.txt
  index 09905cb..7e8be9e 100644
  --- a/Documentation/virtual/kvm/api.txt
  +++ b/Documentation/virtual/kvm/api.txt
  @@ -1775,6 +1775,7 @@ registers, find a list below:
   PPC   | KVM_REG_PPC_VPA_DTL   | 128
   PPC   | KVM_REG_PPC_EPCR | 32
   PPC   | KVM_REG_PPC_EPR  | 32
  +  PPC   | KVM_REG_PPC_DEBUG_INST| 32
 
  4.69 KVM_GET_ONE_REG
 
  diff --git a/arch/powerpc/include/asm/kvm_ppc.h
  b/arch/powerpc/include/asm/kvm_ppc.h
  index 44a657a..b3c481e 100644
  --- a/arch/powerpc/include/asm/kvm_ppc.h
  +++ b/arch/powerpc/include/asm/kvm_ppc.h
  @@ -235,6 +235,7 @@ union kvmppc_one_reg {
 
  void kvmppc_core_get_sregs(struct kvm_vcpu *vcpu, struct kvm_sregs
  *sregs); int kvmppc_core_set_sregs(struct kvm_vcpu *vcpu, struct
  kvm_sregs *sregs);
  +u32 kvmppc_core_debug_inst_op(void);
 
  void kvmppc_get_sregs_ivor(struct kvm_vcpu *vcpu, struct kvm_sregs
  *sregs); int kvmppc_set_sregs_ivor(struct kvm_vcpu *vcpu, struct
  kvm_sregs *sregs); diff --git a/arch/powerpc/include/uapi/asm/kvm.h
  b/arch/powerpc/include/uapi/asm/kvm.h
  index 16064d0..e81ae5b 100644
  --- a/arch/powerpc/include/uapi/asm/kvm.h
  +++ b/arch/powerpc/include/uapi/asm/kvm.h
  @@ -417,4 +417,7 @@ struct kvm_get_htab_header {
  #define KVM_REG_PPC_EPCR  (KVM_REG_PPC | KVM_REG_SIZE_U32 | 0x85)
  #define KVM_REG_PPC_EPR   (KVM_REG_PPC | KVM_REG_SIZE_U32 | 0x86)
 
  +/* Debugging: Special instruction for software breakpoint */
  +#define KVM_REG_PPC_DEBUG_INST (KVM_REG_PPC | KVM_REG_SIZE_U32 |
  +0x87)
  +
  #endif /* __LINUX_KVM_POWERPC_H */
  diff --git a/arch/powerpc/kvm/44x.c b/arch/powerpc/kvm/44x.c index
  3d7fd21..41501be 100644
  --- a/arch/powerpc/kvm/44x.c
  +++ b/arch/powerpc/kvm/44x.c
  @@ -114,6 +114,11 @@ int kvmppc_core_vcpu_translate(struct kvm_vcpu *vcpu,
return 0;
  }
 
  +u32 kvmppc_core_debug_inst_op(void) {
  + return -1;
 
 The way you handle it here this needs to be an  int
 kvmppc_core_debug_inst_op(u32 *inst) so you can return an error for 440. I 
 don't
 think it's worth to worry about a case where we don't know about the inst
 though. Just return the same as what we use on e500v2 here.
 
  +}
  +
  void kvmppc_core_get_sregs(struct kvm_vcpu *vcpu, struct kvm_sregs
  *sregs) {
kvmppc_get_sregs_ivor(vcpu, sregs); diff --git
  a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c index
  d2f502d..453a10f 100644
  --- a/arch/powerpc/kvm/booke.c
  +++ b/arch/powerpc/kvm/booke.c
 
  Please provide the DEBUG_INST on a more global level - across all ppc
 subarchs.
 
  Do you mean defining in powerpc.c ?
 
  We are using one_reg for DEBUG_INST and one_reg_ioctl and defined in
 respective subarchs (booke and books have their separate handler). So how you
 want this to be defined in more common way for all subarchs?
 
 Just add it to all subarch's one_reg handlers.

And what book3s etc should return?

-1 ? 

Thanks
-Bharat

--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 8/8] KVM:PPC:booke: Allow debug interrupt injection to guest

2013-01-31 Thread Bhushan Bharat-R65777


 -Original Message-
 From: kvm-ow...@vger.kernel.org [mailto:kvm-ow...@vger.kernel.org] On Behalf 
 Of
 Alexander Graf
 Sent: Thursday, January 31, 2013 5:34 PM
 To: Bhushan Bharat-R65777
 Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org
 Subject: Re: [PATCH 8/8] KVM:PPC:booke: Allow debug interrupt injection to 
 guest
 
 
 On 30.01.2013, at 12:12, Bhushan Bharat-R65777 wrote:
 
 
 
  -Original Message-
  From: kvm-ppc-ow...@vger.kernel.org
  [mailto:kvm-ppc-ow...@vger.kernel.org] On Behalf Of Alexander Graf
  Sent: Friday, January 25, 2013 5:44 PM
  To: Bhushan Bharat-R65777
  Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; Bhushan
  Bharat-R65777
  Subject: Re: [PATCH 8/8] KVM:PPC:booke: Allow debug interrupt
  injection to guest
 
 
  On 16.01.2013, at 09:24, Bharat Bhushan wrote:
 
  Allow userspace to inject debug interrupt to guest. QEMU can
 
  s/QEMU/user space.
 
  inject the debug interrupt to guest if it is not able to handle the
  debug interrupt.
 
  Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
  ---
  arch/powerpc/kvm/booke.c  |   32 +++-
  arch/powerpc/kvm/e500mc.c |   10 +-
  2 files changed, 40 insertions(+), 2 deletions(-)
 
  diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
  index faa0a0b..547797f 100644
  --- a/arch/powerpc/kvm/booke.c
  +++ b/arch/powerpc/kvm/booke.c
  @@ -133,6 +133,13 @@ static void kvmppc_vcpu_sync_fpu(struct
  kvm_vcpu
  *vcpu) #endif }
 
  +#ifdef CONFIG_KVM_BOOKE_HV
  +static int kvmppc_core_pending_debug(struct kvm_vcpu *vcpu) {
  + return test_bit(BOOKE_IRQPRIO_DEBUG,
  +vcpu-arch.pending_exceptions); } #endif
  +
  /*
  * Helper function for full MSR writes.  No need to call this if
  only
  * EE/CE/ME/DE/RI are changing.
  @@ -144,7 +151,11 @@ void kvmppc_set_msr(struct kvm_vcpu *vcpu, u32
  new_msr) #ifdef CONFIG_KVM_BOOKE_HV
new_msr |= MSR_GS;
 
  - if (vcpu-guest_debug)
  + /*
  +  * Set MSR_DE if the hardware debug resources are owned by user-space
  +  * and there is no debug interrupt pending for guest to handle.
 
  Why?
 
  QEMU is using the IAC/DAC registers to set hardware breakpoint/watchpoints 
  via
 debug ioctls. As debug events are enabled/gated by MSR_DE so somehow we need 
 to
 set MSR_DE on hardware MSR when guest is running in this case.
 
 Reading this 5 times I still have no idea what you're really checking for 
 here.
 Maybe the naming for kvmppc_core_pending_debug is just unnatural? What does 
 that
 function do really?
 
 
  On bookehv this is how I am controlling the MSR_DE in hardware MSR.
 
  And why is this whole thing only executed on HV?
 
  On e500v2 we always enable MSR_DE using vcpu-arch.shadow_msr in
  e500.c #ifndef CONFIG_KVM_BOOKE_HV
  -   vcpu-arch.shadow_msr = MSR_USER | MSR_IS | MSR_DS;
  +   vcpu-arch.shadow_msr = MSR_USER | MSR_DE | MSR_IS | MSR_DS;


diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
index b340a62..1e2d663 100644
--- a/arch/powerpc/kvm/booke.c
+++ b/arch/powerpc/kvm/booke.c
@@ -151,10 +151,14 @@ void kvmppc_set_msr(struct kvm_vcpu *vcpu, u32 new_msr)

/*
 * Set MSR_DE if the hardware debug resources are owned by user-space
-* and there is no debug interrupt pending for guest to handle.
 */
-   if (vcpu-guest_debug  !kvmppc_core_pending_debug(vcpu))
+   if (vcpu-guest_debug)
new_msr |= MSR_DE;
+#else
+   if (vcpu-guest_debug)
+   vcpu-arch.shadow_msr |= MSR_DE;
#endif

But do not when I should clear?

 
 Why? How is e500v2 any different wrt debug? And why wouldn't that work for
 e500mc?
 
 
 Alex
 
 --
 To unsubscribe from this list: send the line unsubscribe kvm in the body of 
 a
 message to majord...@vger.kernel.org More majordomo info at
 http://vger.kernel.org/majordomo-info.html


--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 0/8] KVM: BOOKE/BOOKEHV : Added debug stub support

2013-01-31 Thread Bhushan Bharat-R65777


 -Original Message-
 From: kvm-ppc-ow...@vger.kernel.org [mailto:kvm-ppc-ow...@vger.kernel.org] On
 Behalf Of Alexander Graf
 Sent: Friday, January 25, 2013 6:08 PM
 To: Bhushan Bharat-R65777
 Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; Bhushan Bharat-R65777
 Subject: Re: [PATCH 0/8] KVM: BOOKE/BOOKEHV : Added debug stub support
 
 
 On 16.01.2013, at 09:20, Bharat Bhushan wrote:
 
  This patchset adds the QEMU debug stub support for powerpc (booke/bookehv).
  [1/8] KVM: PPC: booke: use vcpu reference from thread_struct
  - This is a cleanup patch to use vcpu reference from thread struct
  [2/8] KVM: PPC: booke: Allow multiple exception types [3/8] KVM: PPC:
  booke: Added debug handler
  - These two patches install the KVM debug handler.
  [4/8] Added ONE_REG interface for debug instruction
  - Add the ioctl interface to get the debug instruction for
setting software breakpoint from QEMU debug stub.
  [5/8] KVM: PPC: debug stub interface parameter defined [6/8] booke:
  Added DBCR4 SPR number [7/8] KVM: booke/bookehv: Add debug stub
  support
  - Add the debug stub interface on booke/bookehv [8/8] KVM:PPC:booke:
  Allow debug interrupt injection to guest
  -- with this qemu can inject debug interrupt to guest
 
 Thanks, applied 1/8, 2/8, 6/8.


Alex I cannot see these 3 patches on kvm-ppc-next branch. Are those applied on 
some other branch ?

Thanks
-Bharat

--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 3/8] KVM: PPC: booke: Added debug handler

2013-01-31 Thread Bhushan Bharat-R65777


 -Original Message-
 From: kvm-ppc-ow...@vger.kernel.org [mailto:kvm-ppc-ow...@vger.kernel.org] On
 Behalf Of Alexander Graf
 Sent: Thursday, January 31, 2013 10:38 PM
 To: Bhushan Bharat-R65777
 Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org
 Subject: Re: [PATCH 3/8] KVM: PPC: booke: Added debug handler
 
 
 On 31.01.2013, at 17:58, Bhushan Bharat-R65777 wrote:
 
 
 
  -Original Message-
  From: Alexander Graf [mailto:ag...@suse.de]
  Sent: Thursday, January 31, 2013 5:47 PM
  To: Bhushan Bharat-R65777
  Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org
  Subject: Re: [PATCH 3/8] KVM: PPC: booke: Added debug handler
 
 
  On 30.01.2013, at 12:30, Bhushan Bharat-R65777 wrote:
 
 
 
  -Original Message-
  From: Alexander Graf [mailto:ag...@suse.de]
  Sent: Friday, January 25, 2013 5:13 PM
  To: Bhushan Bharat-R65777
  Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; Bhushan
  Bharat-R65777
  Subject: Re: [PATCH 3/8] KVM: PPC: booke: Added debug handler
 
 
  On 16.01.2013, at 09:24, Bharat Bhushan wrote:
 
  From: Bharat Bhushan bharat.bhus...@freescale.com
 
  Installed debug handler will be used for guest debug support and
  debug facility emulation features (patches for these features will
  follow this patch).
 
  Signed-off-by: Liu Yu yu@freescale.com
  [bharat.bhus...@freescale.com: Substantial changes]
  Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
  ---
  arch/powerpc/include/asm/kvm_host.h |1 +
  arch/powerpc/kernel/asm-offsets.c   |1 +
  arch/powerpc/kvm/booke_interrupts.S |   49 
  ++-
 --
  --
  3 files changed, 44 insertions(+), 7 deletions(-)
 
  diff --git a/arch/powerpc/include/asm/kvm_host.h
  b/arch/powerpc/include/asm/kvm_host.h
  index 8a72d59..f4ba881 100644
  --- a/arch/powerpc/include/asm/kvm_host.h
  +++ b/arch/powerpc/include/asm/kvm_host.h
  @@ -503,6 +503,7 @@ struct kvm_vcpu_arch {
  u32 tlbcfg[4];
  u32 mmucfg;
  u32 epr;
  +   u32 crit_save;
  struct kvmppc_booke_debug_reg dbg_reg; #endif
  gpa_t paddr_accessed;
  diff --git a/arch/powerpc/kernel/asm-offsets.c
  b/arch/powerpc/kernel/asm-offsets.c
  index 46f6afd..02048f3 100644
  --- a/arch/powerpc/kernel/asm-offsets.c
  +++ b/arch/powerpc/kernel/asm-offsets.c
  @@ -562,6 +562,7 @@ int main(void)
  DEFINE(VCPU_LAST_INST, offsetof(struct kvm_vcpu, 
  arch.last_inst));
  DEFINE(VCPU_FAULT_DEAR, offsetof(struct kvm_vcpu, 
  arch.fault_dear));
  DEFINE(VCPU_FAULT_ESR, offsetof(struct kvm_vcpu,
  arch.fault_esr));
  +   DEFINE(VCPU_CRIT_SAVE, offsetof(struct kvm_vcpu,
  +arch.crit_save));
  #endif /* CONFIG_PPC_BOOK3S */
  #endif /* CONFIG_KVM */
 
  diff --git a/arch/powerpc/kvm/booke_interrupts.S
  b/arch/powerpc/kvm/booke_interrupts.S
  index eae8483..dd9c5d4 100644
  --- a/arch/powerpc/kvm/booke_interrupts.S
  +++ b/arch/powerpc/kvm/booke_interrupts.S
  @@ -52,12 +52,7 @@
   (1BOOKE_INTERRUPT_PROGRAM) | \
   (1BOOKE_INTERRUPT_DTLB_MISS))
 
  -.macro KVM_HANDLER ivor_nr scratch srr0
  -_GLOBAL(kvmppc_handler_\ivor_nr)
  -   /* Get pointer to vcpu and record exit number. */
  -   mtspr   \scratch , r4
  -   mfspr   r4, SPRN_SPRG_THREAD
  -   lwz r4, THREAD_KVM_VCPU(r4)
  +.macro __KVM_HANDLER ivor_nr scratch srr0
  stw r3, VCPU_GPR(R3)(r4)
  stw r5, VCPU_GPR(R5)(r4)
  stw r6, VCPU_GPR(R6)(r4)
  @@ -74,6 +69,46 @@ _GLOBAL(kvmppc_handler_\ivor_nr)
  bctr
  .endm
 
  +.macro KVM_HANDLER ivor_nr scratch srr0
  +_GLOBAL(kvmppc_handler_\ivor_nr)
  +   /* Get pointer to vcpu and record exit number. */
  +   mtspr   \scratch , r4
  +   mfspr   r4, SPRN_SPRG_THREAD
  +   lwz r4, THREAD_KVM_VCPU(r4)
  +   __KVM_HANDLER \ivor_nr \scratch \srr0 .endm
  +
  +.macro KVM_DBG_HANDLER ivor_nr scratch srr0
  +_GLOBAL(kvmppc_handler_\ivor_nr)
  +   mtspr   \scratch, r4
  +   mfspr   r4, SPRN_SPRG_THREAD
  +   lwz r4, THREAD_KVM_VCPU(r4)
  +   stw r3, VCPU_CRIT_SAVE(r4)
  +   mfcrr3
  +   mfspr   r4, SPRN_CSRR1
  +   andi.   r4, r4, MSR_PR
  +   bne 1f
 
 
  +   /* debug interrupt happened in enter/exit path */
  +   mfspr   r4, SPRN_CSRR1
  +   rlwinm  r4, r4, 0, ~MSR_DE
  +   mtspr   SPRN_CSRR1, r4
  +   lis r4, 0x
  +   ori r4, r4, 0x
  +   mtspr   SPRN_DBSR, r4
  +   mfspr   r4, SPRN_SPRG_THREAD
  +   lwz r4, THREAD_KVM_VCPU(r4)
  +   mtcrr3
  +   lwz r3, VCPU_CRIT_SAVE(r4)
  +   mfspr   r4, \scratch
  +   rfci
 
  What is this part doing? Try to ignore the debug exit?
 
  As BOOKE doesn't have hardware support for virtualization, hardware
  never know
  current pc is in guest or in host.
  So when enable hardware single step for guest, it cannot be disabled
  at the
  time guest exit. Thus, we'll see that an single step interrupt

RE: [PATCH 7/8] KVM: PPC: booke/bookehv: Add debug stub support

2013-01-31 Thread Bhushan Bharat-R65777


 -Original Message-
 From: Alexander Graf [mailto:ag...@suse.de]
 Sent: Friday, January 25, 2013 5:37 PM
 To: Bhushan Bharat-R65777
 Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; Bhushan Bharat-R65777
 Subject: Re: [PATCH 7/8] KVM: PPC: booke/bookehv: Add debug stub support
 
 
 On 16.01.2013, at 09:24, Bharat Bhushan wrote:
 
  This patch adds the debug stub support on booke/bookehv.
  Now QEMU debug stub can use hw breakpoint, watchpoint and software
  breakpoint to debug guest.
 
  Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
  ---
  arch/powerpc/include/asm/kvm_host.h   |5 +
  arch/powerpc/include/asm/kvm_ppc.h|2 +
  arch/powerpc/include/uapi/asm/kvm.h   |   22 -
  arch/powerpc/kernel/asm-offsets.c |   26 ++
  arch/powerpc/kvm/booke.c  |  124 +
  arch/powerpc/kvm/booke_interrupts.S   |  114 ++
  arch/powerpc/kvm/bookehv_interrupts.S |  145 
  -
  arch/powerpc/kvm/e500_emulate.c   |6 ++
  arch/powerpc/kvm/e500mc.c |3 +-
  9 files changed, 422 insertions(+), 25 deletions(-)
 
  diff --git a/arch/powerpc/include/asm/kvm_host.h
  b/arch/powerpc/include/asm/kvm_host.h
  index f4ba881..a9feeb0 100644
  --- a/arch/powerpc/include/asm/kvm_host.h
  +++ b/arch/powerpc/include/asm/kvm_host.h
  @@ -504,7 +504,12 @@ struct kvm_vcpu_arch {
  u32 mmucfg;
  u32 epr;
  u32 crit_save;
  +   /* guest debug registers*/
  struct kvmppc_booke_debug_reg dbg_reg;
  +   /* shadow debug registers */
  +   struct kvmppc_booke_debug_reg shadow_dbg_reg;
  +   /* host debug registers*/
  +   struct kvmppc_booke_debug_reg host_dbg_reg;
  #endif
  gpa_t paddr_accessed;
  gva_t vaddr_accessed;
  diff --git a/arch/powerpc/include/asm/kvm_ppc.h
  b/arch/powerpc/include/asm/kvm_ppc.h
  index b3c481e..e4b3398 100644
  --- a/arch/powerpc/include/asm/kvm_ppc.h
  +++ b/arch/powerpc/include/asm/kvm_ppc.h
  @@ -45,6 +45,8 @@ enum emulation_result {
  EMULATE_FAIL, /* can't emulate this instruction */
  EMULATE_AGAIN,/* something went wrong. go again */
  EMULATE_DO_PAPR,  /* kvm_run filled with PAPR request */
  +   EMULATE_DEBUG_INST,   /* debug instruction for software
  +breakpoint, exit to userspace */
 
 Does this do something different from DO_PAPR? Maybe it makes sense to have an
 exit code EMULATE_EXIT_USER?

I think EMULATE_DO_PAPR does something similar but the name is confusing. May 
be we can rename EMULATE_DO_PAPR to 
EMULATE_EXIT_USER.

Thanks
-Bharat
 
  };
 
  extern int kvmppc_vcpu_run(struct kvm_run *kvm_run, struct kvm_vcpu
  *vcpu); diff --git a/arch/powerpc/include/uapi/asm/kvm.h
  b/arch/powerpc/include/uapi/asm/kvm.h
  index e8842ed..a81ab29 100644
  --- a/arch/powerpc/include/uapi/asm/kvm.h
  +++ b/arch/powerpc/include/uapi/asm/kvm.h
  @@ -25,6 +25,7 @@
  /* Select powerpc specific features in linux/kvm.h */ #define
  __KVM_HAVE_SPAPR_TCE #define __KVM_HAVE_PPC_SMT
  +#define __KVM_HAVE_GUEST_DEBUG
 
  struct kvm_regs {
  __u64 pc;
  @@ -267,7 +268,24 @@ struct kvm_fpu {
  __u64 fpr[32];
  };
 
  +/*
  + * Defines for h/w breakpoint, watchpoint (read, write or both) and
  + * software breakpoint.
  + * These are used as type in KVM_SET_GUEST_DEBUG ioctl and status
  + * for KVM_DEBUG_EXIT.
  + */
  +#define KVMPPC_DEBUG_NONE  0x0
  +#define KVMPPC_DEBUG_BREAKPOINT(1UL  1)
  +#define KVMPPC_DEBUG_WATCH_WRITE   (1UL  2)
  +#define KVMPPC_DEBUG_WATCH_READ(1UL  3)
  struct kvm_debug_exit_arch {
  +   __u64 address;
  +   /*
  +* exiting to userspace because of h/w breakpoint, watchpoint
  +* (read, write or both) and software breakpoint.
  +*/
  +   __u32 status;
  +   __u32 reserved;
  };
 
  /* for KVM_SET_GUEST_DEBUG */
  @@ -279,10 +297,6 @@ struct kvm_guest_debug_arch {
   * Type denotes h/w breakpoint, read watchpoint, write
   * watchpoint or watchpoint (both read and write).
   */
  -#define KVMPPC_DEBUG_NOTYPE0x0
  -#define KVMPPC_DEBUG_BREAKPOINT(1UL  1)
  -#define KVMPPC_DEBUG_WATCH_WRITE   (1UL  2)
  -#define KVMPPC_DEBUG_WATCH_READ(1UL  3)
  __u32 type;
  __u32 reserved;
  } bp[16];
  diff --git a/arch/powerpc/kernel/asm-offsets.c
  b/arch/powerpc/kernel/asm-offsets.c
  index 02048f3..22deda7 100644
  --- a/arch/powerpc/kernel/asm-offsets.c
  +++ b/arch/powerpc/kernel/asm-offsets.c
  @@ -563,6 +563,32 @@ int main(void)
  DEFINE(VCPU_FAULT_DEAR, offsetof(struct kvm_vcpu, arch.fault_dear));
  DEFINE(VCPU_FAULT_ESR, offsetof(struct kvm_vcpu, arch.fault_esr));
  DEFINE(VCPU_CRIT_SAVE, offsetof(struct kvm_vcpu, arch.crit_save));
  +   DEFINE(VCPU_DBSR, offsetof(struct kvm_vcpu, arch.dbsr));
  +   DEFINE(VCPU_SHADOW_DBG, offsetof(struct kvm_vcpu, arch.shadow_dbg_reg));
  +   DEFINE(VCPU_HOST_DBG

RE: [PATCH 8/8] KVM:PPC:booke: Allow debug interrupt injection to guest

2013-01-30 Thread Bhushan Bharat-R65777


 -Original Message-
 From: kvm-ppc-ow...@vger.kernel.org [mailto:kvm-ppc-ow...@vger.kernel.org] On
 Behalf Of Alexander Graf
 Sent: Friday, January 25, 2013 5:44 PM
 To: Bhushan Bharat-R65777
 Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; Bhushan Bharat-R65777
 Subject: Re: [PATCH 8/8] KVM:PPC:booke: Allow debug interrupt injection to 
 guest
 
 
 On 16.01.2013, at 09:24, Bharat Bhushan wrote:
 
  Allow userspace to inject debug interrupt to guest. QEMU can
 
 s/QEMU/user space.
 
  inject the debug interrupt to guest if it is not able to handle the
  debug interrupt.
 
  Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
  ---
  arch/powerpc/kvm/booke.c  |   32 +++-
  arch/powerpc/kvm/e500mc.c |   10 +-
  2 files changed, 40 insertions(+), 2 deletions(-)
 
  diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c index
  faa0a0b..547797f 100644
  --- a/arch/powerpc/kvm/booke.c
  +++ b/arch/powerpc/kvm/booke.c
  @@ -133,6 +133,13 @@ static void kvmppc_vcpu_sync_fpu(struct kvm_vcpu
  *vcpu) #endif }
 
  +#ifdef CONFIG_KVM_BOOKE_HV
  +static int kvmppc_core_pending_debug(struct kvm_vcpu *vcpu) {
  +   return test_bit(BOOKE_IRQPRIO_DEBUG,
  +vcpu-arch.pending_exceptions); } #endif
  +
  /*
   * Helper function for full MSR writes.  No need to call this if only
   * EE/CE/ME/DE/RI are changing.
  @@ -144,7 +151,11 @@ void kvmppc_set_msr(struct kvm_vcpu *vcpu, u32 new_msr)
  #ifdef CONFIG_KVM_BOOKE_HV
  new_msr |= MSR_GS;
 
  -   if (vcpu-guest_debug)
  +   /*
  +* Set MSR_DE if the hardware debug resources are owned by user-space
  +* and there is no debug interrupt pending for guest to handle.
 
 Why?

QEMU is using the IAC/DAC registers to set hardware breakpoint/watchpoints via 
debug ioctls. As debug events are enabled/gated by MSR_DE so somehow we need to 
set MSR_DE on hardware MSR when guest is running in this case.

On bookehv this is how I am controlling the MSR_DE in hardware MSR.  

 And why is this whole thing only executed on HV?

On e500v2 we always enable MSR_DE using vcpu-arch.shadow_msr in e500.c
#ifndef CONFIG_KVM_BOOKE_HV
-   vcpu-arch.shadow_msr = MSR_USER | MSR_IS | MSR_DS;
+   vcpu-arch.shadow_msr = MSR_USER | MSR_DE | MSR_IS | MSR_DS;
vcpu-arch.shadow_pid = 1;
vcpu-arch.shared-msr = 0;
#endif

Thanks
-Bharat

 
 
 Alex
 
  +*/
  +   if (vcpu-guest_debug  !kvmppc_core_pending_debug(vcpu))
  new_msr |= MSR_DE;
  #endif
 
  @@ -234,6 +245,16 @@ static void kvmppc_core_dequeue_watchdog(struct 
  kvm_vcpu
 *vcpu)
  clear_bit(BOOKE_IRQPRIO_WATCHDOG, vcpu-arch.pending_exceptions);
  }
 
  +static void kvmppc_core_queue_debug(struct kvm_vcpu *vcpu)
  +{
  +   kvmppc_booke_queue_irqprio(vcpu, BOOKE_IRQPRIO_DEBUG);
  +}
  +
  +static void kvmppc_core_dequeue_debug(struct kvm_vcpu *vcpu)
  +{
  +   clear_bit(BOOKE_IRQPRIO_DEBUG, vcpu-arch.pending_exceptions);
  +}
  +
  static void set_guest_srr(struct kvm_vcpu *vcpu, unsigned long srr0, u32 
  srr1)
  {
  #ifdef CONFIG_KVM_BOOKE_HV
  @@ -1278,6 +1299,7 @@ static void get_sregs_base(struct kvm_vcpu *vcpu,
  sregs-u.e.dec = kvmppc_get_dec(vcpu, tb);
  sregs-u.e.tb = tb;
  sregs-u.e.vrsave = vcpu-arch.vrsave;
  +   sregs-u.e.dbsr = vcpu-arch.dbsr;
  }
 
  static int set_sregs_base(struct kvm_vcpu *vcpu,
  @@ -1310,6 +1332,14 @@ static int set_sregs_base(struct kvm_vcpu *vcpu,
  update_timer_ints(vcpu);
  }
 
  +   if (sregs-u.e.update_special  KVM_SREGS_E_UPDATE_DBSR) {
  +   vcpu-arch.dbsr = sregs-u.e.dbsr;
  +   if (vcpu-arch.dbsr)
  +   kvmppc_core_queue_debug(vcpu);
  +   else
  +   kvmppc_core_dequeue_debug(vcpu);
  +   }
  +
  return 0;
  }
 
  diff --git a/arch/powerpc/kvm/e500mc.c b/arch/powerpc/kvm/e500mc.c
  index 81abe92..7d90622 100644
  --- a/arch/powerpc/kvm/e500mc.c
  +++ b/arch/powerpc/kvm/e500mc.c
  @@ -208,7 +208,7 @@ void kvmppc_core_get_sregs(struct kvm_vcpu *vcpu, struct
 kvm_sregs *sregs)
  struct kvmppc_vcpu_e500 *vcpu_e500 = to_e500(vcpu);
 
  sregs-u.e.features |= KVM_SREGS_E_ARCH206_MMU | KVM_SREGS_E_PM |
  -  KVM_SREGS_E_PC;
  +  KVM_SREGS_E_PC | KVM_SREGS_E_ED;
  sregs-u.e.impl_id = KVM_SREGS_E_IMPL_FSL;
 
  sregs-u.e.impl.fsl.features = 0;
  @@ -216,6 +216,9 @@ void kvmppc_core_get_sregs(struct kvm_vcpu *vcpu, struct
 kvm_sregs *sregs)
  sregs-u.e.impl.fsl.hid0 = vcpu_e500-hid0;
  sregs-u.e.impl.fsl.mcar = vcpu_e500-mcar;
 
  +   sregs-u.e.dsrr0 = vcpu-arch.dsrr0;
  +   sregs-u.e.dsrr1 = vcpu-arch.dsrr1;
  +
  kvmppc_get_sregs_e500_tlb(vcpu, sregs);
 
  sregs-u.e.ivor_high[3] =
  @@ -256,6 +259,11 @@ int kvmppc_core_set_sregs(struct kvm_vcpu *vcpu, struct
 kvm_sregs *sregs)
  sregs-u.e.ivor_high[5];
  }
 
  +   if (sregs-u.e.features  KVM_SREGS_E_ED) {
  +   vcpu-arch.dsrr0 = sregs-u.e.dsrr0

RE: [PATCH 3/8] KVM: PPC: booke: Added debug handler

2013-01-30 Thread Bhushan Bharat-R65777


 -Original Message-
 From: Alexander Graf [mailto:ag...@suse.de]
 Sent: Friday, January 25, 2013 5:13 PM
 To: Bhushan Bharat-R65777
 Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; Bhushan Bharat-R65777
 Subject: Re: [PATCH 3/8] KVM: PPC: booke: Added debug handler
 
 
 On 16.01.2013, at 09:24, Bharat Bhushan wrote:
 
  From: Bharat Bhushan bharat.bhus...@freescale.com
 
  Installed debug handler will be used for guest debug support and debug
  facility emulation features (patches for these features will follow
  this patch).
 
  Signed-off-by: Liu Yu yu@freescale.com
  [bharat.bhus...@freescale.com: Substantial changes]
  Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
  ---
  arch/powerpc/include/asm/kvm_host.h |1 +
  arch/powerpc/kernel/asm-offsets.c   |1 +
  arch/powerpc/kvm/booke_interrupts.S |   49 
  ++-
  3 files changed, 44 insertions(+), 7 deletions(-)
 
  diff --git a/arch/powerpc/include/asm/kvm_host.h
  b/arch/powerpc/include/asm/kvm_host.h
  index 8a72d59..f4ba881 100644
  --- a/arch/powerpc/include/asm/kvm_host.h
  +++ b/arch/powerpc/include/asm/kvm_host.h
  @@ -503,6 +503,7 @@ struct kvm_vcpu_arch {
  u32 tlbcfg[4];
  u32 mmucfg;
  u32 epr;
  +   u32 crit_save;
  struct kvmppc_booke_debug_reg dbg_reg; #endif
  gpa_t paddr_accessed;
  diff --git a/arch/powerpc/kernel/asm-offsets.c
  b/arch/powerpc/kernel/asm-offsets.c
  index 46f6afd..02048f3 100644
  --- a/arch/powerpc/kernel/asm-offsets.c
  +++ b/arch/powerpc/kernel/asm-offsets.c
  @@ -562,6 +562,7 @@ int main(void)
  DEFINE(VCPU_LAST_INST, offsetof(struct kvm_vcpu, arch.last_inst));
  DEFINE(VCPU_FAULT_DEAR, offsetof(struct kvm_vcpu, arch.fault_dear));
  DEFINE(VCPU_FAULT_ESR, offsetof(struct kvm_vcpu, arch.fault_esr));
  +   DEFINE(VCPU_CRIT_SAVE, offsetof(struct kvm_vcpu, arch.crit_save));
  #endif /* CONFIG_PPC_BOOK3S */
  #endif /* CONFIG_KVM */
 
  diff --git a/arch/powerpc/kvm/booke_interrupts.S
  b/arch/powerpc/kvm/booke_interrupts.S
  index eae8483..dd9c5d4 100644
  --- a/arch/powerpc/kvm/booke_interrupts.S
  +++ b/arch/powerpc/kvm/booke_interrupts.S
  @@ -52,12 +52,7 @@
 (1BOOKE_INTERRUPT_PROGRAM) | \
 (1BOOKE_INTERRUPT_DTLB_MISS))
 
  -.macro KVM_HANDLER ivor_nr scratch srr0
  -_GLOBAL(kvmppc_handler_\ivor_nr)
  -   /* Get pointer to vcpu and record exit number. */
  -   mtspr   \scratch , r4
  -   mfspr   r4, SPRN_SPRG_THREAD
  -   lwz r4, THREAD_KVM_VCPU(r4)
  +.macro __KVM_HANDLER ivor_nr scratch srr0
  stw r3, VCPU_GPR(R3)(r4)
  stw r5, VCPU_GPR(R5)(r4)
  stw r6, VCPU_GPR(R6)(r4)
  @@ -74,6 +69,46 @@ _GLOBAL(kvmppc_handler_\ivor_nr)
  bctr
  .endm
 
  +.macro KVM_HANDLER ivor_nr scratch srr0
  +_GLOBAL(kvmppc_handler_\ivor_nr)
  +   /* Get pointer to vcpu and record exit number. */
  +   mtspr   \scratch , r4
  +   mfspr   r4, SPRN_SPRG_THREAD
  +   lwz r4, THREAD_KVM_VCPU(r4)
  +   __KVM_HANDLER \ivor_nr \scratch \srr0 .endm
  +
  +.macro KVM_DBG_HANDLER ivor_nr scratch srr0
  +_GLOBAL(kvmppc_handler_\ivor_nr)
  +   mtspr   \scratch, r4
  +   mfspr   r4, SPRN_SPRG_THREAD
  +   lwz r4, THREAD_KVM_VCPU(r4)
  +   stw r3, VCPU_CRIT_SAVE(r4)
  +   mfcrr3
  +   mfspr   r4, SPRN_CSRR1
  +   andi.   r4, r4, MSR_PR
  +   bne 1f
 
 
  +   /* debug interrupt happened in enter/exit path */
  +   mfspr   r4, SPRN_CSRR1
  +   rlwinm  r4, r4, 0, ~MSR_DE
  +   mtspr   SPRN_CSRR1, r4
  +   lis r4, 0x
  +   ori r4, r4, 0x
  +   mtspr   SPRN_DBSR, r4
  +   mfspr   r4, SPRN_SPRG_THREAD
  +   lwz r4, THREAD_KVM_VCPU(r4)
  +   mtcrr3
  +   lwz r3, VCPU_CRIT_SAVE(r4)
  +   mfspr   r4, \scratch
  +   rfci
 
 What is this part doing? Try to ignore the debug exit?

As BOOKE doesn't have hardware support for virtualization, hardware never know 
current pc is in guest or in host.
So when enable hardware single step for guest, it cannot be disabled at the 
time guest exit. Thus, we'll see that an single step interrupt happens at the 
beginning of guest exit path.

With the above code we recognize this kind of single step interrupt disable 
single step and rfci.

 Why would we have MSR_DE
 enabled in the first place when we can't handle it?

When QEMU is using hardware debug resource then we always set MSR_DE during 
guest is running.

 
  +1: /* debug interrupt happened in guest */
  +   mtcrr3
  +   mfspr   r4, SPRN_SPRG_THREAD
  +   lwz r4, THREAD_KVM_VCPU(r4)
  +   lwz r3, VCPU_CRIT_SAVE(r4)
  +   __KVM_HANDLER \ivor_nr \scratch \srr0
 
 I don't think you need the __KVM_HANDLER split. This should be quite easily
 refactorable into a simple DBG prolog.

Can you please elaborate how you are envisioning this?

Thanks
-Bharat

 
 
 Alex
 
  +.endm
  +
  .macro KVM_HANDLER_ADDR ivor_nr
  .long   kvmppc_handler_\ivor_nr
  .endm
  @@ -98,7 +133,7 @@ KVM_HANDLER BOOKE_INTERRUPT_FIT SPRN_SPRG_RSCRATCH0
  SPRN_SRR0

RE: [PATCH 5/8] KVM: PPC: debug stub interface parameter defined

2013-01-17 Thread Bhushan Bharat-R65777


 -Original Message-
 From: Paul Mackerras [mailto:pau...@samba.org]
 Sent: Thursday, January 17, 2013 12:53 PM
 To: Bhushan Bharat-R65777
 Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; ag...@suse.de; Bhushan 
 Bharat-
 R65777
 Subject: Re: [PATCH 5/8] KVM: PPC: debug stub interface parameter defined
 
 On Wed, Jan 16, 2013 at 01:54:42PM +0530, Bharat Bhushan wrote:
  This patch defines the interface parameter for KVM_SET_GUEST_DEBUG
  ioctl support. Follow up patches will use this for setting up hardware
  breakpoints, watchpoints and software breakpoints.
 
 [snip]
 
  diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c index
  453a10f..7d5a51c 100644
  --- a/arch/powerpc/kvm/booke.c
  +++ b/arch/powerpc/kvm/booke.c
  @@ -1483,6 +1483,12 @@ int kvm_vcpu_ioctl_set_one_reg(struct kvm_vcpu *vcpu,
 struct kvm_one_reg *reg)
  return r;
   }
 
  +int kvm_arch_vcpu_ioctl_set_guest_debug(struct kvm_vcpu *vcpu,
  +struct kvm_guest_debug *dbg)
  +{
  +   return -EINVAL;
  +}
  +
   int kvm_arch_vcpu_ioctl_get_fpu(struct kvm_vcpu *vcpu, struct kvm_fpu
  *fpu)  {
  return -ENOTSUPP;
  diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c
  index 934413c..4c94ca9 100644
  --- a/arch/powerpc/kvm/powerpc.c
  +++ b/arch/powerpc/kvm/powerpc.c
  @@ -532,12 +532,6 @@ void kvm_arch_vcpu_put(struct kvm_vcpu *vcpu)
  #endif  }
 
  -int kvm_arch_vcpu_ioctl_set_guest_debug(struct kvm_vcpu *vcpu,
  -struct kvm_guest_debug *dbg)
  -{
  -   return -EINVAL;
  -}
  -
 
 This will break the build for non-book E machines, since
 kvm_arch_vcpu_ioctl_set_guest_debug() is referenced from generic code.
 You need to add it to arch/powerpc/kvm/book3s.c as well.

right,  I will correct this.

Thanks
-Bharat


--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 1/6] KVM: PPC: booke: use vcpu reference from thread_struct

2012-10-04 Thread Bhushan Bharat-R65777


 -Original Message-
 From: Alexander Graf [mailto:ag...@suse.de]
 Sent: Monday, September 24, 2012 9:58 PM
 To: Bhushan Bharat-R65777
 Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; Bhushan Bharat-R65777
 Subject: Re: [PATCH 1/6] KVM: PPC: booke: use vcpu reference from 
 thread_struct
 
 
 On 21.08.2012, at 15:51, Bharat Bhushan wrote:
 
  Like other places, use thread_struct to get vcpu reference.
 
 Please remove the definition of SPRN_SPRG_R/WVCPU as well.

Ok

Thanks
-Bharat

 
 
 Alex
 
 
  Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
  ---
  arch/powerpc/kernel/asm-offsets.c   |2 +-
  arch/powerpc/kvm/booke_interrupts.S |6 ++
  2 files changed, 3 insertions(+), 5 deletions(-)
 
  diff --git a/arch/powerpc/kernel/asm-offsets.c
  b/arch/powerpc/kernel/asm-offsets.c
  index 85b05c4..fbb999c 100644
  --- a/arch/powerpc/kernel/asm-offsets.c
  +++ b/arch/powerpc/kernel/asm-offsets.c
  @@ -116,7 +116,7 @@ int main(void)
  #ifdef CONFIG_KVM_BOOK3S_32_HANDLER
  DEFINE(THREAD_KVM_SVCPU, offsetof(struct thread_struct,
  kvm_shadow_vcpu)); #endif -#ifdef CONFIG_KVM_BOOKE_HV
  +#if defined(CONFIG_KVM)  defined(CONFIG_BOOKE)
  DEFINE(THREAD_KVM_VCPU, offsetof(struct thread_struct, kvm_vcpu));
  #endif
 
  diff --git a/arch/powerpc/kvm/booke_interrupts.S
  b/arch/powerpc/kvm/booke_interrupts.S
  index bb46b32..ca16d57 100644
  --- a/arch/powerpc/kvm/booke_interrupts.S
  +++ b/arch/powerpc/kvm/booke_interrupts.S
  @@ -56,7 +56,8 @@
  _GLOBAL(kvmppc_handler_\ivor_nr)
  /* Get pointer to vcpu and record exit number. */
  mtspr   \scratch , r4
  -   mfspr   r4, SPRN_SPRG_RVCPU
  +   mfspr   r4, SPRN_SPRG_THREAD
  +   lwz r4, THREAD_KVM_VCPU(r4)
  stw r3, VCPU_GPR(R3)(r4)
  stw r5, VCPU_GPR(R5)(r4)
  stw r6, VCPU_GPR(R6)(r4)
  @@ -402,9 +403,6 @@ lightweight_exit:
  lwz r8, kvmppc_booke_handlers@l(r8)
  mtspr   SPRN_IVPR, r8
 
  -   /* Save vcpu pointer for the exception handlers. */
  -   mtspr   SPRN_SPRG_WVCPU, r4
  -
  lwz r5, VCPU_SHARED(r4)
 
  /* Can't switch the stack pointer until after IVPR is switched,
  --
  1.7.0.4
 
 
 


--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


  1   2   >