RE: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages

2013-07-24 Thread Bhushan Bharat-R65777


 -Original Message-
 From: Alexander Graf [mailto:ag...@suse.de]
 Sent: Wednesday, July 24, 2013 1:55 PM
 To: “tiejun.chen”
 Cc: Bhushan Bharat-R65777; kvm-ppc@vger.kernel.org; k...@vger.kernel.org list;
 Wood Scott-B07421; Gleb Natapov; Paolo Bonzini
 Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel
 managed pages
 
 
 On 24.07.2013, at 04:26, “tiejun.chen” wrote:
 
  On 07/18/2013 06:27 PM, Alexander Graf wrote:
 
  On 18.07.2013, at 12:19, “tiejun.chen” wrote:
 
  On 07/18/2013 06:12 PM, Alexander Graf wrote:
 
  On 18.07.2013, at 12:08, “tiejun.chen” wrote:
 
  On 07/18/2013 05:48 PM, Alexander Graf wrote:
 
  On 18.07.2013, at 10:25, Bhushan Bharat-R65777 wrote:
 
 
 
  -Original Message-
  From: Bhushan Bharat-R65777
  Sent: Thursday, July 18, 2013 1:53 PM
  To: ' tiejun.chen '
  Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org;
  ag...@suse.de; Wood Scott-
  B07421
  Subject: RE: [PATCH 2/2] kvm: powerpc: set cache coherency only
  for kernel managed pages
 
 
 
  -Original Message-
  From:  tiejun.chen  [mailto:tiejun.c...@windriver.com]
  Sent: Thursday, July 18, 2013 1:52 PM
  To: Bhushan Bharat-R65777
  Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org;
  ag...@suse.de; Wood
  Scott-
  B07421
  Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency
  only for kernel managed pages
 
  On 07/18/2013 04:08 PM, Bhushan Bharat-R65777 wrote:
 
 
  -Original Message-
  From: kvm-ppc-ow...@vger.kernel.org
  [mailto:kvm-ppc-ow...@vger.kernel.org] On Behalf Of  tiejun.chen 
  
  Sent: Thursday, July 18, 2013 1:01 PM
  To: Bhushan Bharat-R65777
  Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org;
  ag...@suse.de; Wood
  Scott-
  B07421
  Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency
  only for kernel managed pages
 
  On 07/18/2013 03:12 PM, Bhushan Bharat-R65777 wrote:
 
 
  -Original Message-
  From:  tiejun.chen  [mailto:tiejun.c...@windriver.com]
  Sent: Thursday, July 18, 2013 11:56 AM
  To: Bhushan Bharat-R65777
  Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org;
  ag...@suse.de; Wood
  Scott- B07421; Bhushan Bharat-R65777
  Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency
  only for kernel managed pages
 
  On 07/18/2013 02:04 PM, Bharat Bhushan wrote:
  If there is a struct page for the requested mapping then
  it's normal DDR and the mapping sets M bit (coherent,
  cacheable) else this is treated as I/O and we set  I +
  G  (cache inhibited,
  guarded)
 
  This helps setting proper TLB mapping for direct assigned
  device
 
  Signed-off-by: Bharat Bhushan
  bharat.bhus...@freescale.com
  ---
 arch/powerpc/kvm/e500_mmu_host.c |   17 -
 1 files changed, 12 insertions(+), 5 deletions(-)
 
  diff --git a/arch/powerpc/kvm/e500_mmu_host.c
  b/arch/powerpc/kvm/e500_mmu_host.c
  index 1c6a9d7..089c227 100644
  --- a/arch/powerpc/kvm/e500_mmu_host.c
  +++ b/arch/powerpc/kvm/e500_mmu_host.c
  @@ -64,13 +64,20 @@ static inline u32
  e500_shadow_mas3_attrib(u32 mas3, int
  usermode)
 return mas3;
 }
 
  -static inline u32 e500_shadow_mas2_attrib(u32 mas2, int
  usermode)
  +static inline u32 e500_shadow_mas2_attrib(u32 mas2,
  +pfn_t pfn)
 {
  +  u32 mas2_attr;
  +
  +  mas2_attr = mas2  MAS2_ATTRIB_MASK;
  +
  +  if (!pfn_valid(pfn)) {
 
  Why not directly use kvm_is_mmio_pfn()?
 
  What I understand from this function (someone can correct
  me) is that it
  returns false when the page is managed by kernel and is
  not marked as RESERVED (for some reason). For us it does not
  matter whether the page is reserved or not, if it is kernel
  visible page then it
  is DDR.
 
 
  I think you are setting I|G by addressing all mmio pages,
  right? If so,
 
   KVM: direct mmio pfn check
 
   Userspace may specify memory slots that are backed by
  mmio pages rather than
   normal RAM.  In some cases it is not enough to identify
  these mmio
  pages
   by pfn_valid().  This patch adds checking the PageReserved as
 well.
 
  Do you know what are those some cases and how checking
  PageReserved helps in
  those cases?
 
  No, myself didn't see these actual cases in qemu,too. But this
  should be chronically persistent as I understand ;-)
 
  Then I will wait till someone educate me :)
 
  The reason is , kvm_is_mmio_pfn() function looks pretty heavy and I do
 not want to call this for all tlbwe operation unless it is necessary.
 
  It certainly does more than we need and potentially slows down the fast
 path (RAM mapping). The only thing it does on top of if (pfn_valid()) is to
 check for pages that are declared reserved on the host. This happens in 2 
 cases:
 
1) Non cache coherent DMA
2) Memory hot remove
 
  The non coherent DMA case would be interesting, as with the mechanism 
  as
 it is in place in Linux today, we could potentially break normal guest 
 operation
 if we don't take it into account. However, it's Kconfig guarded by:
 
  

Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages

2013-07-24 Thread Alexander Graf

On 24.07.2013, at 11:11, Bhushan Bharat-R65777 wrote:

 
 
 -Original Message-
 From: Alexander Graf [mailto:ag...@suse.de]
 Sent: Wednesday, July 24, 2013 1:55 PM
 To: “tiejun.chen”
 Cc: Bhushan Bharat-R65777; kvm-ppc@vger.kernel.org; k...@vger.kernel.org 
 list;
 Wood Scott-B07421; Gleb Natapov; Paolo Bonzini
 Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel
 managed pages
 
 
 On 24.07.2013, at 04:26, “tiejun.chen” wrote:
 
 On 07/18/2013 06:27 PM, Alexander Graf wrote:
 
 On 18.07.2013, at 12:19, “tiejun.chen” wrote:
 
 On 07/18/2013 06:12 PM, Alexander Graf wrote:
 
 On 18.07.2013, at 12:08, “tiejun.chen” wrote:
 
 On 07/18/2013 05:48 PM, Alexander Graf wrote:
 
 On 18.07.2013, at 10:25, Bhushan Bharat-R65777 wrote:
 
 
 
 -Original Message-
 From: Bhushan Bharat-R65777
 Sent: Thursday, July 18, 2013 1:53 PM
 To: ' tiejun.chen '
 Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org;
 ag...@suse.de; Wood Scott-
 B07421
 Subject: RE: [PATCH 2/2] kvm: powerpc: set cache coherency only
 for kernel managed pages
 
 
 
 -Original Message-
 From:  tiejun.chen  [mailto:tiejun.c...@windriver.com]
 Sent: Thursday, July 18, 2013 1:52 PM
 To: Bhushan Bharat-R65777
 Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org;
 ag...@suse.de; Wood
 Scott-
 B07421
 Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency
 only for kernel managed pages
 
 On 07/18/2013 04:08 PM, Bhushan Bharat-R65777 wrote:
 
 
 -Original Message-
 From: kvm-ppc-ow...@vger.kernel.org
 [mailto:kvm-ppc-ow...@vger.kernel.org] On Behalf Of  tiejun.chen 
 
 Sent: Thursday, July 18, 2013 1:01 PM
 To: Bhushan Bharat-R65777
 Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org;
 ag...@suse.de; Wood
 Scott-
 B07421
 Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency
 only for kernel managed pages
 
 On 07/18/2013 03:12 PM, Bhushan Bharat-R65777 wrote:
 
 
 -Original Message-
 From:  tiejun.chen  [mailto:tiejun.c...@windriver.com]
 Sent: Thursday, July 18, 2013 11:56 AM
 To: Bhushan Bharat-R65777
 Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org;
 ag...@suse.de; Wood
 Scott- B07421; Bhushan Bharat-R65777
 Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency
 only for kernel managed pages
 
 On 07/18/2013 02:04 PM, Bharat Bhushan wrote:
 If there is a struct page for the requested mapping then
 it's normal DDR and the mapping sets M bit (coherent,
 cacheable) else this is treated as I/O and we set  I +
 G  (cache inhibited,
 guarded)
 
 This helps setting proper TLB mapping for direct assigned
 device
 
 Signed-off-by: Bharat Bhushan
 bharat.bhus...@freescale.com
 ---
   arch/powerpc/kvm/e500_mmu_host.c |   17 -
   1 files changed, 12 insertions(+), 5 deletions(-)
 
 diff --git a/arch/powerpc/kvm/e500_mmu_host.c
 b/arch/powerpc/kvm/e500_mmu_host.c
 index 1c6a9d7..089c227 100644
 --- a/arch/powerpc/kvm/e500_mmu_host.c
 +++ b/arch/powerpc/kvm/e500_mmu_host.c
 @@ -64,13 +64,20 @@ static inline u32
 e500_shadow_mas3_attrib(u32 mas3, int
 usermode)
return mas3;
   }
 
 -static inline u32 e500_shadow_mas2_attrib(u32 mas2, int
 usermode)
 +static inline u32 e500_shadow_mas2_attrib(u32 mas2,
 +pfn_t pfn)
   {
 +  u32 mas2_attr;
 +
 +  mas2_attr = mas2  MAS2_ATTRIB_MASK;
 +
 +  if (!pfn_valid(pfn)) {
 
 Why not directly use kvm_is_mmio_pfn()?
 
 What I understand from this function (someone can correct
 me) is that it
 returns false when the page is managed by kernel and is
 not marked as RESERVED (for some reason). For us it does not
 matter whether the page is reserved or not, if it is kernel
 visible page then it
 is DDR.
 
 
 I think you are setting I|G by addressing all mmio pages,
 right? If so,
 
 KVM: direct mmio pfn check
 
 Userspace may specify memory slots that are backed by
 mmio pages rather than
 normal RAM.  In some cases it is not enough to identify
 these mmio
 pages
 by pfn_valid().  This patch adds checking the PageReserved as
 well.
 
 Do you know what are those some cases and how checking
 PageReserved helps in
 those cases?
 
 No, myself didn't see these actual cases in qemu,too. But this
 should be chronically persistent as I understand ;-)
 
 Then I will wait till someone educate me :)
 
 The reason is , kvm_is_mmio_pfn() function looks pretty heavy and I do
 not want to call this for all tlbwe operation unless it is necessary.
 
 It certainly does more than we need and potentially slows down the fast
 path (RAM mapping). The only thing it does on top of if (pfn_valid()) is to
 check for pages that are declared reserved on the host. This happens in 2 
 cases:
 
  1) Non cache coherent DMA
  2) Memory hot remove
 
 The non coherent DMA case would be interesting, as with the mechanism 
 as
 it is in place in Linux today, we could potentially break normal guest 
 operation
 if we don't take it into account. However, it's Kconfig guarded by:
 
depends on 4xx || 8xx || E200 || PPC_MPC512x || GAMECUBE_COMMON

Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages

2013-07-24 Thread Alexander Graf

On 24.07.2013, at 11:35, Gleb Natapov wrote:

 On Wed, Jul 24, 2013 at 11:21:11AM +0200, Alexander Graf wrote:
 Are not we going to use page_is_ram() from  e500_shadow_mas2_attrib() as 
 Scott commented?
 
 rWhy aren't we using page_is_ram() in kvm_is_mmio_pfn()?
 
 
 Because it is much slower and, IIRC, actually used to build pfn map that allow
 us to check quickly for valid pfn.

Then why should we use page_is_ram()? :)

I really don't want the e500 code to diverge too much from what the rest of the 
kvm code is doing.


Alex

--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages

2013-07-24 Thread Gleb Natapov
Copying Andrea for him to verify that I am not talking nonsense :)

On Wed, Jul 24, 2013 at 10:25:20AM +0200, Alexander Graf wrote:
  diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
  index 1580dd4..5e8635b 100644
  --- a/virt/kvm/kvm_main.c
  +++ b/virt/kvm/kvm_main.c
  @@ -102,6 +102,10 @@ static bool largepages_enabled = true;
  
  bool kvm_is_mmio_pfn(pfn_t pfn)
  {
  +#ifdef CONFIG_MEMORY_HOTPLUG
 
 I'd feel safer if we narrow this down to e500.
 
  +   /*
  +* Currently only in memory hot remove case we may still need this.
  +*/
 if (pfn_valid(pfn)) {
 
 We still have to check for pfn_valid, no? So the #ifdef should be down here.
 
 int reserved;
 struct page *tail = pfn_to_page(pfn);
  @@ -124,6 +128,7 @@ bool kvm_is_mmio_pfn(pfn_t pfn)
 }
 return PageReserved(tail);
 }
  +#endif
  
 return true;
  }
  
  Before apply this change:
  
  real(1m19.954s + 1m20.918s + 1m22.740s + 1m21.146s + 1m22.120s)/5= 
  1m21.376s
  user(0m23.181s + 0m23.550s + 0m23.506s + 0m23.410s + 0m23.520s)/5= 
  0m23.433s
  sys (0m49.087s + 0m49.563s + 0m51.758s + 0m50.290s + 0m51.047s)/5= 0m50.349s
  
  After apply this change:
  
  real(1m19.507s + 1m20.919s + 1m21.436s + 1m21.179s + 1m20.293s)/5= 
  1m20.667s
  user(0m22.595s + 0m22.719s + 0m22.484s + 0m22.811s + 0m22.467s)/5= 
  0m22.615s
  sys (0m48.841s + 0m49.929s + 0m50.310s + 0m49.813s + 0m48.587s)/5= 0m49.496s
  
  So,
  
  real(1m20.667s - 1m21.376s)/1m21.376s x 100% = -0.6%
  user(0m22.615s - 0m23.433s)/0m23.433s x 100% = -3.5%
  sys (0m49.496s - 0m50.349s)/0m50.349s x 100% = -1.7%
 
 Very nice, so there is a real world performance benefit to doing this. Then 
 yes, I think it would make sense to change the global helper function to be 
 fast on e500 and use that one from e500_shadow_mas2_attrib() instead.
 
 Gleb, Paolo, any hard feelings?
 
I do not see how can we break the function in such a way and get
away with it. Not all valid pfns point to memory. Physical address can
be sparse (due to PCI hole, framebuffer or just because).

--
Gleb.
--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages

2013-07-24 Thread Alexander Graf

On 24.07.2013, at 12:01, Gleb Natapov wrote:

 Copying Andrea for him to verify that I am not talking nonsense :)
 
 On Wed, Jul 24, 2013 at 10:25:20AM +0200, Alexander Graf wrote:
 diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
 index 1580dd4..5e8635b 100644
 --- a/virt/kvm/kvm_main.c
 +++ b/virt/kvm/kvm_main.c
 @@ -102,6 +102,10 @@ static bool largepages_enabled = true;
 
 bool kvm_is_mmio_pfn(pfn_t pfn)
 {
 +#ifdef CONFIG_MEMORY_HOTPLUG
 
 I'd feel safer if we narrow this down to e500.
 
 +   /*
 +* Currently only in memory hot remove case we may still need this.
 +*/
   if (pfn_valid(pfn)) {
 
 We still have to check for pfn_valid, no? So the #ifdef should be down here.
 
   int reserved;
   struct page *tail = pfn_to_page(pfn);
 @@ -124,6 +128,7 @@ bool kvm_is_mmio_pfn(pfn_t pfn)
   }
   return PageReserved(tail);
   }
 +#endif
 
   return true;
 }
 
 Before apply this change:
 
 real(1m19.954s + 1m20.918s + 1m22.740s + 1m21.146s + 1m22.120s)/5= 
 1m21.376s
 user(0m23.181s + 0m23.550s + 0m23.506s + 0m23.410s + 0m23.520s)/5= 
 0m23.433s
 sys (0m49.087s + 0m49.563s + 0m51.758s + 0m50.290s + 0m51.047s)/5= 0m50.349s
 
 After apply this change:
 
 real(1m19.507s + 1m20.919s + 1m21.436s + 1m21.179s + 1m20.293s)/5= 
 1m20.667s
 user(0m22.595s + 0m22.719s + 0m22.484s + 0m22.811s + 0m22.467s)/5= 
 0m22.615s
 sys (0m48.841s + 0m49.929s + 0m50.310s + 0m49.813s + 0m48.587s)/5= 0m49.496s
 
 So,
 
 real(1m20.667s - 1m21.376s)/1m21.376s x 100% = -0.6%
 user(0m22.615s - 0m23.433s)/0m23.433s x 100% = -3.5%
 sys (0m49.496s - 0m50.349s)/0m50.349s x 100% = -1.7%
 
 Very nice, so there is a real world performance benefit to doing this. Then 
 yes, I think it would make sense to change the global helper function to be 
 fast on e500 and use that one from e500_shadow_mas2_attrib() instead.
 
 Gleb, Paolo, any hard feelings?
 
 I do not see how can we break the function in such a way and get
 away with it. Not all valid pfns point to memory. Physical address can
 be sparse (due to PCI hole, framebuffer or just because).

But we don't check for sparseness today in here either. We merely check for 
incomplete huge pages.


Alex

--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages

2013-07-24 Thread Gleb Natapov
On Wed, Jul 24, 2013 at 12:09:42PM +0200, Alexander Graf wrote:
 
 On 24.07.2013, at 12:01, Gleb Natapov wrote:
 
  Copying Andrea for him to verify that I am not talking nonsense :)
  
  On Wed, Jul 24, 2013 at 10:25:20AM +0200, Alexander Graf wrote:
  diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
  index 1580dd4..5e8635b 100644
  --- a/virt/kvm/kvm_main.c
  +++ b/virt/kvm/kvm_main.c
  @@ -102,6 +102,10 @@ static bool largepages_enabled = true;
  
  bool kvm_is_mmio_pfn(pfn_t pfn)
  {
  +#ifdef CONFIG_MEMORY_HOTPLUG
  
  I'd feel safer if we narrow this down to e500.
  
  +   /*
  +* Currently only in memory hot remove case we may still need 
  this.
  +*/
if (pfn_valid(pfn)) {
  
  We still have to check for pfn_valid, no? So the #ifdef should be down 
  here.
  
int reserved;
struct page *tail = pfn_to_page(pfn);
  @@ -124,6 +128,7 @@ bool kvm_is_mmio_pfn(pfn_t pfn)
}
return PageReserved(tail);
}
  +#endif
  
return true;
  }
  
  Before apply this change:
  
  real(1m19.954s + 1m20.918s + 1m22.740s + 1m21.146s + 1m22.120s)/5= 
  1m21.376s
  user(0m23.181s + 0m23.550s + 0m23.506s + 0m23.410s + 0m23.520s)/5= 
  0m23.433s
  sys   (0m49.087s + 0m49.563s + 0m51.758s + 0m50.290s + 0m51.047s)/5= 
  0m50.349s
  
  After apply this change:
  
  real(1m19.507s + 1m20.919s + 1m21.436s + 1m21.179s + 1m20.293s)/5= 
  1m20.667s
  user(0m22.595s + 0m22.719s + 0m22.484s + 0m22.811s + 0m22.467s)/5= 
  0m22.615s
  sys   (0m48.841s + 0m49.929s + 0m50.310s + 0m49.813s + 0m48.587s)/5= 
  0m49.496s
  
  So,
  
  real(1m20.667s - 1m21.376s)/1m21.376s x 100% = -0.6%
  user(0m22.615s - 0m23.433s)/0m23.433s x 100% = -3.5%
  sys   (0m49.496s - 0m50.349s)/0m50.349s x 100% = -1.7%
  
  Very nice, so there is a real world performance benefit to doing this. 
  Then yes, I think it would make sense to change the global helper function 
  to be fast on e500 and use that one from e500_shadow_mas2_attrib() instead.
  
  Gleb, Paolo, any hard feelings?
  
  I do not see how can we break the function in such a way and get
  away with it. Not all valid pfns point to memory. Physical address can
  be sparse (due to PCI hole, framebuffer or just because).
 
 But we don't check for sparseness today in here either. We merely check for 
 incomplete huge pages.
 
That's not how I read the code. The code checks for reserved flag set.
It should be set on pfns that point to memory holes. As far as I
understand huge page tricks they are there to guaranty that THP does not
change flags under us, Andrea?

--
Gleb.
--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages

2013-07-24 Thread Alexander Graf

On 24.07.2013, at 12:19, Gleb Natapov wrote:

 On Wed, Jul 24, 2013 at 12:09:42PM +0200, Alexander Graf wrote:
 
 On 24.07.2013, at 12:01, Gleb Natapov wrote:
 
 Copying Andrea for him to verify that I am not talking nonsense :)
 
 On Wed, Jul 24, 2013 at 10:25:20AM +0200, Alexander Graf wrote:
 diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
 index 1580dd4..5e8635b 100644
 --- a/virt/kvm/kvm_main.c
 +++ b/virt/kvm/kvm_main.c
 @@ -102,6 +102,10 @@ static bool largepages_enabled = true;
 
 bool kvm_is_mmio_pfn(pfn_t pfn)
 {
 +#ifdef CONFIG_MEMORY_HOTPLUG
 
 I'd feel safer if we narrow this down to e500.
 
 +   /*
 +* Currently only in memory hot remove case we may still need 
 this.
 +*/
  if (pfn_valid(pfn)) {
 
 We still have to check for pfn_valid, no? So the #ifdef should be down 
 here.
 
  int reserved;
  struct page *tail = pfn_to_page(pfn);
 @@ -124,6 +128,7 @@ bool kvm_is_mmio_pfn(pfn_t pfn)
  }
  return PageReserved(tail);
  }
 +#endif
 
  return true;
 }
 
 Before apply this change:
 
 real(1m19.954s + 1m20.918s + 1m22.740s + 1m21.146s + 1m22.120s)/5= 
 1m21.376s
 user(0m23.181s + 0m23.550s + 0m23.506s + 0m23.410s + 0m23.520s)/5= 
 0m23.433s
 sys   (0m49.087s + 0m49.563s + 0m51.758s + 0m50.290s + 0m51.047s)/5= 
 0m50.349s
 
 After apply this change:
 
 real(1m19.507s + 1m20.919s + 1m21.436s + 1m21.179s + 1m20.293s)/5= 
 1m20.667s
 user(0m22.595s + 0m22.719s + 0m22.484s + 0m22.811s + 0m22.467s)/5= 
 0m22.615s
 sys   (0m48.841s + 0m49.929s + 0m50.310s + 0m49.813s + 0m48.587s)/5= 
 0m49.496s
 
 So,
 
 real(1m20.667s - 1m21.376s)/1m21.376s x 100% = -0.6%
 user(0m22.615s - 0m23.433s)/0m23.433s x 100% = -3.5%
 sys   (0m49.496s - 0m50.349s)/0m50.349s x 100% = -1.7%
 
 Very nice, so there is a real world performance benefit to doing this. 
 Then yes, I think it would make sense to change the global helper function 
 to be fast on e500 and use that one from e500_shadow_mas2_attrib() instead.
 
 Gleb, Paolo, any hard feelings?
 
 I do not see how can we break the function in such a way and get
 away with it. Not all valid pfns point to memory. Physical address can
 be sparse (due to PCI hole, framebuffer or just because).
 
 But we don't check for sparseness today in here either. We merely check for 
 incomplete huge pages.
 
 That's not how I read the code. The code checks for reserved flag set.
 It should be set on pfns that point to memory holes. As far as I

I couldn't find any traces of code that sets the reserved bits on e500 chips 
though. I've only seen it getting set for memory hotplug and memory incoherent 
DMA code which doesn't get used on e500.

But I'd be more than happy to get proven wrong :).


Alex

 understand huge page tricks they are there to guaranty that THP does not
 change flags under us, Andrea?
 
 --
   Gleb.
 --
 To unsubscribe from this list: send the line unsubscribe kvm-ppc in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages

2013-07-24 Thread Gleb Natapov
On Wed, Jul 24, 2013 at 12:25:18PM +0200, Alexander Graf wrote:
 
 On 24.07.2013, at 12:19, Gleb Natapov wrote:
 
  On Wed, Jul 24, 2013 at 12:09:42PM +0200, Alexander Graf wrote:
  
  On 24.07.2013, at 12:01, Gleb Natapov wrote:
  
  Copying Andrea for him to verify that I am not talking nonsense :)
  
  On Wed, Jul 24, 2013 at 10:25:20AM +0200, Alexander Graf wrote:
  diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
  index 1580dd4..5e8635b 100644
  --- a/virt/kvm/kvm_main.c
  +++ b/virt/kvm/kvm_main.c
  @@ -102,6 +102,10 @@ static bool largepages_enabled = true;
  
  bool kvm_is_mmio_pfn(pfn_t pfn)
  {
  +#ifdef CONFIG_MEMORY_HOTPLUG
  
  I'd feel safer if we narrow this down to e500.
  
  +   /*
  +* Currently only in memory hot remove case we may still need 
  this.
  +*/
   if (pfn_valid(pfn)) {
  
  We still have to check for pfn_valid, no? So the #ifdef should be down 
  here.
  
   int reserved;
   struct page *tail = pfn_to_page(pfn);
  @@ -124,6 +128,7 @@ bool kvm_is_mmio_pfn(pfn_t pfn)
   }
   return PageReserved(tail);
   }
  +#endif
  
   return true;
  }
  
  Before apply this change:
  
  real(1m19.954s + 1m20.918s + 1m22.740s + 1m21.146s + 1m22.120s)/5= 
  1m21.376s
  user(0m23.181s + 0m23.550s + 0m23.506s + 0m23.410s + 0m23.520s)/5= 
  0m23.433s
  sys (0m49.087s + 0m49.563s + 0m51.758s + 0m50.290s + 0m51.047s)/5= 
  0m50.349s
  
  After apply this change:
  
  real(1m19.507s + 1m20.919s + 1m21.436s + 1m21.179s + 1m20.293s)/5= 
  1m20.667s
  user(0m22.595s + 0m22.719s + 0m22.484s + 0m22.811s + 0m22.467s)/5= 
  0m22.615s
  sys (0m48.841s + 0m49.929s + 0m50.310s + 0m49.813s + 0m48.587s)/5= 
  0m49.496s
  
  So,
  
  real(1m20.667s - 1m21.376s)/1m21.376s x 100% = -0.6%
  user(0m22.615s - 0m23.433s)/0m23.433s x 100% = -3.5%
  sys (0m49.496s - 0m50.349s)/0m50.349s x 100% = -1.7%
  
  Very nice, so there is a real world performance benefit to doing this. 
  Then yes, I think it would make sense to change the global helper 
  function to be fast on e500 and use that one from 
  e500_shadow_mas2_attrib() instead.
  
  Gleb, Paolo, any hard feelings?
  
  I do not see how can we break the function in such a way and get
  away with it. Not all valid pfns point to memory. Physical address can
  be sparse (due to PCI hole, framebuffer or just because).
  
  But we don't check for sparseness today in here either. We merely check 
  for incomplete huge pages.
  
  That's not how I read the code. The code checks for reserved flag set.
  It should be set on pfns that point to memory holes. As far as I
 
 I couldn't find any traces of code that sets the reserved bits on e500 chips 
 though. I've only seen it getting set for memory hotplug and memory 
 incoherent DMA code which doesn't get used on e500.
 
 But I'd be more than happy to get proven wrong :).
 
Can you write a module that scans all page structures? AFAIK all pages
are marked as reserved and then those that become regular memory are
marked as unreserved. Hope Andrea will chime in here :)

--
Gleb.
--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages

2013-07-24 Thread Scott Wood

On 07/24/2013 04:39:59 AM, Alexander Graf wrote:


On 24.07.2013, at 11:35, Gleb Natapov wrote:

 On Wed, Jul 24, 2013 at 11:21:11AM +0200, Alexander Graf wrote:
 Are not we going to use page_is_ram() from   
e500_shadow_mas2_attrib() as Scott commented?


 rWhy aren't we using page_is_ram() in kvm_is_mmio_pfn()?


 Because it is much slower and, IIRC, actually used to build pfn map  
that allow

 us to check quickly for valid pfn.

Then why should we use page_is_ram()? :)

I really don't want the e500 code to diverge too much from what the  
rest of the kvm code is doing.


I don't understand actually used to build pfn map  What code is  
this?  I don't see any calls to page_is_ram() in the KVM code, or in  
generic mm code.  Is this a statement about what x86 does?


On PPC page_is_ram() is only called (AFAICT) for determining what  
attributes to set on mmaps.  We want to be sure that KVM always makes  
the same decision.  While pfn_valid() seems like it should be  
equivalent, it's not obvious from the PPC code that it is.


If pfn_valid() is better, why is that not used for mmap?  Why are there  
two different names for the same thing?


-Scott
--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 04/10] powerpc: Prepare to support kernel handling of IOMMU map/unmap

2013-07-24 Thread Andrew Morton
On Tue, 23 Jul 2013 12:22:59 +1000 Alexey Kardashevskiy a...@ozlabs.ru wrote:

 Ping, anyone, please?

ew, you top-posted.

 Ben needs ack from any of MM people before proceeding with this patch. Thanks!

For what?  The three lines of comment in page-flags.h?   ack :)

Manipulating page-_count directly is considered poor form.  Don't
blame us if we break your code ;)

Actually, the manipulation in realmode_get_page() duplicates the
existing get_page_unless_zero() and the one in realmode_put_page()
could perhaps be placed in mm.h with a suitable name and some
documentation.  That would improve your form and might protect the code
from getting broken later on.

--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 04/10] powerpc: Prepare to support kernel handling of IOMMU map/unmap

2013-07-24 Thread Benjamin Herrenschmidt
On Wed, 2013-07-24 at 15:43 -0700, Andrew Morton wrote:
 For what?  The three lines of comment in page-flags.h?   ack :)
 
 Manipulating page-_count directly is considered poor form.  Don't
 blame us if we break your code ;)
 
 Actually, the manipulation in realmode_get_page() duplicates the
 existing get_page_unless_zero() and the one in realmode_put_page()
 could perhaps be placed in mm.h with a suitable name and some
 documentation.  That would improve your form and might protect the code
 from getting broken later on.

Yes, this stuff makes me really nervous :-) If it didn't provide an order
of magnitude performance improvement in KVM I would avoid it but heh...

Alexey, I like having that stuff in generic code.

However the meaning of the words real mode can be ambiguous accross
architectures, it might be best to then name it mmu_off_put_page to
make things a bit clearer, along with a comment explaining that this is
called in a context where none of the virtual mappings are accessible
(vmalloc, vmemmap, IOs, ...), and that in the case of sparsemem vmemmap
the caller must have taken care of getting the physical address of the
struct page and of ensuring it isn't split accross two vmemmap blocks.

Cheers,
Ben.


--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html