Re: [PATCH v2] Use the POWER8 Micro Partition Prefetch Engine in KVM HV on POWER8

2014-07-10 Thread Alexander Graf


On 09.07.14 00:59, Stewart Smith wrote:

Hi!

Thanks for review, much appreciated!

Alexander Graf ag...@suse.de writes:

On 08.07.14 07:06, Stewart Smith wrote:

@@ -1528,6 +1535,7 @@ static void kvmppc_run_core(struct kvmppc_vcore *vc)
int i, need_vpa_update;
int srcu_idx;
struct kvm_vcpu *vcpus_to_update[threads_per_core];
+   phys_addr_t phy_addr, tmp;

Please put the variable declarations into the if () branch so that the
compiler can catch potential leaks :)

ack. will fix.


@@ -1590,9 +1598,48 @@ static void kvmppc_run_core(struct kvmppc_vcore *vc)
   
   	srcu_idx = srcu_read_lock(vc-kvm-srcu);
   
+	/* If we have a saved list of L2/L3, restore it */

+   if (cpu_has_feature(CPU_FTR_ARCH_207S)  vc-mpp_buffer) {
+   phy_addr = virt_to_phys((void *)vc-mpp_buffer);
+#if defined(CONFIG_PPC_4K_PAGES)
+   phy_addr = (phy_addr + 8*4096)  ~(8*4096);

get_free_pages() is automatically aligned to the order, no?

That's what Paul reckoned too, and then we've attempted to find anywhere
that documents that behaviour. Happen to be able to point to docs/source
that say this is part of API?


Phew - it's probably buried somewhere. I could only find this document 
saying that we always get order-aligned allocations:


http://www.thehackademy.net/madchat/ebooks/Mem_virtuelle/linux-mm/zonealloc.html

Mel, do you happen to have any pointer to something that explicitly (or 
even properly implicitly) says that get_free_pages() returns 
order-aligned memory?





+#endif
+   tmp = phy_addr  PPC_MPPE_ADDRESS_MASK;
+   tmp = tmp | PPC_MPPE_WHOLE_TABLE;
+
+   /* For sanity, abort any 'save' requests in progress */
+   asm volatile(PPC_LOGMPP(R1) : : r (tmp));
+
+   /* Inititate a cache-load request */
+   mtspr(SPRN_MPPR, tmp);
+   }

In fact, this whole block up here could be a function, no?

It could, perfectly happy for it to be one. Will fix.


+
+   /* Allocate memory before switching out of guest so we don't
+  trash L2/L3 with memory allocation stuff */
+   if (cpu_has_feature(CPU_FTR_ARCH_207S)  !vc-mpp_buffer) {
+   vc-mpp_buffer = __get_free_pages(GFP_KERNEL|__GFP_ZERO,
+ MPP_BUFFER_ORDER);

get_order(64 * 1024)?

Also, why allocate it here and not on vcore creation?

There's also the possibility of saving/restorting part of the L3 cache
as well, and I was envisioning a future patch to this which checks a
flag in vcore (maybe exposed via sysfs or whatever mechanism is
applicable) if it should save/restore L2 or L2/L3, so thus it makes a
bit more sense allocating it there rather than elsewhere.

There's also no real reason to fail to create a vcore if we can't
allocate a buffer for L2/L3 cache contents - retrying later is perfectly
harmless.


If we failed during core creation just don't save/restore L2 cache 
contents at all. I really prefer to have allocation and dealloction all 
at init time - and such low order allocations will most likely succeed.


Let's leave the L3 cache bits for later when we know whether it actually 
has an impact. I personally doubt it :).



Alex

--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2] Use the POWER8 Micro Partition Prefetch Engine in KVM HV on POWER8

2014-07-10 Thread Mel Gorman
On Thu, Jul 10, 2014 at 01:05:47PM +0200, Alexander Graf wrote:
 
 On 09.07.14 00:59, Stewart Smith wrote:
 Hi!
 
 Thanks for review, much appreciated!
 
 Alexander Graf ag...@suse.de writes:
 On 08.07.14 07:06, Stewart Smith wrote:
 @@ -1528,6 +1535,7 @@ static void kvmppc_run_core(struct kvmppc_vcore *vc)
int i, need_vpa_update;
int srcu_idx;
struct kvm_vcpu *vcpus_to_update[threads_per_core];
 +  phys_addr_t phy_addr, tmp;
 Please put the variable declarations into the if () branch so that the
 compiler can catch potential leaks :)
 ack. will fix.
 
 @@ -1590,9 +1598,48 @@ static void kvmppc_run_core(struct kvmppc_vcore *vc)
srcu_idx = srcu_read_lock(vc-kvm-srcu);
 +  /* If we have a saved list of L2/L3, restore it */
 +  if (cpu_has_feature(CPU_FTR_ARCH_207S)  vc-mpp_buffer) {
 +  phy_addr = virt_to_phys((void *)vc-mpp_buffer);
 +#if defined(CONFIG_PPC_4K_PAGES)
 +  phy_addr = (phy_addr + 8*4096)  ~(8*4096);
 get_free_pages() is automatically aligned to the order, no?
 That's what Paul reckoned too, and then we've attempted to find anywhere
 that documents that behaviour. Happen to be able to point to docs/source
 that say this is part of API?
 
 Phew - it's probably buried somewhere. I could only find this
 document saying that we always get order-aligned allocations:
 
 http://www.thehackademy.net/madchat/ebooks/Mem_virtuelle/linux-mm/zonealloc.html
 
 Mel, do you happen to have any pointer to something that explicitly
 (or even properly implicitly) says that get_free_pages() returns
 order-aligned memory?
 

I did not read the whole thread so I lack context and will just answer
this part.

There is no guarantee that pages are returned in PFN order for multiple
requests to the page allocator. This is the relevant comment in
rmqueue_bulk

/*
 * Split buddy pages returned by expand() are received here
 * in physical page order. The page is added to the callers and
 * list and the list head then moves forward. From the callers
 * perspective, the linked list is ordered by page number in
 * some conditions. This is useful for IO devices that can
 * merge IO requests if the physical pages are ordered
 * properly.
 */

It will probably be true early in the lifetime of the system but the milage
will vary on systems with a lot of uptime. If you depend on this behaviour
for correctness then you will have a bad day.

High-order page requests to the page allocator are guaranteed to be in physical
order. However, this does not apply to vmalloc() where allocations are
only guaranteed to be virtually contiguous.

-- 
Mel Gorman
SUSE Labs
--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2] Use the POWER8 Micro Partition Prefetch Engine in KVM HV on POWER8

2014-07-10 Thread Alexander Graf


On 10.07.14 15:07, Mel Gorman wrote:

On Thu, Jul 10, 2014 at 01:05:47PM +0200, Alexander Graf wrote:

On 09.07.14 00:59, Stewart Smith wrote:

Hi!

Thanks for review, much appreciated!

Alexander Graf ag...@suse.de writes:

On 08.07.14 07:06, Stewart Smith wrote:

@@ -1528,6 +1535,7 @@ static void kvmppc_run_core(struct kvmppc_vcore *vc)
int i, need_vpa_update;
int srcu_idx;
struct kvm_vcpu *vcpus_to_update[threads_per_core];
+   phys_addr_t phy_addr, tmp;

Please put the variable declarations into the if () branch so that the
compiler can catch potential leaks :)

ack. will fix.


@@ -1590,9 +1598,48 @@ static void kvmppc_run_core(struct kvmppc_vcore *vc)
srcu_idx = srcu_read_lock(vc-kvm-srcu);
+   /* If we have a saved list of L2/L3, restore it */
+   if (cpu_has_feature(CPU_FTR_ARCH_207S)  vc-mpp_buffer) {
+   phy_addr = virt_to_phys((void *)vc-mpp_buffer);
+#if defined(CONFIG_PPC_4K_PAGES)
+   phy_addr = (phy_addr + 8*4096)  ~(8*4096);

get_free_pages() is automatically aligned to the order, no?

That's what Paul reckoned too, and then we've attempted to find anywhere
that documents that behaviour. Happen to be able to point to docs/source
that say this is part of API?

Phew - it's probably buried somewhere. I could only find this
document saying that we always get order-aligned allocations:

http://www.thehackademy.net/madchat/ebooks/Mem_virtuelle/linux-mm/zonealloc.html

Mel, do you happen to have any pointer to something that explicitly
(or even properly implicitly) says that get_free_pages() returns
order-aligned memory?


I did not read the whole thread so I lack context and will just answer
this part.

There is no guarantee that pages are returned in PFN order for multiple
requests to the page allocator. This is the relevant comment in
rmqueue_bulk

 /*
  * Split buddy pages returned by expand() are received here
  * in physical page order. The page is added to the callers and
  * list and the list head then moves forward. From the callers
  * perspective, the linked list is ordered by page number in
  * some conditions. This is useful for IO devices that can
  * merge IO requests if the physical pages are ordered
  * properly.
  */

It will probably be true early in the lifetime of the system but the milage
will vary on systems with a lot of uptime. If you depend on this behaviour
for correctness then you will have a bad day.

High-order page requests to the page allocator are guaranteed to be in physical
order. However, this does not apply to vmalloc() where allocations are
only guaranteed to be virtually contiguous.


Hrm, ok to be very concrete:

  Does __get_free_pages(..., 4); on a 4k page size system give me a 64k 
aligned pointer? :)



Alex

--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2] Use the POWER8 Micro Partition Prefetch Engine in KVM HV on POWER8

2014-07-10 Thread Mel Gorman
On Thu, Jul 10, 2014 at 03:17:16PM +0200, Alexander Graf wrote:
 
 On 10.07.14 15:07, Mel Gorman wrote:
 On Thu, Jul 10, 2014 at 01:05:47PM +0200, Alexander Graf wrote:
 On 09.07.14 00:59, Stewart Smith wrote:
 Hi!
 
 Thanks for review, much appreciated!
 
 Alexander Graf ag...@suse.de writes:
 On 08.07.14 07:06, Stewart Smith wrote:
 @@ -1528,6 +1535,7 @@ static void kvmppc_run_core(struct kvmppc_vcore 
 *vc)
  int i, need_vpa_update;
  int srcu_idx;
  struct kvm_vcpu *vcpus_to_update[threads_per_core];
 +phys_addr_t phy_addr, tmp;
 Please put the variable declarations into the if () branch so that the
 compiler can catch potential leaks :)
 ack. will fix.
 
 @@ -1590,9 +1598,48 @@ static void kvmppc_run_core(struct kvmppc_vcore 
 *vc)
  srcu_idx = srcu_read_lock(vc-kvm-srcu);
 +/* If we have a saved list of L2/L3, restore it */
 +if (cpu_has_feature(CPU_FTR_ARCH_207S)  vc-mpp_buffer) {
 +phy_addr = virt_to_phys((void *)vc-mpp_buffer);
 +#if defined(CONFIG_PPC_4K_PAGES)
 +phy_addr = (phy_addr + 8*4096)  ~(8*4096);
 get_free_pages() is automatically aligned to the order, no?
 That's what Paul reckoned too, and then we've attempted to find anywhere
 that documents that behaviour. Happen to be able to point to docs/source
 that say this is part of API?
 Phew - it's probably buried somewhere. I could only find this
 document saying that we always get order-aligned allocations:
 
 http://www.thehackademy.net/madchat/ebooks/Mem_virtuelle/linux-mm/zonealloc.html
 
 Mel, do you happen to have any pointer to something that explicitly
 (or even properly implicitly) says that get_free_pages() returns
 order-aligned memory?
 
 I did not read the whole thread so I lack context and will just answer
 this part.
 
 There is no guarantee that pages are returned in PFN order for multiple
 requests to the page allocator. This is the relevant comment in
 rmqueue_bulk
 
  /*
   * Split buddy pages returned by expand() are received here
   * in physical page order. The page is added to the callers 
  and
   * list and the list head then moves forward. From the 
  callers
   * perspective, the linked list is ordered by page number in
   * some conditions. This is useful for IO devices that can
   * merge IO requests if the physical pages are ordered
   * properly.
   */
 
 It will probably be true early in the lifetime of the system but the milage
 will vary on systems with a lot of uptime. If you depend on this behaviour
 for correctness then you will have a bad day.
 
 High-order page requests to the page allocator are guaranteed to be in 
 physical
 order. However, this does not apply to vmalloc() where allocations are
 only guaranteed to be virtually contiguous.
 
 Hrm, ok to be very concrete:
 
   Does __get_free_pages(..., 4); on a 4k page size system give me a
 64k aligned pointer? :)
 

Yes.

-- 
Mel Gorman
SUSE Labs
--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html