Re: [PATCH v2] kvm: arm64: vgic: fix hyp panic with 64k pages on juno platform

2014-07-30 Thread Marc Zyngier
On Fri, Jul 25 2014 at  4:29:12 pm BST, Will Deacon will.dea...@arm.com wrote:
 If the physical address of GICV isn't page-aligned, then we end up
 creating a stage-2 mapping of the page containing it, which causes us to
 map neighbouring memory locations directly into the guest.

 As an example, consider a platform with GICV at physical 0x2c02f000
 running a 64k-page host kernel. If qemu maps this into the guest at
 0x8001, then guest physical addresses 0x8001 - 0x8001efff will
 map host physical region 0x2c02 - 0x2c02efff. Accesses to these
 physical regions may cause UNPREDICTABLE behaviour, for example, on the
 Juno platform this will cause an SError exception to EL3, which brings
 down the entire physical CPU resulting in RCU stalls / HYP panics / host
 crashing / wasted weeks of debugging.

 SBSA recommends that systems alias the 4k GICV across the bounding 64k
 region, in which case GICV physical could be described as 0x2c02 in
 the above scenario.

 This patch fixes the problem by failing the vgic probe if the physical
 base address or the size of GICV aren't page-aligned. Note that this
 generated a warning in dmesg about freeing enabled IRQs, so I had to
 move the IRQ enabling later in the probe.

 Cc: Christoffer Dall christoffer.d...@linaro.org
 Cc: Marc Zyngier marc.zyng...@arm.com
 Cc: Gleb Natapov g...@kernel.org
 Cc: Paolo Bonzini pbonz...@redhat.com
 Cc: Joel Schopp joel.sch...@amd.com
 Cc: Don Dutile ddut...@redhat.com
 Acked-by: Peter Maydell peter.mayd...@linaro.org
 Signed-off-by: Will Deacon will.dea...@arm.com

Looks good to me:

Acked-by: Marc Zyngier marc.zyng...@arm.com

Christoffer, can you please take this as an urgent fix?

Thanks,

M.
-- 
Jazz is not dead. It just smells funny.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2] kvm: arm64: vgic: fix hyp panic with 64k pages on juno platform

2014-07-30 Thread Christoffer Dall
On Wed, Jul 30, 2014 at 11:47:40AM +0100, Marc Zyngier wrote:
 On Fri, Jul 25 2014 at  4:29:12 pm BST, Will Deacon will.dea...@arm.com 
 wrote:
  If the physical address of GICV isn't page-aligned, then we end up
  creating a stage-2 mapping of the page containing it, which causes us to
  map neighbouring memory locations directly into the guest.
 
  As an example, consider a platform with GICV at physical 0x2c02f000
  running a 64k-page host kernel. If qemu maps this into the guest at
  0x8001, then guest physical addresses 0x8001 - 0x8001efff will
  map host physical region 0x2c02 - 0x2c02efff. Accesses to these
  physical regions may cause UNPREDICTABLE behaviour, for example, on the
  Juno platform this will cause an SError exception to EL3, which brings
  down the entire physical CPU resulting in RCU stalls / HYP panics / host
  crashing / wasted weeks of debugging.
 
  SBSA recommends that systems alias the 4k GICV across the bounding 64k
  region, in which case GICV physical could be described as 0x2c02 in
  the above scenario.
 
  This patch fixes the problem by failing the vgic probe if the physical
  base address or the size of GICV aren't page-aligned. Note that this
  generated a warning in dmesg about freeing enabled IRQs, so I had to
  move the IRQ enabling later in the probe.
 
  Cc: Christoffer Dall christoffer.d...@linaro.org
  Cc: Marc Zyngier marc.zyng...@arm.com
  Cc: Gleb Natapov g...@kernel.org
  Cc: Paolo Bonzini pbonz...@redhat.com
  Cc: Joel Schopp joel.sch...@amd.com
  Cc: Don Dutile ddut...@redhat.com
  Acked-by: Peter Maydell peter.mayd...@linaro.org
  Signed-off-by: Will Deacon will.dea...@arm.com
 
 Looks good to me:
 
 Acked-by: Marc Zyngier marc.zyng...@arm.com
 
 Christoffer, can you please take this as an urgent fix?
 
Yes, sorry for the delay,

Applied to master and notified the KVM guys to try and get it into 3.16.

Thanks,
-Christoffer
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2] kvm: arm64: vgic: fix hyp panic with 64k pages on juno platform

2014-07-25 Thread Will Deacon
If the physical address of GICV isn't page-aligned, then we end up
creating a stage-2 mapping of the page containing it, which causes us to
map neighbouring memory locations directly into the guest.

As an example, consider a platform with GICV at physical 0x2c02f000
running a 64k-page host kernel. If qemu maps this into the guest at
0x8001, then guest physical addresses 0x8001 - 0x8001efff will
map host physical region 0x2c02 - 0x2c02efff. Accesses to these
physical regions may cause UNPREDICTABLE behaviour, for example, on the
Juno platform this will cause an SError exception to EL3, which brings
down the entire physical CPU resulting in RCU stalls / HYP panics / host
crashing / wasted weeks of debugging.

SBSA recommends that systems alias the 4k GICV across the bounding 64k
region, in which case GICV physical could be described as 0x2c02 in
the above scenario.

This patch fixes the problem by failing the vgic probe if the physical
base address or the size of GICV aren't page-aligned. Note that this
generated a warning in dmesg about freeing enabled IRQs, so I had to
move the IRQ enabling later in the probe.

Cc: Christoffer Dall christoffer.d...@linaro.org
Cc: Marc Zyngier marc.zyng...@arm.com
Cc: Gleb Natapov g...@kernel.org
Cc: Paolo Bonzini pbonz...@redhat.com
Cc: Joel Schopp joel.sch...@amd.com
Cc: Don Dutile ddut...@redhat.com
Acked-by: Peter Maydell peter.mayd...@linaro.org
Signed-off-by: Will Deacon will.dea...@arm.com
---

v1 -v2 : Added size alignment check and Peter's ack. Could this go in
  for 3.16 please?

 virt/kvm/arm/vgic.c | 24 
 1 file changed, 20 insertions(+), 4 deletions(-)

diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c
index 56ff9bebb577..476d3bf540a8 100644
--- a/virt/kvm/arm/vgic.c
+++ b/virt/kvm/arm/vgic.c
@@ -1526,17 +1526,33 @@ int kvm_vgic_hyp_init(void)
goto out_unmap;
}
 
-   kvm_info(%s@%llx IRQ%d\n, vgic_node-name,
-vctrl_res.start, vgic_maint_irq);
-   on_each_cpu(vgic_init_maintenance_interrupt, NULL, 1);
-
if (of_address_to_resource(vgic_node, 3, vcpu_res)) {
kvm_err(Cannot obtain VCPU resource\n);
ret = -ENXIO;
goto out_unmap;
}
+
+   if (!PAGE_ALIGNED(vcpu_res.start)) {
+   kvm_err(GICV physical address 0x%llx not page aligned\n,
+   (unsigned long long)vcpu_res.start);
+   ret = -ENXIO;
+   goto out_unmap;
+   }
+
+   if (!PAGE_ALIGNED(resource_size(vcpu_res))) {
+   kvm_err(GICV size 0x%llx not a multiple of page size 0x%lx\n,
+   (unsigned long long)resource_size(vcpu_res),
+   PAGE_SIZE);
+   ret = -ENXIO;
+   goto out_unmap;
+   }
+
vgic_vcpu_base = vcpu_res.start;
 
+   kvm_info(%s@%llx IRQ%d\n, vgic_node-name,
+vctrl_res.start, vgic_maint_irq);
+   on_each_cpu(vgic_init_maintenance_interrupt, NULL, 1);
+
goto out;
 
 out_unmap:
-- 
2.0.1

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2] kvm: arm64: vgic: fix hyp panic with 64k pages on juno platform

2014-07-25 Thread Joel Schopp

On 07/25/2014 10:29 AM, Will Deacon wrote:
 If the physical address of GICV isn't page-aligned, then we end up
 creating a stage-2 mapping of the page containing it, which causes us to
 map neighbouring memory locations directly into the guest.

 As an example, consider a platform with GICV at physical 0x2c02f000
 running a 64k-page host kernel. If qemu maps this into the guest at
 0x8001, then guest physical addresses 0x8001 - 0x8001efff will
 map host physical region 0x2c02 - 0x2c02efff. Accesses to these
 physical regions may cause UNPREDICTABLE behaviour, for example, on the
 Juno platform this will cause an SError exception to EL3, which brings
 down the entire physical CPU resulting in RCU stalls / HYP panics / host
 crashing / wasted weeks of debugging.
No denying this is a problem.

 SBSA recommends that systems alias the 4k GICV across the bounding 64k
 region, in which case GICV physical could be described as 0x2c02 in
 the above scenario.
The problem with this patch is the gicv is really 8K.  The reason you
would map at a 60K offset (0xf000), and why we do on our SOC, is so that
the 8K gicv would pick up the last 4K from the first page and the first
4K from the next page.  With your patch it is impossible to map all 8K
of the gicv with 64K pages. 

My SOC which works fine with kvm now will go to not working with kvm
after this patch. 


 This patch fixes the problem by failing the vgic probe if the physical
 base address or the size of GICV aren't page-aligned. Note that this
 generated a warning in dmesg about freeing enabled IRQs, so I had to
 move the IRQ enabling later in the probe.

 Cc: Christoffer Dall christoffer.d...@linaro.org
 Cc: Marc Zyngier marc.zyng...@arm.com
 Cc: Gleb Natapov g...@kernel.org
 Cc: Paolo Bonzini pbonz...@redhat.com
 Cc: Joel Schopp joel.sch...@amd.com
 Cc: Don Dutile ddut...@redhat.com
 Acked-by: Peter Maydell peter.mayd...@linaro.org
 Signed-off-by: Will Deacon will.dea...@arm.com
 ---

 v1 -v2 : Added size alignment check and Peter's ack. Could this go in
   for 3.16 please?

  virt/kvm/arm/vgic.c | 24 
  1 file changed, 20 insertions(+), 4 deletions(-)

 diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c
 index 56ff9bebb577..476d3bf540a8 100644
 --- a/virt/kvm/arm/vgic.c
 +++ b/virt/kvm/arm/vgic.c
 @@ -1526,17 +1526,33 @@ int kvm_vgic_hyp_init(void)
   goto out_unmap;
   }
  
 - kvm_info(%s@%llx IRQ%d\n, vgic_node-name,
 -  vctrl_res.start, vgic_maint_irq);
 - on_each_cpu(vgic_init_maintenance_interrupt, NULL, 1);
 -
   if (of_address_to_resource(vgic_node, 3, vcpu_res)) {
   kvm_err(Cannot obtain VCPU resource\n);
   ret = -ENXIO;
   goto out_unmap;
   }
 +
 + if (!PAGE_ALIGNED(vcpu_res.start)) {
 + kvm_err(GICV physical address 0x%llx not page aligned\n,
 + (unsigned long long)vcpu_res.start);
 + ret = -ENXIO;
 + goto out_unmap;
 + }
 +
 + if (!PAGE_ALIGNED(resource_size(vcpu_res))) {
 + kvm_err(GICV size 0x%llx not a multiple of page size 0x%lx\n,
 + (unsigned long long)resource_size(vcpu_res),
 + PAGE_SIZE);
 + ret = -ENXIO;
 + goto out_unmap;
 + }
 +
   vgic_vcpu_base = vcpu_res.start;
  
 + kvm_info(%s@%llx IRQ%d\n, vgic_node-name,
 +  vctrl_res.start, vgic_maint_irq);
 + on_each_cpu(vgic_init_maintenance_interrupt, NULL, 1);
 +
   goto out;
  
  out_unmap:

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2] kvm: arm64: vgic: fix hyp panic with 64k pages on juno platform

2014-07-25 Thread Peter Maydell
On 25 July 2014 16:56, Joel Schopp joel.sch...@amd.com wrote:
 The problem with this patch is the gicv is really 8K.  The reason you
 would map at a 60K offset (0xf000), and why we do on our SOC, is so that
 the 8K gicv would pick up the last 4K from the first page and the first
 4K from the next page.  With your patch it is impossible to map all 8K
 of the gicv with 64K pages.

 My SOC which works fine with kvm now will go to not working with kvm
 after this patch.

Your SOC currently works by fluke because the guest doesn't
look at the last 4K of the GICC. If you're happy with it continuing
to work by fluke you could make your device tree say it had a
64K GICV region with a 64K-aligned base.

To make it work not by fluke but actually correctly requires
Marc's patchset, at a minimum.

thanks
-- PMM
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2] kvm: arm64: vgic: fix hyp panic with 64k pages on juno platform

2014-07-25 Thread Will Deacon
On Fri, Jul 25, 2014 at 04:56:18PM +0100, Joel Schopp wrote:
 
 On 07/25/2014 10:29 AM, Will Deacon wrote:
  If the physical address of GICV isn't page-aligned, then we end up
  creating a stage-2 mapping of the page containing it, which causes us to
  map neighbouring memory locations directly into the guest.
 
  As an example, consider a platform with GICV at physical 0x2c02f000
  running a 64k-page host kernel. If qemu maps this into the guest at
  0x8001, then guest physical addresses 0x8001 - 0x8001efff will
  map host physical region 0x2c02 - 0x2c02efff. Accesses to these
  physical regions may cause UNPREDICTABLE behaviour, for example, on the
  Juno platform this will cause an SError exception to EL3, which brings
  down the entire physical CPU resulting in RCU stalls / HYP panics / host
  crashing / wasted weeks of debugging.
 No denying this is a problem.
  SBSA recommends that systems alias the 4k GICV across the bounding 64k
  region, in which case GICV physical could be described as 0x2c02 in
  the above scenario.
 The problem with this patch is the gicv is really 8K.  The reason you
 would map at a 60K offset (0xf000), and why we do on our SOC, is so that
 the 8K gicv would pick up the last 4K from the first page and the first
 4K from the next page.  With your patch it is impossible to map all 8K
 of the gicv with 64K pages.

Please, help me with an alternative. If we drop the size alignment check,
then we can miss some dangerous cases such as the one highlighted previously
by Peter.

 My SOC which works fine with kvm now will go to not working with kvm
 after this patch. 

Right, but my only alternative is have CONFIG_KVM depends on !64K_PAGES,
which sucks for everybody. Your device-tree entry has to change *anyway*,
because as it stands we're mapping 60k of unknown stuff into the guest,
which the kernel needs to know is safe.

Will
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2] kvm: arm64: vgic: fix hyp panic with 64k pages on juno platform

2014-07-25 Thread Joel Schopp

On 07/25/2014 11:02 AM, Peter Maydell wrote:
 On 25 July 2014 16:56, Joel Schopp joel.sch...@amd.com wrote:
 The problem with this patch is the gicv is really 8K.  The reason you
 would map at a 60K offset (0xf000), and why we do on our SOC, is so that
 the 8K gicv would pick up the last 4K from the first page and the first
 4K from the next page.  With your patch it is impossible to map all 8K
 of the gicv with 64K pages.

 My SOC which works fine with kvm now will go to not working with kvm
 after this patch.
 Your SOC currently works by fluke because the guest doesn't
 look at the last 4K of the GICC. If you're happy with it continuing
 to work by fluke you could make your device tree say it had a
 64K GICV region with a 64K-aligned base.

 To make it work not by fluke but actually correctly requires
 Marc's patchset, at a minimum.

Since we aren't actually using the last 4K of the gicv at the moment I supppose 
I could drop my objections to this patch and change my device tree until Marc's 
patchset provides a proper solution for the gicv's second 4K that works for 
everybody.

Acked-by: Joel Schopp joel.sch...@amd.com

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2] kvm: arm64: vgic: fix hyp panic with 64k pages on juno platform

2014-07-25 Thread Will Deacon
On Fri, Jul 25, 2014 at 05:24:18PM +0100, Joel Schopp wrote:
 
 On 07/25/2014 11:02 AM, Peter Maydell wrote:
  On 25 July 2014 16:56, Joel Schopp joel.sch...@amd.com wrote:
  The problem with this patch is the gicv is really 8K.  The reason you
  would map at a 60K offset (0xf000), and why we do on our SOC, is so that
  the 8K gicv would pick up the last 4K from the first page and the first
  4K from the next page.  With your patch it is impossible to map all 8K
  of the gicv with 64K pages.
 
  My SOC which works fine with kvm now will go to not working with kvm
  after this patch.
  Your SOC currently works by fluke because the guest doesn't
  look at the last 4K of the GICC. If you're happy with it continuing
  to work by fluke you could make your device tree say it had a
  64K GICV region with a 64K-aligned base.
 
  To make it work not by fluke but actually correctly requires
  Marc's patchset, at a minimum.
 
 Since we aren't actually using the last 4K of the gicv at the moment I
 supppose I could drop my objections to this patch and change my device
 tree until Marc's patchset provides a proper solution for the gicv's
 second 4K that works for everybody.
 
 Acked-by: Joel Schopp joel.sch...@amd.com

Thanks, Joel.

Will
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html