Re: [RFC v2 06/10] KVM: arm/arm64: Update the physical timer interrupt level

2017-02-01 Thread Christoffer Dall
On Tue, Jan 31, 2017 at 05:00:03PM +, Marc Zyngier wrote:
> On 30/01/17 19:06, Christoffer Dall wrote:
> > On Mon, Jan 30, 2017 at 06:48:02PM +, Marc Zyngier wrote:
> >> On 30/01/17 18:41, Christoffer Dall wrote:
> >>> On Mon, Jan 30, 2017 at 05:50:03PM +, Marc Zyngier wrote:
>  On 30/01/17 15:02, Christoffer Dall wrote:
> > On Sun, Jan 29, 2017 at 03:21:06PM +, Marc Zyngier wrote:
> >> On Fri, Jan 27 2017 at 01:04:56 AM, Jintack Lim 
> >>  wrote:
> >>> Now that we maintain the EL1 physical timer register states of VMs,
> >>> update the physical timer interrupt level along with the virtual one.
> >>>
> >>> Note that the emulated EL1 physical timer is not mapped to any 
> >>> hardware
> >>> timer, so we call a proper vgic function.
> >>>
> >>> Signed-off-by: Jintack Lim 
> >>> ---
> >>>  virt/kvm/arm/arch_timer.c | 20 
> >>>  1 file changed, 20 insertions(+)
> >>>
> >>> diff --git a/virt/kvm/arm/arch_timer.c b/virt/kvm/arm/arch_timer.c
> >>> index 0f6e935..3b6bd50 100644
> >>> --- a/virt/kvm/arm/arch_timer.c
> >>> +++ b/virt/kvm/arm/arch_timer.c
> >>> @@ -180,6 +180,21 @@ static void kvm_timer_update_mapped_irq(struct 
> >>> kvm_vcpu *vcpu, bool new_level,
> >>>   WARN_ON(ret);
> >>>  }
> >>>  
> >>> +static void kvm_timer_update_irq(struct kvm_vcpu *vcpu, bool 
> >>> new_level,
> >>> +  struct arch_timer_context *timer)
> >>> +{
> >>> + int ret;
> >>> +
> >>> + BUG_ON(!vgic_initialized(vcpu->kvm));
> >>
> >> Although I've added my fair share of BUG_ON() in the code base, I've
> >> since reconsidered my position. If we get in a situation where the vgic
> >> is not initialized, maybe it would be better to just WARN_ON and return
> >> early rather than killing the whole box. Thoughts?
> >>
> >
> > The distinction to me is whether this will cause fatal crashes or
> > exploits down the road if we're working on uninitialized data.  If all
> > that can happen if the vgic is not initialized, is that the guest
> > doesn't see interrupts, for example, then a WARN_ON is appropriate.
> >
> > Which is the case here?
> >
> > That being said, do we need this at all?  This is in the critial path
> > and is actually measurable (I know this from my work on the other timer
> > series), so it's better to get rid of it if we can.  Can we simply
> > convince ourselves this will never happen, and is the code ever likely
> > to change so that it gets called with the vgic disabled later?
> 
>  That'd be the best course of action. I remember us reworking some of
>  that in the now defunct vgic-less series. Maybe we could salvage that
>  code, if only for the time we spent on it...
> 
> >>> Ah, we never merged it?  Were we waiting on a userspace implementation
> >>> or agreement on the ABI?
> >>
> >> We were waiting on the userspace side to be respun against the latest
> >> API, and there were some comments from Peter (IIRC) about supporting
> >> PPIs in general (the other timers and the PMU, for example).
> >>
> >> None of that happened, as the most vocal proponent of the series
> >> apparently lost interest.
> >>
> >>> There was definitely a useful cleanup with the whole enabled flag thing
> >>> on the timer I remember.
> >>
> >> Indeed. We should at least try to resurrect that bit.
> >>
> > 
> > It's probably worth it trying to resurrect the whole thing I think,
> > especially since I think the implementation ended up looking quite nice.
> 
> Indeed. My only concern is about the userspace counterpart, which hasn't
> materialized when expected. Hopefully it will this time around!
> 
> > I can add a rebase of that to my list of never-ending timer rework.
> 
> We all know that you can do that while sleeping! ;-)
> 

Haha, maybe that will finally make the code right.

-Christoffer
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [RFC v2 06/10] KVM: arm/arm64: Update the physical timer interrupt level

2017-02-01 Thread Christoffer Dall
On Sun, Jan 29, 2017 at 03:21:06PM +, Marc Zyngier wrote:
> On Fri, Jan 27 2017 at 01:04:56 AM, Jintack Lim  
> wrote:
> > Now that we maintain the EL1 physical timer register states of VMs,
> > update the physical timer interrupt level along with the virtual one.
> >
> > Note that the emulated EL1 physical timer is not mapped to any hardware
> > timer, so we call a proper vgic function.
> >
> > Signed-off-by: Jintack Lim 
> > ---
> >  virt/kvm/arm/arch_timer.c | 20 
> >  1 file changed, 20 insertions(+)
> >
> > diff --git a/virt/kvm/arm/arch_timer.c b/virt/kvm/arm/arch_timer.c
> > index 0f6e935..3b6bd50 100644
> > --- a/virt/kvm/arm/arch_timer.c
> > +++ b/virt/kvm/arm/arch_timer.c
> > @@ -180,6 +180,21 @@ static void kvm_timer_update_mapped_irq(struct 
> > kvm_vcpu *vcpu, bool new_level,
> > WARN_ON(ret);
> >  }
> >  
> > +static void kvm_timer_update_irq(struct kvm_vcpu *vcpu, bool new_level,
> > +struct arch_timer_context *timer)
> > +{
> > +   int ret;
> > +
> > +   BUG_ON(!vgic_initialized(vcpu->kvm));
> 
> Although I've added my fair share of BUG_ON() in the code base, I've
> since reconsidered my position. If we get in a situation where the vgic
> is not initialized, maybe it would be better to just WARN_ON and return
> early rather than killing the whole box. Thoughts?
> 

Could we help this series along by saying that since this BUG_ON already
exists in the kvm_timer_update_mapped_irq function, then it just
preserves functionality and it's up to someone else (me) to remove the
BUG_ON from both functions later in life?

Thanks,
-Christoffer
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [RFC v2 06/10] KVM: arm/arm64: Update the physical timer interrupt level

2017-02-01 Thread Jintack Lim
On Wed, Feb 1, 2017 at 3:04 AM, Christoffer Dall
 wrote:
> On Sun, Jan 29, 2017 at 03:21:06PM +, Marc Zyngier wrote:
>> On Fri, Jan 27 2017 at 01:04:56 AM, Jintack Lim  
>> wrote:
>> > Now that we maintain the EL1 physical timer register states of VMs,
>> > update the physical timer interrupt level along with the virtual one.
>> >
>> > Note that the emulated EL1 physical timer is not mapped to any hardware
>> > timer, so we call a proper vgic function.
>> >
>> > Signed-off-by: Jintack Lim 
>> > ---
>> >  virt/kvm/arm/arch_timer.c | 20 
>> >  1 file changed, 20 insertions(+)
>> >
>> > diff --git a/virt/kvm/arm/arch_timer.c b/virt/kvm/arm/arch_timer.c
>> > index 0f6e935..3b6bd50 100644
>> > --- a/virt/kvm/arm/arch_timer.c
>> > +++ b/virt/kvm/arm/arch_timer.c
>> > @@ -180,6 +180,21 @@ static void kvm_timer_update_mapped_irq(struct 
>> > kvm_vcpu *vcpu, bool new_level,
>> > WARN_ON(ret);
>> >  }
>> >
>> > +static void kvm_timer_update_irq(struct kvm_vcpu *vcpu, bool new_level,
>> > +struct arch_timer_context *timer)
>> > +{
>> > +   int ret;
>> > +
>> > +   BUG_ON(!vgic_initialized(vcpu->kvm));
>>
>> Although I've added my fair share of BUG_ON() in the code base, I've
>> since reconsidered my position. If we get in a situation where the vgic
>> is not initialized, maybe it would be better to just WARN_ON and return
>> early rather than killing the whole box. Thoughts?
>>
>
> Could we help this series along by saying that since this BUG_ON already
> exists in the kvm_timer_update_mapped_irq function, then it just
> preserves functionality and it's up to someone else (me) to remove the
> BUG_ON from both functions later in life?
>

Sounds good to me :) Thanks!

> Thanks,
> -Christoffer
>

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [RFC v2 06/10] KVM: arm/arm64: Update the physical timer interrupt level

2017-02-01 Thread Marc Zyngier
On 01/02/17 08:04, Christoffer Dall wrote:
> On Sun, Jan 29, 2017 at 03:21:06PM +, Marc Zyngier wrote:
>> On Fri, Jan 27 2017 at 01:04:56 AM, Jintack Lim  
>> wrote:
>>> Now that we maintain the EL1 physical timer register states of VMs,
>>> update the physical timer interrupt level along with the virtual one.
>>>
>>> Note that the emulated EL1 physical timer is not mapped to any hardware
>>> timer, so we call a proper vgic function.
>>>
>>> Signed-off-by: Jintack Lim 
>>> ---
>>>  virt/kvm/arm/arch_timer.c | 20 
>>>  1 file changed, 20 insertions(+)
>>>
>>> diff --git a/virt/kvm/arm/arch_timer.c b/virt/kvm/arm/arch_timer.c
>>> index 0f6e935..3b6bd50 100644
>>> --- a/virt/kvm/arm/arch_timer.c
>>> +++ b/virt/kvm/arm/arch_timer.c
>>> @@ -180,6 +180,21 @@ static void kvm_timer_update_mapped_irq(struct 
>>> kvm_vcpu *vcpu, bool new_level,
>>> WARN_ON(ret);
>>>  }
>>>  
>>> +static void kvm_timer_update_irq(struct kvm_vcpu *vcpu, bool new_level,
>>> +struct arch_timer_context *timer)
>>> +{
>>> +   int ret;
>>> +
>>> +   BUG_ON(!vgic_initialized(vcpu->kvm));
>>
>> Although I've added my fair share of BUG_ON() in the code base, I've
>> since reconsidered my position. If we get in a situation where the vgic
>> is not initialized, maybe it would be better to just WARN_ON and return
>> early rather than killing the whole box. Thoughts?
>>
> 
> Could we help this series along by saying that since this BUG_ON already
> exists in the kvm_timer_update_mapped_irq function, then it just
> preserves functionality and it's up to someone else (me) to remove the
> BUG_ON from both functions later in life?

Works for me.

M.
-- 
Jazz is not dead. It just smells funny...
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [RFC v2 06/10] KVM: arm/arm64: Update the physical timer interrupt level

2017-02-01 Thread Christoffer Dall
On Wed, Feb 01, 2017 at 03:40:10AM -0500, Jintack Lim wrote:
> On Wed, Feb 1, 2017 at 3:04 AM, Christoffer Dall
>  wrote:
> > On Sun, Jan 29, 2017 at 03:21:06PM +, Marc Zyngier wrote:
> >> On Fri, Jan 27 2017 at 01:04:56 AM, Jintack Lim  
> >> wrote:
> >> > Now that we maintain the EL1 physical timer register states of VMs,
> >> > update the physical timer interrupt level along with the virtual one.
> >> >
> >> > Note that the emulated EL1 physical timer is not mapped to any hardware
> >> > timer, so we call a proper vgic function.
> >> >
> >> > Signed-off-by: Jintack Lim 
> >> > ---
> >> >  virt/kvm/arm/arch_timer.c | 20 
> >> >  1 file changed, 20 insertions(+)
> >> >
> >> > diff --git a/virt/kvm/arm/arch_timer.c b/virt/kvm/arm/arch_timer.c
> >> > index 0f6e935..3b6bd50 100644
> >> > --- a/virt/kvm/arm/arch_timer.c
> >> > +++ b/virt/kvm/arm/arch_timer.c
> >> > @@ -180,6 +180,21 @@ static void kvm_timer_update_mapped_irq(struct 
> >> > kvm_vcpu *vcpu, bool new_level,
> >> > WARN_ON(ret);
> >> >  }
> >> >
> >> > +static void kvm_timer_update_irq(struct kvm_vcpu *vcpu, bool new_level,
> >> > +struct arch_timer_context *timer)
> >> > +{
> >> > +   int ret;
> >> > +
> >> > +   BUG_ON(!vgic_initialized(vcpu->kvm));
> >>
> >> Although I've added my fair share of BUG_ON() in the code base, I've
> >> since reconsidered my position. If we get in a situation where the vgic
> >> is not initialized, maybe it would be better to just WARN_ON and return
> >> early rather than killing the whole box. Thoughts?
> >>
> >
> > Could we help this series along by saying that since this BUG_ON already
> > exists in the kvm_timer_update_mapped_irq function, then it just
> > preserves functionality and it's up to someone else (me) to remove the
> > BUG_ON from both functions later in life?
> >
> 
> Sounds good to me :) Thanks!
> 

So just as you thought you were getting off easy...

The reason we now have kvm_timer_update_irq and
kvm_timer_update_mapped_irq is that we have the following two vgic
functions:

kvm_vgic_inject_irq
kvm_vgic_inject_mapped_irq

But the only difference between the two is what they pass
as the mapped_irq argument to vgic_update_irq_pending.

What about if we just had this as a precursor patch:

diff --git a/virt/kvm/arm/arch_timer.c b/virt/kvm/arm/arch_timer.c
index 6a084cd..91ecf48 100644
--- a/virt/kvm/arm/arch_timer.c
+++ b/virt/kvm/arm/arch_timer.c
@@ -175,7 +175,8 @@ static void kvm_timer_update_irq(struct kvm_vcpu *vcpu, 
bool new_level)
timer->irq.level = new_level;
trace_kvm_timer_update_irq(vcpu->vcpu_id, timer->irq.irq,
   timer->irq.level);
-   ret = kvm_vgic_inject_mapped_irq(vcpu->kvm, vcpu->vcpu_id,
+
+   ret = kvm_vgic_inject_irq(vcpu->kvm, vcpu->vcpu_id,
 timer->irq.irq,
 timer->irq.level);
WARN_ON(ret);
diff --git a/virt/kvm/arm/vgic/vgic.c b/virt/kvm/arm/vgic/vgic.c
index dea12df..4c87fd0 100644
--- a/virt/kvm/arm/vgic/vgic.c
+++ b/virt/kvm/arm/vgic/vgic.c
@@ -336,8 +336,7 @@ bool vgic_queue_irq_unlock(struct kvm *kvm, struct vgic_irq 
*irq)
 }
 
 static int vgic_update_irq_pending(struct kvm *kvm, int cpuid,
-  unsigned int intid, bool level,
-  bool mapped_irq)
+  unsigned int intid, bool level)
 {
struct kvm_vcpu *vcpu;
struct vgic_irq *irq;
@@ -357,11 +356,6 @@ static int vgic_update_irq_pending(struct kvm *kvm, int 
cpuid,
if (!irq)
return -EINVAL;
 
-   if (irq->hw != mapped_irq) {
-   vgic_put_irq(kvm, irq);
-   return -EINVAL;
-   }
-
spin_lock(&irq->irq_lock);
 
if (!vgic_validate_injection(irq, level)) {
@@ -399,13 +393,7 @@ static int vgic_update_irq_pending(struct kvm *kvm, int 
cpuid,
 int kvm_vgic_inject_irq(struct kvm *kvm, int cpuid, unsigned int intid,
bool level)
 {
-   return vgic_update_irq_pending(kvm, cpuid, intid, level, false);
-}
-
-int kvm_vgic_inject_mapped_irq(struct kvm *kvm, int cpuid, unsigned int intid,
-  bool level)
-{
-   return vgic_update_irq_pending(kvm, cpuid, intid, level, true);
+   return vgic_update_irq_pending(kvm, cpuid, intid, level);
 }
 
 int kvm_vgic_map_phys_irq(struct kvm_vcpu *vcpu, u32 virt_irq, u32 phys_irq)


That would make this patch simpler.  If so, I can send out the above
patch with a proper description.

Thanks,
-Christoffer
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [RFC v2 05/10] KVM: arm/arm64: Initialize the emulated EL1 physical timer

2017-02-01 Thread Marc Zyngier
On 30/01/17 19:04, Christoffer Dall wrote:
> On Mon, Jan 30, 2017 at 05:44:20PM +, Marc Zyngier wrote:

>> Wventually, we'll have to support an offset-able
>> physical counter to support nested virtualization, but this can come at
>> a later time.
>>
> Why do we need the offset-able physical counter for nested
> virtualization?  I would think for nested virt we just need to support
> respecting how the guest hypervisor programs CNTVOFF?

Ah, I see what you mean. Yes, once the guest hypervisor is in control of
its own CNTVOFF, we get everything we need. So let's just ignore this
for the time being, and we should be pretty good for this series.

Thanks,

M.
-- 
Jazz is not dead. It just smells funny...
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [RFC v2 06/10] KVM: arm/arm64: Update the physical timer interrupt level

2017-02-01 Thread Marc Zyngier
On 01/02/17 10:07, Christoffer Dall wrote:
> On Wed, Feb 01, 2017 at 03:40:10AM -0500, Jintack Lim wrote:
>> On Wed, Feb 1, 2017 at 3:04 AM, Christoffer Dall
>>  wrote:
>>> On Sun, Jan 29, 2017 at 03:21:06PM +, Marc Zyngier wrote:
 On Fri, Jan 27 2017 at 01:04:56 AM, Jintack Lim  
 wrote:
> Now that we maintain the EL1 physical timer register states of VMs,
> update the physical timer interrupt level along with the virtual one.
>
> Note that the emulated EL1 physical timer is not mapped to any hardware
> timer, so we call a proper vgic function.
>
> Signed-off-by: Jintack Lim 
> ---
>  virt/kvm/arm/arch_timer.c | 20 
>  1 file changed, 20 insertions(+)
>
> diff --git a/virt/kvm/arm/arch_timer.c b/virt/kvm/arm/arch_timer.c
> index 0f6e935..3b6bd50 100644
> --- a/virt/kvm/arm/arch_timer.c
> +++ b/virt/kvm/arm/arch_timer.c
> @@ -180,6 +180,21 @@ static void kvm_timer_update_mapped_irq(struct 
> kvm_vcpu *vcpu, bool new_level,
> WARN_ON(ret);
>  }
>
> +static void kvm_timer_update_irq(struct kvm_vcpu *vcpu, bool new_level,
> +struct arch_timer_context *timer)
> +{
> +   int ret;
> +
> +   BUG_ON(!vgic_initialized(vcpu->kvm));

 Although I've added my fair share of BUG_ON() in the code base, I've
 since reconsidered my position. If we get in a situation where the vgic
 is not initialized, maybe it would be better to just WARN_ON and return
 early rather than killing the whole box. Thoughts?

>>>
>>> Could we help this series along by saying that since this BUG_ON already
>>> exists in the kvm_timer_update_mapped_irq function, then it just
>>> preserves functionality and it's up to someone else (me) to remove the
>>> BUG_ON from both functions later in life?
>>>
>>
>> Sounds good to me :) Thanks!
>>
> 
> So just as you thought you were getting off easy...
> 
> The reason we now have kvm_timer_update_irq and
> kvm_timer_update_mapped_irq is that we have the following two vgic
> functions:
> 
> kvm_vgic_inject_irq
> kvm_vgic_inject_mapped_irq
> 
> But the only difference between the two is what they pass
> as the mapped_irq argument to vgic_update_irq_pending.
> 
> What about if we just had this as a precursor patch:
> 
> diff --git a/virt/kvm/arm/arch_timer.c b/virt/kvm/arm/arch_timer.c
> index 6a084cd..91ecf48 100644
> --- a/virt/kvm/arm/arch_timer.c
> +++ b/virt/kvm/arm/arch_timer.c
> @@ -175,7 +175,8 @@ static void kvm_timer_update_irq(struct kvm_vcpu *vcpu, 
> bool new_level)
>   timer->irq.level = new_level;
>   trace_kvm_timer_update_irq(vcpu->vcpu_id, timer->irq.irq,
>  timer->irq.level);
> - ret = kvm_vgic_inject_mapped_irq(vcpu->kvm, vcpu->vcpu_id,
> +
> + ret = kvm_vgic_inject_irq(vcpu->kvm, vcpu->vcpu_id,
>timer->irq.irq,
>timer->irq.level);
>   WARN_ON(ret);
> diff --git a/virt/kvm/arm/vgic/vgic.c b/virt/kvm/arm/vgic/vgic.c
> index dea12df..4c87fd0 100644
> --- a/virt/kvm/arm/vgic/vgic.c
> +++ b/virt/kvm/arm/vgic/vgic.c
> @@ -336,8 +336,7 @@ bool vgic_queue_irq_unlock(struct kvm *kvm, struct 
> vgic_irq *irq)
>  }
>  
>  static int vgic_update_irq_pending(struct kvm *kvm, int cpuid,
> -unsigned int intid, bool level,
> -bool mapped_irq)
> +unsigned int intid, bool level)
>  {
>   struct kvm_vcpu *vcpu;
>   struct vgic_irq *irq;
> @@ -357,11 +356,6 @@ static int vgic_update_irq_pending(struct kvm *kvm, int 
> cpuid,
>   if (!irq)
>   return -EINVAL;
>  
> - if (irq->hw != mapped_irq) {
> - vgic_put_irq(kvm, irq);
> - return -EINVAL;
> - }
> -
>   spin_lock(&irq->irq_lock);
>  
>   if (!vgic_validate_injection(irq, level)) {
> @@ -399,13 +393,7 @@ static int vgic_update_irq_pending(struct kvm *kvm, int 
> cpuid,
>  int kvm_vgic_inject_irq(struct kvm *kvm, int cpuid, unsigned int intid,
>   bool level)
>  {
> - return vgic_update_irq_pending(kvm, cpuid, intid, level, false);
> -}
> -
> -int kvm_vgic_inject_mapped_irq(struct kvm *kvm, int cpuid, unsigned int 
> intid,
> -bool level)
> -{
> - return vgic_update_irq_pending(kvm, cpuid, intid, level, true);
> + return vgic_update_irq_pending(kvm, cpuid, intid, level);
>  }
>  
>  int kvm_vgic_map_phys_irq(struct kvm_vcpu *vcpu, u32 virt_irq, u32 phys_irq)
> 
> 
> That would make this patch simpler.  If so, I can send out the above
> patch with a proper description.

Indeed. And while you're at it, rename vgic_update_irq_pending to
kvm_vgic_inject_irq, as I don't think we need this simple stub?

Thanks,

M.
-- 
Jazz is not dead. It just smells funny...
___
kvmarm mailing lis

[PATCH] KVM: arm/arm64: Remove kvm_vgic_inject_mapped_irq

2017-02-01 Thread Christoffer Dall
The only benefit of having kvm_vgic_inject_mapped_irq separate from
kvm_vgic_inject_irq is that we pass a boolean that we use for error
checking on the injection path.

While this could potentially help in some aspect of robustness, it's
also a little bit of a defensive move, and arguably callers into the
vgic should have make sure they have marked their virtual IRQs as mapped
if required.

Signed-off-by: Christoffer Dall 
---
 virt/kvm/arm/arch_timer.c |  3 ++-
 virt/kvm/arm/vgic/vgic.c  | 50 +++
 2 files changed, 18 insertions(+), 35 deletions(-)

diff --git a/virt/kvm/arm/arch_timer.c b/virt/kvm/arm/arch_timer.c
index 6a084cd..91ecf48 100644
--- a/virt/kvm/arm/arch_timer.c
+++ b/virt/kvm/arm/arch_timer.c
@@ -175,7 +175,8 @@ static void kvm_timer_update_irq(struct kvm_vcpu *vcpu, 
bool new_level)
timer->irq.level = new_level;
trace_kvm_timer_update_irq(vcpu->vcpu_id, timer->irq.irq,
   timer->irq.level);
-   ret = kvm_vgic_inject_mapped_irq(vcpu->kvm, vcpu->vcpu_id,
+
+   ret = kvm_vgic_inject_irq(vcpu->kvm, vcpu->vcpu_id,
 timer->irq.irq,
 timer->irq.level);
WARN_ON(ret);
diff --git a/virt/kvm/arm/vgic/vgic.c b/virt/kvm/arm/vgic/vgic.c
index dea12df..654dfd4 100644
--- a/virt/kvm/arm/vgic/vgic.c
+++ b/virt/kvm/arm/vgic/vgic.c
@@ -335,9 +335,22 @@ bool vgic_queue_irq_unlock(struct kvm *kvm, struct 
vgic_irq *irq)
return true;
 }
 
-static int vgic_update_irq_pending(struct kvm *kvm, int cpuid,
-  unsigned int intid, bool level,
-  bool mapped_irq)
+/**
+ * kvm_vgic_inject_irq - Inject an IRQ from a device to the vgic
+ * @kvm: The VM structure pointer
+ * @cpuid:   The CPU for PPIs
+ * @intid:   The INTID to inject a new state to.
+ * @level:   Edge-triggered:  true:  to trigger the interrupt
+ *   false: to ignore the call
+ *  Level-sensitive  true:  raise the input signal
+ *   false: lower the input signal
+ *
+ * The VGIC is not concerned with devices being active-LOW or active-HIGH for
+ * level-sensitive interrupts.  You can think of the level parameter as 1
+ * being HIGH and 0 being LOW and all devices being active-HIGH.
+ */
+int kvm_vgic_inject_irq(struct kvm *kvm, int cpuid, unsigned int intid,
+   bool level)
 {
struct kvm_vcpu *vcpu;
struct vgic_irq *irq;
@@ -357,11 +370,6 @@ static int vgic_update_irq_pending(struct kvm *kvm, int 
cpuid,
if (!irq)
return -EINVAL;
 
-   if (irq->hw != mapped_irq) {
-   vgic_put_irq(kvm, irq);
-   return -EINVAL;
-   }
-
spin_lock(&irq->irq_lock);
 
if (!vgic_validate_injection(irq, level)) {
@@ -382,32 +390,6 @@ static int vgic_update_irq_pending(struct kvm *kvm, int 
cpuid,
return 0;
 }
 
-/**
- * kvm_vgic_inject_irq - Inject an IRQ from a device to the vgic
- * @kvm: The VM structure pointer
- * @cpuid:   The CPU for PPIs
- * @intid:   The INTID to inject a new state to.
- * @level:   Edge-triggered:  true:  to trigger the interrupt
- *   false: to ignore the call
- *  Level-sensitive  true:  raise the input signal
- *   false: lower the input signal
- *
- * The VGIC is not concerned with devices being active-LOW or active-HIGH for
- * level-sensitive interrupts.  You can think of the level parameter as 1
- * being HIGH and 0 being LOW and all devices being active-HIGH.
- */
-int kvm_vgic_inject_irq(struct kvm *kvm, int cpuid, unsigned int intid,
-   bool level)
-{
-   return vgic_update_irq_pending(kvm, cpuid, intid, level, false);
-}
-
-int kvm_vgic_inject_mapped_irq(struct kvm *kvm, int cpuid, unsigned int intid,
-  bool level)
-{
-   return vgic_update_irq_pending(kvm, cpuid, intid, level, true);
-}
-
 int kvm_vgic_map_phys_irq(struct kvm_vcpu *vcpu, u32 virt_irq, u32 phys_irq)
 {
struct vgic_irq *irq = vgic_get_irq(vcpu->kvm, vcpu, virt_irq);
-- 
2.9.0

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH] KVM: arm/arm64: Remove kvm_vgic_inject_mapped_irq

2017-02-01 Thread Marc Zyngier
On 01/02/17 10:22, Christoffer Dall wrote:
> The only benefit of having kvm_vgic_inject_mapped_irq separate from
> kvm_vgic_inject_irq is that we pass a boolean that we use for error
> checking on the injection path.
> 
> While this could potentially help in some aspect of robustness, it's
> also a little bit of a defensive move, and arguably callers into the
> vgic should have make sure they have marked their virtual IRQs as mapped
> if required.
> 
> Signed-off-by: Christoffer Dall 


Acked-by: Marc Zyngier 

M.

-- 
Jazz is not dead. It just smells funny...
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH] KVM: arm/arm64: Remove kvm_vgic_inject_mapped_irq

2017-02-01 Thread Andre Przywara
Hi,

On 01/02/17 10:22, Christoffer Dall wrote:
> The only benefit of having kvm_vgic_inject_mapped_irq separate from
> kvm_vgic_inject_irq is that we pass a boolean that we use for error
> checking on the injection path.
> 
> While this could potentially help in some aspect of robustness, it's
> also a little bit of a defensive move, and arguably callers into the
> vgic should have make sure they have marked their virtual IRQs as mapped
> if required.

Yes, after looking again into the "new-vgic" patches I convinced myself
that this is fine. I think since we originally came from two different
implementations of inject_mapped_irq and inject_irq, we were just not
brave enough to actually squash them.
And I like seeing that confusing vgic_update_irq_pending() name go away.

Reviewed-by: Andre Przywara 

Thanks!
Andre

> Signed-off-by: Christoffer Dall 
> ---
>  virt/kvm/arm/arch_timer.c |  3 ++-
>  virt/kvm/arm/vgic/vgic.c  | 50 
> +++
>  2 files changed, 18 insertions(+), 35 deletions(-)
> 
> diff --git a/virt/kvm/arm/arch_timer.c b/virt/kvm/arm/arch_timer.c
> index 6a084cd..91ecf48 100644
> --- a/virt/kvm/arm/arch_timer.c
> +++ b/virt/kvm/arm/arch_timer.c
> @@ -175,7 +175,8 @@ static void kvm_timer_update_irq(struct kvm_vcpu *vcpu, 
> bool new_level)
>   timer->irq.level = new_level;
>   trace_kvm_timer_update_irq(vcpu->vcpu_id, timer->irq.irq,
>  timer->irq.level);
> - ret = kvm_vgic_inject_mapped_irq(vcpu->kvm, vcpu->vcpu_id,
> +
> + ret = kvm_vgic_inject_irq(vcpu->kvm, vcpu->vcpu_id,
>timer->irq.irq,
>timer->irq.level);
>   WARN_ON(ret);
> diff --git a/virt/kvm/arm/vgic/vgic.c b/virt/kvm/arm/vgic/vgic.c
> index dea12df..654dfd4 100644
> --- a/virt/kvm/arm/vgic/vgic.c
> +++ b/virt/kvm/arm/vgic/vgic.c
> @@ -335,9 +335,22 @@ bool vgic_queue_irq_unlock(struct kvm *kvm, struct 
> vgic_irq *irq)
>   return true;
>  }
>  
> -static int vgic_update_irq_pending(struct kvm *kvm, int cpuid,
> -unsigned int intid, bool level,
> -bool mapped_irq)
> +/**
> + * kvm_vgic_inject_irq - Inject an IRQ from a device to the vgic
> + * @kvm: The VM structure pointer
> + * @cpuid:   The CPU for PPIs
> + * @intid:   The INTID to inject a new state to.
> + * @level:   Edge-triggered:  true:  to trigger the interrupt
> + * false: to ignore the call
> + *Level-sensitive  true:  raise the input signal
> + * false: lower the input signal
> + *
> + * The VGIC is not concerned with devices being active-LOW or active-HIGH for
> + * level-sensitive interrupts.  You can think of the level parameter as 1
> + * being HIGH and 0 being LOW and all devices being active-HIGH.
> + */
> +int kvm_vgic_inject_irq(struct kvm *kvm, int cpuid, unsigned int intid,
> + bool level)
>  {
>   struct kvm_vcpu *vcpu;
>   struct vgic_irq *irq;
> @@ -357,11 +370,6 @@ static int vgic_update_irq_pending(struct kvm *kvm, int 
> cpuid,
>   if (!irq)
>   return -EINVAL;
>  
> - if (irq->hw != mapped_irq) {
> - vgic_put_irq(kvm, irq);
> - return -EINVAL;
> - }
> -
>   spin_lock(&irq->irq_lock);
>  
>   if (!vgic_validate_injection(irq, level)) {
> @@ -382,32 +390,6 @@ static int vgic_update_irq_pending(struct kvm *kvm, int 
> cpuid,
>   return 0;
>  }
>  
> -/**
> - * kvm_vgic_inject_irq - Inject an IRQ from a device to the vgic
> - * @kvm: The VM structure pointer
> - * @cpuid:   The CPU for PPIs
> - * @intid:   The INTID to inject a new state to.
> - * @level:   Edge-triggered:  true:  to trigger the interrupt
> - * false: to ignore the call
> - *Level-sensitive  true:  raise the input signal
> - * false: lower the input signal
> - *
> - * The VGIC is not concerned with devices being active-LOW or active-HIGH for
> - * level-sensitive interrupts.  You can think of the level parameter as 1
> - * being HIGH and 0 being LOW and all devices being active-HIGH.
> - */
> -int kvm_vgic_inject_irq(struct kvm *kvm, int cpuid, unsigned int intid,
> - bool level)
> -{
> - return vgic_update_irq_pending(kvm, cpuid, intid, level, false);
> -}
> -
> -int kvm_vgic_inject_mapped_irq(struct kvm *kvm, int cpuid, unsigned int 
> intid,
> -bool level)
> -{
> - return vgic_update_irq_pending(kvm, cpuid, intid, level, true);
> -}
> -
>  int kvm_vgic_map_phys_irq(struct kvm_vcpu *vcpu, u32 virt_irq, u32 phys_irq)
>  {
>   struct vgic_irq *irq = vgic_get_irq(vcpu->kvm, vcpu, virt_irq);
> 
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH v2 2/3] arm64: kvm: reuse existing cache type/info related macros

2017-02-01 Thread Marc Zyngier
On 30/01/17 16:25, Sudeep Holla wrote:
> We already have various macros related to cache type and bitfields in
> CLIDR system register. We can replace some of the hardcoded values
> here using those existing macros.
> 
> This patch reuses those existing cache type/info related macros and
> replaces the hardcorded values. It also removes some of the comments
> that become trivial with the macro names.
> 
> Cc: Catalin Marinas 
> Cc: Will Deacon 
> Cc: Christoffer Dall 
> Cc: Marc Zyngier 
> Signed-off-by: Sudeep Holla 

Acked-by: Marc Zyngier 

M.
-- 
Jazz is not dead. It just smells funny...
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH v2 3/3] arm64: kvm: add support for the extended 64bit ccsidr

2017-02-01 Thread Marc Zyngier
On 30/01/17 16:25, Sudeep Holla wrote:
> csselr and ccsidr are treated as 64-bit values already elsewhere in the
> kernel. It also aligns well with the architecture extensions that allow
> 64-bit format for ccsidr.
> 
> This patch upgrades the existing accesses to csselr and ccsidr from
> 32-bit to 64-bit in preparation to add support to those extensions.
> It also add dedicated KVM_REG_ARM_DEMUX_ID_EXT_CCSIDR demux register
> to handle 64-bit ccsidr in KVM.
> 
> Cc: Christoffer Dall 
> Cc: Marc Zyngier 
> Signed-off-by: Sudeep Holla 
> ---
>  arch/arm64/include/uapi/asm/kvm.h |   1 +
>  arch/arm64/kvm/sys_regs.c | 104 
> --
>  2 files changed, 77 insertions(+), 28 deletions(-)
> 
> v1->v2:
>   - Added dependency on cpu_supports_ccsidr_64b_format(PATCH 1/3)
>   - Added a new KVM_REG_ARM_DEMUX_ID_EXT_CCSIDR demux register id
> to support new 64bit CCSIDR
> 
> diff --git a/arch/arm64/include/uapi/asm/kvm.h 
> b/arch/arm64/include/uapi/asm/kvm.h
> index 3051f86a9b5f..8aa18e65e6a5 100644
> --- a/arch/arm64/include/uapi/asm/kvm.h
> +++ b/arch/arm64/include/uapi/asm/kvm.h
> @@ -161,6 +161,7 @@ struct kvm_arch_memory_slot {
>  #define KVM_REG_ARM_DEMUX_ID_MASK0xFF00
>  #define KVM_REG_ARM_DEMUX_ID_SHIFT   8
>  #define KVM_REG_ARM_DEMUX_ID_CCSIDR  (0x00 << KVM_REG_ARM_DEMUX_ID_SHIFT)
> +#define KVM_REG_ARM_DEMUX_ID_EXT_CCSIDR  (0x01 << 
> KVM_REG_ARM_DEMUX_ID_SHIFT)
>  #define KVM_REG_ARM_DEMUX_VAL_MASK   0x00FF
>  #define KVM_REG_ARM_DEMUX_VAL_SHIFT  0
> 
> diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
> index 72656743b4cc..f9822ac6d9ab 100644
> --- a/arch/arm64/kvm/sys_regs.c
> +++ b/arch/arm64/kvm/sys_regs.c
> @@ -58,15 +58,15 @@
>   */
> 
>  /* 3 bits per cache level, as per CLIDR, but non-existent caches always 0 */
> -static u32 cache_levels;
> +static u64 cache_levels;
> 
> -/* CSSELR values; used to index KVM_REG_ARM_DEMUX_ID_CCSIDR */
> +/* CSSELR values; used to index KVM_REG_ARM_DEMUX_ID_{EXT_,}CCSIDR */
>  #define CSSELR_MAX   ((MAX_CACHE_LEVEL - 1) << 1)
> 
>  /* Which cache CCSIDR represents depends on CSSELR value. */
> -static u32 get_ccsidr(u32 csselr)
> +static u64 get_ccsidr(u64 csselr)
>  {
> - u32 ccsidr;
> + u64 ccsidr;
> 
>   /* Make sure noone else changes CSSELR during this! */
>   local_irq_disable();
> @@ -1952,9 +1952,9 @@ static int set_invariant_sys_reg(u64 id, void __user 
> *uaddr)
>   return 0;
>  }
> 
> -static bool is_valid_cache(u32 val)
> +static bool is_valid_cache(u64 val)
>  {
> - u32 level, ctype;
> + u64 level, ctype;
> 
>   if (val >= CSSELR_MAX)
>   return false;
> @@ -1977,10 +1977,28 @@ static bool is_valid_cache(u32 val)
>   }
>  }
> 
> +static int demux_ccsidr_validate_get(u64 id, int size, u64 *val)
> +{
> + u64 cidx;
> +
> + if (KVM_REG_SIZE(id) != size)
> + return -ENOENT;
> +
> + cidx = (id & KVM_REG_ARM_DEMUX_VAL_MASK)
> + >> KVM_REG_ARM_DEMUX_VAL_SHIFT;
> + if (!is_valid_cache(cidx))
> + return -ENOENT;
> +
> + *val = get_ccsidr(cidx);
> + return 0;
> +}
> +
>  static int demux_c15_get(u64 id, void __user *uaddr)
>  {
> - u32 val;
> - u32 __user *uval = uaddr;
> + int ret;
> + u64 val;
> + u32 __user *uval;
> + u64 __user *uval64;
> 
>   /* Fail if we have unknown bits set. */
>   if (id & ~(KVM_REG_ARCH_MASK|KVM_REG_SIZE_MASK|KVM_REG_ARM_COPROC_MASK
> @@ -1989,14 +2007,17 @@ static int demux_c15_get(u64 id, void __user *uaddr)
> 
>   switch (id & KVM_REG_ARM_DEMUX_ID_MASK) {
>   case KVM_REG_ARM_DEMUX_ID_CCSIDR:
> - if (KVM_REG_SIZE(id) != 4)
> - return -ENOENT;
> - val = (id & KVM_REG_ARM_DEMUX_VAL_MASK)
> - >> KVM_REG_ARM_DEMUX_VAL_SHIFT;
> - if (!is_valid_cache(val))
> - return -ENOENT;
> -
> - return put_user(get_ccsidr(val), uval);
> + ret = demux_ccsidr_validate_get(id, sizeof(*uval), &val);
> + if (ret)
> + return ret;
> + uval = uaddr;
> + return put_user(val, uval);
> + case KVM_REG_ARM_DEMUX_ID_EXT_CCSIDR:
> + ret = demux_ccsidr_validate_get(id, sizeof(*uval64), &val);
> + if (ret)
> + return ret;
> + uval64 = uaddr;
> + return put_user(val, uval64);
>   default:
>   return -ENOENT;
>   }
> @@ -2004,8 +2025,10 @@ static int demux_c15_get(u64 id, void __user *uaddr)
> 
>  static int demux_c15_set(u64 id, void __user *uaddr)
>  {
> - u32 val, newval;
> - u32 __user *uval = uaddr;
> + int ret;
> + u64 val, newval;
> + u32 __user *uval;
> + u64 __user *uval64;
> 
>   /* Fail if we have unknown bits set. */
>   if (id & ~(KVM_REG_ARCH_MASK|KVM_REG_SIZE_MASK|KVM_REG_ARM_COPROC_MASK
> @@ -2014,18 +2

Re: [PATCH v4 2/4] arm64: Work around Falkor erratum 1003

2017-02-01 Thread Christopher Covington
On 01/31/2017 12:56 PM, Marc Zyngier wrote:
> On 31/01/17 17:48, Christopher Covington wrote:
>> On 01/31/2017 07:37 AM, Mark Rutland wrote:
>>> On Wed, Jan 25, 2017 at 10:52:30AM -0500, Christopher Covington wrote:
 The Qualcomm Datacenter Technologies Falkor v1 CPU may allocate TLB entries
 using an incorrect ASID when TTBRx_EL1 is being updated. When the erratum
 is triggered, page table entries using the new translation table base
 address (BADDR) will be allocated into the TLB using the old ASID. All
 circumstances leading to the incorrect ASID being cached in the TLB arise
 when software writes TTBRx_EL1[ASID] and TTBRx_EL1[BADDR], a memory
 operation is in the process of performing a translation using the specific
 TTBRx_EL1 being written, and the memory operation uses a translation table
 descriptor designated as non-global. EL2 and EL3 code changing the EL1&0
 ASID is not subject to this erratum because hardware is prohibited from
 performing translations from an out-of-context translation regime.

 Consider the following pseudo code.

   write new BADDR and ASID values to TTBRx_EL1

 Replacing the above sequence with the one below will ensure that no TLB
 entries with an incorrect ASID are used by software.

   write reserved value to TTBRx_EL1[ASID]
   ISB
   write new value to TTBRx_EL1[BADDR]
   ISB
   write new value to TTBRx_EL1[ASID]
   ISB

 When the above sequence is used, page table entries using the new BADDR
 value may still be incorrectly allocated into the TLB using the reserved
 ASID. Yet this will not reduce functionality, since TLB entries incorrectly
 tagged with the reserved ASID will never be hit by a later instruction.
>>>
>>> Based on my understanding that entries allocated to the reserved ASID
>>> will not be used for subsequent page table walks (and so we don't have
>>> asynchronous behaviour to contend with), this sounds fine to me.
>>>
>>> Thanks for taking the time to clarify the details on that.
>>>
 Based on work by Shanker Donthineni 

 Signed-off-by: Christopher Covington 
 ---
  Documentation/arm64/silicon-errata.txt |  1 +
  arch/arm64/Kconfig | 11 +++
  arch/arm64/include/asm/assembler.h | 23 +++
  arch/arm64/include/asm/cpucaps.h   |  3 ++-
  arch/arm64/include/asm/mmu_context.h   |  8 +++-
  arch/arm64/kernel/cpu_errata.c |  7 +++
  arch/arm64/mm/context.c| 11 +++
  arch/arm64/mm/proc.S   |  1 +
  8 files changed, 63 insertions(+), 2 deletions(-)
>>>
>>> Don't we need to use pre_ttbr0_update_workaround in 
>>> for CONFIG_ARM64_SW_TTBR0_PAN? We implicitly switch to the reserved ASID
>>> for the empty table in __uaccess_ttbr0_disable.
>>>
>>> That also means we have to invalidate the reserved ASID so as to not
>>> accidentally hit while uaccess is disabled.
>>
>> The CPU in question (Falkor v1) has hardware PAN support. Do we need
>> to worry about including the workaround in the SW PAN code in that case?
> 
> Given that all ARMv8 CPUs can support SW_PAN, it is more likely to be
> enabled than the ARMv8.1 PAN. I'd vote for supporting the workaround in
> that case too, and hope that people do enable the HW version.

Okay, I'll do my best to add support for the SW PAN case. I rebased and
submitted v6 of the E1009 patch [1] so that it no longer depends on this
patch landing first, if you all are inclined to pick it up while work on
this E1003 patch continues.

1. https://patchwork.kernel.org/patch/9547923/

Thanks,
Christopher

-- 
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm
Technologies, Inc. Qualcomm Technologies, Inc. is a member of the Code
Aurora Forum, a Linux Foundation Collaborative Project.
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH v4 2/4] arm64: Work around Falkor erratum 1003

2017-02-01 Thread Will Deacon
On Wed, Feb 01, 2017 at 11:29:22AM -0500, Christopher Covington wrote:
> On 01/31/2017 12:56 PM, Marc Zyngier wrote:
> > Given that all ARMv8 CPUs can support SW_PAN, it is more likely to be
> > enabled than the ARMv8.1 PAN. I'd vote for supporting the workaround in
> > that case too, and hope that people do enable the HW version.
> 
> Okay, I'll do my best to add support for the SW PAN case. I rebased and
> submitted v6 of the E1009 patch [1] so that it no longer depends on this
> patch landing first, if you all are inclined to pick it up while work on
> this E1003 patch continues.

The alternative is not enabling SW_PAN (at runtime) if this errata is
present, along with a warning stating that hardware-PAN should be
enabled in kconfig instead. Not sure what distributions will make of that
though.

> 1. https://patchwork.kernel.org/patch/9547923/

Yup, I queued that one locally and I'll push to -next tomorrow.

Thanks,

Will
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH v8 01/16] FDT: introduce global phandle allocation

2017-02-01 Thread André Przywara
Hi Marc, Will,

On 09/12/16 12:03, Marc Zyngier wrote:
> On 04/11/16 17:31, Andre Przywara wrote:
>> Allocating an FDT phandle (a unique identifier) using a static
>> variable in a static inline function defined in a header file works
>> only if all users are in the same source file. So trying to allocate
>> a handle from two different compilation units fails.
>> Introduce global phandle allocation and reference code to properly
>> allocate unique phandles.
>>
>> Signed-off-by: Andre Przywara 
>> ---
>>  Makefile  |  1 +
>>  arm/fdt.c |  2 +-
>>  arm/gic.c |  2 +-
>>  include/kvm/fdt.h | 10 +-
>>  kvm-fdt.c | 26 ++
>>  5 files changed, 34 insertions(+), 7 deletions(-)
>>  create mode 100644 kvm-fdt.c
>>
>> diff --git a/Makefile b/Makefile
>> index 1f0196f..e4a4002 100644
>> --- a/Makefile
>> +++ b/Makefile
>> @@ -98,6 +98,7 @@ OBJS   += kvm-ipc.o
>>  OBJS+= builtin-sandbox.o
>>  OBJS+= virtio/mmio.o
>>  OBJS+= hw/i8042.o
>> +OBJS+= kvm-fdt.o
>>  
>>  # Translate uname -m into ARCH string
>>  ARCH ?= $(shell uname -m | sed -e s/i.86/i386/ -e s/ppc.*/powerpc/ \
>> diff --git a/arm/fdt.c b/arm/fdt.c
>> index 381d48f..8bcfffb 100644
>> --- a/arm/fdt.c
>> +++ b/arm/fdt.c
>> @@ -114,7 +114,7 @@ static int setup_fdt(struct kvm *kvm)
>>  {
>>  struct device_header *dev_hdr;
>>  u8 staging_fdt[FDT_MAX_SIZE];
>> -u32 gic_phandle = fdt__alloc_phandle();
>> +u32 gic_phandle = fdt__get_phandle(PHANDLE_GIC);
>>  u64 mem_reg_prop[]  = {
>>  cpu_to_fdt64(kvm->arch.memory_guest_start),
>>  cpu_to_fdt64(kvm->ram_size),
>> diff --git a/arm/gic.c b/arm/gic.c
>> index d6d6dd0..b60437e 100644
>> --- a/arm/gic.c
>> +++ b/arm/gic.c
>> @@ -222,7 +222,7 @@ void gic__generate_fdt_nodes(void *fdt, u32 phandle, 
>> enum irqchip_type type)
>>  _FDT(fdt_property_cell(fdt, "#interrupt-cells", GIC_FDT_IRQ_NUM_CELLS));
>>  _FDT(fdt_property(fdt, "interrupt-controller", NULL, 0));
>>  _FDT(fdt_property(fdt, "reg", reg_prop, sizeof(reg_prop)));
>> -_FDT(fdt_property_cell(fdt, "phandle", phandle));
>> +_FDT(fdt_property_cell(fdt, "phandle", fdt__get_phandle(PHANDLE_GIC)));
>>  _FDT(fdt_end_node(fdt));
>>  }
>>  
>> diff --git a/include/kvm/fdt.h b/include/kvm/fdt.h
>> index 53d85a4..cd2bb72 100644
>> --- a/include/kvm/fdt.h
>> +++ b/include/kvm/fdt.h
>> @@ -8,6 +8,10 @@
>>  #include 
>>  
>>  #define FDT_MAX_SIZE0x1
>> +#define FDT_INVALID_PHANDLE 0
>> +#define FDT_IS_VALID_PHANDLE(phandle) ((phandle) != FDT_INVALID_PHANDLE)
>> +
>> +enum phandles {PHANDLE_GIC, PHANDLES_MAX};
> 
> Another nit here: PHANDLE_GIC is pretty much ARM-specific, while FDT is
> supposed to be generic. Can't we move the enum to be architecture
> specific and not put this in an architecture agnostic one?

So while trying to find the best possible solution for this seemingly
simple problem, I was wondering why we had this allocation function in
the first place?
It seems a bit overkill to allocate a handle if we could just go with
static values.
I changed the first two patches now to have an enum per architecture to
list all possible handles and then just using those values directly
instead of going through another layer of indirection.

So is there anything that will require dynamic phandles in the future?
This version proposed here can't really cope with them anyway and in the
moment it's just about _one_ GIC phandle and _one_ ITS phandle, so a
static assignment is much simpler.

Will we need multiple ITSes for device passthrough? Or would it just be
one ITS for all hardware devices and one for emulated virtio?

Or do I miss anything else here?

Cheers,
Andre.

> 
> Thanks,
> 
>   M.
> 

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH v4 3/4] arm64: Use __tlbi() macros in KVM code

2017-02-01 Thread Punit Agrawal
Will Deacon  writes:

> On Wed, Jan 25, 2017 at 08:39:43PM +0100, Christoffer Dall wrote:
>> On Wed, Jan 25, 2017 at 10:52:31AM -0500, Christopher Covington wrote:
>> > Refactor the KVM code to use the __tlbi macros, which will allow an errata
>> > workaround that repeats tlbi dsb sequences to only change one location.
>> > This is not intended to change the generated assembly and comparing before
>> > and after vmlinux objdump shows no functional changes.
>> > 
>> > Signed-off-by: Christopher Covington 
>> 
>> Acked-by: Christoffer Dall 
>
> Thanks, I'll queue this one via arm64.

I just noticed this patch but there's been a similar patch from Mark
that I've been carrying as part of the KVM TLB monitoring series[0].

[0] http://www.mail-archive.com/kvmarm@lists.cs.columbia.edu/msg09359.html

>
> Will
> ___
> kvmarm mailing list
> kvmarm@lists.cs.columbia.edu
> https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH v4 3/4] arm64: Use __tlbi() macros in KVM code

2017-02-01 Thread Will Deacon
On Wed, Feb 01, 2017 at 05:02:43PM +, Punit Agrawal wrote:
> Will Deacon  writes:
> 
> > On Wed, Jan 25, 2017 at 08:39:43PM +0100, Christoffer Dall wrote:
> >> On Wed, Jan 25, 2017 at 10:52:31AM -0500, Christopher Covington wrote:
> >> > Refactor the KVM code to use the __tlbi macros, which will allow an 
> >> > errata
> >> > workaround that repeats tlbi dsb sequences to only change one location.
> >> > This is not intended to change the generated assembly and comparing 
> >> > before
> >> > and after vmlinux objdump shows no functional changes.
> >> > 
> >> > Signed-off-by: Christopher Covington 
> >> 
> >> Acked-by: Christoffer Dall 
> >
> > Thanks, I'll queue this one via arm64.
> 
> I just noticed this patch but there's been a similar patch from Mark
> that I've been carrying as part of the KVM TLB monitoring series[0].
> 
> [0] http://www.mail-archive.com/kvmarm@lists.cs.columbia.edu/msg09359.html

Well I've already queued the one from Christopher. It's weird that Mark's
version appears to miss the local VMID case (but it does have the
flush_icache_all I wanted :).

Anyway, I'm not reverting anything, so you'll need to rebase when this
lot lands in mainline.

Will
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH v8 01/16] FDT: introduce global phandle allocation

2017-02-01 Thread Marc Zyngier
On Wed, Feb 01 2017 at  4:44:19 pm GMT, André Przywara  
wrote:

Hi Andre,

> Hi Marc, Will,
>
> On 09/12/16 12:03, Marc Zyngier wrote:
>> On 04/11/16 17:31, Andre Przywara wrote:
>>> Allocating an FDT phandle (a unique identifier) using a static
>>> variable in a static inline function defined in a header file works
>>> only if all users are in the same source file. So trying to allocate
>>> a handle from two different compilation units fails.
>>> Introduce global phandle allocation and reference code to properly
>>> allocate unique phandles.
>>>
>>> Signed-off-by: Andre Przywara 
>>> ---
>>>  Makefile  |  1 +
>>>  arm/fdt.c |  2 +-
>>>  arm/gic.c |  2 +-
>>>  include/kvm/fdt.h | 10 +-
>>>  kvm-fdt.c | 26 ++
>>>  5 files changed, 34 insertions(+), 7 deletions(-)
>>>  create mode 100644 kvm-fdt.c
>>>
>>> diff --git a/Makefile b/Makefile
>>> index 1f0196f..e4a4002 100644
>>> --- a/Makefile
>>> +++ b/Makefile
>>> @@ -98,6 +98,7 @@ OBJS  += kvm-ipc.o
>>>  OBJS   += builtin-sandbox.o
>>>  OBJS   += virtio/mmio.o
>>>  OBJS   += hw/i8042.o
>>> +OBJS   += kvm-fdt.o
>>>  
>>>  # Translate uname -m into ARCH string
>>>  ARCH ?= $(shell uname -m | sed -e s/i.86/i386/ -e s/ppc.*/powerpc/ \
>>> diff --git a/arm/fdt.c b/arm/fdt.c
>>> index 381d48f..8bcfffb 100644
>>> --- a/arm/fdt.c
>>> +++ b/arm/fdt.c
>>> @@ -114,7 +114,7 @@ static int setup_fdt(struct kvm *kvm)
>>>  {
>>> struct device_header *dev_hdr;
>>> u8 staging_fdt[FDT_MAX_SIZE];
>>> -   u32 gic_phandle = fdt__alloc_phandle();
>>> +   u32 gic_phandle = fdt__get_phandle(PHANDLE_GIC);
>>> u64 mem_reg_prop[]  = {
>>> cpu_to_fdt64(kvm->arch.memory_guest_start),
>>> cpu_to_fdt64(kvm->ram_size),
>>> diff --git a/arm/gic.c b/arm/gic.c
>>> index d6d6dd0..b60437e 100644
>>> --- a/arm/gic.c
>>> +++ b/arm/gic.c
>>> @@ -222,7 +222,7 @@ void gic__generate_fdt_nodes(void *fdt, u32 phandle, 
>>> enum irqchip_type type)
>>> _FDT(fdt_property_cell(fdt, "#interrupt-cells", GIC_FDT_IRQ_NUM_CELLS));
>>> _FDT(fdt_property(fdt, "interrupt-controller", NULL, 0));
>>> _FDT(fdt_property(fdt, "reg", reg_prop, sizeof(reg_prop)));
>>> -   _FDT(fdt_property_cell(fdt, "phandle", phandle));
>>> +   _FDT(fdt_property_cell(fdt, "phandle", fdt__get_phandle(PHANDLE_GIC)));
>>> _FDT(fdt_end_node(fdt));
>>>  }
>>>  
>>> diff --git a/include/kvm/fdt.h b/include/kvm/fdt.h
>>> index 53d85a4..cd2bb72 100644
>>> --- a/include/kvm/fdt.h
>>> +++ b/include/kvm/fdt.h
>>> @@ -8,6 +8,10 @@
>>>  #include 
>>>  
>>>  #define FDT_MAX_SIZE   0x1
>>> +#define FDT_INVALID_PHANDLE 0
>>> +#define FDT_IS_VALID_PHANDLE(phandle) ((phandle) != FDT_INVALID_PHANDLE)
>>> +
>>> +enum phandles {PHANDLE_GIC, PHANDLES_MAX};
>> 
>> Another nit here: PHANDLE_GIC is pretty much ARM-specific, while FDT is
>> supposed to be generic. Can't we move the enum to be architecture
>> specific and not put this in an architecture agnostic one?
>
> So while trying to find the best possible solution for this seemingly
> simple problem, I was wondering why we had this allocation function in
> the first place?
> It seems a bit overkill to allocate a handle if we could just go with
> static values.
> I changed the first two patches now to have an enum per architecture to
> list all possible handles and then just using those values directly
> instead of going through another layer of indirection.

Yes, that probably make sense, at least for the time being.

> So is there anything that will require dynamic phandles in the future?
> This version proposed here can't really cope with them anyway and in the
> moment it's just about _one_ GIC phandle and _one_ ITS phandle, so a
> static assignment is much simpler.
>
> Will we need multiple ITSes for device passthrough? Or would it just be
> one ITS for all hardware devices and one for emulated virtio?

Having multiple ITSes is definitely on the cards (pass-through and
emulated devices are one case, where once of them would be driven using
GICv4 and the other be purely virtual). This probably would translate
into having multiple PCIe host controllers.

So maybe we don't need the full breath of an allocator yet, but I reckon
that we shouldn't pretend that there is no use for it, forever.

Thanks,

M.
-- 
Jazz is not dead, it just smell funny.
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH v4 3/4] arm64: Use __tlbi() macros in KVM code

2017-02-01 Thread Punit Agrawal
Will Deacon  writes:

> On Wed, Feb 01, 2017 at 05:02:43PM +, Punit Agrawal wrote:
>> Will Deacon  writes:
>> 
>> > On Wed, Jan 25, 2017 at 08:39:43PM +0100, Christoffer Dall wrote:
>> >> On Wed, Jan 25, 2017 at 10:52:31AM -0500, Christopher Covington wrote:
>> >> > Refactor the KVM code to use the __tlbi macros, which will allow an 
>> >> > errata
>> >> > workaround that repeats tlbi dsb sequences to only change one location.
>> >> > This is not intended to change the generated assembly and comparing 
>> >> > before
>> >> > and after vmlinux objdump shows no functional changes.
>> >> > 
>> >> > Signed-off-by: Christopher Covington 
>> >> 
>> >> Acked-by: Christoffer Dall 
>> >
>> > Thanks, I'll queue this one via arm64.
>> 
>> I just noticed this patch but there's been a similar patch from Mark
>> that I've been carrying as part of the KVM TLB monitoring series[0].
>> 
>> [0] http://www.mail-archive.com/kvmarm@lists.cs.columbia.edu/msg09359.html
>
> Well I've already queued the one from Christopher. It's weird that Mark's
> version appears to miss the local VMID case (but it does have the
> flush_icache_all I wanted :).
>
> Anyway, I'm not reverting anything, so you'll need to rebase when this
> lot lands in mainline.

Sure. I'll drop the patch the next time around.

The patch is unrelated to that series and should've gone in when
we introduced the __tlbi() macro but got left out somehow.

>
> Will
> ___
> kvmarm mailing list
> kvmarm@lists.cs.columbia.edu
> https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH V8 00/10] Add UEFI 2.6 and ACPI 6.1 updates for RAS on ARM64

2017-02-01 Thread Tyler Baicar
When a memory error, CPU error, PCIe error, or other type of hardware error
that's covered by RAS occurs, firmware should populate the shared GHES memory
location with the proper GHES structures to notify the OS of the error.
For example, platforms that implement firmware first handling may implement
separate GHES sources for corrected errors and uncorrected errors. If the
error is an uncorrectable error, then the firmware will notify the OS
immediately since the error needs to be handled ASAP. The OS will then be able
to take the appropriate action needed such as offlining a page. If the error
is a corrected error, then the firmware will not interrupt the OS immediately.
Instead, the OS will see and report the error the next time it's GHES timer
expires. The kernel will first parse the GHES structures and report the errors
through the kernel logs and then notify the user space through RAS trace
events. This allows user space applications such as RAS Daemon to see the
errors and report them however the user desires. This patchset extends the
kernel functionality for RAS errors based on updates in the UEFI 2.6 and
ACPI 6.1 specifications.

An example flow from firmware to user space could be:

 +---+
   +>|   |
   | |  GHES polling |--+
+-+  |source |  |   +---+   ++
| |  +---+  |   |  Kernel GHES  |   ||
|  Firmware   | +-->|  CPER AER and |-->|  RAS trace |
| |  +---+  |   |  EDAC drivers |   |   event|
+-+  |   |  |   +---+   ++
   | |  GHES sci |--+
   +>|   source  |
 +---+

Add support for Generic Hardware Error Source (GHES) v2, which introduces the
capability for the OS to acknowledge the consumption of the error record
generated by the Reliability, Availability and Serviceability (RAS) controller.
This eliminates potential race conditions between the OS and the RAS controller.

Add support for the timestamp field added to the Generic Error Data Entry v3,
allowing the OS to log the time that the error is generated by the firmware,
rather than the time the error is consumed. This improves the correctness of
event sequences when analyzing error logs. The timestamp is added in
ACPI 6.1, reference Table 18-343 Generic Error Data Entry.

Add support for ARMv8 Common Platform Error Record (CPER) per UEFI 2.6
specification. ARMv8 specific processor error information is reported as part of
the CPER records.  This provides more detail on for processor error logs. This
can help describe ARMv8 cache, tlb, and bus errors.

Synchronous External Abort (SEA) represents a specific processor error condition
in ARM systems. A handler is added to recognize SEA errors, and a notifier is
added to parse and report the errors before the process is killed. Refer to
section N.2.1.1 in the Common Platform Error Record appendix of the UEFI 2.6
specification.

Currently the kernel ignores CPER records that are unrecognized.
On the other hand, UEFI spec allows for non-standard (eg. vendor
proprietary) error section type in CPER (Common Platform Error Record),
as defined in section N2.3 of UEFI version 2.5. Therefore, user
is not able to see hardware error data of non-standard section.

If section Type field of Generic Error Data Entry is unrecognized,
prints out the raw data in dmesg buffer, and also adds a tracepoint
for reporting such hardware errors.

Currently even if an error status block's severity is fatal, the kernel
does not honor the severity level and panic. With the firmware first
model, the platform could inform the OS about a fatal hardware error
through the non-NMI GHES notification type. The OS should panic when a
hardware error record is received with this severity.

Add support to handle SEAs that occur while a KVM guest kernel is
running. Currently these are unsupported by the guest abort handling.

V8: Remove SEA notifier
Add FAR not valid bit check when populating the SEA error address
Move nmi_enter/exit() to architecture specific code
Add synchronize_rcu() usage to SEA handling
Make GHES_IOREMAP_PAGES always 2
Update ghes_ioremap_pfn_nmi() to work like ghes_ioremap_pfn_irq()
Remove the SEA print from handle_guest_sea()

V7: Update a couple prints for ARM processor errors
Add Print notifying if overflow occurred for ARM processor errors
Check for ARM configuration to allow the compiler to ignore ARM code
 on non-ARM systems
Use SEA acronym instead of spelling it out
Update fault_info prints to be more clear
Add NMI locking to SEA notification
Remove error info structure from ARM trace event since there can be
 a variable amount of these structures

V6: Change HEST_TYPE_GENERIC_V2 to IS_HEST_TYPE_GENERIC_V2 for readability
Move APEI helper defines from c

[PATCH V8 02/10] ras: acpi/apei: cper: generic error data entry v3 per ACPI 6.1

2017-02-01 Thread Tyler Baicar
Currently when a RAS error is reported it is not timestamped.
The ACPI 6.1 spec adds the timestamp field to the generic error
data entry v3 structure. The timestamp of when the firmware
generated the error is now being reported.

Signed-off-by: Tyler Baicar 
Signed-off-by: Jonathan (Zhixiong) Zhang 
Signed-off-by: Richard Ruigrok 
Signed-off-by: Naveen Kaje 
Reviewed-by: James Morse 
---
 drivers/acpi/apei/ghes.c|  9 ---
 drivers/firmware/efi/cper.c | 63 +++--
 include/acpi/ghes.h | 22 
 3 files changed, 77 insertions(+), 17 deletions(-)

diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
index 5e1ec41..b25e7cf 100644
--- a/drivers/acpi/apei/ghes.c
+++ b/drivers/acpi/apei/ghes.c
@@ -420,7 +420,8 @@ static void ghes_handle_memory_failure(struct 
acpi_hest_generic_data *gdata, int
int flags = -1;
int sec_sev = ghes_severity(gdata->error_severity);
struct cper_sec_mem_err *mem_err;
-   mem_err = (struct cper_sec_mem_err *)(gdata + 1);
+
+   mem_err = acpi_hest_generic_data_payload(gdata);
 
if (!(mem_err->validation_bits & CPER_MEM_VALID_PA))
return;
@@ -457,7 +458,8 @@ static void ghes_do_proc(struct ghes *ghes,
if (!uuid_le_cmp(*(uuid_le *)gdata->section_type,
 CPER_SEC_PLATFORM_MEM)) {
struct cper_sec_mem_err *mem_err;
-   mem_err = (struct cper_sec_mem_err *)(gdata+1);
+
+   mem_err = acpi_hest_generic_data_payload(gdata);
ghes_edac_report_mem_error(ghes, sev, mem_err);
 
arch_apei_report_mem_error(sev, mem_err);
@@ -467,7 +469,8 @@ static void ghes_do_proc(struct ghes *ghes,
else if (!uuid_le_cmp(*(uuid_le *)gdata->section_type,
  CPER_SEC_PCIE)) {
struct cper_sec_pcie *pcie_err;
-   pcie_err = (struct cper_sec_pcie *)(gdata+1);
+
+   pcie_err = acpi_hest_generic_data_payload(gdata);
if (sev == GHES_SEV_RECOVERABLE &&
sec_sev == GHES_SEV_RECOVERABLE &&
pcie_err->validation_bits & 
CPER_PCIE_VALID_DEVICE_ID &&
diff --git a/drivers/firmware/efi/cper.c b/drivers/firmware/efi/cper.c
index d425374..8fa4e23 100644
--- a/drivers/firmware/efi/cper.c
+++ b/drivers/firmware/efi/cper.c
@@ -32,6 +32,9 @@
 #include 
 #include 
 #include 
+#include 
+#include 
+#include 
 
 #define INDENT_SP  " "
 
@@ -386,13 +389,37 @@ static void cper_print_pcie(const char *pfx, const struct 
cper_sec_pcie *pcie,
pfx, pcie->bridge.secondary_status, pcie->bridge.control);
 }
 
+static void cper_estatus_print_section_v300(const char *pfx,
+   const struct acpi_hest_generic_data_v300 *gdata)
+{
+   __u8 hour, min, sec, day, mon, year, century, *timestamp;
+
+   if (gdata->validation_bits & ACPI_HEST_GEN_VALID_TIMESTAMP) {
+   timestamp = (__u8 *)&(gdata->time_stamp);
+   sec = bcd2bin(timestamp[0]);
+   min = bcd2bin(timestamp[1]);
+   hour = bcd2bin(timestamp[2]);
+   day = bcd2bin(timestamp[4]);
+   mon = bcd2bin(timestamp[5]);
+   year = bcd2bin(timestamp[6]);
+   century = bcd2bin(timestamp[7]);
+   printk("%stime: %7s %02d%02d-%02d-%02d %02d:%02d:%02d\n", pfx,
+   0x01 & *(timestamp + 3) ? "precise" : "", century,
+   year, mon, day, hour, min, sec);
+   }
+}
+
 static void cper_estatus_print_section(
-   const char *pfx, const struct acpi_hest_generic_data *gdata, int sec_no)
+   const char *pfx, struct acpi_hest_generic_data *gdata, int sec_no)
 {
uuid_le *sec_type = (uuid_le *)gdata->section_type;
__u16 severity;
char newpfx[64];
 
+   if (acpi_hest_generic_data_version(gdata) >= 3)
+   cper_estatus_print_section_v300(pfx,
+   (const struct acpi_hest_generic_data_v300 *)gdata);
+
severity = gdata->error_severity;
printk("%s""Error %d, type: %s\n", pfx, sec_no,
   cper_severity_str(severity));
@@ -403,14 +430,18 @@ static void cper_estatus_print_section(
 
snprintf(newpfx, sizeof(newpfx), "%s%s", pfx, INDENT_SP);
if (!uuid_le_cmp(*sec_type, CPER_SEC_PROC_GENERIC)) {
-   struct cper_sec_proc_generic *proc_err = (void *)(gdata + 1);
+   struct cper_sec_proc_generic *proc_err;
+
+   proc_err = acpi_hest_generic_data_payload(gdata);
printk("%s""section_type: general processor error\n", newpfx);
if (gdata->error_data_length >= sizeof(*proc_err))
cper_print_proc_generic(newpfx, proc_err);
else
goto err_section_too_small;
} els

[PATCH V8 01/10] acpi: apei: read ack upon ghes record consumption

2017-02-01 Thread Tyler Baicar
A RAS (Reliability, Availability, Serviceability) controller
may be a separate processor running in parallel with OS
execution, and may generate error records for consumption by
the OS. If the RAS controller produces multiple error records,
then they may be overwritten before the OS has consumed them.

The Generic Hardware Error Source (GHES) v2 structure
introduces the capability for the OS to acknowledge the
consumption of the error record generated by the RAS
controller. A RAS controller supporting GHESv2 shall wait for
the acknowledgment before writing a new error record, thus
eliminating the race condition.

Add support for parsing of GHESv2 sub-tables as well.

Signed-off-by: Tyler Baicar 
Signed-off-by: Jonathan (Zhixiong) Zhang 
Signed-off-by: Richard Ruigrok 
Signed-off-by: Naveen Kaje 
Reviewed-by: James Morse 
---
 drivers/acpi/apei/ghes.c | 49 +---
 drivers/acpi/apei/hest.c |  7 +--
 include/acpi/ghes.h  |  5 -
 3 files changed, 55 insertions(+), 6 deletions(-)

diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
index e53bef6..5e1ec41 100644
--- a/drivers/acpi/apei/ghes.c
+++ b/drivers/acpi/apei/ghes.c
@@ -45,6 +45,7 @@
 #include 
 #include 
 
+#include 
 #include 
 #include 
 #include 
@@ -79,6 +80,10 @@
((struct acpi_hest_generic_status *)\
 ((struct ghes_estatus_node *)(estatus_node) + 1))
 
+#define IS_HEST_TYPE_GENERIC_V2(ghes)  \
+   ((struct acpi_hest_header *)ghes->generic)->type == \
+ACPI_HEST_TYPE_GENERIC_ERROR_V2
+
 /*
  * This driver isn't really modular, however for the time being,
  * continuing to use module_param is the easiest way to remain
@@ -248,10 +253,18 @@ static struct ghes *ghes_new(struct acpi_hest_generic 
*generic)
ghes = kzalloc(sizeof(*ghes), GFP_KERNEL);
if (!ghes)
return ERR_PTR(-ENOMEM);
+
ghes->generic = generic;
+   if (IS_HEST_TYPE_GENERIC_V2(ghes)) {
+   rc = apei_map_generic_address(
+   &ghes->generic_v2->read_ack_register);
+   if (rc)
+   goto err_free;
+   }
+
rc = apei_map_generic_address(&generic->error_status_address);
if (rc)
-   goto err_free;
+   goto err_unmap_read_ack_addr;
error_block_length = generic->error_block_length;
if (error_block_length > GHES_ESTATUS_MAX_SIZE) {
pr_warning(FW_WARN GHES_PFX
@@ -263,13 +276,17 @@ static struct ghes *ghes_new(struct acpi_hest_generic 
*generic)
ghes->estatus = kmalloc(error_block_length, GFP_KERNEL);
if (!ghes->estatus) {
rc = -ENOMEM;
-   goto err_unmap;
+   goto err_unmap_status_addr;
}
 
return ghes;
 
-err_unmap:
+err_unmap_status_addr:
apei_unmap_generic_address(&generic->error_status_address);
+err_unmap_read_ack_addr:
+   if (IS_HEST_TYPE_GENERIC_V2(ghes))
+   apei_unmap_generic_address(
+   &ghes->generic_v2->read_ack_register);
 err_free:
kfree(ghes);
return ERR_PTR(rc);
@@ -279,6 +296,9 @@ static void ghes_fini(struct ghes *ghes)
 {
kfree(ghes->estatus);
apei_unmap_generic_address(&ghes->generic->error_status_address);
+   if (IS_HEST_TYPE_GENERIC_V2(ghes))
+   apei_unmap_generic_address(
+   &ghes->generic_v2->read_ack_register);
 }
 
 static inline int ghes_severity(int severity)
@@ -648,6 +668,23 @@ static void ghes_estatus_cache_add(
rcu_read_unlock();
 }
 
+static int ghes_ack_error(struct acpi_hest_generic_v2 *generic_v2)
+{
+   int rc;
+   u64 val = 0;
+
+   rc = apei_read(&val, &generic_v2->read_ack_register);
+   if (rc)
+   return rc;
+   val &= generic_v2->read_ack_preserve <<
+   generic_v2->read_ack_register.bit_offset;
+   val |= generic_v2->read_ack_write <<
+   generic_v2->read_ack_register.bit_offset;
+   rc = apei_write(val, &generic_v2->read_ack_register);
+
+   return rc;
+}
+
 static int ghes_proc(struct ghes *ghes)
 {
int rc;
@@ -660,6 +697,12 @@ static int ghes_proc(struct ghes *ghes)
ghes_estatus_cache_add(ghes->generic, ghes->estatus);
}
ghes_do_proc(ghes, ghes->estatus);
+
+   if (IS_HEST_TYPE_GENERIC_V2(ghes)) {
+   rc = ghes_ack_error(ghes->generic_v2);
+   if (rc)
+   return rc;
+   }
 out:
ghes_clear_estatus(ghes);
return rc;
diff --git a/drivers/acpi/apei/hest.c b/drivers/acpi/apei/hest.c
index 8f2a98e..456b488 100644
--- a/drivers/acpi/apei/hest.c
+++ b/drivers/acpi/apei/hest.c
@@ -52,6 +52,7 @@
[ACPI_HEST_TYPE_AER_ENDPOINT] = sizeof(struct acpi_hest_aer),
[ACPI_HEST_TYPE_AER_BRIDGE] = sizeof(struct acpi_hest_aer_bridge),
[ACPI_HEST_TYPE_G

[PATCH V8 04/10] arm64: exception: handle Synchronous External Abort

2017-02-01 Thread Tyler Baicar
SEA exceptions are often caused by an uncorrected hardware
error, and are handled when data abort and instruction abort
exception classes have specific values for their Fault Status
Code.
When SEA occurs, before killing the process, report the error
in the kernel logs.
Update fault_info[] with specific SEA faults so that the
new SEA handler is used.

Signed-off-by: Tyler Baicar 
Signed-off-by: Jonathan (Zhixiong) Zhang 
Signed-off-by: Naveen Kaje 
---
 arch/arm64/mm/fault.c | 45 +++--
 1 file changed, 35 insertions(+), 10 deletions(-)

diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c
index 156169c..9ae7e65 100644
--- a/arch/arm64/mm/fault.c
+++ b/arch/arm64/mm/fault.c
@@ -487,6 +487,31 @@ static int do_bad(unsigned long addr, unsigned int esr, 
struct pt_regs *regs)
return 1;
 }
 
+#define SEA_FnV_MASK   0x0400
+
+/*
+ * This abort handler deals with Synchronous External Abort.
+ * It calls notifiers, and then returns "fault".
+ */
+static int do_sea(unsigned long addr, unsigned int esr, struct pt_regs *regs)
+{
+   struct siginfo info;
+
+   pr_err("Synchronous External Abort: %s (0x%08x) at 0x%016lx\n",
+fault_name(esr), esr, addr);
+
+   info.si_signo = SIGBUS;
+   info.si_errno = 0;
+   info.si_code  = 0;
+   if (esr & SEA_FnV_MASK)
+   info.si_addr = 0;
+   else
+   info.si_addr  = (void __user *)addr;
+   arm64_notify_die("", regs, &info, esr);
+
+   return 0;
+}
+
 static const struct fault_info {
int (*fn)(unsigned long addr, unsigned int esr, struct pt_regs 
*regs);
int sig;
@@ -509,22 +534,22 @@ static int do_bad(unsigned long addr, unsigned int esr, 
struct pt_regs *regs)
{ do_page_fault,SIGSEGV, SEGV_ACCERR,   "level 1 permission 
fault"  },
{ do_page_fault,SIGSEGV, SEGV_ACCERR,   "level 2 permission 
fault"  },
{ do_page_fault,SIGSEGV, SEGV_ACCERR,   "level 3 permission 
fault"  },
-   { do_bad,   SIGBUS,  0, "synchronous external 
abort"},
+   { do_sea,   SIGBUS,  0, "synchronous external 
abort"},
{ do_bad,   SIGBUS,  0, "unknown 17"
},
{ do_bad,   SIGBUS,  0, "unknown 18"
},
{ do_bad,   SIGBUS,  0, "unknown 19"
},
-   { do_bad,   SIGBUS,  0, "synchronous external 
abort (translation table walk)" },
-   { do_bad,   SIGBUS,  0, "synchronous external 
abort (translation table walk)" },
-   { do_bad,   SIGBUS,  0, "synchronous external 
abort (translation table walk)" },
-   { do_bad,   SIGBUS,  0, "synchronous external 
abort (translation table walk)" },
-   { do_bad,   SIGBUS,  0, "synchronous parity 
error"  },
+   { do_sea,   SIGBUS,  0, "level 0 (translation 
table walk)"  },
+   { do_sea,   SIGBUS,  0, "level 1 (translation 
table walk)"  },
+   { do_sea,   SIGBUS,  0, "level 2 (translation 
table walk)"  },
+   { do_sea,   SIGBUS,  0, "level 3 (translation 
table walk)"  },
+   { do_sea,   SIGBUS,  0, "synchronous parity or 
ECC error" },
{ do_bad,   SIGBUS,  0, "unknown 25"
},
{ do_bad,   SIGBUS,  0, "unknown 26"
},
{ do_bad,   SIGBUS,  0, "unknown 27"
},
-   { do_bad,   SIGBUS,  0, "synchronous parity 
error (translation table walk)" },
-   { do_bad,   SIGBUS,  0, "synchronous parity 
error (translation table walk)" },
-   { do_bad,   SIGBUS,  0, "synchronous parity 
error (translation table walk)" },
-   { do_bad,   SIGBUS,  0, "synchronous parity 
error (translation table walk)" },
+   { do_sea,   SIGBUS,  0, "level 0 synchronous 
parity error (translation table walk)" },
+   { do_sea,   SIGBUS,  0, "level 1 synchronous 
parity error (translation table walk)" },
+   { do_sea,   SIGBUS,  0, "level 2 synchronous 
parity error (translation table walk)" },
+   { do_sea,   SIGBUS,  0, "level 3 synchronous 
parity error (translation table walk)" },
{ do_bad,   SIGBUS,  0, "unknown 32"
},
{ do_alignment_fault,   SIGBUS,  BUS_ADRALN,"alignment fault"   
},
{ do_bad,   SIGBUS,  0,  

[PATCH V8 03/10] efi: parse ARM processor error

2017-02-01 Thread Tyler Baicar
Add support for ARM Common Platform Error Record (CPER).
UEFI 2.6 specification adds support for ARM specific
processor error information to be reported as part of the
CPER records. This provides more detail on for processor error logs.

Signed-off-by: Tyler Baicar 
Signed-off-by: Jonathan (Zhixiong) Zhang 
Signed-off-by: Naveen Kaje 
Reviewed-by: James Morse 
---
 drivers/firmware/efi/cper.c | 133 
 include/linux/cper.h|  54 ++
 2 files changed, 187 insertions(+)

diff --git a/drivers/firmware/efi/cper.c b/drivers/firmware/efi/cper.c
index 8fa4e23..c2b0a12 100644
--- a/drivers/firmware/efi/cper.c
+++ b/drivers/firmware/efi/cper.c
@@ -110,12 +110,15 @@ void cper_print_bits(const char *pfx, unsigned int bits,
 static const char * const proc_type_strs[] = {
"IA32/X64",
"IA64",
+   "ARM",
 };
 
 static const char * const proc_isa_strs[] = {
"IA32",
"IA64",
"X64",
+   "ARM A32/T32",
+   "ARM A64",
 };
 
 static const char * const proc_error_type_strs[] = {
@@ -139,6 +142,18 @@ void cper_print_bits(const char *pfx, unsigned int bits,
"corrected",
 };
 
+static const char * const arm_reg_ctx_strs[] = {
+   "AArch32 general purpose registers",
+   "AArch32 EL1 context registers",
+   "AArch32 EL2 context registers",
+   "AArch32 secure context registers",
+   "AArch64 general purpose registers",
+   "AArch64 EL1 context registers",
+   "AArch64 EL2 context registers",
+   "AArch64 EL3 context registers",
+   "Misc. system register structure",
+};
+
 static void cper_print_proc_generic(const char *pfx,
const struct cper_sec_proc_generic *proc)
 {
@@ -184,6 +199,114 @@ static void cper_print_proc_generic(const char *pfx,
printk("%s""IP: 0x%016llx\n", pfx, proc->ip);
 }
 
+static void cper_print_proc_arm(const char *pfx,
+   const struct cper_sec_proc_arm *proc)
+{
+   int i, len, max_ctx_type;
+   struct cper_arm_err_info *err_info;
+   struct cper_arm_ctx_info *ctx_info;
+   char newpfx[64];
+
+   printk("%s""section length: %d\n", pfx, proc->section_length);
+   printk("%s""MIDR: 0x%016llx\n", pfx, proc->midr);
+
+   len = proc->section_length - (sizeof(*proc) +
+   proc->err_info_num * (sizeof(*err_info)));
+   if (len < 0) {
+   printk("%s""section length is too small\n", pfx);
+   printk("%s""firmware-generated error record is incorrect\n", 
pfx);
+   printk("%s""ERR_INFO_NUM is %d\n", pfx, proc->err_info_num);
+   return;
+   }
+
+   if (proc->validation_bits & CPER_ARM_VALID_MPIDR)
+   printk("%s""MPIDR: 0x%016llx\n", pfx, proc->mpidr);
+   if (proc->validation_bits & CPER_ARM_VALID_AFFINITY_LEVEL)
+   printk("%s""error affinity level: %d\n", pfx,
+   proc->affinity_level);
+   if (proc->validation_bits & CPER_ARM_VALID_RUNNING_STATE) {
+   printk("%s""running state: 0x%x\n", pfx, proc->running_state);
+   printk("%s""PSCI state: %d\n", pfx, proc->psci_state);
+   }
+
+   snprintf(newpfx, sizeof(newpfx), "%s%s", pfx, INDENT_SP);
+
+   err_info = (struct cper_arm_err_info *)(proc + 1);
+   for (i = 0; i < proc->err_info_num; i++) {
+   printk("%s""Error info structure %d:\n", pfx, i);
+   printk("%s""version:%d\n", newpfx, err_info->version);
+   printk("%s""length:%d\n", newpfx, err_info->length);
+   if (err_info->validation_bits &
+   CPER_ARM_INFO_VALID_MULTI_ERR) {
+   if (err_info->multiple_error == 0)
+   printk("%s""single error\n", newpfx);
+   else if (err_info->multiple_error == 1)
+   printk("%s""multiple errors\n", newpfx);
+   else
+   printk("%s""multiple errors count:%u\n",
+   newpfx, err_info->multiple_error);
+   }
+   if (err_info->validation_bits & CPER_ARM_INFO_VALID_FLAGS) {
+   if (err_info->flags & CPER_ARM_INFO_FLAGS_FIRST)
+   printk("%s""first error captured\n", newpfx);
+   if (err_info->flags & CPER_ARM_INFO_FLAGS_LAST)
+   printk("%s""last error captured\n", newpfx);
+   if (err_info->flags & CPER_ARM_INFO_FLAGS_PROPAGATED)
+   printk("%s""propagated error captured\n",
+  newpfx);
+   if (err_info->flags & CPER_ARM_INFO_FLAGS_OVERFLOW)
+   printk("%s""overflow occurred, error info is 
incomplete\n",
+  newpfx);
+   }
+   

[PATCH V8 05/10] acpi: apei: handle SEA notification type for ARMv8

2017-02-01 Thread Tyler Baicar
ARM APEI extension proposal added SEA (Synchronous External Abort)
notification type for ARMv8.
Add a new GHES error source handling function for SEA. If an error
source's notification type is SEA, then this function can be registered
into the SEA exception handler. That way GHES will parse and report
SEA exceptions when they occur.

Signed-off-by: Tyler Baicar 
Signed-off-by: Jonathan (Zhixiong) Zhang 
Signed-off-by: Naveen Kaje 
---
 arch/arm64/Kconfig|  2 ++
 arch/arm64/mm/fault.c | 11 +++
 drivers/acpi/apei/Kconfig | 14 +
 drivers/acpi/apei/ghes.c  | 78 ++-
 include/acpi/ghes.h   |  2 ++
 5 files changed, 100 insertions(+), 7 deletions(-)

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 1117421..f92778d 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -53,6 +53,8 @@ config ARM64
select HANDLE_DOMAIN_IRQ
select HARDIRQS_SW_RESEND
select HAVE_ACPI_APEI if (ACPI && EFI)
+   select HAVE_ACPI_APEI_SEA if (ACPI && EFI)
+   select HAVE_NMI if HAVE_ACPI_APEI_SEA
select HAVE_ALIGNED_STRUCT_PAGE if SLUB
select HAVE_ARCH_AUDITSYSCALL
select HAVE_ARCH_BITREVERSE
diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c
index 9ae7e65..5a5a096 100644
--- a/arch/arm64/mm/fault.c
+++ b/arch/arm64/mm/fault.c
@@ -30,6 +30,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
@@ -41,6 +42,8 @@
 #include 
 #include 
 
+#include 
+
 static const char *fault_name(unsigned int esr);
 
 #ifdef CONFIG_KPROBES
@@ -500,6 +503,14 @@ static int do_sea(unsigned long addr, unsigned int esr, 
struct pt_regs *regs)
pr_err("Synchronous External Abort: %s (0x%08x) at 0x%016lx\n",
 fault_name(esr), esr, addr);
 
+   /*
+* synchronize_rcu() will wait for nmi_exit(), so no need to
+* rcu_read_lock().
+*/
+   nmi_enter();
+   ghes_notify_sea();
+   nmi_exit();
+
info.si_signo = SIGBUS;
info.si_errno = 0;
info.si_code  = 0;
diff --git a/drivers/acpi/apei/Kconfig b/drivers/acpi/apei/Kconfig
index b0140c8..3786ff1 100644
--- a/drivers/acpi/apei/Kconfig
+++ b/drivers/acpi/apei/Kconfig
@@ -4,6 +4,20 @@ config HAVE_ACPI_APEI
 config HAVE_ACPI_APEI_NMI
bool
 
+config HAVE_ACPI_APEI_SEA
+   bool "APEI Synchronous External Abort logging/recovering support"
+   depends on ARM64
+   help
+ This option should be enabled if the system supports
+ firmware first handling of SEA (Synchronous External Abort).
+ SEA happens with certain faults of data abort or instruction
+ abort synchronous exceptions on ARMv8 systems. If a system
+ supports firmware first handling of SEA, the platform analyzes
+ and handles hardware error notifications with SEA, and it may then
+ form a HW error record for the OS to parse and handle. This
+ option allows the OS to look for such HW error record, and
+ take appropriate action.
+
 config ACPI_APEI
bool "ACPI Platform Error Interface (APEI)"
select MISC_FILESYSTEMS
diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
index b25e7cf..8756172 100644
--- a/drivers/acpi/apei/ghes.c
+++ b/drivers/acpi/apei/ghes.c
@@ -114,11 +114,7 @@
  * Two virtual pages are used, one for IRQ/PROCESS context, the other for
  * NMI context (optionally).
  */
-#ifdef CONFIG_HAVE_ACPI_APEI_NMI
 #define GHES_IOREMAP_PAGES   2
-#else
-#define GHES_IOREMAP_PAGES   1
-#endif
 #define GHES_IOREMAP_IRQ_PAGE(base)(base)
 #define GHES_IOREMAP_NMI_PAGE(base)((base) + PAGE_SIZE)
 
@@ -156,11 +152,14 @@ static void ghes_ioremap_exit(void)
 
 static void __iomem *ghes_ioremap_pfn_nmi(u64 pfn)
 {
-   unsigned long vaddr;
+   unsigned long vaddr, paddr;
+   pgprot_t prot;
 
vaddr = (unsigned long)GHES_IOREMAP_NMI_PAGE(ghes_ioremap_area->addr);
-   ioremap_page_range(vaddr, vaddr + PAGE_SIZE,
-  pfn << PAGE_SHIFT, PAGE_KERNEL);
+
+   paddr = pfn << PAGE_SHIFT;
+   prot = arch_apei_get_mem_attribute(paddr);
+   ioremap_page_range(vaddr, vaddr + PAGE_SIZE, paddr, prot);
 
return (void __iomem *)vaddr;
 }
@@ -767,6 +766,48 @@ static int ghes_notify_sci(struct notifier_block *this,
.notifier_call = ghes_notify_sci,
 };
 
+#ifdef CONFIG_HAVE_ACPI_APEI_SEA
+static LIST_HEAD(ghes_sea);
+
+void ghes_notify_sea(void)
+{
+   struct ghes *ghes;
+
+   list_for_each_entry_rcu(ghes, &ghes_sea, list) {
+   ghes_proc(ghes);
+   }
+}
+
+static int ghes_sea_add(struct ghes *ghes)
+{
+   mutex_lock(&ghes_list_mutex);
+   list_add_rcu(&ghes->list, &ghes_sea);
+   mutex_unlock(&ghes_list_mutex);
+   return 0;
+}
+
+static void ghes_sea_remove(struct ghes *ghes)
+{
+   mutex_lock(&ghes_list_mutex);
+   list_del_rcu(&ghes->list);
+   mutex_unlock(&ghes_list_mutex);
+  

[PATCH V8 06/10] acpi: apei: panic OS with fatal error status block

2017-02-01 Thread Tyler Baicar
From: "Jonathan (Zhixiong) Zhang" 

Even if an error status block's severity is fatal, the kernel does not
honor the severity level and panic.

With the firmware first model, the platform could inform the OS about a
fatal hardware error through the non-NMI GHES notification type. The OS
should panic when a hardware error record is received with this
severity.

Call panic() after CPER data in error status block is printed if
severity is fatal, before each error section is handled.

Signed-off-by: Jonathan (Zhixiong) Zhang 
Signed-off-by: Tyler Baicar 
---
 drivers/acpi/apei/ghes.c | 19 ++-
 1 file changed, 14 insertions(+), 5 deletions(-)

diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
index 8756172..86c1f15 100644
--- a/drivers/acpi/apei/ghes.c
+++ b/drivers/acpi/apei/ghes.c
@@ -133,6 +133,8 @@
 static struct ghes_estatus_cache 
*ghes_estatus_caches[GHES_ESTATUS_CACHES_SIZE];
 static atomic_t ghes_estatus_cache_alloced;
 
+static int ghes_panic_timeout __read_mostly = 30;
+
 static int ghes_ioremap_init(void)
 {
ghes_ioremap_area = __get_vm_area(PAGE_SIZE * GHES_IOREMAP_PAGES,
@@ -687,6 +689,13 @@ static int ghes_ack_error(struct acpi_hest_generic_v2 
*generic_v2)
return rc;
 }
 
+static void __ghes_call_panic(void)
+{
+   if (panic_timeout == 0)
+   panic_timeout = ghes_panic_timeout;
+   panic("Fatal hardware error!");
+}
+
 static int ghes_proc(struct ghes *ghes)
 {
int rc;
@@ -698,6 +707,10 @@ static int ghes_proc(struct ghes *ghes)
if (ghes_print_estatus(NULL, ghes->generic, ghes->estatus))
ghes_estatus_cache_add(ghes->generic, ghes->estatus);
}
+   if (ghes_severity(ghes->estatus->error_severity) >= GHES_SEV_PANIC) {
+   __ghes_call_panic();
+   }
+
ghes_do_proc(ghes, ghes->estatus);
 
if (IS_HEST_TYPE_GENERIC_V2(ghes)) {
@@ -828,8 +841,6 @@ static inline void ghes_sea_remove(struct ghes *ghes)
 
 static LIST_HEAD(ghes_nmi);
 
-static int ghes_panic_timeout  __read_mostly = 30;
-
 static void ghes_proc_in_irq(struct irq_work *irq_work)
 {
struct llist_node *llnode, *next;
@@ -922,9 +933,7 @@ static void __ghes_panic(struct ghes *ghes)
__ghes_print_estatus(KERN_EMERG, ghes->generic, ghes->estatus);
 
/* reboot to log the error! */
-   if (panic_timeout == 0)
-   panic_timeout = ghes_panic_timeout;
-   panic("Fatal hardware error!");
+   __ghes_call_panic();
 }
 
 static int ghes_notify_nmi(unsigned int cmd, struct pt_regs *regs)
-- 
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm 
Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project.

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH V8 09/10] trace, ras: add ARM processor error trace event

2017-02-01 Thread Tyler Baicar
Currently there are trace events for the various RAS
errors with the exception of ARM processor type errors.
Add a new trace event for such errors so that the user
will know when they occur. These trace events are
consistent with the ARM processor error section type
defined in UEFI 2.6 spec section N.2.4.4.

Signed-off-by: Tyler Baicar 
Acked-by: Steven Rostedt 
---
 drivers/acpi/apei/ghes.c|  7 ++-
 drivers/firmware/efi/cper.c |  1 +
 drivers/ras/ras.c   |  1 +
 include/ras/ras_event.h | 34 ++
 4 files changed, 42 insertions(+), 1 deletion(-)

diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
index a989345..013faf0 100644
--- a/drivers/acpi/apei/ghes.c
+++ b/drivers/acpi/apei/ghes.c
@@ -512,7 +512,12 @@ static void ghes_do_proc(struct ghes *ghes,
 
}
 #endif
-   else {
+   else if (!uuid_le_cmp(sec_type, CPER_SEC_PROC_ARM)) {
+   struct cper_sec_proc_arm *arm_err;
+
+   arm_err = acpi_hest_generic_data_payload(gdata);
+   trace_arm_event(arm_err);
+   } else {
void *unknown_err = 
acpi_hest_generic_data_payload(gdata);
trace_unknown_sec_event(&sec_type,
fru_id, fru_text, sec_sev,
diff --git a/drivers/firmware/efi/cper.c b/drivers/firmware/efi/cper.c
index 48cb8ee..0ec678e 100644
--- a/drivers/firmware/efi/cper.c
+++ b/drivers/firmware/efi/cper.c
@@ -35,6 +35,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #define INDENT_SP  " "
 
diff --git a/drivers/ras/ras.c b/drivers/ras/ras.c
index fb2500b..8ba5a94 100644
--- a/drivers/ras/ras.c
+++ b/drivers/ras/ras.c
@@ -28,3 +28,4 @@ static int __init ras_init(void)
 #endif
 EXPORT_TRACEPOINT_SYMBOL_GPL(mc_event);
 EXPORT_TRACEPOINT_SYMBOL_GPL(unknown_sec_event);
+EXPORT_TRACEPOINT_SYMBOL_GPL(arm_event);
diff --git a/include/ras/ras_event.h b/include/ras/ras_event.h
index 5861b6f..b36db48 100644
--- a/include/ras/ras_event.h
+++ b/include/ras/ras_event.h
@@ -162,6 +162,40 @@
 );
 
 /*
+ * ARM Processor Events Report
+ *
+ * This event is generated when hardware detects an ARM processor error
+ * has occurred. UEFI 2.6 spec section N.2.4.4.
+ */
+TRACE_EVENT(arm_event,
+
+   TP_PROTO(const struct cper_sec_proc_arm *proc),
+
+   TP_ARGS(proc),
+
+   TP_STRUCT__entry(
+   __field(u64, mpidr)
+   __field(u64, midr)
+   __field(u32, running_state)
+   __field(u32, psci_state)
+   __field(u8, affinity)
+   ),
+
+   TP_fast_assign(
+   __entry->affinity = proc->affinity_level;
+   __entry->mpidr = proc->mpidr;
+   __entry->midr = proc->midr;
+   __entry->running_state = proc->running_state;
+   __entry->psci_state = proc->psci_state;
+   ),
+
+   TP_printk("affinity level: %d; MPIDR: %016llx; MIDR: %016llx; "
+ "running state: %d; PSCI state: %d",
+ __entry->affinity, __entry->mpidr, __entry->midr,
+ __entry->running_state, __entry->psci_state)
+);
+
+/*
  * Unknown Section Report
  *
  * This event is generated when hardware detected a hardware
-- 
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm 
Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project.

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH V8 07/10] efi: print unrecognized CPER section

2017-02-01 Thread Tyler Baicar
UEFI spec allows for non-standard section in Common Platform Error
Record. This is defined in section N.2.3 of UEFI version 2.5.

Currently if the CPER section's type (UUID) does not match with
one of the section types that the kernel knows how to parse, the
section is skipped. Therefore, user is not able to see
such CPER data, for instance, error record of non-standard section.

For above mentioned case, this change prints out the raw data in
hex in dmesg buffer. Data length is taken from Error Data length
field of Generic Error Data Entry.

Following is a sample output from dmesg:
[  115.771702] {1}[Hardware Error]: Hardware error from APEI Generic Hardware 
Error Source: 2
[  115.779042] {1}[Hardware Error]: It has been corrected by h/w and requires 
no further action
[  115.787456] {1}[Hardware Error]: event severity: corrected
[  115.792927] {1}[Hardware Error]:  Error 0, type: corrected
[  115.798415] {1}[Hardware Error]:  fru_id: 
----
[  115.805596] {1}[Hardware Error]:  fru_text:
[  115.816105] {1}[Hardware Error]:  section type: 
d2e2621c-f936-468d-0d84-15a4ed015c8b
[  115.823880] {1}[Hardware Error]:  section length: 88
[  115.828779] {1}[Hardware Error]:   : 0101 0002 5f434345 
525f4543
[  115.836153] {1}[Hardware Error]:   0010: 574d   

[  115.843531] {1}[Hardware Error]:   0020:    

[  115.850908] {1}[Hardware Error]:   0030:    

[  115.858288] {1}[Hardware Error]:   0040: fe80  0004 
5f434345
[  115.865665] {1}[Hardware Error]:   0050: 525f4543 574d

Signed-off-by: Tyler Baicar 
Signed-off-by: Jonathan (Zhixiong) Zhang 
---
 drivers/firmware/efi/cper.c | 12 ++--
 1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/drivers/firmware/efi/cper.c b/drivers/firmware/efi/cper.c
index c2b0a12..48cb8ee 100644
--- a/drivers/firmware/efi/cper.c
+++ b/drivers/firmware/efi/cper.c
@@ -591,8 +591,16 @@ static void cper_estatus_print_section(
cper_print_proc_arm(newpfx, arm_err);
else
goto err_section_too_small;
-   } else
-   printk("%s""section type: unknown, %pUl\n", newpfx, sec_type);
+   } else {
+   const void *unknown_err;
+
+   unknown_err = acpi_hest_generic_data_payload(gdata);
+   printk("%ssection type: %pUl\n", newpfx, sec_type);
+   printk("%ssection length: %d\n", newpfx,
+  gdata->error_data_length);
+   print_hex_dump(newpfx, "", DUMP_PREFIX_OFFSET, 16, 4,
+  unknown_err, gdata->error_data_length, 0);
+   }
 
return;
 
-- 
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm 
Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project.

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH V8 08/10] ras: acpi / apei: generate trace event for unrecognized CPER section

2017-02-01 Thread Tyler Baicar
UEFI spec allows for non-standard section in Common Platform Error
Record. This is defined in section N.2.3 of UEFI version 2.5.

Currently if the CPER section's type (UUID) does not match with
any section type that the kernel knows how to parse, trace event
is not generated for such section. And thus user is not able to know
happening of such hardware error, including error record of
non-standard section.

This commit generates a trace event which contains raw error data
for unrecognized CPER section.

Signed-off-by: Tyler Baicar 
Signed-off-by: Jonathan (Zhixiong) Zhang 
---
 drivers/acpi/apei/ghes.c | 22 --
 drivers/ras/ras.c|  1 +
 include/ras/ras_event.h  | 45 +
 3 files changed, 66 insertions(+), 2 deletions(-)

diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
index 86c1f15..a989345 100644
--- a/drivers/acpi/apei/ghes.c
+++ b/drivers/acpi/apei/ghes.c
@@ -44,11 +44,13 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
 #include 
 #include 
+#include 
 
 #include "apei-internal.h"
 
@@ -452,11 +454,21 @@ static void ghes_do_proc(struct ghes *ghes,
 {
int sev, sec_sev;
struct acpi_hest_generic_data *gdata;
+   uuid_le sec_type;
+   uuid_le *fru_id = &NULL_UUID_LE;
+   char *fru_text = "";
 
sev = ghes_severity(estatus->error_severity);
apei_estatus_for_each_section(estatus, gdata) {
sec_sev = ghes_severity(gdata->error_severity);
-   if (!uuid_le_cmp(*(uuid_le *)gdata->section_type,
+   sec_type = *(uuid_le *)gdata->section_type;
+
+   if (gdata->validation_bits & CPER_SEC_VALID_FRU_ID)
+   fru_id = (uuid_le *)gdata->fru_id;
+   if (gdata->validation_bits & CPER_SEC_VALID_FRU_TEXT)
+   fru_text = gdata->fru_text;
+
+   if (!uuid_le_cmp(sec_type,
 CPER_SEC_PLATFORM_MEM)) {
struct cper_sec_mem_err *mem_err;
 
@@ -467,7 +479,7 @@ static void ghes_do_proc(struct ghes *ghes,
ghes_handle_memory_failure(gdata, sev);
}
 #ifdef CONFIG_ACPI_APEI_PCIEAER
-   else if (!uuid_le_cmp(*(uuid_le *)gdata->section_type,
+   else if (!uuid_le_cmp(sec_type,
  CPER_SEC_PCIE)) {
struct cper_sec_pcie *pcie_err;
 
@@ -500,6 +512,12 @@ static void ghes_do_proc(struct ghes *ghes,
 
}
 #endif
+   else {
+   void *unknown_err = 
acpi_hest_generic_data_payload(gdata);
+   trace_unknown_sec_event(&sec_type,
+   fru_id, fru_text, sec_sev,
+   unknown_err, gdata->error_data_length);
+   }
}
 }
 
diff --git a/drivers/ras/ras.c b/drivers/ras/ras.c
index b67dd36..fb2500b 100644
--- a/drivers/ras/ras.c
+++ b/drivers/ras/ras.c
@@ -27,3 +27,4 @@ static int __init ras_init(void)
 EXPORT_TRACEPOINT_SYMBOL_GPL(extlog_mem_event);
 #endif
 EXPORT_TRACEPOINT_SYMBOL_GPL(mc_event);
+EXPORT_TRACEPOINT_SYMBOL_GPL(unknown_sec_event);
diff --git a/include/ras/ras_event.h b/include/ras/ras_event.h
index 1791a12..5861b6f 100644
--- a/include/ras/ras_event.h
+++ b/include/ras/ras_event.h
@@ -162,6 +162,51 @@
 );
 
 /*
+ * Unknown Section Report
+ *
+ * This event is generated when hardware detected a hardware
+ * error event, which may be of non-standard section as defined
+ * in UEFI spec appendix "Common Platform Error Record", or may
+ * be of sections for which TRACE_EVENT is not defined.
+ *
+ */
+TRACE_EVENT(unknown_sec_event,
+
+   TP_PROTO(const uuid_le *sec_type,
+const uuid_le *fru_id,
+const char *fru_text,
+const u8 sev,
+const u8 *err,
+const u32 len),
+
+   TP_ARGS(sec_type, fru_id, fru_text, sev, err, len),
+
+   TP_STRUCT__entry(
+   __array(char, sec_type, 16)
+   __array(char, fru_id, 16)
+   __string(fru_text, fru_text)
+   __field(u8, sev)
+   __field(u32, len)
+   __dynamic_array(u8, buf, len)
+   ),
+
+   TP_fast_assign(
+   memcpy(__entry->sec_type, sec_type, sizeof(uuid_le));
+   memcpy(__entry->fru_id, fru_id, sizeof(uuid_le));
+   __assign_str(fru_text, fru_text);
+   __entry->sev = sev;
+   __entry->len = len;
+   memcpy(__get_dynamic_array(buf), err, len);
+   ),
+
+   TP_printk("severity: %d; sec type:%pU; FRU: %pU %s; data len:%d; raw 
data:%s",
+ __entry->sev, __entry->sec_type,
+ __entry->fru_id, __get_str(fru_text),
+ __entry->len,
+ __print_hex(__get_dynamic_array(buf), __entry->len))
+);
+
+/*
  * PCIe AER Trace eve

[PATCH V8 10/10] arm/arm64: KVM: add guest SEA support

2017-02-01 Thread Tyler Baicar
Currently external aborts are unsupported by the guest abort
handling. Add handling for SEAs so that the host kernel reports
SEAs which occur in the guest kernel.

Signed-off-by: Tyler Baicar 
---
 arch/arm/include/asm/kvm_arm.h   |  1 +
 arch/arm/include/asm/system_misc.h   |  5 +
 arch/arm/kvm/mmu.c   | 18 --
 arch/arm64/include/asm/kvm_arm.h |  1 +
 arch/arm64/include/asm/system_misc.h |  2 ++
 arch/arm64/mm/fault.c| 16 
 6 files changed, 41 insertions(+), 2 deletions(-)

diff --git a/arch/arm/include/asm/kvm_arm.h b/arch/arm/include/asm/kvm_arm.h
index e22089f..33a77509 100644
--- a/arch/arm/include/asm/kvm_arm.h
+++ b/arch/arm/include/asm/kvm_arm.h
@@ -187,6 +187,7 @@
 #define FSC_FAULT  (0x04)
 #define FSC_ACCESS (0x08)
 #define FSC_PERM   (0x0c)
+#define FSC_EXTABT (0x10)
 
 /* Hyp Prefetch Fault Address Register (HPFAR/HDFAR) */
 #define HPFAR_MASK (~0xf)
diff --git a/arch/arm/include/asm/system_misc.h 
b/arch/arm/include/asm/system_misc.h
index a3d61ad..ea45d94 100644
--- a/arch/arm/include/asm/system_misc.h
+++ b/arch/arm/include/asm/system_misc.h
@@ -24,4 +24,9 @@
 
 #endif /* !__ASSEMBLY__ */
 
+static inline int handle_guest_sea(unsigned long addr, unsigned int esr)
+{
+   return -1;
+}
+
 #endif /* __ASM_ARM_SYSTEM_MISC_H */
diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
index a5265ed..04f1dd50 100644
--- a/arch/arm/kvm/mmu.c
+++ b/arch/arm/kvm/mmu.c
@@ -29,6 +29,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include "trace.h"
 
@@ -1444,8 +1445,21 @@ int kvm_handle_guest_abort(struct kvm_vcpu *vcpu, struct 
kvm_run *run)
 
/* Check the stage-2 fault is trans. fault or write fault */
fault_status = kvm_vcpu_trap_get_fault_type(vcpu);
-   if (fault_status != FSC_FAULT && fault_status != FSC_PERM &&
-   fault_status != FSC_ACCESS) {
+
+   /* The host kernel will handle the synchronous external abort. There
+* is no need to pass the error into the guest.
+*/
+   if (fault_status == FSC_EXTABT) {
+   if(handle_guest_sea((unsigned long)fault_ipa,
+   kvm_vcpu_get_hsr(vcpu))) {
+   kvm_err("Failed to handle guest SEA, FSC: EC=%#x 
xFSC=%#lx ESR_EL2=%#lx\n",
+   kvm_vcpu_trap_get_class(vcpu),
+   (unsigned long)kvm_vcpu_trap_get_fault(vcpu),
+   (unsigned long)kvm_vcpu_get_hsr(vcpu));
+   return -EFAULT;
+   }
+   } else if (fault_status != FSC_FAULT && fault_status != FSC_PERM &&
+  fault_status != FSC_ACCESS) {
kvm_err("Unsupported FSC: EC=%#x xFSC=%#lx ESR_EL2=%#lx\n",
kvm_vcpu_trap_get_class(vcpu),
(unsigned long)kvm_vcpu_trap_get_fault(vcpu),
diff --git a/arch/arm64/include/asm/kvm_arm.h b/arch/arm64/include/asm/kvm_arm.h
index 2a2752b..2b11d59 100644
--- a/arch/arm64/include/asm/kvm_arm.h
+++ b/arch/arm64/include/asm/kvm_arm.h
@@ -201,6 +201,7 @@
 #define FSC_FAULT  ESR_ELx_FSC_FAULT
 #define FSC_ACCESS ESR_ELx_FSC_ACCESS
 #define FSC_PERM   ESR_ELx_FSC_PERM
+#define FSC_EXTABT ESR_ELx_FSC_EXTABT
 
 /* Hyp Prefetch Fault Address Register (HPFAR/HDFAR) */
 #define HPFAR_MASK (~UL(0xf))
diff --git a/arch/arm64/include/asm/system_misc.h 
b/arch/arm64/include/asm/system_misc.h
index bc81243..5b2cecd1 100644
--- a/arch/arm64/include/asm/system_misc.h
+++ b/arch/arm64/include/asm/system_misc.h
@@ -58,4 +58,6 @@ void hook_debug_fault_code(int nr, int (*fn)(unsigned long, 
unsigned int,
 
 #endif /* __ASSEMBLY__ */
 
+int handle_guest_sea(unsigned long addr, unsigned int esr);
+
 #endif /* __ASM_SYSTEM_MISC_H */
diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c
index 5a5a096..edd0c4f 100644
--- a/arch/arm64/mm/fault.c
+++ b/arch/arm64/mm/fault.c
@@ -602,6 +602,22 @@ static const char *fault_name(unsigned int esr)
 }
 
 /*
+ * Handle Synchronous External Aborts that occur in a guest kernel.
+ */
+int handle_guest_sea(unsigned long addr, unsigned int esr)
+{
+   /*
+* synchronize_rcu() will wait for nmi_exit(), so no need to
+* rcu_read_lock().
+*/
+   nmi_enter();
+   ghes_notify_sea();
+   nmi_exit();
+
+   return 0;
+}
+
+/*
  * Dispatch a data abort to the relevant handler.
  */
 asmlinkage void __exception do_mem_abort(unsigned long addr, unsigned int esr,
-- 
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm 
Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project.

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH v4 2/4] arm64: Work around Falkor erratum 1003

2017-02-01 Thread Catalin Marinas
On Wed, Feb 01, 2017 at 04:33:58PM +, Will Deacon wrote:
> On Wed, Feb 01, 2017 at 11:29:22AM -0500, Christopher Covington wrote:
> > On 01/31/2017 12:56 PM, Marc Zyngier wrote:
> > > Given that all ARMv8 CPUs can support SW_PAN, it is more likely to be
> > > enabled than the ARMv8.1 PAN. I'd vote for supporting the workaround in
> > > that case too, and hope that people do enable the HW version.
> > 
> > Okay, I'll do my best to add support for the SW PAN case. I rebased and
> > submitted v6 of the E1009 patch [1] so that it no longer depends on this
> > patch landing first, if you all are inclined to pick it up while work on
> > this E1003 patch continues.
> 
> The alternative is not enabling SW_PAN (at runtime) if this errata is
> present, along with a warning stating that hardware-PAN should be
> enabled in kconfig instead. Not sure what distributions will make of that
> though.

The problem with this patch is that when ARM64_SW_TTBR0_PAN is enabled
and in the absence of hardware PAN (or ARM64_PAN disabled),
cpu_do_switch_mm is no longer called for user process switching, so the
workaround is pretty much useless.

I'm ok with adding the Kconfig dependency below to
QCOM_FALKOR_ERRATUM_1003:

depends on !ARM64_SW_TTBR0_PAN || ARM64_PAN

together with a run-time warning if ARM64_SW_TTBR0_PAN is being used.

I'm not keen on adding the workaround to the uaccess macros. They are
complex enough already and people who do want SW PAN (and single Image)
would end up with 6-8 extra unnecessary nops on each side of a uaccess
(and nops are not entirely free).

-- 
Catalin
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH v4 2/4] arm64: Work around Falkor erratum 1003

2017-02-01 Thread Will Deacon
On Wed, Feb 01, 2017 at 05:36:09PM +, Catalin Marinas wrote:
> On Wed, Feb 01, 2017 at 04:33:58PM +, Will Deacon wrote:
> > On Wed, Feb 01, 2017 at 11:29:22AM -0500, Christopher Covington wrote:
> > > On 01/31/2017 12:56 PM, Marc Zyngier wrote:
> > > > Given that all ARMv8 CPUs can support SW_PAN, it is more likely to be
> > > > enabled than the ARMv8.1 PAN. I'd vote for supporting the workaround in
> > > > that case too, and hope that people do enable the HW version.
> > > 
> > > Okay, I'll do my best to add support for the SW PAN case. I rebased and
> > > submitted v6 of the E1009 patch [1] so that it no longer depends on this
> > > patch landing first, if you all are inclined to pick it up while work on
> > > this E1003 patch continues.
> > 
> > The alternative is not enabling SW_PAN (at runtime) if this errata is
> > present, along with a warning stating that hardware-PAN should be
> > enabled in kconfig instead. Not sure what distributions will make of that
> > though.
> 
> The problem with this patch is that when ARM64_SW_TTBR0_PAN is enabled
> and in the absence of hardware PAN (or ARM64_PAN disabled),
> cpu_do_switch_mm is no longer called for user process switching, so the
> workaround is pretty much useless.

Oh, I see what you mean now.

> I'm ok with adding the Kconfig dependency below to
> QCOM_FALKOR_ERRATUM_1003:
> 
>   depends on !ARM64_SW_TTBR0_PAN || ARM64_PAN
> 
> together with a run-time warning if ARM64_SW_TTBR0_PAN is being used.

That makes it look like hardware-PAN is the cause of the erratum. Maybe
just select ARM64_PAN if the erratum workaround is selected, then
runtime warning if we find that the h/w doesn't have PAN but does have
the erratum (which should never fire)?

Will
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[RFC v3 06/10] KVM: arm/arm64: Update the physical timer interrupt level

2017-02-01 Thread Jintack Lim
Now that we maintain the EL1 physical timer register states of VMs,
update the physical timer interrupt level along with the virtual one.

Signed-off-by: Jintack Lim 
---
 virt/kvm/arm/arch_timer.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/virt/kvm/arm/arch_timer.c b/virt/kvm/arm/arch_timer.c
index ae38703..1b086fd6 100644
--- a/virt/kvm/arm/arch_timer.c
+++ b/virt/kvm/arm/arch_timer.c
@@ -186,6 +186,7 @@ static int kvm_timer_update_state(struct kvm_vcpu *vcpu)
 {
struct arch_timer_cpu *timer = &vcpu->arch.timer_cpu;
struct arch_timer_context *vtimer = vcpu_vtimer(vcpu);
+   struct arch_timer_context *ptimer = vcpu_ptimer(vcpu);
 
/*
 * If userspace modified the timer registers via SET_ONE_REG before
@@ -199,6 +200,9 @@ static int kvm_timer_update_state(struct kvm_vcpu *vcpu)
if (kvm_timer_should_fire(vtimer) != vtimer->irq.level)
kvm_timer_update_irq(vcpu, !vtimer->irq.level, vtimer);
 
+   if (kvm_timer_should_fire(ptimer) != ptimer->irq.level)
+   kvm_timer_update_irq(vcpu, !ptimer->irq.level, ptimer);
+
return 0;
 }
 
-- 
1.9.1


___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[RFC v3 05/10] KVM: arm/arm64: Initialize the emulated EL1 physical timer

2017-02-01 Thread Jintack Lim
Initialize the emulated EL1 physical timer with the default irq number.

Signed-off-by: Jintack Lim 
---
 arch/arm/kvm/reset.c | 9 -
 arch/arm64/kvm/reset.c   | 9 -
 include/kvm/arm_arch_timer.h | 3 ++-
 virt/kvm/arm/arch_timer.c| 9 +++--
 4 files changed, 25 insertions(+), 5 deletions(-)

diff --git a/arch/arm/kvm/reset.c b/arch/arm/kvm/reset.c
index 4b5e802..1da8b2d 100644
--- a/arch/arm/kvm/reset.c
+++ b/arch/arm/kvm/reset.c
@@ -37,6 +37,11 @@
.usr_regs.ARM_cpsr = SVC_MODE | PSR_A_BIT | PSR_I_BIT | PSR_F_BIT,
 };
 
+static const struct kvm_irq_level cortexa_ptimer_irq = {
+   { .irq = 30 },
+   .level = 1,
+};
+
 static const struct kvm_irq_level cortexa_vtimer_irq = {
{ .irq = 27 },
.level = 1,
@@ -58,6 +63,7 @@ int kvm_reset_vcpu(struct kvm_vcpu *vcpu)
 {
struct kvm_regs *reset_regs;
const struct kvm_irq_level *cpu_vtimer_irq;
+   const struct kvm_irq_level *cpu_ptimer_irq;
 
switch (vcpu->arch.target) {
case KVM_ARM_TARGET_CORTEX_A7:
@@ -65,6 +71,7 @@ int kvm_reset_vcpu(struct kvm_vcpu *vcpu)
reset_regs = &cortexa_regs_reset;
vcpu->arch.midr = read_cpuid_id();
cpu_vtimer_irq = &cortexa_vtimer_irq;
+   cpu_ptimer_irq = &cortexa_ptimer_irq;
break;
default:
return -ENODEV;
@@ -77,5 +84,5 @@ int kvm_reset_vcpu(struct kvm_vcpu *vcpu)
kvm_reset_coprocs(vcpu);
 
/* Reset arch_timer context */
-   return kvm_timer_vcpu_reset(vcpu, cpu_vtimer_irq);
+   return kvm_timer_vcpu_reset(vcpu, cpu_vtimer_irq, cpu_ptimer_irq);
 }
diff --git a/arch/arm64/kvm/reset.c b/arch/arm64/kvm/reset.c
index e95d4f6..d9e9697 100644
--- a/arch/arm64/kvm/reset.c
+++ b/arch/arm64/kvm/reset.c
@@ -46,6 +46,11 @@
COMPAT_PSR_I_BIT | COMPAT_PSR_F_BIT),
 };
 
+static const struct kvm_irq_level default_ptimer_irq = {
+   .irq= 30,
+   .level  = 1,
+};
+
 static const struct kvm_irq_level default_vtimer_irq = {
.irq= 27,
.level  = 1,
@@ -104,6 +109,7 @@ int kvm_arch_dev_ioctl_check_extension(struct kvm *kvm, 
long ext)
 int kvm_reset_vcpu(struct kvm_vcpu *vcpu)
 {
const struct kvm_irq_level *cpu_vtimer_irq;
+   const struct kvm_irq_level *cpu_ptimer_irq;
const struct kvm_regs *cpu_reset;
 
switch (vcpu->arch.target) {
@@ -117,6 +123,7 @@ int kvm_reset_vcpu(struct kvm_vcpu *vcpu)
}
 
cpu_vtimer_irq = &default_vtimer_irq;
+   cpu_ptimer_irq = &default_ptimer_irq;
break;
}
 
@@ -130,5 +137,5 @@ int kvm_reset_vcpu(struct kvm_vcpu *vcpu)
kvm_pmu_vcpu_reset(vcpu);
 
/* Reset timer */
-   return kvm_timer_vcpu_reset(vcpu, cpu_vtimer_irq);
+   return kvm_timer_vcpu_reset(vcpu, cpu_vtimer_irq, cpu_ptimer_irq);
 }
diff --git a/include/kvm/arm_arch_timer.h b/include/kvm/arm_arch_timer.h
index 6445a3d..f1d2fba0 100644
--- a/include/kvm/arm_arch_timer.h
+++ b/include/kvm/arm_arch_timer.h
@@ -58,7 +58,8 @@ struct arch_timer_cpu {
 int kvm_timer_hyp_init(void);
 int kvm_timer_enable(struct kvm_vcpu *vcpu);
 int kvm_timer_vcpu_reset(struct kvm_vcpu *vcpu,
-const struct kvm_irq_level *irq);
+const struct kvm_irq_level *virt_irq,
+const struct kvm_irq_level *phys_irq);
 void kvm_timer_vcpu_init(struct kvm_vcpu *vcpu);
 void kvm_timer_flush_hwstate(struct kvm_vcpu *vcpu);
 void kvm_timer_sync_hwstate(struct kvm_vcpu *vcpu);
diff --git a/virt/kvm/arm/arch_timer.c b/virt/kvm/arm/arch_timer.c
index c42bca5..ae38703 100644
--- a/virt/kvm/arm/arch_timer.c
+++ b/virt/kvm/arm/arch_timer.c
@@ -327,9 +327,11 @@ void kvm_timer_sync_hwstate(struct kvm_vcpu *vcpu)
 }
 
 int kvm_timer_vcpu_reset(struct kvm_vcpu *vcpu,
-const struct kvm_irq_level *irq)
+const struct kvm_irq_level *virt_irq,
+const struct kvm_irq_level *phys_irq)
 {
struct arch_timer_context *vtimer = vcpu_vtimer(vcpu);
+   struct arch_timer_context *ptimer = vcpu_ptimer(vcpu);
 
/*
 * The vcpu timer irq number cannot be determined in
@@ -337,7 +339,8 @@ int kvm_timer_vcpu_reset(struct kvm_vcpu *vcpu,
 * kvm_vcpu_set_target(). To handle this, we determine
 * vcpu timer irq number when the vcpu is reset.
 */
-   vtimer->irq.irq = irq->irq;
+   vtimer->irq.irq = virt_irq->irq;
+   ptimer->irq.irq = phys_irq->irq;
 
/*
 * The bits in CNTV_CTL are architecturally reset to UNKNOWN for ARMv8
@@ -346,6 +349,7 @@ int kvm_timer_vcpu_reset(struct kvm_vcpu *vcpu,
 * the ARMv7 architecture.
 */
vtimer->cnt_ctl = 0;
+   ptimer->cnt_ctl = 0;
kvm_timer_update_state(vcpu);
 
return 0;
@@ -369,6 +373,7 @@ void kvm_timer_vcpu_init(struct kvm_vcpu *vcpu)
 
/* 

[RFC v3 10/10] KVM: arm/arm64: Emulate the EL1 phys timer registers

2017-02-01 Thread Jintack Lim
Emulate read and write operations to CNTP_TVAL, CNTP_CVAL and CNTP_CTL.
Now VMs are able to use the EL1 physical timer.

Signed-off-by: Jintack Lim 
---
 arch/arm64/kvm/sys_regs.c| 37 ++---
 include/kvm/arm_arch_timer.h |  2 ++
 virt/kvm/arm/arch_timer.c|  2 +-
 3 files changed, 37 insertions(+), 4 deletions(-)

diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
index 1cd3464..0e26f8c 100644
--- a/arch/arm64/kvm/sys_regs.c
+++ b/arch/arm64/kvm/sys_regs.c
@@ -824,7 +824,14 @@ static bool access_cntp_tval(struct kvm_vcpu *vcpu,
struct sys_reg_params *p,
const struct sys_reg_desc *r)
 {
-   kvm_inject_undefined(vcpu);
+   struct arch_timer_context *ptimer = vcpu_ptimer(vcpu);
+   u64 now = kvm_phys_timer_read();
+
+   if (p->is_write)
+   ptimer->cnt_cval = p->regval + now;
+   else
+   p->regval = ptimer->cnt_cval - now;
+
return true;
 }
 
@@ -832,7 +839,25 @@ static bool access_cntp_ctl(struct kvm_vcpu *vcpu,
struct sys_reg_params *p,
const struct sys_reg_desc *r)
 {
-   kvm_inject_undefined(vcpu);
+   struct arch_timer_context *ptimer = vcpu_ptimer(vcpu);
+
+   if (p->is_write) {
+   /* ISTATUS bit is read-only */
+   ptimer->cnt_ctl = p->regval & ~ARCH_TIMER_CTRL_IT_STAT;
+   } else {
+   u64 now = kvm_phys_timer_read();
+
+   p->regval = ptimer->cnt_ctl;
+   /*
+* Set ISTATUS bit if it's expired.
+* Note that according to ARMv8 ARM Issue A.k, ISTATUS bit is
+* UNKNOWN when ENABLE bit is 0, so we chose to set ISTATUS bit
+* regardless of ENABLE bit for our implementation convenience.
+*/
+   if (ptimer->cnt_cval <= now)
+   p->regval |= ARCH_TIMER_CTRL_IT_STAT;
+   }
+
return true;
 }
 
@@ -840,7 +865,13 @@ static bool access_cntp_cval(struct kvm_vcpu *vcpu,
struct sys_reg_params *p,
const struct sys_reg_desc *r)
 {
-   kvm_inject_undefined(vcpu);
+   struct arch_timer_context *ptimer = vcpu_ptimer(vcpu);
+
+   if (p->is_write)
+   ptimer->cnt_cval = p->regval;
+   else
+   p->regval = ptimer->cnt_cval;
+
return true;
 }
 
diff --git a/include/kvm/arm_arch_timer.h b/include/kvm/arm_arch_timer.h
index f1d2fba0..fe797d6 100644
--- a/include/kvm/arm_arch_timer.h
+++ b/include/kvm/arm_arch_timer.h
@@ -72,6 +72,8 @@ int kvm_timer_vcpu_reset(struct kvm_vcpu *vcpu,
 void kvm_timer_schedule(struct kvm_vcpu *vcpu);
 void kvm_timer_unschedule(struct kvm_vcpu *vcpu);
 
+u64 kvm_phys_timer_read(void);
+
 void kvm_timer_vcpu_put(struct kvm_vcpu *vcpu);
 
 void kvm_timer_init_vhe(void);
diff --git a/virt/kvm/arm/arch_timer.c b/virt/kvm/arm/arch_timer.c
index 1806e5e..93c811c 100644
--- a/virt/kvm/arm/arch_timer.c
+++ b/virt/kvm/arm/arch_timer.c
@@ -40,7 +40,7 @@ void kvm_timer_vcpu_put(struct kvm_vcpu *vcpu)
vcpu_vtimer(vcpu)->active_cleared_last = false;
 }
 
-static u64 kvm_phys_timer_read(void)
+u64 kvm_phys_timer_read(void)
 {
return timecounter->cc->read(timecounter->cc);
 }
-- 
1.9.1


___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[RFC v3 04/10] KVM: arm/arm64: Add the EL1 physical timer context

2017-02-01 Thread Jintack Lim
Add the EL1 physical timer context.

Signed-off-by: Jintack Lim 
---
 include/kvm/arm_arch_timer.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/include/kvm/arm_arch_timer.h b/include/kvm/arm_arch_timer.h
index f46fa3b..6445a3d 100644
--- a/include/kvm/arm_arch_timer.h
+++ b/include/kvm/arm_arch_timer.h
@@ -40,6 +40,7 @@ struct arch_timer_context {
 
 struct arch_timer_cpu {
struct arch_timer_context   vtimer;
+   struct arch_timer_context   ptimer;
 
/* Background timer used when the guest is not running */
struct hrtimer  timer;
@@ -75,4 +76,5 @@ int kvm_timer_vcpu_reset(struct kvm_vcpu *vcpu,
 void kvm_timer_init_vhe(void);
 
 #define vcpu_vtimer(v) (&(v)->arch.timer_cpu.vtimer)
+#define vcpu_ptimer(v) (&(v)->arch.timer_cpu.ptimer)
 #endif
-- 
1.9.1


___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[RFC v3 08/10] KVM: arm/arm64: Set up a background timer for the physical timer emulation

2017-02-01 Thread Jintack Lim
Set a background timer for the EL1 physical timer emulation while VMs
are running, so that VMs get the physical timer interrupts in a timely
manner.

Schedule the background timer on entry to the VM and cancel it on exit.
This would not have any performance impact to the guest OSes that
currently use the virtual timer since the physical timer is always not
enabled.

Signed-off-by: Jintack Lim 
---
 virt/kvm/arm/arch_timer.c | 25 -
 1 file changed, 24 insertions(+), 1 deletion(-)

diff --git a/virt/kvm/arm/arch_timer.c b/virt/kvm/arm/arch_timer.c
index 89bdb79..1806e5e 100644
--- a/virt/kvm/arm/arch_timer.c
+++ b/virt/kvm/arm/arch_timer.c
@@ -229,6 +229,22 @@ static int kvm_timer_update_state(struct kvm_vcpu *vcpu)
return 0;
 }
 
+/* Schedule the background timer for the emulated timer. */
+static void kvm_timer_emulate(struct kvm_vcpu *vcpu,
+ struct arch_timer_context *timer_ctx)
+{
+   struct arch_timer_cpu *timer = &vcpu->arch.timer_cpu;
+
+   if (kvm_timer_should_fire(timer_ctx))
+   return;
+
+   if (!kvm_timer_irq_can_fire(timer_ctx))
+   return;
+
+   /*  The timer has not yet expired, schedule a background timer */
+   timer_arm(timer, kvm_timer_compute_delta(timer_ctx));
+}
+
 /*
  * Schedule the background timer before calling kvm_vcpu_block, so that this
  * thread is removed from its waitqueue and made runnable when there's a timer
@@ -286,6 +302,9 @@ void kvm_timer_flush_hwstate(struct kvm_vcpu *vcpu)
if (kvm_timer_update_state(vcpu))
return;
 
+   /* Set the background timer for the physical timer emulation. */
+   kvm_timer_emulate(vcpu, vcpu_ptimer(vcpu));
+
/*
* If we enter the guest with the virtual input level to the VGIC
* asserted, then we have already told the VGIC what we need to, and
@@ -348,7 +367,11 @@ void kvm_timer_sync_hwstate(struct kvm_vcpu *vcpu)
 {
struct arch_timer_cpu *timer = &vcpu->arch.timer_cpu;
 
-   BUG_ON(timer_is_armed(timer));
+   /*
+* This is to cancel the background timer for the physical timer
+* emulation if it is set.
+*/
+   timer_disarm(timer);
 
/*
 * The guest could have modified the timer registers or the timer
-- 
1.9.1


___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[RFC v3 01/10] KVM: arm/arm64: Abstract virtual timer context into separate structure

2017-02-01 Thread Jintack Lim
Abstract virtual timer context into a separate structure and change all
callers referring to timer registers, irq state and so on. No change in
functionality.

This is about to become very handy when adding the EL1 physical timer.

Signed-off-by: Jintack Lim 
Acked-by: Christoffer Dall 
Acked-by: Marc Zyngier 
---
 include/kvm/arm_arch_timer.h | 27 -
 virt/kvm/arm/arch_timer.c| 69 +++-
 virt/kvm/arm/hyp/timer-sr.c  | 10 ---
 3 files changed, 56 insertions(+), 50 deletions(-)

diff --git a/include/kvm/arm_arch_timer.h b/include/kvm/arm_arch_timer.h
index 5c970ce..daad3c1 100644
--- a/include/kvm/arm_arch_timer.h
+++ b/include/kvm/arm_arch_timer.h
@@ -28,15 +28,20 @@ struct arch_timer_kvm {
u64 cntvoff;
 };
 
-struct arch_timer_cpu {
+struct arch_timer_context {
/* Registers: control register, timer value */
-   u32 cntv_ctl;   /* Saved/restored */
-   u64 cntv_cval;  /* Saved/restored */
+   u32 cnt_ctl;
+   u64 cnt_cval;
+
+   /* Timer IRQ */
+   struct kvm_irq_levelirq;
+
+   /* Active IRQ state caching */
+   boolactive_cleared_last;
+};
 
-   /*
-* Anything that is not used directly from assembly code goes
-* here.
-*/
+struct arch_timer_cpu {
+   struct arch_timer_context   vtimer;
 
/* Background timer used when the guest is not running */
struct hrtimer  timer;
@@ -47,12 +52,6 @@ struct arch_timer_cpu {
/* Background timer active */
boolarmed;
 
-   /* Timer IRQ */
-   struct kvm_irq_levelirq;
-
-   /* Active IRQ state caching */
-   boolactive_cleared_last;
-
/* Is the timer enabled */
boolenabled;
 };
@@ -77,4 +76,6 @@ int kvm_timer_vcpu_reset(struct kvm_vcpu *vcpu,
 void kvm_timer_vcpu_put(struct kvm_vcpu *vcpu);
 
 void kvm_timer_init_vhe(void);
+
+#define vcpu_vtimer(v) (&(v)->arch.timer_cpu.vtimer)
 #endif
diff --git a/virt/kvm/arm/arch_timer.c b/virt/kvm/arm/arch_timer.c
index 91ecf48..d3556b3 100644
--- a/virt/kvm/arm/arch_timer.c
+++ b/virt/kvm/arm/arch_timer.c
@@ -37,7 +37,7 @@
 
 void kvm_timer_vcpu_put(struct kvm_vcpu *vcpu)
 {
-   vcpu->arch.timer_cpu.active_cleared_last = false;
+   vcpu_vtimer(vcpu)->active_cleared_last = false;
 }
 
 static u64 kvm_phys_timer_read(void)
@@ -102,7 +102,7 @@ static u64 kvm_timer_compute_delta(struct kvm_vcpu *vcpu)
 {
u64 cval, now;
 
-   cval = vcpu->arch.timer_cpu.cntv_cval;
+   cval = vcpu_vtimer(vcpu)->cnt_cval;
now = kvm_phys_timer_read() - vcpu->kvm->arch.timer.cntvoff;
 
if (now < cval) {
@@ -144,21 +144,21 @@ static enum hrtimer_restart kvm_timer_expire(struct 
hrtimer *hrt)
 
 static bool kvm_timer_irq_can_fire(struct kvm_vcpu *vcpu)
 {
-   struct arch_timer_cpu *timer = &vcpu->arch.timer_cpu;
+   struct arch_timer_context *vtimer = vcpu_vtimer(vcpu);
 
-   return !(timer->cntv_ctl & ARCH_TIMER_CTRL_IT_MASK) &&
-   (timer->cntv_ctl & ARCH_TIMER_CTRL_ENABLE);
+   return !(vtimer->cnt_ctl & ARCH_TIMER_CTRL_IT_MASK) &&
+   (vtimer->cnt_ctl & ARCH_TIMER_CTRL_ENABLE);
 }
 
 bool kvm_timer_should_fire(struct kvm_vcpu *vcpu)
 {
-   struct arch_timer_cpu *timer = &vcpu->arch.timer_cpu;
+   struct arch_timer_context *vtimer = vcpu_vtimer(vcpu);
u64 cval, now;
 
if (!kvm_timer_irq_can_fire(vcpu))
return false;
 
-   cval = timer->cntv_cval;
+   cval = vtimer->cnt_cval;
now = kvm_phys_timer_read() - vcpu->kvm->arch.timer.cntvoff;
 
return cval <= now;
@@ -167,18 +167,18 @@ bool kvm_timer_should_fire(struct kvm_vcpu *vcpu)
 static void kvm_timer_update_irq(struct kvm_vcpu *vcpu, bool new_level)
 {
int ret;
-   struct arch_timer_cpu *timer = &vcpu->arch.timer_cpu;
+   struct arch_timer_context *vtimer = vcpu_vtimer(vcpu);
 
BUG_ON(!vgic_initialized(vcpu->kvm));
 
-   timer->active_cleared_last = false;
-   timer->irq.level = new_level;
-   trace_kvm_timer_update_irq(vcpu->vcpu_id, timer->irq.irq,
-  timer->irq.level);
+   vtimer->active_cleared_last = false;
+   vtimer->irq.level = new_level;
+   trace_kvm_timer_update_irq(vcpu->vcpu_id, vtimer->irq.irq,
+  vtimer->irq.level);
 
ret = kvm_vgic_inject_irq(vcpu->kvm, vcpu->vcpu_id,
-timer->irq.irq,
-timer->irq.level);
+vtimer->irq.irq,
+vtimer->irq.level);
WARN_ON(ret);
 }
 
@@ -189,18 +189,19 @@ st

[RFC v3 07/10] KVM: arm/arm64: Set a background timer to the earliest timer expiration

2017-02-01 Thread Jintack Lim
When scheduling a background timer, consider both of the virtual and
physical timer and pick the earliest expiration time.

Signed-off-by: Jintack Lim 
---
 arch/arm/kvm/arm.c|  3 ++-
 virt/kvm/arm/arch_timer.c | 53 +++
 2 files changed, 42 insertions(+), 14 deletions(-)

diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
index 0ecd6cf..21c493a 100644
--- a/arch/arm/kvm/arm.c
+++ b/arch/arm/kvm/arm.c
@@ -300,7 +300,8 @@ void kvm_arch_vcpu_destroy(struct kvm_vcpu *vcpu)
 
 int kvm_cpu_has_pending_timer(struct kvm_vcpu *vcpu)
 {
-   return kvm_timer_should_fire(vcpu_vtimer(vcpu));
+   return kvm_timer_should_fire(vcpu_vtimer(vcpu)) ||
+  kvm_timer_should_fire(vcpu_ptimer(vcpu));
 }
 
 void kvm_arch_vcpu_blocking(struct kvm_vcpu *vcpu)
diff --git a/virt/kvm/arm/arch_timer.c b/virt/kvm/arm/arch_timer.c
index 1b086fd6..89bdb79 100644
--- a/virt/kvm/arm/arch_timer.c
+++ b/virt/kvm/arm/arch_timer.c
@@ -118,6 +118,35 @@ static u64 kvm_timer_compute_delta(struct 
arch_timer_context *timer_ctx)
return 0;
 }
 
+static bool kvm_timer_irq_can_fire(struct arch_timer_context *timer_ctx)
+{
+   return !(timer_ctx->cnt_ctl & ARCH_TIMER_CTRL_IT_MASK) &&
+   (timer_ctx->cnt_ctl & ARCH_TIMER_CTRL_ENABLE);
+}
+
+/*
+ * Returns the earliest expiration time in ns among guest timers.
+ * Note that it will return 0 if none of timers can fire.
+ */
+static u64 kvm_timer_earliest_exp(struct kvm_vcpu *vcpu)
+{
+   u64 min_virt = ULLONG_MAX, min_phys = ULLONG_MAX;
+   struct arch_timer_context *vtimer = vcpu_vtimer(vcpu);
+   struct arch_timer_context *ptimer = vcpu_ptimer(vcpu);
+
+   if (kvm_timer_irq_can_fire(vtimer))
+   min_virt = kvm_timer_compute_delta(vtimer);
+
+   if (kvm_timer_irq_can_fire(ptimer))
+   min_phys = kvm_timer_compute_delta(ptimer);
+
+   /* If none of timers can fire, then return 0 */
+   if ((min_virt == ULLONG_MAX) && (min_phys == ULLONG_MAX))
+   return 0;
+
+   return min(min_virt, min_phys);
+}
+
 static enum hrtimer_restart kvm_timer_expire(struct hrtimer *hrt)
 {
struct arch_timer_cpu *timer;
@@ -132,7 +161,7 @@ static enum hrtimer_restart kvm_timer_expire(struct hrtimer 
*hrt)
 * PoV (NTP on the host may have forced it to expire
 * early). If we should have slept longer, restart it.
 */
-   ns = kvm_timer_compute_delta(vcpu_vtimer(vcpu));
+   ns = kvm_timer_earliest_exp(vcpu);
if (unlikely(ns)) {
hrtimer_forward_now(hrt, ns_to_ktime(ns));
return HRTIMER_RESTART;
@@ -142,12 +171,6 @@ static enum hrtimer_restart kvm_timer_expire(struct 
hrtimer *hrt)
return HRTIMER_NORESTART;
 }
 
-static bool kvm_timer_irq_can_fire(struct arch_timer_context *timer_ctx)
-{
-   return !(timer_ctx->cnt_ctl & ARCH_TIMER_CTRL_IT_MASK) &&
-   (timer_ctx->cnt_ctl & ARCH_TIMER_CTRL_ENABLE);
-}
-
 bool kvm_timer_should_fire(struct arch_timer_context *timer_ctx)
 {
u64 cval, now;
@@ -215,26 +238,30 @@ void kvm_timer_schedule(struct kvm_vcpu *vcpu)
 {
struct arch_timer_cpu *timer = &vcpu->arch.timer_cpu;
struct arch_timer_context *vtimer = vcpu_vtimer(vcpu);
+   struct arch_timer_context *ptimer = vcpu_ptimer(vcpu);
 
BUG_ON(timer_is_armed(timer));
 
/*
-* No need to schedule a background timer if the guest timer has
+* No need to schedule a background timer if any guest timer has
 * already expired, because kvm_vcpu_block will return before putting
 * the thread to sleep.
 */
-   if (kvm_timer_should_fire(vtimer))
+   if (kvm_timer_should_fire(vtimer) || kvm_timer_should_fire(ptimer))
return;
 
/*
-* If the timer is not capable of raising interrupts (disabled or
+* If both timers are not capable of raising interrupts (disabled or
 * masked), then there's no more work for us to do.
 */
-   if (!kvm_timer_irq_can_fire(vtimer))
+   if (!kvm_timer_irq_can_fire(vtimer) && !kvm_timer_irq_can_fire(ptimer))
return;
 
-   /*  The timer has not yet expired, schedule a background timer */
-   timer_arm(timer, kvm_timer_compute_delta(vtimer));
+   /*
+* The guest timers have not yet expired, schedule a background timer.
+* Set the earliest expiration time among the guest timers.
+*/
+   timer_arm(timer, kvm_timer_earliest_exp(vcpu));
 }
 
 void kvm_timer_unschedule(struct kvm_vcpu *vcpu)
-- 
1.9.1


___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[RFC v3 09/10] KVM: arm64: Add the EL1 physical timer access handler

2017-02-01 Thread Jintack Lim
KVM traps on the EL1 phys timer accesses from VMs, but it doesn't handle
those traps. This results in terminating VMs. Instead, set a handler for
the EL1 phys timer access, and inject an undefined exception as an
intermediate step.

Signed-off-by: Jintack Lim 
---
 arch/arm64/kvm/sys_regs.c | 34 ++
 1 file changed, 34 insertions(+)

diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
index caa47ce..1cd3464 100644
--- a/arch/arm64/kvm/sys_regs.c
+++ b/arch/arm64/kvm/sys_regs.c
@@ -820,6 +820,30 @@ static bool access_pmuserenr(struct kvm_vcpu *vcpu, struct 
sys_reg_params *p,
  CRm((0b1100 | (((n) >> 3) & 0x3))), Op2(((n) & 0x7)), \
  access_pmu_evtyper, reset_unknown, (PMEVTYPER0_EL0 + n), }
 
+static bool access_cntp_tval(struct kvm_vcpu *vcpu,
+   struct sys_reg_params *p,
+   const struct sys_reg_desc *r)
+{
+   kvm_inject_undefined(vcpu);
+   return true;
+}
+
+static bool access_cntp_ctl(struct kvm_vcpu *vcpu,
+   struct sys_reg_params *p,
+   const struct sys_reg_desc *r)
+{
+   kvm_inject_undefined(vcpu);
+   return true;
+}
+
+static bool access_cntp_cval(struct kvm_vcpu *vcpu,
+   struct sys_reg_params *p,
+   const struct sys_reg_desc *r)
+{
+   kvm_inject_undefined(vcpu);
+   return true;
+}
+
 /*
  * Architected system registers.
  * Important: Must be sorted ascending by Op0, Op1, CRn, CRm, Op2
@@ -1029,6 +1053,16 @@ static bool access_pmuserenr(struct kvm_vcpu *vcpu, 
struct sys_reg_params *p,
{ Op0(0b11), Op1(0b011), CRn(0b1101), CRm(0b), Op2(0b011),
  NULL, reset_unknown, TPIDRRO_EL0 },
 
+   /* CNTP_TVAL_EL0 */
+   { Op0(0b11), Op1(0b011), CRn(0b1110), CRm(0b0010), Op2(0b000),
+ access_cntp_tval },
+   /* CNTP_CTL_EL0 */
+   { Op0(0b11), Op1(0b011), CRn(0b1110), CRm(0b0010), Op2(0b001),
+ access_cntp_ctl },
+   /* CNTP_CVAL_EL0 */
+   { Op0(0b11), Op1(0b011), CRn(0b1110), CRm(0b0010), Op2(0b010),
+ access_cntp_cval },
+
/* PMEVCNTRn_EL0 */
PMU_PMEVCNTR_EL0(0),
PMU_PMEVCNTR_EL0(1),
-- 
1.9.1


___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[RFC v3 00/10] Provide the EL1 physical timer to the VM

2017-02-01 Thread Jintack Lim
The ARM architecture defines the EL1 physical timer and the virtual timer,
and it is reasonable for an OS to expect to be able to access both.
However, the current KVM implementation does not provide the EL1 physical
timer to VMs but terminates VMs on access to the timer.

This patch series enables VMs to use the EL1 physical timer through
trap-and-emulate.  The KVM host emulates each EL1 physical timer register
access and sets up the background timer accordingly.  When the background
timer expires, the KVM host injects EL1 physical timer interrupts to the
VM.  Alternatively, it's also possible to allow VMs to access the EL1
physical timer without trapping.  However, this requires somehow using the
EL2 physical timer for the Linux host while running the VM instead of the
EL1 physical timer.  Right now I just implemented trap-and-emulate because
this was straightforward to do, and I leave it to future work to determine
if transferring the EL1 physical timer state to the EL2 timer provides any
performance benefit.

This feature will be useful for any OS that wishes to access the EL1
physical timer. Nested virtualization is one of those use cases. A nested
hypervisor running inside a VM would think it has full access to the
hardware and naturally tries to use the EL1 physical timer as Linux would
do. Other nested hypervisors may try to use the EL2 physical timer as Xen
would do, but supporting the EL2 physical timer to the VM is out of scope
of this patch series. This patch series will make it easy to add the EL2
timer support in the future, though.

Note that Linux VMs booting in EL1 will be unaffected by this patch series
and will continue to use only the virtual timer and this patch series will
therefore not introduce any performance degredation as a result of
trap-and-emulate.

v2 => v3:
 - Rebase on kvmarm/queue
 - Take kvm->lock to synchronize cntvoff across all vtimers
 - Remove unnecessary function parameters
 - Add comments

v1 => v2:
 - Rebase on kvm-arm-for-4.10-rc4
 - To make it simple, schedule the background timer for the EL1 physical timer
   emulation on every entry to the VM and cancel it on exit.
 - Change timer_context structure to have cntvoff and restore enable field back
   to arch_timer_cpu structure

Jintack Lim (10):
  KVM: arm/arm64: Abstract virtual timer context into separate structure
  KVM: arm/arm64: Move cntvoff to each timer context
  KVM: arm/arm64: Decouple kvm timer functions from virtual timer
  KVM: arm/arm64: Add the EL1 physical timer context
  KVM: arm/arm64: Initialize the emulated EL1 physical timer
  KVM: arm/arm64: Update the physical timer interrupt level
  KVM: arm/arm64: Set a background timer to the earliest timer
expiration
  KVM: arm/arm64: Set up a background timer for the physical timer
emulation
  KVM: arm64: Add the EL1 physical timer access handler
  KVM: arm/arm64: Emulate the EL1 phys timer registers

 arch/arm/include/asm/kvm_host.h   |   3 -
 arch/arm/kvm/arm.c|   4 +-
 arch/arm/kvm/reset.c  |   9 +-
 arch/arm64/include/asm/kvm_host.h |   3 -
 arch/arm64/kvm/reset.c|   9 +-
 arch/arm64/kvm/sys_regs.c |  65 +
 include/kvm/arm_arch_timer.h  |  39 
 virt/kvm/arm/arch_timer.c | 193 ++
 virt/kvm/arm/hyp/timer-sr.c   |  13 +--
 9 files changed, 242 insertions(+), 96 deletions(-)

-- 
1.9.1


___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[RFC v3 03/10] KVM: arm/arm64: Decouple kvm timer functions from virtual timer

2017-02-01 Thread Jintack Lim
Now that we have a separate structure for timer context, make functions
generic so that they can work with any timer context, not just the
virtual timer context.  This does not change the virtual timer
functionality.

Signed-off-by: Jintack Lim 
Acked-by: Marc Zyngier 
---
 arch/arm/kvm/arm.c   |  2 +-
 include/kvm/arm_arch_timer.h |  2 +-
 virt/kvm/arm/arch_timer.c| 54 
 3 files changed, 27 insertions(+), 31 deletions(-)

diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
index f93f2171..0ecd6cf 100644
--- a/arch/arm/kvm/arm.c
+++ b/arch/arm/kvm/arm.c
@@ -300,7 +300,7 @@ void kvm_arch_vcpu_destroy(struct kvm_vcpu *vcpu)
 
 int kvm_cpu_has_pending_timer(struct kvm_vcpu *vcpu)
 {
-   return kvm_timer_should_fire(vcpu);
+   return kvm_timer_should_fire(vcpu_vtimer(vcpu));
 }
 
 void kvm_arch_vcpu_blocking(struct kvm_vcpu *vcpu)
diff --git a/include/kvm/arm_arch_timer.h b/include/kvm/arm_arch_timer.h
index 2c8560b..f46fa3b 100644
--- a/include/kvm/arm_arch_timer.h
+++ b/include/kvm/arm_arch_timer.h
@@ -66,7 +66,7 @@ int kvm_timer_vcpu_reset(struct kvm_vcpu *vcpu,
 u64 kvm_arm_timer_get_reg(struct kvm_vcpu *, u64 regid);
 int kvm_arm_timer_set_reg(struct kvm_vcpu *, u64 regid, u64 value);
 
-bool kvm_timer_should_fire(struct kvm_vcpu *vcpu);
+bool kvm_timer_should_fire(struct arch_timer_context *timer_ctx);
 void kvm_timer_schedule(struct kvm_vcpu *vcpu);
 void kvm_timer_unschedule(struct kvm_vcpu *vcpu);
 
diff --git a/virt/kvm/arm/arch_timer.c b/virt/kvm/arm/arch_timer.c
index 19fbaaf..c42bca5 100644
--- a/virt/kvm/arm/arch_timer.c
+++ b/virt/kvm/arm/arch_timer.c
@@ -98,13 +98,12 @@ static void kvm_timer_inject_irq_work(struct work_struct 
*work)
kvm_vcpu_kick(vcpu);
 }
 
-static u64 kvm_timer_compute_delta(struct kvm_vcpu *vcpu)
+static u64 kvm_timer_compute_delta(struct arch_timer_context *timer_ctx)
 {
u64 cval, now;
-   struct arch_timer_context *vtimer = vcpu_vtimer(vcpu);
 
-   cval = vtimer->cnt_cval;
-   now = kvm_phys_timer_read() - vtimer->cntvoff;
+   cval = timer_ctx->cnt_cval;
+   now = kvm_phys_timer_read() - timer_ctx->cntvoff;
 
if (now < cval) {
u64 ns;
@@ -133,7 +132,7 @@ static enum hrtimer_restart kvm_timer_expire(struct hrtimer 
*hrt)
 * PoV (NTP on the host may have forced it to expire
 * early). If we should have slept longer, restart it.
 */
-   ns = kvm_timer_compute_delta(vcpu);
+   ns = kvm_timer_compute_delta(vcpu_vtimer(vcpu));
if (unlikely(ns)) {
hrtimer_forward_now(hrt, ns_to_ktime(ns));
return HRTIMER_RESTART;
@@ -143,43 +142,39 @@ static enum hrtimer_restart kvm_timer_expire(struct 
hrtimer *hrt)
return HRTIMER_NORESTART;
 }
 
-static bool kvm_timer_irq_can_fire(struct kvm_vcpu *vcpu)
+static bool kvm_timer_irq_can_fire(struct arch_timer_context *timer_ctx)
 {
-   struct arch_timer_context *vtimer = vcpu_vtimer(vcpu);
-
-   return !(vtimer->cnt_ctl & ARCH_TIMER_CTRL_IT_MASK) &&
-   (vtimer->cnt_ctl & ARCH_TIMER_CTRL_ENABLE);
+   return !(timer_ctx->cnt_ctl & ARCH_TIMER_CTRL_IT_MASK) &&
+   (timer_ctx->cnt_ctl & ARCH_TIMER_CTRL_ENABLE);
 }
 
-bool kvm_timer_should_fire(struct kvm_vcpu *vcpu)
+bool kvm_timer_should_fire(struct arch_timer_context *timer_ctx)
 {
-   struct arch_timer_context *vtimer = vcpu_vtimer(vcpu);
u64 cval, now;
 
-   if (!kvm_timer_irq_can_fire(vcpu))
+   if (!kvm_timer_irq_can_fire(timer_ctx))
return false;
 
-   cval = vtimer->cnt_cval;
-   now = kvm_phys_timer_read() - vtimer->cntvoff;
+   cval = timer_ctx->cnt_cval;
+   now = kvm_phys_timer_read() - timer_ctx->cntvoff;
 
return cval <= now;
 }
 
-static void kvm_timer_update_irq(struct kvm_vcpu *vcpu, bool new_level)
+static void kvm_timer_update_irq(struct kvm_vcpu *vcpu, bool new_level,
+struct arch_timer_context *timer_ctx)
 {
int ret;
-   struct arch_timer_context *vtimer = vcpu_vtimer(vcpu);
 
BUG_ON(!vgic_initialized(vcpu->kvm));
 
-   vtimer->active_cleared_last = false;
-   vtimer->irq.level = new_level;
-   trace_kvm_timer_update_irq(vcpu->vcpu_id, vtimer->irq.irq,
-  vtimer->irq.level);
+   timer_ctx->active_cleared_last = false;
+   timer_ctx->irq.level = new_level;
+   trace_kvm_timer_update_irq(vcpu->vcpu_id, timer_ctx->irq.irq,
+  timer_ctx->irq.level);
 
-   ret = kvm_vgic_inject_irq(vcpu->kvm, vcpu->vcpu_id,
-vtimer->irq.irq,
-vtimer->irq.level);
+   ret = kvm_vgic_inject_irq(vcpu->kvm, vcpu->vcpu_id, timer_ctx->irq.irq,
+ timer_ctx->irq.level);
WARN_ON(ret);
 }
 
@@ -201,8 +196,8 @@ static int kvm_timer_update_state(struct kvm

[RFC v3 02/10] KVM: arm/arm64: Move cntvoff to each timer context

2017-02-01 Thread Jintack Lim
Make cntvoff per each timer context. This is helpful to abstract kvm
timer functions to work with timer context without considering timer
types (e.g. physical timer or virtual timer).

This also would pave the way for ever doing adjustments of the cntvoff
on a per-CPU basis if that should ever make sense.

Signed-off-by: Jintack Lim 
---
 arch/arm/include/asm/kvm_host.h   |  3 ---
 arch/arm/kvm/arm.c|  1 -
 arch/arm64/include/asm/kvm_host.h |  3 ---
 include/kvm/arm_arch_timer.h  |  9 +++--
 virt/kvm/arm/arch_timer.c | 31 +--
 virt/kvm/arm/hyp/timer-sr.c   |  3 +--
 6 files changed, 25 insertions(+), 25 deletions(-)

diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h
index d5423ab..cc495d79 100644
--- a/arch/arm/include/asm/kvm_host.h
+++ b/arch/arm/include/asm/kvm_host.h
@@ -60,9 +60,6 @@ struct kvm_arch {
/* The last vcpu id that ran on each physical CPU */
int __percpu *last_vcpu_ran;
 
-   /* Timer */
-   struct arch_timer_kvm   timer;
-
/*
 * Anything that is not used directly from assembly code goes
 * here.
diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
index 9d74464..f93f2171 100644
--- a/arch/arm/kvm/arm.c
+++ b/arch/arm/kvm/arm.c
@@ -135,7 +135,6 @@ int kvm_arch_init_vm(struct kvm *kvm, unsigned long type)
goto out_free_stage2_pgd;
 
kvm_vgic_early_init(kvm);
-   kvm_timer_init(kvm);
 
/* Mark the initial VMID generation invalid */
kvm->arch.vmid_gen = 0;
diff --git a/arch/arm64/include/asm/kvm_host.h 
b/arch/arm64/include/asm/kvm_host.h
index e505038..4a758cb 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -70,9 +70,6 @@ struct kvm_arch {
 
/* Interrupt controller */
struct vgic_distvgic;
-
-   /* Timer */
-   struct arch_timer_kvm   timer;
 };
 
 #define KVM_NR_MEM_OBJS 40
diff --git a/include/kvm/arm_arch_timer.h b/include/kvm/arm_arch_timer.h
index daad3c1..2c8560b 100644
--- a/include/kvm/arm_arch_timer.h
+++ b/include/kvm/arm_arch_timer.h
@@ -23,11 +23,6 @@
 #include 
 #include 
 
-struct arch_timer_kvm {
-   /* Virtual offset */
-   u64 cntvoff;
-};
-
 struct arch_timer_context {
/* Registers: control register, timer value */
u32 cnt_ctl;
@@ -38,6 +33,9 @@ struct arch_timer_context {
 
/* Active IRQ state caching */
boolactive_cleared_last;
+
+   /* Virtual offset */
+   u64 cntvoff;
 };
 
 struct arch_timer_cpu {
@@ -58,7 +56,6 @@ struct arch_timer_cpu {
 
 int kvm_timer_hyp_init(void);
 int kvm_timer_enable(struct kvm_vcpu *vcpu);
-void kvm_timer_init(struct kvm *kvm);
 int kvm_timer_vcpu_reset(struct kvm_vcpu *vcpu,
 const struct kvm_irq_level *irq);
 void kvm_timer_vcpu_init(struct kvm_vcpu *vcpu);
diff --git a/virt/kvm/arm/arch_timer.c b/virt/kvm/arm/arch_timer.c
index d3556b3..19fbaaf 100644
--- a/virt/kvm/arm/arch_timer.c
+++ b/virt/kvm/arm/arch_timer.c
@@ -101,9 +101,10 @@ static void kvm_timer_inject_irq_work(struct work_struct 
*work)
 static u64 kvm_timer_compute_delta(struct kvm_vcpu *vcpu)
 {
u64 cval, now;
+   struct arch_timer_context *vtimer = vcpu_vtimer(vcpu);
 
-   cval = vcpu_vtimer(vcpu)->cnt_cval;
-   now = kvm_phys_timer_read() - vcpu->kvm->arch.timer.cntvoff;
+   cval = vtimer->cnt_cval;
+   now = kvm_phys_timer_read() - vtimer->cntvoff;
 
if (now < cval) {
u64 ns;
@@ -159,7 +160,7 @@ bool kvm_timer_should_fire(struct kvm_vcpu *vcpu)
return false;
 
cval = vtimer->cnt_cval;
-   now = kvm_phys_timer_read() - vcpu->kvm->arch.timer.cntvoff;
+   now = kvm_phys_timer_read() - vtimer->cntvoff;
 
return cval <= now;
 }
@@ -354,10 +355,25 @@ int kvm_timer_vcpu_reset(struct kvm_vcpu *vcpu,
return 0;
 }
 
+/* Make the updates of cntvoff for all vtimer contexts atomic */
+static void update_vtimer_cntvoff(struct kvm *kvm, u64 cntvoff)
+{
+   int i;
+   struct kvm_vcpu *vcpu;
+
+   mutex_lock(&kvm->lock);
+   kvm_for_each_vcpu(i, vcpu, kvm)
+   vcpu_vtimer(vcpu)->cntvoff = cntvoff;
+   mutex_unlock(&kvm->lock);
+}
+
 void kvm_timer_vcpu_init(struct kvm_vcpu *vcpu)
 {
struct arch_timer_cpu *timer = &vcpu->arch.timer_cpu;
 
+   /* Synchronize cntvoff across all vtimers of a VM. */
+   update_vtimer_cntvoff(vcpu->kvm, kvm_phys_timer_read());
+
INIT_WORK(&timer->expired, kvm_timer_inject_irq_work);
hrtimer_init(&timer->timer, CLOCK_MONOTONIC, HRTIMER_MODE_ABS);
timer->timer.function = kvm_timer_expire;
@@ -377,7 +393,7 @@ int kvm_arm_timer_set_reg(struct kvm_vcpu *vcpu, u64 regid, 
u64 value)
vtimer->cnt_ctl = value;
break;
case KVM_REG_ARM_TIMER

Re: [PATCH v4 2/4] arm64: Work around Falkor erratum 1003

2017-02-01 Thread Catalin Marinas
On Wed, Feb 01, 2017 at 05:41:05PM +, Will Deacon wrote:
> On Wed, Feb 01, 2017 at 05:36:09PM +, Catalin Marinas wrote:
> > On Wed, Feb 01, 2017 at 04:33:58PM +, Will Deacon wrote:
> > > On Wed, Feb 01, 2017 at 11:29:22AM -0500, Christopher Covington wrote:
> > > > On 01/31/2017 12:56 PM, Marc Zyngier wrote:
> > > > > Given that all ARMv8 CPUs can support SW_PAN, it is more likely to be
> > > > > enabled than the ARMv8.1 PAN. I'd vote for supporting the workaround 
> > > > > in
> > > > > that case too, and hope that people do enable the HW version.
> > > > 
> > > > Okay, I'll do my best to add support for the SW PAN case. I rebased and
> > > > submitted v6 of the E1009 patch [1] so that it no longer depends on this
> > > > patch landing first, if you all are inclined to pick it up while work on
> > > > this E1003 patch continues.
> > > 
> > > The alternative is not enabling SW_PAN (at runtime) if this errata is
> > > present, along with a warning stating that hardware-PAN should be
> > > enabled in kconfig instead. Not sure what distributions will make of that
> > > though.
> > 
> > The problem with this patch is that when ARM64_SW_TTBR0_PAN is enabled
> > and in the absence of hardware PAN (or ARM64_PAN disabled),
> > cpu_do_switch_mm is no longer called for user process switching, so the
> > workaround is pretty much useless.
> 
> Oh, I see what you mean now.
> 
> > I'm ok with adding the Kconfig dependency below to
> > QCOM_FALKOR_ERRATUM_1003:
> > 
> > depends on !ARM64_SW_TTBR0_PAN || ARM64_PAN
> > 
> > together with a run-time warning if ARM64_SW_TTBR0_PAN is being used.
> 
> That makes it look like hardware-PAN is the cause of the erratum.

With the right Kconfig comment we could make this clearer.

> Maybe
> just select ARM64_PAN if the erratum workaround is selected, then
> runtime warning if we find that the h/w doesn't have PAN but does have
> the erratum (which should never fire)?

You still need this workaround even if you don't want any PAN (both sw
and hw PAN disabled). I wouldn't want to select ARM64_PAN since it's not
a dependency. It's more like if you do need a PAN, make sure you only
use the hw one.

-- 
Catalin
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH v4 2/4] arm64: Work around Falkor erratum 1003

2017-02-01 Thread Catalin Marinas
On Wed, Feb 01, 2017 at 05:49:34PM +, Catalin Marinas wrote:
> On Wed, Feb 01, 2017 at 05:41:05PM +, Will Deacon wrote:
> > On Wed, Feb 01, 2017 at 05:36:09PM +, Catalin Marinas wrote:
> > > On Wed, Feb 01, 2017 at 04:33:58PM +, Will Deacon wrote:
> > > > On Wed, Feb 01, 2017 at 11:29:22AM -0500, Christopher Covington wrote:
> > > > > On 01/31/2017 12:56 PM, Marc Zyngier wrote:
> > > > > > Given that all ARMv8 CPUs can support SW_PAN, it is more likely to 
> > > > > > be
> > > > > > enabled than the ARMv8.1 PAN. I'd vote for supporting the 
> > > > > > workaround in
> > > > > > that case too, and hope that people do enable the HW version.
> > > > > 
> > > > > Okay, I'll do my best to add support for the SW PAN case. I rebased 
> > > > > and
> > > > > submitted v6 of the E1009 patch [1] so that it no longer depends on 
> > > > > this
> > > > > patch landing first, if you all are inclined to pick it up while work 
> > > > > on
> > > > > this E1003 patch continues.
> > > > 
> > > > The alternative is not enabling SW_PAN (at runtime) if this errata is
> > > > present, along with a warning stating that hardware-PAN should be
> > > > enabled in kconfig instead. Not sure what distributions will make of 
> > > > that
> > > > though.
> > > 
> > > The problem with this patch is that when ARM64_SW_TTBR0_PAN is enabled
> > > and in the absence of hardware PAN (or ARM64_PAN disabled),
> > > cpu_do_switch_mm is no longer called for user process switching, so the
> > > workaround is pretty much useless.
> > 
> > Oh, I see what you mean now.
> > 
> > > I'm ok with adding the Kconfig dependency below to
> > > QCOM_FALKOR_ERRATUM_1003:
> > > 
> > >   depends on !ARM64_SW_TTBR0_PAN || ARM64_PAN
> > > 
> > > together with a run-time warning if ARM64_SW_TTBR0_PAN is being used.
> > 
> > That makes it look like hardware-PAN is the cause of the erratum.
> 
> With the right Kconfig comment we could make this clearer.
> 
> > Maybe
> > just select ARM64_PAN if the erratum workaround is selected, then
> > runtime warning if we find that the h/w doesn't have PAN but does have
> > the erratum (which should never fire)?
> 
> You still need this workaround even if you don't want any PAN (both sw
> and hw PAN disabled). I wouldn't want to select ARM64_PAN since it's not
> a dependency. It's more like if you do need a PAN, make sure you only
> use the hw one.

Alternatively, your select idea could be refined to:

select ARM64_PAN if ARM64_SW_TTBR0_PAN

but we still need a proper comment as people would wonder what this is
for.

-- 
Catalin
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH v4 2/4] arm64: Work around Falkor erratum 1003

2017-02-01 Thread Will Deacon
On Wed, Feb 01, 2017 at 05:49:34PM +, Catalin Marinas wrote:
> On Wed, Feb 01, 2017 at 05:41:05PM +, Will Deacon wrote:
> > On Wed, Feb 01, 2017 at 05:36:09PM +, Catalin Marinas wrote:
> > > On Wed, Feb 01, 2017 at 04:33:58PM +, Will Deacon wrote:
> > > > On Wed, Feb 01, 2017 at 11:29:22AM -0500, Christopher Covington wrote:
> > > > > On 01/31/2017 12:56 PM, Marc Zyngier wrote:
> > > > > > Given that all ARMv8 CPUs can support SW_PAN, it is more likely to 
> > > > > > be
> > > > > > enabled than the ARMv8.1 PAN. I'd vote for supporting the 
> > > > > > workaround in
> > > > > > that case too, and hope that people do enable the HW version.
> > > > > 
> > > > > Okay, I'll do my best to add support for the SW PAN case. I rebased 
> > > > > and
> > > > > submitted v6 of the E1009 patch [1] so that it no longer depends on 
> > > > > this
> > > > > patch landing first, if you all are inclined to pick it up while work 
> > > > > on
> > > > > this E1003 patch continues.
> > > > 
> > > > The alternative is not enabling SW_PAN (at runtime) if this errata is
> > > > present, along with a warning stating that hardware-PAN should be
> > > > enabled in kconfig instead. Not sure what distributions will make of 
> > > > that
> > > > though.
> > > 
> > > The problem with this patch is that when ARM64_SW_TTBR0_PAN is enabled
> > > and in the absence of hardware PAN (or ARM64_PAN disabled),
> > > cpu_do_switch_mm is no longer called for user process switching, so the
> > > workaround is pretty much useless.
> > 
> > Oh, I see what you mean now.
> > 
> > > I'm ok with adding the Kconfig dependency below to
> > > QCOM_FALKOR_ERRATUM_1003:
> > > 
> > >   depends on !ARM64_SW_TTBR0_PAN || ARM64_PAN
> > > 
> > > together with a run-time warning if ARM64_SW_TTBR0_PAN is being used.
> > 
> > That makes it look like hardware-PAN is the cause of the erratum.
> 
> With the right Kconfig comment we could make this clearer.

It's not just a comment though, the kconfig option for the workaround
will disappear from menuconfig as long as the dependencies aren't met.
The dependency is really that SW_PAN depends on !ERRATUM_1003, but that
doesn't work for the distributions.

> > Maybe
> > just select ARM64_PAN if the erratum workaround is selected, then
> > runtime warning if we find that the h/w doesn't have PAN but does have
> > the erratum (which should never fire)?
> 
> You still need this workaround even if you don't want any PAN (both sw
> and hw PAN disabled). I wouldn't want to select ARM64_PAN since it's not
> a dependency. It's more like if you do need a PAN, make sure you only
> use the hw one.

True, in the case that all PAN options are disabled we still want this
to work. How about:

  select ARM64_PAN if ARM64_SW_TTBR0_PAN

?

In fact, what's the reason for supporting SW_PAN and ARM64_PAN as a
config combination? Why not just have "PAN" that enables them both and
uses the hardware feature if it's there?

Will
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH v4 2/4] arm64: Work around Falkor erratum 1003

2017-02-01 Thread Catalin Marinas
On Wed, Feb 01, 2017 at 05:59:48PM +, Will Deacon wrote:
> On Wed, Feb 01, 2017 at 05:49:34PM +, Catalin Marinas wrote:
> > On Wed, Feb 01, 2017 at 05:41:05PM +, Will Deacon wrote:
> > > On Wed, Feb 01, 2017 at 05:36:09PM +, Catalin Marinas wrote:
> > > > On Wed, Feb 01, 2017 at 04:33:58PM +, Will Deacon wrote:
> > > > > On Wed, Feb 01, 2017 at 11:29:22AM -0500, Christopher Covington wrote:
> > > > > > On 01/31/2017 12:56 PM, Marc Zyngier wrote:
> > > > > > > Given that all ARMv8 CPUs can support SW_PAN, it is more likely 
> > > > > > > to be
> > > > > > > enabled than the ARMv8.1 PAN. I'd vote for supporting the 
> > > > > > > workaround in
> > > > > > > that case too, and hope that people do enable the HW version.
> > > > > > 
> > > > > > Okay, I'll do my best to add support for the SW PAN case. I rebased 
> > > > > > and
> > > > > > submitted v6 of the E1009 patch [1] so that it no longer depends on 
> > > > > > this
> > > > > > patch landing first, if you all are inclined to pick it up while 
> > > > > > work on
> > > > > > this E1003 patch continues.
> > > > > 
> > > > > The alternative is not enabling SW_PAN (at runtime) if this errata is
> > > > > present, along with a warning stating that hardware-PAN should be
> > > > > enabled in kconfig instead. Not sure what distributions will make of 
> > > > > that
> > > > > though.
> > > > 
> > > > The problem with this patch is that when ARM64_SW_TTBR0_PAN is enabled
> > > > and in the absence of hardware PAN (or ARM64_PAN disabled),
> > > > cpu_do_switch_mm is no longer called for user process switching, so the
> > > > workaround is pretty much useless.
> > > 
> > > Oh, I see what you mean now.
> > > 
> > > > I'm ok with adding the Kconfig dependency below to
> > > > QCOM_FALKOR_ERRATUM_1003:
> > > > 
> > > > depends on !ARM64_SW_TTBR0_PAN || ARM64_PAN
> > > > 
> > > > together with a run-time warning if ARM64_SW_TTBR0_PAN is being used.
> > > 
> > > That makes it look like hardware-PAN is the cause of the erratum.
> > 
> > With the right Kconfig comment we could make this clearer.
> 
> It's not just a comment though, the kconfig option for the workaround
> will disappear from menuconfig as long as the dependencies aren't met.
> The dependency is really that SW_PAN depends on !ERRATUM_1003, but that
> doesn't work for the distributions.

I agree.

> > > Maybe
> > > just select ARM64_PAN if the erratum workaround is selected, then
> > > runtime warning if we find that the h/w doesn't have PAN but does have
> > > the erratum (which should never fire)?
> > 
> > You still need this workaround even if you don't want any PAN (both sw
> > and hw PAN disabled). I wouldn't want to select ARM64_PAN since it's not
> > a dependency. It's more like if you do need a PAN, make sure you only
> > use the hw one.
> 
> True, in the case that all PAN options are disabled we still want this
> to work. How about:
> 
>   select ARM64_PAN if ARM64_SW_TTBR0_PAN

As I replied to myself, the above would work for me as well, so let's go
for this.

> In fact, what's the reason for supporting SW_PAN and ARM64_PAN as a
> config combination? Why not just have "PAN" that enables them both and
> uses the hardware feature if it's there?

Because SW PAN has a non-trivial performance hit. You would enable SW
PAN only if you are paranoid about security. HW PAN, OTOH, is very cheap
and I wouldn't want to miss enabling it in a single Image supporting
ARMv8.0 and ARMv8.1 just because SW PAN is slow on ARMv8.0.

IOW, ARM64_PAN is default y while ARM64_SW_TTBR0_PAN is default n.

-- 
Catalin
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH v4 2/4] arm64: Work around Falkor erratum 1003

2017-02-01 Thread Will Deacon
On Wed, Feb 01, 2017 at 06:22:44PM +, Catalin Marinas wrote:
> On Wed, Feb 01, 2017 at 05:59:48PM +, Will Deacon wrote:
> > On Wed, Feb 01, 2017 at 05:49:34PM +, Catalin Marinas wrote:
> > > On Wed, Feb 01, 2017 at 05:41:05PM +, Will Deacon wrote:
> > > > Maybe
> > > > just select ARM64_PAN if the erratum workaround is selected, then
> > > > runtime warning if we find that the h/w doesn't have PAN but does have
> > > > the erratum (which should never fire)?
> > > 
> > > You still need this workaround even if you don't want any PAN (both sw
> > > and hw PAN disabled). I wouldn't want to select ARM64_PAN since it's not
> > > a dependency. It's more like if you do need a PAN, make sure you only
> > > use the hw one.
> > 
> > True, in the case that all PAN options are disabled we still want this
> > to work. How about:
> > 
> >   select ARM64_PAN if ARM64_SW_TTBR0_PAN
> 
> As I replied to myself, the above would work for me as well, so let's go
> for this.
> 
> > In fact, what's the reason for supporting SW_PAN and ARM64_PAN as a
> > config combination? Why not just have "PAN" that enables them both and
> > uses the hardware feature if it's there?
> 
> Because SW PAN has a non-trivial performance hit. You would enable SW
> PAN only if you are paranoid about security. HW PAN, OTOH, is very cheap
> and I wouldn't want to miss enabling it in a single Image supporting
> ARMv8.0 and ARMv8.1 just because SW PAN is slow on ARMv8.0.
> 
> IOW, ARM64_PAN is default y while ARM64_SW_TTBR0_PAN is default n.

Ok, in that case, then how about another permutation: we make
ARM64_SW_TTBR0_PAN depend on ARM64_PAN? Then when you select "PAN Support"
you get a new menu option underneath it for the emulation? I think that
solves the erratum case and the use-case above.

Will
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH v4 2/4] arm64: Work around Falkor erratum 1003

2017-02-01 Thread Catalin Marinas
On Wed, Feb 01, 2017 at 06:34:01PM +, Will Deacon wrote:
> On Wed, Feb 01, 2017 at 06:22:44PM +, Catalin Marinas wrote:
> > On Wed, Feb 01, 2017 at 05:59:48PM +, Will Deacon wrote:
> > > On Wed, Feb 01, 2017 at 05:49:34PM +, Catalin Marinas wrote:
> > > > On Wed, Feb 01, 2017 at 05:41:05PM +, Will Deacon wrote:
> > > > > Maybe
> > > > > just select ARM64_PAN if the erratum workaround is selected, then
> > > > > runtime warning if we find that the h/w doesn't have PAN but does have
> > > > > the erratum (which should never fire)?
> > > > 
> > > > You still need this workaround even if you don't want any PAN (both sw
> > > > and hw PAN disabled). I wouldn't want to select ARM64_PAN since it's not
> > > > a dependency. It's more like if you do need a PAN, make sure you only
> > > > use the hw one.
> > > 
> > > True, in the case that all PAN options are disabled we still want this
> > > to work. How about:
> > > 
> > >   select ARM64_PAN if ARM64_SW_TTBR0_PAN
> > 
> > As I replied to myself, the above would work for me as well, so let's go
> > for this.
> > 
> > > In fact, what's the reason for supporting SW_PAN and ARM64_PAN as a
> > > config combination? Why not just have "PAN" that enables them both and
> > > uses the hardware feature if it's there?
> > 
> > Because SW PAN has a non-trivial performance hit. You would enable SW
> > PAN only if you are paranoid about security. HW PAN, OTOH, is very cheap
> > and I wouldn't want to miss enabling it in a single Image supporting
> > ARMv8.0 and ARMv8.1 just because SW PAN is slow on ARMv8.0.
> > 
> > IOW, ARM64_PAN is default y while ARM64_SW_TTBR0_PAN is default n.
> 
> Ok, in that case, then how about another permutation: we make
> ARM64_SW_TTBR0_PAN depend on ARM64_PAN? Then when you select "PAN Support"
> you get a new menu option underneath it for the emulation? I think that
> solves the erratum case and the use-case above.

The problem is that ARM64_PAN is an ARMv8.1 feature and currently
grouped accordingly in Kconfig. ARM64_SW_TTBR0_PAN is complementary (and
even not recommended on ARMv8.1). We can do this if we break the ARMv8.x
feature grouping but just for this erratum, I don't think it's worth.

-- 
Catalin
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH V8 05/10] acpi: apei: handle SEA notification type for ARMv8

2017-02-01 Thread kbuild test robot
Hi Tyler,

[auto build test ERROR on pm/linux-next]
[also build test ERROR on v4.10-rc6 next-20170201]
[if your patch is applied to the wrong git tree, please drop us a note to help 
improve the system]

url:
https://github.com/0day-ci/linux/commits/Tyler-Baicar/Add-UEFI-2-6-and-ACPI-6-1-updates-for-RAS-on-ARM64/20170202-020320
base:   https://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm.git 
linux-next
config: arm64-defconfig (attached as .config)
compiler: aarch64-linux-gnu-gcc (Debian 6.1.1-9) 6.1.1 20160705
reproduce:
wget 
https://git.kernel.org/cgit/linux/kernel/git/wfg/lkp-tests.git/plain/sbin/make.cross
 -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# save the attached .config to linux build tree
make.cross ARCH=arm64 

All errors (new ones prefixed by >>):

   arch/arm64/mm/built-in.o: In function `do_sea':
>> arch/arm64/mm/fault.c:511: undefined reference to `ghes_notify_sea'
   arch/arm64/mm/fault.c:511:(.text+0x1868): relocation truncated to fit: 
R_AARCH64_CALL26 against undefined symbol `ghes_notify_sea'

vim +511 arch/arm64/mm/fault.c

   505  
   506  /*
   507   * synchronize_rcu() will wait for nmi_exit(), so no need to
   508   * rcu_read_lock().
   509   */
   510  nmi_enter();
 > 511  ghes_notify_sea();
   512  nmi_exit();
   513  
   514  info.si_signo = SIGBUS;

---
0-DAY kernel test infrastructureOpen Source Technology Center
https://lists.01.org/pipermail/kbuild-all   Intel Corporation


.config.gz
Description: application/gzip
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH V8 08/10] ras: acpi / apei: generate trace event for unrecognized CPER section

2017-02-01 Thread kbuild test robot
Hi Tyler,

[auto build test ERROR on pm/linux-next]
[also build test ERROR on v4.10-rc6 next-20170201]
[if your patch is applied to the wrong git tree, please drop us a note to help 
improve the system]

url:
https://github.com/0day-ci/linux/commits/Tyler-Baicar/Add-UEFI-2-6-and-ACPI-6-1-updates-for-RAS-on-ARM64/20170202-020320
base:   https://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm.git 
linux-next
config: x86_64-randconfig-r0-02020102 (attached as .config)
compiler: gcc-6 (Debian 6.2.0-3) 6.2.0 20160901
reproduce:
# save the attached .config to linux build tree
make ARCH=x86_64 

All errors (new ones prefixed by >>):

   drivers/built-in.o: In function `ghes_do_proc.isra.9':
>> ghes.c:(.text+0x92f3a): undefined reference to 
>> `__tracepoint_unknown_sec_event'
   ghes.c:(.text+0x92f73): undefined reference to 
`__tracepoint_unknown_sec_event'
   ghes.c:(.text+0x92fdb): undefined reference to 
`__tracepoint_unknown_sec_event'

---
0-DAY kernel test infrastructureOpen Source Technology Center
https://lists.01.org/pipermail/kbuild-all   Intel Corporation


.config.gz
Description: application/gzip
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH V8 09/10] trace, ras: add ARM processor error trace event

2017-02-01 Thread kbuild test robot
Hi Tyler,

[auto build test ERROR on pm/linux-next]
[also build test ERROR on v4.10-rc6 next-20170201]
[if your patch is applied to the wrong git tree, please drop us a note to help 
improve the system]

url:
https://github.com/0day-ci/linux/commits/Tyler-Baicar/Add-UEFI-2-6-and-ACPI-6-1-updates-for-RAS-on-ARM64/20170202-020320
base:   https://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm.git 
linux-next
config: x86_64-randconfig-r0-02020102 (attached as .config)
compiler: gcc-6 (Debian 6.2.0-3) 6.2.0 20160901
reproduce:
# save the attached .config to linux build tree
make ARCH=x86_64 

All errors (new ones prefixed by >>):

   drivers/built-in.o: In function `ghes_do_proc.isra.9':
>> ghes.c:(.text+0x92fda): undefined reference to `__tracepoint_arm_event'
   ghes.c:(.text+0x9300a): undefined reference to `__tracepoint_arm_event'
   ghes.c:(.text+0x93054): undefined reference to `__tracepoint_arm_event'
   ghes.c:(.text+0x93083): undefined reference to 
`__tracepoint_unknown_sec_event'
   ghes.c:(.text+0x930b8): undefined reference to 
`__tracepoint_unknown_sec_event'
   ghes.c:(.text+0x93121): undefined reference to 
`__tracepoint_unknown_sec_event'

---
0-DAY kernel test infrastructureOpen Source Technology Center
https://lists.01.org/pipermail/kbuild-all   Intel Corporation


.config.gz
Description: application/gzip
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH V8 09/10] trace, ras: add ARM processor error trace event

2017-02-01 Thread Steven Rostedt
On Wed,  1 Feb 2017 10:16:52 -0700
Tyler Baicar  wrote:

> Currently there are trace events for the various RAS
> errors with the exception of ARM processor type errors.
> Add a new trace event for such errors so that the user
> will know when they occur. These trace events are
> consistent with the ARM processor error section type
> defined in UEFI 2.6 spec section N.2.4.4.
> 
> Signed-off-by: Tyler Baicar 
> Acked-by: Steven Rostedt 
> ---
>  drivers/acpi/apei/ghes.c|  7 ++-
>  drivers/firmware/efi/cper.c |  1 +
>  drivers/ras/ras.c   |  1 +
>  include/ras/ras_event.h | 34 ++
>  4 files changed, 42 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
> index a989345..013faf0 100644
> --- a/drivers/acpi/apei/ghes.c
> +++ b/drivers/acpi/apei/ghes.c
> @@ -512,7 +512,12 @@ static void ghes_do_proc(struct ghes *ghes,
>  
>   }
>  #endif
> - else {
> + else if (!uuid_le_cmp(sec_type, CPER_SEC_PROC_ARM)) {
> + struct cper_sec_proc_arm *arm_err;
> +
> + arm_err = acpi_hest_generic_data_payload(gdata);
> + trace_arm_event(arm_err);

According to the kbuild failure, I'm guessing this file requires a:

 #include 

-- Steve

> + } else {
>   void *unknown_err = 
> acpi_hest_generic_data_payload(gdata);
>   trace_unknown_sec_event(&sec_type,
>   fru_id, fru_text, sec_sev,
> diff --git a/drivers/firmware/efi/cper.c b/drivers/firmware/efi/cper.c
> index 48cb8ee..0ec678e 100644
> --- a/drivers/firmware/efi/cper.c
> +++ b/drivers/firmware/efi/cper.c
> @@ -35,6 +35,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  
>  #define INDENT_SP" "
>  
> diff --git a/drivers/ras/ras.c b/drivers/ras/ras.c
> index fb2500b..8ba5a94 100644
> --- a/drivers/ras/ras.c
> +++ b/drivers/ras/ras.c
> @@ -28,3 +28,4 @@ static int __init ras_init(void)
>  #endif
>  EXPORT_TRACEPOINT_SYMBOL_GPL(mc_event);
>  EXPORT_TRACEPOINT_SYMBOL_GPL(unknown_sec_event);
> +EXPORT_TRACEPOINT_SYMBOL_GPL(arm_event);
> diff --git a/include/ras/ras_event.h b/include/ras/ras_event.h
> index 5861b6f..b36db48 100644
> --- a/include/ras/ras_event.h
> +++ b/include/ras/ras_event.h
> @@ -162,6 +162,40 @@
>  );
>  
>  /*
> + * ARM Processor Events Report
> + *
> + * This event is generated when hardware detects an ARM processor error
> + * has occurred. UEFI 2.6 spec section N.2.4.4.
> + */
> +TRACE_EVENT(arm_event,
> +
> + TP_PROTO(const struct cper_sec_proc_arm *proc),
> +
> + TP_ARGS(proc),
> +
> + TP_STRUCT__entry(
> + __field(u64, mpidr)
> + __field(u64, midr)
> + __field(u32, running_state)
> + __field(u32, psci_state)
> + __field(u8, affinity)
> + ),
> +
> + TP_fast_assign(
> + __entry->affinity = proc->affinity_level;
> + __entry->mpidr = proc->mpidr;
> + __entry->midr = proc->midr;
> + __entry->running_state = proc->running_state;
> + __entry->psci_state = proc->psci_state;
> + ),
> +
> + TP_printk("affinity level: %d; MPIDR: %016llx; MIDR: %016llx; "
> +   "running state: %d; PSCI state: %d",
> +   __entry->affinity, __entry->mpidr, __entry->midr,
> +   __entry->running_state, __entry->psci_state)
> +);
> +
> +/*
>   * Unknown Section Report
>   *
>   * This event is generated when hardware detected a hardware

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm