On 03.01.2014, at 03:51, Scott Wood <scottw...@freescale.com> wrote:

> On Sun, 2013-12-29 at 16:43 +0100, Alexander Graf wrote:
>> On 11.07.2013, at 00:47, Scott Wood <scottw...@freescale.com> wrote:
>>> diff --git a/arch/powerpc/kvm/book3s_pr.c b/arch/powerpc/kvm/book3s_pr.c
>>> index ddfaf56..c13caa0 100644
>>> --- a/arch/powerpc/kvm/book3s_pr.c
>>> +++ b/arch/powerpc/kvm/book3s_pr.c
>>> @@ -884,14 +884,11 @@ program_interrupt:
>>>              * and if we really did time things so badly, then we just exit
>>>              * again due to a host external interrupt.
>>>              */
>>> -           local_irq_disable();
>>>             s = kvmppc_prepare_to_enter(vcpu);
>>> -           if (s <= 0) {
>>> -                   local_irq_enable();
>>> +           if (s <= 0)
>>>                     r = s;
>>> -           } else {
>>> +           else
>>>                     kvmppc_fix_ee_before_entry();
>>> -           }
>>>     }
>> 
>> Please add a comment here stating that interrupts are hard disabled at this 
>> point.
> 
> OK.
> 
>>> diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c
>>> index 4e05f8c..2f7a221 100644
>>> --- a/arch/powerpc/kvm/powerpc.c
>>> +++ b/arch/powerpc/kvm/powerpc.c
>>> @@ -64,12 +64,14 @@ int kvmppc_prepare_to_enter(struct kvm_vcpu *vcpu)
>>> {
>>>     int r = 1;
>>> 
>>> -   WARN_ON_ONCE(!irqs_disabled());
>>> +   WARN_ON(irqs_disabled());
>>> +   hard_irq_disable();
>>> +
>>>     while (true) {
>>>             if (need_resched()) {
>>>                     local_irq_enable();
>> 
>> One thing I find reasonably tricky in this approach is that we run
>> local_irq_enable, but hard_irq_disable. I did dig through the code and
>> it should work just fine, but I think we should make sure to note that
>> this is intended and doesn't just work by accident.
>> 
>> Just add a comment above the first call to hard_irq_disable() that
>> explains that local_irq_enable() will properly unroll hard_irq_disable.
>> That way the next person reading this doesn't have to dig as deeply.
> 
> There is no hard_irq_enable() -- only __hard_irq_enable() that doesn't
> update local_paca->irq_happened.
> 
> This is normal use of the API.  If there does need to be a comment, it
> should go in hw_irq.h. :-)

Yeah, it's always confusing to me in other places too :). But there are only so 
many places that have to deal with really hard disabled interrupts.

> 
>>> #ifdef CONFIG_PPC64
>>> -           /* lazy EE magic */
>>> -           hard_irq_disable();
>>> -           if (lazy_irq_pending()) {
>>> -                   /* Got an interrupt in between, try again */
>>> -                   local_irq_enable();
>>> -                   local_irq_disable();
>>> -                   kvm_guest_exit();
>>> -                   continue;
>>> -           }
>>> +           WARN_ON(lazy_irq_pending());
>>> #endif
>>> +           /* Can't use irqs_disabled() because we want hard irq state */
>>> +           WARN_ON(mfmsr() & MSR_EE);
>> 
>> The reason for lazy EE is that mfmsr() is supposed to be slow.
> 
> Not just mtmsr?
> 
> And when I complained about wrtee not being all that slow on our
> hardware, Ben said it was also for perf coverage. :-)
> 
>> Couldn't we check for the internal hard disable flag instead? Just create a 
>> new
>> helper in hw_irq.h that tells us
>> 
>>  local_paca->irq_happened & PACA_IRQ_HARD_DIS
>> 
>> on PPC64 and
>> 
>>  !(mfmsr() & MSR_EE)
>> 
>> on PPC32. We can then just use that helper here and not run "expensive" 
>> mfmsr() calls.
> 
>> I'm not sure whether it's actually measurable overhead in the context
>> of the whole world switch, but why diverge from the rest of the system?
>> If you think a new helper is overkill, I'd be fine with a simple #ifdef
>> here just as well.
> 
> I'd hope a mfmsr wouldn't be so slow on any hardware as to be a big deal
> here, though it'd also be nice if there were a Linux-wide way of
> specifying whether a particular WARN should be always checked, or only
> if the kernel is built with some debug option...
> 
> The "rest of the system" is irqs_disabled() and there's already a
> comment explaining why we diverge from that.
> 
> Maybe we should just remove that check; we'd still at least have the
> 64-bit check in kvmppc_fix_ee_before_entry.

Well, as I mentioned we're already so deep down in the guts of how interrupt 
handling works that it's perfectly fine to check paca->irq_happened directly 
here if we care.

> 
>>>             kvm_guest_enter();
>>> -           break;
>>> +           return r;
>> 
>> This must be 0 at this point, right?
> 
> No, it'll be 1.
> 
>> Either way, I prefer to keep the code easily understandable. How about
>> you keep the break and just add an if (r) around the local_irq_enable()
>> below? Then add a comment stating that the return value tells us
>> whether entry is ok to proceed and interrupts are disabled (0) or
>> something didn't work out and interrupts are enabled (!0).
> 
> How is it more understandable to overload what looks like a single code
> path with divergent cases that have little to nothing in common?  Would
> it help to add a comment on the return-to-host code path indicating that
> it is only used for returning to the host?

Yup :)

> I'd be fine with replacing "return r" with "return 1" (and getting rid
> of the initialization of r at the top of the function, unless GCC
> complains inappropriately).

Works fine for me.


Alex

--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to