Jan Kiszka wrote:
> Philippe Gerum wrote:
>> Jan Kiszka wrote:
>>> Jan Kiszka wrote:
>>>> Philippe Gerum wrote:
>>>>> Jan Kiszka wrote:
>>>>>> Philippe Gerum wrote:
>>>>>>> Jan Kiszka wrote:
>>>>>>>> This is an attempt to fix the broken root domain state adjustment in
>>>>>>>> __ipipe_handle_exception. Patch below fixes the issues recently 
>>>>>>>> reported
>>>>>>>> by Roman Pisl. Also, it currently makes much more sense to me than what
>>>>>>>> we have so far.
>>>>>>>>
>>>>>>>> In short, this patch propagates the hardware irq state into the root
>>>>>>>> domains stall flag shortly before calling into the Linux handler, and
>>>>>>>> only then. This avoids spurious root domain stalls the end up over the
>>>>>>>> wrong Linux context due to context switches between enter and exit of
>>>>>>>> ipipe_handle_exception. Also, this patch drops the bogus
>>>>>>>> local_irq_save/restore pair that doesn't account for Linux irq state
>>>>>>>> changes inside its fault handler.
>>>>>>>>
>>>>>>> Actually, it is not bogus at all, it is even mandatory on x86_64, given 
>>>>>>> that we
>>>>>>> don't branch to any sysretq/iretq emulation unlike with x86_32. So if 
>>>>>>> we don't
>>>>>>> restore the stall bit for the root domain properly there, we could end 
>>>>>>> up
>>>>>>> running with interrupts off in user-space.
>>>>>>>
>>>>>>> However, the way the interrupt state is currently saved is wrong: we 
>>>>>>> should not
>>>>>>> local_irq_disable() over non-root domains. Here is some on-line 
>>>>>>> documentation to
>>>>>>> explain why:
>>>>>>>
>>>>>>> The main difference between x86_32 and 64 is that the former does 
>>>>>>> virtualize the
>>>>>>> interrupt state in entry_32.S, unlike the latter. For that reason, 
>>>>>>> x86_64 does
>>>>>>> not require (actually, we should not be doing) any fixup. So, to sum up:
>>>>>>>
>>>>>>> - we use fixup_if() to restore the virtual interrupt state properly 
>>>>>>> when control
>>>>>>> is given back to the code that triggered the fault/exception (x86_32). 
>>>>>>> We need
>>>>>>> to do that because of task migrations between primary and secondary 
>>>>>>> modes.
>>>>>>>
>>>>>>> - we must clear the virtual interrupt flag before calling the I-pipe 
>>>>>>> handler /
>>>>>>> Linux regular exception handler, because our callee may/must run in the 
>>>>>>> root
>>>>>>> domain as well, and expect that interrupt state to reflect the hw one, 
>>>>>>> as set by
>>>>>>> the x86 exception gate / fault prologue in entry_*.S.
>>>>>>>
>>>>>>> - because of the above, we must use 
>>>>>>> local_irq_save()/local_irq_restore_nosync()
>>>>>>> in our fault handler to make sure to restore the virtual interrupt flag 
>>>>>>> properly
>>>>>>> between this routine, and the exception return statement (i.e. during 
>>>>>>> the Linux
>>>>>>> fault epilogue in entry_*.S).
>>>>>> OK, if there is a reason to enforce a stalled root domain while calling
>>>>>> into the exception hook, this makes some sense. But I don't think it is
>>>>>> formally correct to save the root state on entry and blindly restore it
>>>>>> _after_ calling the Linux handler. I rather think we should keep the
>>>>>> state that Linux leaves behind to remain transparent to it. Maybe no
>>>>>> practical issue ATM, but it makes the code at least illogical.
>>>>>>
>>>>> Please re-read the explanations, and you will find the logic. I cannot do
>>>>> anything more than re-hashing what I just said. What you perceive as 
>>>>> illogical
>>>>> is actually the only sane way to do this. Formally speaking, a linux fault
>>>>> handler may NOT alter the interrupt state blindly, so we must be able to 
>>>>> assume
>>>>> that we ought to restore it the way the lower code set it.
>>>> I got your first and second point, but they don't imply to me that the
>>>> third shall be correct as well. "...to make sure to restore the virtual
>>>> interrupt flag properly" is not directly an clear explanation (for me)
>>>> why we have to restore the flag across calls to the _Linux_ handler. We
>>>> can demand that the hook handler leaves the root state untouched, but
>>>> requiring the same from Linux is a restriction that you don't find in
>>>> the ipipe-less case, nor do I see the reason for this under ipipe control.
>>>>
>>> The make my question a bit more concrete (and help me writing the right
>>> comments around these lines): What makes the following change bogus,
>>> which scenario will fail?
>>>
>>> Index: b/arch/x86/kernel/ipipe.c
>>> ===================================================================
>>> --- a/arch/x86/kernel/ipipe.c
>>> +++ b/arch/x86/kernel/ipipe.c
>>> @@ -685,7 +685,9 @@ int __ipipe_handle_exception(struct pt_r
>>>     }
>>>  
>>>     __ipipe_std_extable[vector](regs, error_code);
>>> -   local_irq_restore_nosync(flags);
>>> +
>>> +   __fixup_if(test_bit(IPIPE_STALL_FLAG, &ipipe_root_cpudom_var(status)),
>>> +              regs);
>>>  
>>>     return 0;
>>>  }
>>>
>>
>> This would break the interrupt state on x86_64, because it is not 
>> virtualized by
> 
> Hmm, __fixup_if is void on x86-64, so this was way off what I was trying
> to express.
> 
>> the low level code (latency wise, this is not worth the burden). So your
>> exception path would stall the root domain, and never unstall it because you 
>> do
>> not have any iretq/sysretq emulation; actually, you do not have any fixup. 
>> This
>> would work on x86_32 for the converse reason though.
> 
> OK, I finally understood this difference. (I guess it comes from a
> different code structure of entry_32.S compared to entry_64.S, right? So
> nothing we could unify on our own?)
>

Exactly, or at least something that would be overkill to unify when it comes to
full virtualization of the interrupt state in entry_64.S, for basically no gain,
and maybe worse performances.

> To clarify this for me: For 32 bit, the pipeline state after iret/sysret
> is calculated in the entry layer (in __ipipe_unstall_iret_root more
> precisely), so the local_irq_restore_nosync is actually of minor
> importance here, isn't it? Couldn't we skip it for this arch then?
> 

Yes, this would work with the current upstream code; but at the same time, not
restoring the stall bit the way we got it on entry of the exception handler
would break the assumption the I-pipe may make in further revisions that no
routine will interfere with the root domain state.

> On 64 bit, we have to set the right pipeline state before returning to
> the entry layer because it won't be touched there at all. We currently
> do this based on the state found on exception entry, but we could also
> do it based on the regs.flags state that the Linux handler left behind
> (like we do on 32 bit). But the scenario I had in mind where this would
> actually make a difference turned out to be a red herring. I don't think
> Linux modifies regs.flags in its exception handlers. The ipipe pattern
> remains inconsistent IMHO, but it is practically irrelevant.
> 

I don't think it is inconsistent, it is just about defining who owns the
interrupt state. When the I-pipe is enabled, the fundamental design choice is
that our low level code has precedence over what the upstream code does.
Obviously, we try to do that in a way that keeps Linux happy, but if you
consider the way x86_64 works for us, we actually choose to enforce interrupt
protection using the hw state like in the I-pipe less case, instead of relying
on a virtualized state while running the epilogue code, with some exceptions
like calling preempt_schedule_irq() with (virtualized) interrupts off.

> 
> Final question to explain the __fixup_if in __ipipe_handle_exception:
> That's due to the scenario where we migrate to the root domain while
> running the notify handler? We may return from that migration with some
> IF state in regs.flags that no longer matches the one found on exception
> entry, correct?

Correct.

> 
> Will stuff all this into a few lines of comments soon.
> 
> Thanks for your patience,
> Jan
> 


-- 
Philippe.

_______________________________________________
Adeos-main mailing list
[email protected]
https://mail.gna.org/listinfo/adeos-main

Reply via email to