Re: xen/evtchn and forced threaded irq

2019-02-22 Thread Jan Beulich
>>> On 20.02.19 at 23:03,  wrote:
> On 2/20/19 9:46 PM, Boris Ostrovsky wrote:
>> On 2/20/19 3:46 PM, Julien Grall wrote:
>>> On 2/20/19 8:04 PM, Boris Ostrovsky wrote:
 On 2/20/19 1:05 PM, Julien Grall wrote:
 Some sort of a FIFO that stores {irq, data} tuple. It could obviously be
 implemented as a ring but not necessarily as Xen shared ring (if that's
 what you were referring to).

I'm sort of lost here with the mixed references to "interrupts" and
"events". Interrupts can be shared (and have a per-instance
(irq,data) tuple), but there should be at most one interrupt
underlying the event channel systems, shouldn't there? Event
channels otoh can't be shared, and hence there's only one
"data" item to be associated with each channel, at which point a
plain counter ought to do.

>>> The underlying question is what happen if you miss an interrupt. Is it
>>> going to be ok?
>> 
>> This I am not sure about. I thought yes since we are signaling the
>> process only once.
> 
> I have CCed Andrew and Jan to see if they can help here.

What meaning of "miss" do you use here? As long as is only means
delay (i.e. miss an instance, but get notified as soon as feasible
even if there is not further event coming from the source) - that
would be okay, I think. But if you truly mean "miss" (i.e. event lost
altogether, with silence resulting if there's no further event), then
no, this would not be tolerable.

Jan




Re: xen/evtchn and forced threaded irq

2019-02-21 Thread Juergen Gross
On 19/02/2019 18:31, Julien Grall wrote:
> Hi all,
> 
> I have been looking at using Linux RT in Dom0. Once the guest is started,
> the console is ending to have a lot of warning (see trace below).
> 
> After some investigation, this is because the irq handler will now be 
> threaded.
> I can reproduce the same error with the vanilla Linux when passing the option
> 'threadirqs' on the command line (the trace below is from 5.0.0-rc7 that has
> not RT support).
> 
> FWIW, the interrupt for port 6 is used to for the guest to communicate with
> xenstore.
> 
> From my understanding, this is happening because the interrupt handler is now
> run in a thread. So we can have the following happening.
> 
>Interrupt context| Interrupt thread
> |  
>receive interrupt port 6 |
>clear the evtchn port|
>set IRQF_RUNTHREAD |
>kick interrupt thread|
> |clear IRQF_RUNTHREAD
> |call evtchn_interrupt
>receive interrupt port 6 |
>clear the evtchn port|
>set IRQF_RUNTHREAD   |
>kick interrupt thread|
> |disable interrupt port 6
> |evtchn->enabled = false
> |[]
> |
> |*** Handling the second interrupt ***
> |clear IRQF_RUNTHREAD
> |call evtchn_interrupt
> |WARN(...)
> 
> I am not entirely sure how to fix this. I have two solutions in mind:
> 
> 1) Prevent the interrupt handler to be threaded. We would also need to
> switch from spin_lock to raw_spin_lock as the former may sleep on RT-Linux.
> 
> 2) Remove the warning

3) Split the handler (RT-only?) to a non-threaded part containing
   everything until the "evtchn->enabled = false" and only then
   kick the thread doing the rest, including the spin_lock().

Juergen


Re: xen/evtchn and forced threaded irq

2019-02-20 Thread Julien Grall

Hi Boris,

On 2/20/19 9:46 PM, Boris Ostrovsky wrote:

On 2/20/19 3:46 PM, Julien Grall wrote:

(+ Andrew and Jan for feedback on the event channel interrupt)

Hi Boris,

Thank you for the your feedback.

On 2/20/19 8:04 PM, Boris Ostrovsky wrote:

On 2/20/19 1:05 PM, Julien Grall wrote:

Hi,

On 20/02/2019 17:07, Boris Ostrovsky wrote:

On 2/20/19 9:15 AM, Julien Grall wrote:

Hi Boris,

Thank you for your answer.

On 20/02/2019 00:02, Boris Ostrovsky wrote:

On Tue, Feb 19, 2019 at 05:31:10PM +, Julien Grall wrote:

Hi all,

I have been looking at using Linux RT in Dom0. Once the guest is
started,
the console is ending to have a lot of warning (see trace below).

After some investigation, this is because the irq handler will now
be threaded.
I can reproduce the same error with the vanilla Linux when passing
the option
'threadirqs' on the command line (the trace below is from 5.0.0-rc7
that has
not RT support).

FWIW, the interrupt for port 6 is used to for the guest to
communicate with
xenstore.

    From my understanding, this is happening because the interrupt
handler is now
run in a thread. So we can have the following happening.

   Interrupt context    | Interrupt thread
    |
   receive interrupt port 6 |
   clear the evtchn port    |
   set IRQF_RUNTHREAD    |
   kick interrupt thread    |
    |    clear IRQF_RUNTHREAD
    |    call evtchn_interrupt
   receive interrupt port 6 |
   clear the evtchn port    |
   set IRQF_RUNTHREAD   |
   kick interrupt thread    |
    |    disable interrupt port 6
    |    evtchn->enabled = false
    |    []
    |
    |    *** Handling the second
interrupt ***
    |    clear IRQF_RUNTHREAD
    |    call evtchn_interrupt
    |    WARN(...)

I am not entirely sure how to fix this. I have two solutions in
mind:

1) Prevent the interrupt handler to be threaded. We would also
need to
switch from spin_lock to raw_spin_lock as the former may sleep on
RT-Linux.

2) Remove the warning


I think access to evtchn->enabled is racy so (with or without the
warning) we can't use it reliably.


Thinking about it, it would not be the only issue. The ring is sized
to contain only one instance of the same event. So if you receive
twice the event, you may overflow the ring.


Hm... That's another argument in favor of "unthreading" the handler.


I first thought it would be possible to unthread it. However,
wake_up_interruptible is using a spin_lock. On RT spin_lock can sleep,
so this cannot be used in an interrupt context.

So I think "unthreading" the handler is not an option here.


That sounds like a different problem. I.e. there are two issues:
* threaded interrupts don't work properly (races, ring overflow)
* evtchn_interrupt() (threaded or not) has spin_lock(), which is not
going to work for RT


I am afraid that's not correct, you can use spin_lock() in threaded
interrupt handler.


In non-RT handler -- yes, but not in an RT one (in fact, isn't this what
you yourself said above?)


In RT-linux, interrupt handlers are threaded by default. So the handler 
will not run in the interrupt context. Hence, it will be safe to call 
spin_lock.


However, if you force the handler to not be threaded (IRQF_NO_THREAD), 
it will run in interrupt context.










Another alternative could be to queue the irq if !evtchn->enabled
and
handle it in evtchn_write() (which is where irq is supposed to be
re-enabled).

What do you mean by queue? Is it queueing in the ring?



No, I was thinking about having a new structure for deferred
interrupts.


Hmmm, I am not entirely sure what would be the structure here. Could
you expand your thinking?


Some sort of a FIFO that stores {irq, data} tuple. It could obviously be
implemented as a ring but not necessarily as Xen shared ring (if that's
what you were referring to).


The underlying question is what happen if you miss an interrupt. Is it
going to be ok?


This I am not sure about. I thought yes since we are signaling the
process only once.


I have CCed Andrew and Jan to see if they can help here.

Cheers,

--
Julien Grall


Re: xen/evtchn and forced threaded irq

2019-02-20 Thread Boris Ostrovsky
On 2/20/19 3:46 PM, Julien Grall wrote:
> (+ Andrew and Jan for feedback on the event channel interrupt)
>
> Hi Boris,
>
> Thank you for the your feedback.
>
> On 2/20/19 8:04 PM, Boris Ostrovsky wrote:
>> On 2/20/19 1:05 PM, Julien Grall wrote:
>>> Hi,
>>>
>>> On 20/02/2019 17:07, Boris Ostrovsky wrote:
 On 2/20/19 9:15 AM, Julien Grall wrote:
> Hi Boris,
>
> Thank you for your answer.
>
> On 20/02/2019 00:02, Boris Ostrovsky wrote:
>> On Tue, Feb 19, 2019 at 05:31:10PM +, Julien Grall wrote:
>>> Hi all,
>>>
>>> I have been looking at using Linux RT in Dom0. Once the guest is
>>> started,
>>> the console is ending to have a lot of warning (see trace below).
>>>
>>> After some investigation, this is because the irq handler will now
>>> be threaded.
>>> I can reproduce the same error with the vanilla Linux when passing
>>> the option
>>> 'threadirqs' on the command line (the trace below is from 5.0.0-rc7
>>> that has
>>> not RT support).
>>>
>>> FWIW, the interrupt for port 6 is used to for the guest to
>>> communicate with
>>> xenstore.
>>>
>>>    From my understanding, this is happening because the interrupt
>>> handler is now
>>> run in a thread. So we can have the following happening.
>>>
>>>   Interrupt context    | Interrupt thread
>>>    |
>>>   receive interrupt port 6 |
>>>   clear the evtchn port    |
>>>   set IRQF_RUNTHREAD    |
>>>   kick interrupt thread    |
>>>    |    clear IRQF_RUNTHREAD
>>>    |    call evtchn_interrupt
>>>   receive interrupt port 6 |
>>>   clear the evtchn port    |
>>>   set IRQF_RUNTHREAD   |
>>>   kick interrupt thread    |
>>>    |    disable interrupt port 6
>>>    |    evtchn->enabled = false
>>>    |    []
>>>    |
>>>    |    *** Handling the second
>>> interrupt ***
>>>    |    clear IRQF_RUNTHREAD
>>>    |    call evtchn_interrupt
>>>    |    WARN(...)
>>>
>>> I am not entirely sure how to fix this. I have two solutions in
>>> mind:
>>>
>>> 1) Prevent the interrupt handler to be threaded. We would also
>>> need to
>>> switch from spin_lock to raw_spin_lock as the former may sleep on
>>> RT-Linux.
>>>
>>> 2) Remove the warning
>>
>> I think access to evtchn->enabled is racy so (with or without the
>> warning) we can't use it reliably.
>
> Thinking about it, it would not be the only issue. The ring is sized
> to contain only one instance of the same event. So if you receive
> twice the event, you may overflow the ring.

 Hm... That's another argument in favor of "unthreading" the handler.
>>>
>>> I first thought it would be possible to unthread it. However,
>>> wake_up_interruptible is using a spin_lock. On RT spin_lock can sleep,
>>> so this cannot be used in an interrupt context.
>>>
>>> So I think "unthreading" the handler is not an option here.
>>
>> That sounds like a different problem. I.e. there are two issues:
>> * threaded interrupts don't work properly (races, ring overflow)
>> * evtchn_interrupt() (threaded or not) has spin_lock(), which is not
>> going to work for RT
>
> I am afraid that's not correct, you can use spin_lock() in threaded
> interrupt handler.

In non-RT handler -- yes, but not in an RT one (in fact, isn't this what
you yourself said above?)

>
>> The first can be fixed by using non-threaded handlers.
>
> The two are somewhat related, if you use a non-threaded handler then
> you are not going to help the RT case.
>
> In general, the unthreaded solution should be used in the last resort.
>
>>>

>
>>
>> Another alternative could be to queue the irq if !evtchn->enabled
>> and
>> handle it in evtchn_write() (which is where irq is supposed to be
>> re-enabled).
> What do you mean by queue? Is it queueing in the ring?


 No, I was thinking about having a new structure for deferred
 interrupts.
>>>
>>> Hmmm, I am not entirely sure what would be the structure here. Could
>>> you expand your thinking?
>>
>> Some sort of a FIFO that stores {irq, data} tuple. It could obviously be
>> implemented as a ring but not necessarily as Xen shared ring (if that's
>> what you were referring to).
>
> The underlying question is what happen if you miss an interrupt. Is it
> going to be ok?

This I am not sure about. I thought yes since we are signaling the
process only once.

-boris

> If no, then we 

Re: xen/evtchn and forced threaded irq

2019-02-20 Thread Julien Grall

(+ Andrew and Jan for feedback on the event channel interrupt)

Hi Boris,

Thank you for the your feedback.

On 2/20/19 8:04 PM, Boris Ostrovsky wrote:

On 2/20/19 1:05 PM, Julien Grall wrote:

Hi,

On 20/02/2019 17:07, Boris Ostrovsky wrote:

On 2/20/19 9:15 AM, Julien Grall wrote:

Hi Boris,

Thank you for your answer.

On 20/02/2019 00:02, Boris Ostrovsky wrote:

On Tue, Feb 19, 2019 at 05:31:10PM +, Julien Grall wrote:

Hi all,

I have been looking at using Linux RT in Dom0. Once the guest is
started,
the console is ending to have a lot of warning (see trace below).

After some investigation, this is because the irq handler will now
be threaded.
I can reproduce the same error with the vanilla Linux when passing
the option
'threadirqs' on the command line (the trace below is from 5.0.0-rc7
that has
not RT support).

FWIW, the interrupt for port 6 is used to for the guest to
communicate with
xenstore.

   From my understanding, this is happening because the interrupt
handler is now
run in a thread. So we can have the following happening.

  Interrupt context    | Interrupt thread
   |
  receive interrupt port 6 |
  clear the evtchn port    |
  set IRQF_RUNTHREAD    |
  kick interrupt thread    |
   |    clear IRQF_RUNTHREAD
   |    call evtchn_interrupt
  receive interrupt port 6 |
  clear the evtchn port    |
  set IRQF_RUNTHREAD   |
  kick interrupt thread    |
   |    disable interrupt port 6
   |    evtchn->enabled = false
   |    []
   |
   |    *** Handling the second
interrupt ***
   |    clear IRQF_RUNTHREAD
   |    call evtchn_interrupt
   |    WARN(...)

I am not entirely sure how to fix this. I have two solutions in mind:

1) Prevent the interrupt handler to be threaded. We would also
need to
switch from spin_lock to raw_spin_lock as the former may sleep on
RT-Linux.

2) Remove the warning


I think access to evtchn->enabled is racy so (with or without the
warning) we can't use it reliably.


Thinking about it, it would not be the only issue. The ring is sized
to contain only one instance of the same event. So if you receive
twice the event, you may overflow the ring.


Hm... That's another argument in favor of "unthreading" the handler.


I first thought it would be possible to unthread it. However,
wake_up_interruptible is using a spin_lock. On RT spin_lock can sleep,
so this cannot be used in an interrupt context.

So I think "unthreading" the handler is not an option here.


That sounds like a different problem. I.e. there are two issues:
* threaded interrupts don't work properly (races, ring overflow)
* evtchn_interrupt() (threaded or not) has spin_lock(), which is not
going to work for RT


I am afraid that's not correct, you can use spin_lock() in threaded 
interrupt handler.



The first can be fixed by using non-threaded handlers.


The two are somewhat related, if you use a non-threaded handler then you 
are not going to help the RT case.


In general, the unthreaded solution should be used in the last resort.









Another alternative could be to queue the irq if !evtchn->enabled and
handle it in evtchn_write() (which is where irq is supposed to be
re-enabled).

What do you mean by queue? Is it queueing in the ring?



No, I was thinking about having a new structure for deferred interrupts.


Hmmm, I am not entirely sure what would be the structure here. Could
you expand your thinking?


Some sort of a FIFO that stores {irq, data} tuple. It could obviously be
implemented as a ring but not necessarily as Xen shared ring (if that's
what you were referring to).


The underlying question is what happen if you miss an interrupt. Is it 
going to be ok? If no, then we have to record everything and can't 
kill/send an error to the user app because it is not its fault.


This means a FIFO would not be a viable. How do you size it? Static (i.e 
pre-defined) size is not going to be possible because you don't know how 
many interrupt you are going to receive before the thread handler runs. 
You can't make it grow dynamically as it make become quite big for the 
same reason.


Discussing with my team, a solution that came up would be to introduce 
one atomic field per event to record the number of event received. I 
will explore that solution tomorrow.


Cheers,

--
Julien Grall


Re: xen/evtchn and forced threaded irq

2019-02-20 Thread Boris Ostrovsky
On 2/20/19 1:05 PM, Julien Grall wrote:
> Hi,
>
> On 20/02/2019 17:07, Boris Ostrovsky wrote:
>> On 2/20/19 9:15 AM, Julien Grall wrote:
>>> Hi Boris,
>>>
>>> Thank you for your answer.
>>>
>>> On 20/02/2019 00:02, Boris Ostrovsky wrote:
 On Tue, Feb 19, 2019 at 05:31:10PM +, Julien Grall wrote:
> Hi all,
>
> I have been looking at using Linux RT in Dom0. Once the guest is
> started,
> the console is ending to have a lot of warning (see trace below).
>
> After some investigation, this is because the irq handler will now
> be threaded.
> I can reproduce the same error with the vanilla Linux when passing
> the option
> 'threadirqs' on the command line (the trace below is from 5.0.0-rc7
> that has
> not RT support).
>
> FWIW, the interrupt for port 6 is used to for the guest to
> communicate with
> xenstore.
>
>   From my understanding, this is happening because the interrupt
> handler is now
> run in a thread. So we can have the following happening.
>
>  Interrupt context    | Interrupt thread
>   |
>  receive interrupt port 6 |
>  clear the evtchn port    |
>  set IRQF_RUNTHREAD    |
>  kick interrupt thread    |
>   |    clear IRQF_RUNTHREAD
>   |    call evtchn_interrupt
>  receive interrupt port 6 |
>  clear the evtchn port    |
>  set IRQF_RUNTHREAD   |
>  kick interrupt thread    |
>   |    disable interrupt port 6
>   |    evtchn->enabled = false
>   |    []
>   |
>   |    *** Handling the second
> interrupt ***
>   |    clear IRQF_RUNTHREAD
>   |    call evtchn_interrupt
>   |    WARN(...)
>
> I am not entirely sure how to fix this. I have two solutions in mind:
>
> 1) Prevent the interrupt handler to be threaded. We would also
> need to
> switch from spin_lock to raw_spin_lock as the former may sleep on
> RT-Linux.
>
> 2) Remove the warning

 I think access to evtchn->enabled is racy so (with or without the
 warning) we can't use it reliably.
>>>
>>> Thinking about it, it would not be the only issue. The ring is sized
>>> to contain only one instance of the same event. So if you receive
>>> twice the event, you may overflow the ring.
>>
>> Hm... That's another argument in favor of "unthreading" the handler.
>
> I first thought it would be possible to unthread it. However,
> wake_up_interruptible is using a spin_lock. On RT spin_lock can sleep,
> so this cannot be used in an interrupt context.
>
> So I think "unthreading" the handler is not an option here.

That sounds like a different problem. I.e. there are two issues:
* threaded interrupts don't work properly (races, ring overflow)
* evtchn_interrupt() (threaded or not) has spin_lock(), which is not
going to work for RT

The first can be fixed by using non-threaded handlers.

>
>>
>>>

 Another alternative could be to queue the irq if !evtchn->enabled and
 handle it in evtchn_write() (which is where irq is supposed to be
 re-enabled).
>>> What do you mean by queue? Is it queueing in the ring?
>>
>>
>> No, I was thinking about having a new structure for deferred interrupts.
>
> Hmmm, I am not entirely sure what would be the structure here. Could
> you expand your thinking?

Some sort of a FIFO that stores {irq, data} tuple. It could obviously be
implemented as a ring but not necessarily as Xen shared ring (if that's
what you were referring to).

-boris




Re: xen/evtchn and forced threaded irq

2019-02-20 Thread Julien Grall

Hi,

On 20/02/2019 17:07, Boris Ostrovsky wrote:

On 2/20/19 9:15 AM, Julien Grall wrote:

Hi Boris,

Thank you for your answer.

On 20/02/2019 00:02, Boris Ostrovsky wrote:

On Tue, Feb 19, 2019 at 05:31:10PM +, Julien Grall wrote:

Hi all,

I have been looking at using Linux RT in Dom0. Once the guest is
started,
the console is ending to have a lot of warning (see trace below).

After some investigation, this is because the irq handler will now
be threaded.
I can reproduce the same error with the vanilla Linux when passing
the option
'threadirqs' on the command line (the trace below is from 5.0.0-rc7
that has
not RT support).

FWIW, the interrupt for port 6 is used to for the guest to
communicate with
xenstore.

  From my understanding, this is happening because the interrupt
handler is now
run in a thread. So we can have the following happening.

     Interrupt context    | Interrupt thread
  |
     receive interrupt port 6 |
     clear the evtchn port    |
     set IRQF_RUNTHREAD    |
     kick interrupt thread    |
  |    clear IRQF_RUNTHREAD
  |    call evtchn_interrupt
     receive interrupt port 6 |
     clear the evtchn port    |
     set IRQF_RUNTHREAD   |
     kick interrupt thread    |
  |    disable interrupt port 6
  |    evtchn->enabled = false
  |    []
  |
  |    *** Handling the second
interrupt ***
  |    clear IRQF_RUNTHREAD
  |    call evtchn_interrupt
  |    WARN(...)

I am not entirely sure how to fix this. I have two solutions in mind:

1) Prevent the interrupt handler to be threaded. We would also need to
switch from spin_lock to raw_spin_lock as the former may sleep on
RT-Linux.

2) Remove the warning


I think access to evtchn->enabled is racy so (with or without the
warning) we can't use it reliably.


Thinking about it, it would not be the only issue. The ring is sized
to contain only one instance of the same event. So if you receive
twice the event, you may overflow the ring.


Hm... That's another argument in favor of "unthreading" the handler.


I first thought it would be possible to unthread it. However, 
wake_up_interruptible is using a spin_lock. On RT spin_lock can sleep, so this 
cannot be used in an interrupt context.


So I think "unthreading" the handler is not an option here.







Another alternative could be to queue the irq if !evtchn->enabled and
handle it in evtchn_write() (which is where irq is supposed to be
re-enabled).

What do you mean by queue? Is it queueing in the ring?



No, I was thinking about having a new structure for deferred interrupts.


Hmmm, I am not entirely sure what would be the structure here. Could you expand 
your thinking?


Cheers,

--
Julien Grall


Re: xen/evtchn and forced threaded irq

2019-02-20 Thread Boris Ostrovsky
On 2/20/19 9:15 AM, Julien Grall wrote:
> Hi Boris,
>
> Thank you for your answer.
>
> On 20/02/2019 00:02, Boris Ostrovsky wrote:
>> On Tue, Feb 19, 2019 at 05:31:10PM +, Julien Grall wrote:
>>> Hi all,
>>>
>>> I have been looking at using Linux RT in Dom0. Once the guest is
>>> started,
>>> the console is ending to have a lot of warning (see trace below).
>>>
>>> After some investigation, this is because the irq handler will now
>>> be threaded.
>>> I can reproduce the same error with the vanilla Linux when passing
>>> the option
>>> 'threadirqs' on the command line (the trace below is from 5.0.0-rc7
>>> that has
>>> not RT support).
>>>
>>> FWIW, the interrupt for port 6 is used to for the guest to
>>> communicate with
>>> xenstore.
>>>
>>>  From my understanding, this is happening because the interrupt
>>> handler is now
>>> run in a thread. So we can have the following happening.
>>>
>>>     Interrupt context    | Interrupt thread
>>>  |
>>>     receive interrupt port 6 |
>>>     clear the evtchn port    |
>>>     set IRQF_RUNTHREAD    |
>>>     kick interrupt thread    |
>>>  |    clear IRQF_RUNTHREAD
>>>  |    call evtchn_interrupt
>>>     receive interrupt port 6 |
>>>     clear the evtchn port    |
>>>     set IRQF_RUNTHREAD   |
>>>     kick interrupt thread    |
>>>  |    disable interrupt port 6
>>>  |    evtchn->enabled = false
>>>  |    []
>>>  |
>>>  |    *** Handling the second
>>> interrupt ***
>>>  |    clear IRQF_RUNTHREAD
>>>  |    call evtchn_interrupt
>>>  |    WARN(...)
>>>
>>> I am not entirely sure how to fix this. I have two solutions in mind:
>>>
>>> 1) Prevent the interrupt handler to be threaded. We would also need to
>>> switch from spin_lock to raw_spin_lock as the former may sleep on
>>> RT-Linux.
>>>
>>> 2) Remove the warning
>>
>> I think access to evtchn->enabled is racy so (with or without the
>> warning) we can't use it reliably.
>
> Thinking about it, it would not be the only issue. The ring is sized
> to contain only one instance of the same event. So if you receive
> twice the event, you may overflow the ring.

Hm... That's another argument in favor of "unthreading" the handler.

>
>>
>> Another alternative could be to queue the irq if !evtchn->enabled and
>> handle it in evtchn_write() (which is where irq is supposed to be
>> re-enabled).
> What do you mean by queue? Is it queueing in the ring? 


No, I was thinking about having a new structure for deferred interrupts.

-boris

> If so, wouldn't it be racy as described above?





Re: xen/evtchn and forced threaded irq

2019-02-20 Thread Julien Grall

Hi Boris,

Thank you for your answer.

On 20/02/2019 00:02, Boris Ostrovsky wrote:

On Tue, Feb 19, 2019 at 05:31:10PM +, Julien Grall wrote:

Hi all,

I have been looking at using Linux RT in Dom0. Once the guest is started,
the console is ending to have a lot of warning (see trace below).

After some investigation, this is because the irq handler will now be threaded.
I can reproduce the same error with the vanilla Linux when passing the option
'threadirqs' on the command line (the trace below is from 5.0.0-rc7 that has
not RT support).

FWIW, the interrupt for port 6 is used to for the guest to communicate with
xenstore.

 From my understanding, this is happening because the interrupt handler is now
run in a thread. So we can have the following happening.

Interrupt context| Interrupt thread
 |
receive interrupt port 6 |
clear the evtchn port|
set IRQF_RUNTHREAD  |
kick interrupt thread|
 |clear IRQF_RUNTHREAD
 |call evtchn_interrupt
receive interrupt port 6 |
clear the evtchn port|
set IRQF_RUNTHREAD   |
kick interrupt thread|
 |disable interrupt port 6
 |evtchn->enabled = false
 |[]
 |
 |*** Handling the second interrupt ***
 |clear IRQF_RUNTHREAD
 |call evtchn_interrupt
 |WARN(...)

I am not entirely sure how to fix this. I have two solutions in mind:

1) Prevent the interrupt handler to be threaded. We would also need to
switch from spin_lock to raw_spin_lock as the former may sleep on RT-Linux.

2) Remove the warning


I think access to evtchn->enabled is racy so (with or without the warning) we 
can't use it reliably.


Thinking about it, it would not be the only issue. The ring is sized to contain 
only one instance of the same event. So if you receive twice the event, you may 
overflow the ring.




Another alternative could be to queue the irq if !evtchn->enabled and handle it 
in evtchn_write() (which is where irq is supposed to be re-enabled).
What do you mean by queue? Is it queueing in the ring? If so, wouldn't it be 
racy as described above?


Cheers,

--
Julien Grall


Re: xen/evtchn and forced threaded irq

2019-02-19 Thread Boris Ostrovsky
On Tue, Feb 19, 2019 at 05:31:10PM +, Julien Grall wrote:
> Hi all,
> 
> I have been looking at using Linux RT in Dom0. Once the guest is started,
> the console is ending to have a lot of warning (see trace below).
> 
> After some investigation, this is because the irq handler will now be 
> threaded.
> I can reproduce the same error with the vanilla Linux when passing the option
> 'threadirqs' on the command line (the trace below is from 5.0.0-rc7 that has
> not RT support).
> 
> FWIW, the interrupt for port 6 is used to for the guest to communicate with
> xenstore.
> 
> From my understanding, this is happening because the interrupt handler is now
> run in a thread. So we can have the following happening.
> 
>Interrupt context| Interrupt thread
> |  
>receive interrupt port 6 |
>clear the evtchn port|
>set IRQF_RUNTHREAD |
>kick interrupt thread|
> |clear IRQF_RUNTHREAD
> |call evtchn_interrupt
>receive interrupt port 6 |
>clear the evtchn port|
>set IRQF_RUNTHREAD   |
>kick interrupt thread|
> |disable interrupt port 6
> |evtchn->enabled = false
> |[]
> |
> |*** Handling the second interrupt ***
> |clear IRQF_RUNTHREAD
> |call evtchn_interrupt
> |WARN(...)
> 
> I am not entirely sure how to fix this. I have two solutions in mind:
> 
> 1) Prevent the interrupt handler to be threaded. We would also need to
> switch from spin_lock to raw_spin_lock as the former may sleep on RT-Linux.
> 
> 2) Remove the warning

I think access to evtchn->enabled is racy so (with or without the warning) we 
can't use it reliably.

Another alternative could be to queue the irq if !evtchn->enabled and handle it 
in evtchn_write() (which is where irq is supposed to be re-enabled).


-boris


> 
> None of them are ideals. Do you have an opionion/better suggestion?
> 
> [  127.192087] Interrupt for port 6, but apparently not enabled; per-user 
> 78d39c7f
> [  127.200333] WARNING: CPU: 0 PID: 2553 at drivers/xen/evtchn.c:167 
> evtchn_interrupt+0xfc/0x120
> [  127.208799] Modules linked in:
> [  127.211939] CPU: 0 PID: 2553 Comm: irq/52-evtchn:x Tainted: GW
>   5.0.0-rc7-00023-g2a3d41623699 #1257
> [  127.222374] Hardware name: ARM Juno development board (r2) (DT)
> [  127.228381] pstate: 4005 (nZcv daif -PAN -UAO)
> [  127.233256] pc : evtchn_interrupt+0xfc/0x120
> [  127.237607] lr : evtchn_interrupt+0xfc/0x120
> [  127.241952] sp : 12d2bd60
> [  127.245347] x29: 12d2bd60 x28: 1015d608
> [  127.250741] x27: 8008b39de400 x26: 1015d330
> [  127.256137] x25: 1015d2d4 x24: 0001
> [  127.261532] x23: 1015d570 x22: 0034
> [  127.266926] x21:  x20: 8008b7f02400
> [  127.272322] x19: 8008b3aba000 x18: 0037
> [  127.277717] x17:  x16: 
> [  127.283112] x15: fff0 x14: 
> [  127.288507] x13: 3030303030207265 x12: 000c
> [  127.293902] x11: 10ea0a88 x10: 
> [  127.299297] x9 : fffb9fff x8 : 
> [  127.304701] x7 : 0001 x6 : 8008bac5c240
> [  127.310087] x5 : 8008bac5c240 x4 : 
> [  127.315483] x3 : 8008bac64708 x2 : 8008b39de400
> [  127.320877] x1 : 893ac9b38837b800 x0 : 
> [  127.326272] Call trace:
> [  127.328802]  evtchn_interrupt+0xfc/0x120
> [  127.332804]  irq_forced_thread_fn+0x38/0x98
> [  127.337066]  irq_thread+0x190/0x238
> [  127.340636]  kthread+0x134/0x138
> [  127.343942]  ret_from_fork+0x10/0x1c
> [  127.347593] ---[ end trace 1d3fa385877cc18b ]---
> 
> 
> Cheers,
> 
> -- 
> Julien Grall