Re: RTOS inmate misses interrupts

2022-12-29 Thread Nir Geller
Hi Ralf,

Do you know of a difference in the way SPI and PPI are treated by the
hypervisor?
A timer that is specific for a certain core is wired to GIC by PPI. Would
its processing be influenced by heavy load running on another core?

The GIC is facing the same load, but handles PPI instead of SPI. What would
be the implications on performance?

Thanks a lot,

Nir.

On Wed, Dec 14, 2022 at 11:09 AM Rasty Slutsker  wrote:

> Hi,
> Exactly I reference to  'jailhouse cell stats'
> We enabled TSC in Hypervisor init, and printed value to dmsg.
> But *after* start of Linux it gets disabled again.
> *** Is there a place in hypervisor that runs at kernel space after start
> of linux where we can put re-enabling of TSC?
>
> We would like to measure duration of Hypervisor ISR  with TSC and look for
> anomalies.
> Anomalies like unusual duration of ISR processing and Irregularities in
> ISR entry for particular request number.
>
> Other suggestions would be highly appreciated.
>
> Thanks
> Rasty
>
> On Tuesday, December 13, 2022 at 11:27:50 PM UTC+2 Ralf Ramsauer wrote:
>
>> Hi Rasty,
>>
>> (reply-to-all :) )
>>
>> On 13/12/2022 19:31, Rasty Slutsker wrote:
>> > We learned how you export some statistics from Jailhouse as you guys do
>> > and added 3 variables
>>
>> You reference to 'jailhouse cell stats'?
>>
>> > 1. At the entry  Jailhouse IRQ (if irq==xxx counter ++)
>> > 2. At injection point of the same IRQ to inmate, still in Jailhouse
>> > 3. At the beginning of ISR in inmate (RTOS).
>>
>> Ok, but that just counts the interrupts, and not the occuring delays,
>> right?
>>
>> >
>> > We let system run, introduce some load to linux. We see physical
>> effects
>> > that suggest that we lose interrupts.
>> > We confirm it with difference in performance counters (gaps in quanta
>> of
>> > 30 uSecs) that we sample in inmate ISR.
>> > Than we kill interrupt.
>> > All 3 counters are the same. Amount matches interrupt rate.
>> > My conclusion that Interrupt request is lost.
>>
>> Could be the case. I don't know what happens if the jitter gets too long
>> between interrupts.
>>
>> >
>> > Another question.
>> > We try to read CPU time stamp counter from Jailhouse ISR . We get 0.
>> > mrc p15, #0, r0, c9, c13, #0
>>
>> That's the PMCNTR, right?
>>
>> >
>> > Any idea why? Do you have some code for that?
>>
>> Uhm, I would have to read the reference manual as well. Does reading the
>> TSC work in Linux? Maybe it has to be activated or enabled for the
>> hypervisor?
>>
>>
>> https://developer.arm.com/documentation/ddi0406/b/Debug-Architecture/Debug-Registers-Reference/Performance-monitor-registers/c9--Count-Enable-Set-Register--PMCNTENSET-?lang=en
>>
>> Thanks,
>> Ralf
>>
>> >
>> > Best regards
>> > Rasty
>> >
>> >
>> >
>> > On Tuesday, December 13, 2022 at 3:47:25 PM UTC+2 Ralf Ramsauer wrote:
>> >
>> > Hi Rasty,
>> >
>> > Please reply-to-all, then your reply will also pop up in my Inbox.
>> >
>> > On 10/12/2022 08:52, Rasty Slutsker wrote:
>> > > Hi,
>> > > We did some performance measurements.
>> > > Added counters in 3 places (per Irq source)
>> > > 1. entry to jailhouse ISR
>> > > 2. dispatch of interrupt to particular vector to particular core
>> > > 3. in RTOS isr.
>> >
>> > Okay. How do you read and dump them? I hope after everything is done.
>> >
>> > Take care that if you dump them immediately to the uart, this will
>> > consume a lot of time and cause significant delay. ('heavy' printk
>> > logic
>> > + busy waiting for the uart)
>> >
>> > >
>> > > *We see that all 3 counters have the same value*, but we measure
>> > time
>> >
>> > huh? What counters do you access? There's something odd if they hold
>> > the
>> > same value at different places.
>> >
>> > > gaps in RTOS in ISR invocation times, sometimes upto 60 uSec.
>> > >
>> > > It means that either
>> > > a) interrupt request is lost. But, according to setup it is
>> > > edge-triggered, it cannot be lost, just delayed.
>> > > b) there is a delay of more than 60 usec in jailhouse ISR.
>> > >
>> > > questions:
>> > > 1. Is it possible that jailhouse interrupt dispatching routine
>> > enters
>> > > some loop that takes considerable amount of time?
>> >
>> > If you use printk during dispatching for debugging - yes. Otherwise, I
>> > guess no. Not 60µs.
>> >
>> > > 2. What would be explanation of interrupt latency of 60 Secs?
>> > Even if we
>> > > take into account cache line refill we get much lower number,
>> > which do
>> > > not reach tens uSecs.
>> >
>> > Ack, I would definitely not expect 60µs delay for IRQ reinjection.
>> >
>> > Thanks
>> > Ralf
>> >
>> > >
>> > > Best regards
>> > > Rasty
>> > >
>> > >
>> > >
>> > >
>> > >
>> > > On Tuesday, December 6, 2022 at 6:01:20 PM UTC+2 Ralf Ramsauer
>> > wrote:
>> > >
>> > > Hi,
>> > >
>> > > On 05/12/2022 17:30, Rasty Slutsker wrote:
>> > > > Hi Ralf,
>> > > > Thank you for the answer.
>> > > > We have periodic interrupt each 30 u(!)Sec. Linux cannot deal
>> > > with such
>> > > > rate, so we need 

Re: RTOS inmate misses interrupts

2022-12-14 Thread Rasty Slutsker
Hi,
Exactly I reference to  'jailhouse cell stats'
We enabled TSC in Hypervisor init, and printed value to dmsg.
But *after* start of Linux it gets disabled again.
*** Is there a place in hypervisor that runs at kernel space after start of 
linux where we can put re-enabling of TSC?

We would like to measure duration of Hypervisor ISR  with TSC and look for 
anomalies.
Anomalies like unusual duration of ISR processing and Irregularities in ISR 
entry for particular request number.

Other suggestions would be highly appreciated.

Thanks
Rasty

On Tuesday, December 13, 2022 at 11:27:50 PM UTC+2 Ralf Ramsauer wrote:

> Hi Rasty,
>
> (reply-to-all :) )
>
> On 13/12/2022 19:31, Rasty Slutsker wrote:
> > We learned how you export some statistics from Jailhouse as you guys do 
> > and added 3 variables
>
> You reference to 'jailhouse cell stats'?
>
> > 1. At the entry  Jailhouse IRQ (if irq==xxx counter ++)
> > 2. At injection point of the same IRQ to inmate, still in Jailhouse
> > 3. At the beginning of ISR in inmate (RTOS).
>
> Ok, but that just counts the interrupts, and not the occuring delays, 
> right?
>
> > 
> > We let system run, introduce some load to linux. We see physical effects 
> > that suggest that we lose interrupts.
> > We confirm it with difference in performance counters (gaps in quanta of 
> > 30 uSecs) that we sample in inmate ISR.
> > Than we kill interrupt.
> > All 3 counters are the same. Amount matches interrupt rate.
> > My conclusion that Interrupt request is lost.
>
> Could be the case. I don't know what happens if the jitter gets too long 
> between interrupts.
>
> > 
> > Another question.
> > We try to read CPU time stamp counter from Jailhouse ISR . We get 0.
> > mrc p15, #0, r0, c9, c13, #0
>
> That's the PMCNTR, right?
>
> > 
> > Any idea why? Do you have some code for that?
>
> Uhm, I would have to read the reference manual as well. Does reading the 
> TSC work in Linux? Maybe it has to be activated or enabled for the 
> hypervisor?
>
>
> https://developer.arm.com/documentation/ddi0406/b/Debug-Architecture/Debug-Registers-Reference/Performance-monitor-registers/c9--Count-Enable-Set-Register--PMCNTENSET-?lang=en
>
> Thanks,
> Ralf
>
> > 
> > Best regards
> > Rasty
> > 
> > 
> > 
> > On Tuesday, December 13, 2022 at 3:47:25 PM UTC+2 Ralf Ramsauer wrote:
> > 
> > Hi Rasty,
> > 
> > Please reply-to-all, then your reply will also pop up in my Inbox.
> > 
> > On 10/12/2022 08:52, Rasty Slutsker wrote:
> > > Hi,
> > > We did some performance measurements.
> > > Added counters in 3 places (per Irq source)
> > > 1. entry to jailhouse ISR
> > > 2. dispatch of interrupt to particular vector to particular core
> > > 3. in RTOS isr.
> > 
> > Okay. How do you read and dump them? I hope after everything is done.
> > 
> > Take care that if you dump them immediately to the uart, this will
> > consume a lot of time and cause significant delay. ('heavy' printk
> > logic
> > + busy waiting for the uart)
> > 
> > >
> > > *We see that all 3 counters have the same value*, but we measure
> > time
> > 
> > huh? What counters do you access? There's something odd if they hold
> > the
> > same value at different places.
> > 
> > > gaps in RTOS in ISR invocation times, sometimes upto 60 uSec.
> > >
> > > It means that either
> > > a) interrupt request is lost. But, according to setup it is
> > > edge-triggered, it cannot be lost, just delayed.
> > > b) there is a delay of more than 60 usec in jailhouse ISR.
> > >
> > > questions:
> > > 1. Is it possible that jailhouse interrupt dispatching routine
> > enters
> > > some loop that takes considerable amount of time?
> > 
> > If you use printk during dispatching for debugging - yes. Otherwise, I
> > guess no. Not 60µs.
> > 
> > > 2. What would be explanation of interrupt latency of 60 Secs?
> > Even if we
> > > take into account cache line refill we get much lower number,
> > which do
> > > not reach tens uSecs.
> > 
> > Ack, I would definitely not expect 60µs delay for IRQ reinjection.
> > 
> > Thanks
> > Ralf
> > 
> > >
> > > Best regards
> > > Rasty
> > >
> > >
> > >
> > >
> > >
> > > On Tuesday, December 6, 2022 at 6:01:20 PM UTC+2 Ralf Ramsauer
> > wrote:
> > >
> > > Hi,
> > >
> > > On 05/12/2022 17:30, Rasty Slutsker wrote:
> > > > Hi Ralf,
> > > > Thank you for the answer.
> > > > We have periodic interrupt each 30 u(!)Sec. Linux cannot deal
> > > with such
> > > > rate, so we need hypervisor/RTOS.
> > >
> > > I understand.
> > >
> > > > We managed to read a code of hypervisor. It appears that all
> > > interrupts
> > > > to all cores are intercepted by hypervisor and then forwarded to
> > > guests
> > > > (per core).
> > >
> > > Yes, exactly, that's the case if you don't have SDEI. If you have a
> > > platform that would come with SDEI, then you have of course less
> > > overhead.
> > >
> > > > If we reduce interrupt priority of mentioned interrupt (as you
> > > suggest)
> > > > we lose even more interrupts, even without stress.
> > > > Interrupt is 

Re: RTOS inmate misses interrupts

2022-12-13 Thread Ralf Ramsauer

Hi Rasty,

(reply-to-all :) )

On 13/12/2022 19:31, Rasty Slutsker wrote:
We learned how you export some statistics from Jailhouse as you guys do 
and added 3 variables


You reference to 'jailhouse cell stats'?


1. At the entry  Jailhouse IRQ (if irq==xxx counter ++)
2. At injection point of the same IRQ to inmate, still in Jailhouse
3. At the beginning of ISR in inmate (RTOS).


Ok, but that just counts the interrupts, and not the occuring delays, right?



We let system run, introduce some load to linux. We see physical effects 
that suggest that we lose interrupts.
We confirm it with difference in performance counters (gaps in quanta of 
30 uSecs) that we sample in inmate ISR.

Than we kill interrupt.
All 3 counters are the same. Amount matches interrupt rate.
My conclusion that Interrupt request is lost.


Could be the case. I don't know what happens if the jitter gets too long 
between interrupts.




Another question.
We try to read CPU time stamp counter from Jailhouse ISR . We get 0.
mrc     p15, #0, r0, c9, c13, #0


That's the PMCNTR, right?



Any idea why? Do you have some code for that?


Uhm, I would have to read the reference manual as well. Does reading the 
TSC work in Linux? Maybe it has to be activated or enabled for the 
hypervisor?


https://developer.arm.com/documentation/ddi0406/b/Debug-Architecture/Debug-Registers-Reference/Performance-monitor-registers/c9--Count-Enable-Set-Register--PMCNTENSET-?lang=en

Thanks,
  Ralf



Best regards
Rasty



On Tuesday, December 13, 2022 at 3:47:25 PM UTC+2 Ralf Ramsauer wrote:

Hi Rasty,

Please reply-to-all, then your reply will also pop up in my Inbox.

On 10/12/2022 08:52, Rasty Slutsker wrote:
 > Hi,
 > We did some performance measurements.
 > Added counters in 3 places (per Irq source)
 > 1. entry to jailhouse ISR
 > 2. dispatch of interrupt to particular vector to particular core
 > 3. in RTOS isr.

Okay. How do you read and dump them? I hope after everything is done.

Take care that if you dump them immediately to the uart, this will
consume a lot of time and cause significant delay. ('heavy' printk
logic
+ busy waiting for the uart)

 >
 > *We see that all 3 counters have the same value*, but we measure
time

huh? What counters do you access? There's something odd if they hold
the
same value at different places.

 > gaps in RTOS in ISR invocation times, sometimes upto 60 uSec.
 >
 > It means that either
 > a) interrupt request is lost. But, according to setup it is
 > edge-triggered, it cannot be lost, just delayed.
 > b) there is a delay of more than 60 usec in jailhouse ISR.
 >
 > questions:
 > 1. Is it possible that jailhouse interrupt dispatching routine
enters
 > some loop that takes considerable amount of time?

If you use printk during dispatching for debugging - yes. Otherwise, I
guess no. Not 60µs.

 > 2. What would be explanation of interrupt latency of 60 Secs?
Even if we
 > take into account cache line refill we get much lower number,
which do
 > not reach tens uSecs.

Ack, I would definitely not expect 60µs delay for IRQ reinjection.

Thanks
Ralf

 >
 > Best regards
 > Rasty
 >
 >
 >
 >
 >
 > On Tuesday, December 6, 2022 at 6:01:20 PM UTC+2 Ralf Ramsauer
wrote:
 >
 > Hi,
 >
 > On 05/12/2022 17:30, Rasty Slutsker wrote:
 > > Hi Ralf,
 > > Thank you for the answer.
 > > We have periodic interrupt each 30 u(!)Sec. Linux cannot deal
 > with such
 > > rate, so we need hypervisor/RTOS.
 >
 > I understand.
 >
 > > We managed to read a code of hypervisor. It appears that all
 > interrupts
 > > to all cores are intercepted by hypervisor and then forwarded to
 > guests
 > > (per core).
 >
 > Yes, exactly, that's the case if you don't have SDEI. If you have a
 > platform that would come with SDEI, then you have of course less
 > overhead.
 >
 > > If we reduce interrupt priority of mentioned interrupt (as you
 > suggest)
 > > we lose even more interrupts, even without stress.
 > > Interrupt is defined as edge triggered, I assumed that it is
 > memorized
 > > by gic until serviced.
 > > Is it possible that Hypervisor acknowledges pending interrupt
while
 > > servicing interrupt from another source ? Kind of race - 2
 > interrupts
 > > for 2 cores arrive nearly simultaneously. One is lost.
 >
 > The EOIR and IAR registers of the GIC are core-local registers of
the
 > GIC CPU interface (GICC), so I wonder how this should cause a race,
 > unless there isn't a hard logical mistake in the code, which I
doubt.
 >
 >
 > What you could try to do for debugging purposes:
 >
 > 1. Slow down from 30µs to something slwer, which you can handle
 > even under load. Say 

Re: RTOS inmate misses interrupts

2022-12-13 Thread Rasty Slutsker
We learned how you export some statistics from Jailhouse as you guys do and 
added 3 variables
1. At the entry  Jailhouse IRQ (if irq==xxx counter ++)
2. At injection point of the same IRQ to inmate, still in Jailhouse 
3. At the beginning of ISR in inmate (RTOS).

We let system run, introduce some load to linux. We see physical effects 
that suggest that we lose interrupts. 
We confirm it with difference in performance counters (gaps in quanta of 30 
uSecs) that we sample in inmate ISR.
Than we kill interrupt.
All 3 counters are the same. Amount matches interrupt rate.
My conclusion that Interrupt request is lost.

Another question.
We try to read CPU time stamp counter from Jailhouse ISR . We get 0.
mrc p15, #0, r0, c9, c13, #0

Any idea why? Do you have some code for that?

Best regards
Rasty



On Tuesday, December 13, 2022 at 3:47:25 PM UTC+2 Ralf Ramsauer wrote:

> Hi Rasty,
>
> Please reply-to-all, then your reply will also pop up in my Inbox.
>
> On 10/12/2022 08:52, Rasty Slutsker wrote:
> > Hi,
> > We did some performance measurements.
> > Added counters in 3 places (per Irq source)
> > 1. entry to jailhouse ISR
> > 2. dispatch of interrupt to particular vector to particular core
> > 3. in RTOS isr.
>
> Okay. How do you read and dump them? I hope after everything is done.
>
> Take care that if you dump them immediately to the uart, this will 
> consume a lot of time and cause significant delay. ('heavy' printk logic 
> + busy waiting for the uart)
>
> > 
> > *We see that all 3 counters have the same value*, but we measure time 
>
> huh? What counters do you access? There's something odd if they hold the 
> same value at different places.
>
> > gaps in RTOS in ISR invocation times, sometimes upto 60 uSec.
> > 
> > It means that either
> > a) interrupt request is lost. But, according to setup it is 
> > edge-triggered, it cannot be lost, just delayed.
> > b) there is a delay of more than 60 usec in jailhouse ISR.
> > 
> > questions:
> > 1. Is it possible that jailhouse interrupt dispatching routine enters 
> > some loop that takes considerable amount of time?
>
> If you use printk during dispatching for debugging - yes. Otherwise, I 
> guess no. Not 60µs.
>
> > 2. What would be explanation of interrupt latency of 60 Secs? Even if we 
> > take into account cache line refill we get much lower number, which do 
> > not reach tens uSecs.
>
> Ack, I would definitely not expect 60µs delay for IRQ reinjection.
>
> Thanks
> Ralf
>
> > 
> > Best regards
> > Rasty
> > 
> > 
> > 
> > 
> > 
> > On Tuesday, December 6, 2022 at 6:01:20 PM UTC+2 Ralf Ramsauer wrote:
> > 
> > Hi,
> > 
> > On 05/12/2022 17:30, Rasty Slutsker wrote:
> > > Hi Ralf,
> > > Thank you for the answer.
> > > We have periodic interrupt each 30 u(!)Sec. Linux cannot deal
> > with such
> > > rate, so we need hypervisor/RTOS.
> > 
> > I understand.
> > 
> > > We managed to read a code of hypervisor. It appears that all
> > interrupts
> > > to all cores are intercepted by hypervisor and then forwarded to
> > guests
> > > (per core).
> > 
> > Yes, exactly, that's the case if you don't have SDEI. If you have a
> > platform that would come with SDEI, then you have of course less
> > overhead.
> > 
> > > If we reduce interrupt priority of mentioned interrupt (as you
> > suggest)
> > > we lose even more interrupts, even without stress.
> > > Interrupt is defined as edge triggered, I assumed that it is
> > memorized
> > > by gic until serviced.
> > > Is it possible that Hypervisor acknowledges pending interrupt while
> > > servicing interrupt from another source ? Kind of race - 2
> > interrupts
> > > for 2 cores arrive nearly simultaneously. One is lost.
> > 
> > The EOIR and IAR registers of the GIC are core-local registers of the
> > GIC CPU interface (GICC), so I wonder how this should cause a race,
> > unless there isn't a hard logical mistake in the code, which I doubt.
> > 
> > 
> > What you could try to do for debugging purposes:
> > 
> > 1. Slow down from 30µs to something slwer, which you can handle
> > even under load. Say 100µs, 500µs, something like that.
> > 
> > 2. Measure the jitter x between arrival of the interrupt, and final
> > acknowledgement in your RTOS. You can use performance monitoring
> > registers, or watch CPU cycle counters, whatever. Repeat the
> > measurement, w/ load and w/o load on Linux-side.
> > 
> > 3. If max(x) >= 30µs, then you know where your IRQs go in case of a
> > periodic cycle of 30µs.
> > 
> > 
> > Reason: What I did some while ago, is measuring the Jitter of
> > Linux+Jailhouse on ARM systems with cyclictest. On a Jetson TX1
> > platform, for example, we saw Jitter up 50µs. So there's IRQ
> > reinjection, a full Linux stack and some userspace application
> > involved,
> > so three context switches and lots of code. You have probably two
> > context switches and less code, as you use a RTOS, but I think
> > there's a
> > certain chance to exceed 30µs.
> > 
> > What my gut feeling tells me is that 

Re: RTOS inmate misses interrupts

2022-12-13 Thread Ralf Ramsauer

Hi Rasty,

Please reply-to-all, then your reply will also pop up in my Inbox.

On 10/12/2022 08:52, Rasty Slutsker wrote:

Hi,
We did some performance measurements.
Added counters in 3 places (per Irq source)
1. entry to jailhouse ISR
2. dispatch of interrupt to particular vector to particular core
3. in RTOS isr.


Okay. How do you read and dump them? I hope after everything is done.

Take care that if you dump them immediately to the uart, this will 
consume a lot of time and cause significant delay. ('heavy' printk logic 
+ busy waiting for the uart)




*We see that all 3 counters have the same value*, but we measure time 


huh? What counters do you access? There's something odd if they hold the 
same value at different places.



gaps in RTOS in ISR invocation times, sometimes upto 60 uSec.

It means that either
a) interrupt request is lost. But, according to setup it is 
edge-triggered, it cannot be lost, just delayed.

b) there is a delay of more than 60 usec in jailhouse ISR.

questions:
1. Is it possible that jailhouse interrupt dispatching routine enters 
some loop that takes considerable amount of time?


If you use printk during dispatching for debugging - yes. Otherwise, I 
guess no. Not 60µs.


2. What would be explanation of interrupt latency of 60 Secs? Even if we 
take into account cache line refill we get much lower number, which do 
not reach tens uSecs.


Ack, I would definitely not expect 60µs delay for IRQ reinjection.

Thanks
  Ralf



Best regards
Rasty





On Tuesday, December 6, 2022 at 6:01:20 PM UTC+2 Ralf Ramsauer wrote:

Hi,

On 05/12/2022 17:30, Rasty Slutsker wrote:
 > Hi Ralf,
 > Thank you for the answer.
 > We have periodic interrupt each 30 u(!)Sec. Linux cannot deal
with such
 > rate, so we need hypervisor/RTOS.

I understand.

 > We managed to read a code of hypervisor. It appears that all
interrupts
 > to all cores are intercepted by hypervisor and then forwarded to
guests
 > (per core).

Yes, exactly, that's the case if you don't have SDEI. If you have a
platform that would come with SDEI, then you have of course less
overhead.

 > If we reduce interrupt priority of mentioned interrupt (as you
suggest)
 > we lose even more interrupts, even without stress.
 > Interrupt is defined as edge triggered, I assumed that it is
memorized
 > by gic until serviced.
 > Is it possible that Hypervisor acknowledges pending interrupt while
 > servicing interrupt from another source ? Kind of race - 2
interrupts
 > for 2 cores arrive nearly simultaneously. One is lost.

The EOIR and IAR registers of the GIC are core-local registers of the
GIC CPU interface (GICC), so I wonder how this should cause a race,
unless there isn't a hard logical mistake in the code, which I doubt.


What you could try to do for debugging purposes:

1. Slow down from 30µs to something slwer, which you can handle
even under load. Say 100µs, 500µs, something like that.

2. Measure the jitter x between arrival of the interrupt, and final
acknowledgement in your RTOS. You can use performance monitoring
registers, or watch CPU cycle counters, whatever. Repeat the
measurement, w/ load and w/o load on Linux-side.

3. If max(x) >= 30µs, then you know where your IRQs go in case of a
periodic cycle of 30µs.


Reason: What I did some while ago, is measuring the Jitter of
Linux+Jailhouse on ARM systems with cyclictest. On a Jetson TX1
platform, for example, we saw Jitter up 50µs. So there's IRQ
reinjection, a full Linux stack and some userspace application
involved,
so three context switches and lots of code. You have probably two
context switches and less code, as you use a RTOS, but I think
there's a
certain chance to exceed 30µs.

What my gut feeling tells me is that you manage to hold those 30µs if
Linux is quiet. As soon as there's some stress on the system bus, and
even on shared caches, you exceed you deadline.

Thanks
Ralf

 >
 > Best regards
 > Rasty
 >
 > On Monday, December 5, 2022 at 5:14:37 PM UTC+2 Ralf Ramsauer wrote:
 >
 > Hi Nir,
 >
 > On 29/11/2022 14:21, nirge...@gmail.com wrote:
 > > Hi there,
 > >
 > > Our target is Sitara AM5726 , CortexA15 dual core on which we are
 > > running Linux on A15 core0 and RTOS on core1.
 > >
 > > __
 > >
 > > RTOS gets periodic interrupt from external hardware via nirq1 pin
 > > (dedicated input into ARM gic).
 > >
 > > Under heavy load in Linux (core 0!), RTOS, which runs on core1
 > misses
 > > interrupts.
 >
 > Uhm. Can you reconstruct that issue w/o Jailhouse under Linux?
 >
 > I mean, can you set the SMP affinity of that IRQ to core 1 under
Linux,
 > and then write some test application running on core 1 that just
 > receives the 

Re: RTOS inmate misses interrupts

2022-12-09 Thread Rasty Slutsker
Hi,
We did some performance measurements.
Added counters in 3 places (per Irq source)
1. entry to jailhouse ISR
2. dispatch of interrupt to particular vector to particular core
3. in RTOS isr.

*We see that all 3 counters have the same value*, but we measure time gaps 
in RTOS in ISR invocation times, sometimes upto 60 uSec.

It means that either
a) interrupt request is lost. But, according to setup it is edge-triggered, 
it cannot be lost, just delayed.
b) there is a delay of more than 60 usec in jailhouse ISR.

questions:
1. Is it possible that jailhouse interrupt dispatching routine enters some 
loop that takes considerable amount of time?
2. What would be explanation of interrupt latency of 60 Secs? Even if we 
take into account cache line refill we get much lower number, which do not 
reach tens uSecs. 

Best regards
Rasty



 

On Tuesday, December 6, 2022 at 6:01:20 PM UTC+2 Ralf Ramsauer wrote:

> Hi,
>
> On 05/12/2022 17:30, Rasty Slutsker wrote:
> > Hi Ralf,
> > Thank you for the answer.
> > We have periodic interrupt each 30 u(!)Sec. Linux cannot deal with such 
> > rate, so we need hypervisor/RTOS.
>
> I understand.
>
> > We managed to read a code of hypervisor. It appears that all interrupts 
> > to all cores are intercepted by hypervisor and then forwarded to guests 
> > (per core).
>
> Yes, exactly, that's the case if you don't have SDEI. If you have a 
> platform that would come with SDEI, then you have of course less overhead.
>
> > If we reduce interrupt priority of mentioned interrupt (as you suggest) 
> > we lose even more interrupts, even without stress.
> > Interrupt is defined as edge triggered, I assumed that it is memorized 
> > by gic until serviced.
> > Is it possible that Hypervisor acknowledges pending interrupt while 
> > servicing interrupt from another source ? Kind of race - 2 interrupts 
> > for 2 cores arrive nearly simultaneously. One is lost.
>
> The EOIR and IAR registers of the GIC are core-local registers of the 
> GIC CPU interface (GICC), so I wonder how this should cause a race, 
> unless there isn't a hard logical mistake in the code, which I doubt.
>
>
> What you could try to do for debugging purposes:
>
> 1. Slow down from 30µs to something slwer, which you can handle
> even under load. Say 100µs, 500µs, something like that.
>
> 2. Measure the jitter x between arrival of the interrupt, and final
> acknowledgement in your RTOS. You can use performance monitoring
> registers, or watch CPU cycle counters, whatever. Repeat the
> measurement, w/ load and w/o load on Linux-side.
>
> 3. If max(x) >= 30µs, then you know where your IRQs go in case of a
> periodic cycle of 30µs.
>
>
> Reason: What I did some while ago, is measuring the Jitter of 
> Linux+Jailhouse on ARM systems with cyclictest. On a Jetson TX1 
> platform, for example, we saw Jitter up 50µs. So there's IRQ 
> reinjection, a full Linux stack and some userspace application involved, 
> so three context switches and lots of code. You have probably two 
> context switches and less code, as you use a RTOS, but I think there's a 
> certain chance to exceed 30µs.
>
> What my gut feeling tells me is that you manage to hold those 30µs if 
> Linux is quiet. As soon as there's some stress on the system bus, and 
> even on shared caches, you exceed you deadline.
>
> Thanks
> Ralf
>
> > 
> > Best regards
> > Rasty
> > 
> > On Monday, December 5, 2022 at 5:14:37 PM UTC+2 Ralf Ramsauer wrote:
> > 
> > Hi Nir,
> > 
> > On 29/11/2022 14:21, nirge...@gmail.com wrote:
> > > Hi there,
> > >
> > > Our target is Sitara AM5726 , CortexA15 dual core on which we are
> > > running Linux on A15 core0 and RTOS on core1.
> > >
> > > __
> > >
> > > RTOS gets periodic interrupt from external hardware via nirq1 pin
> > > (dedicated input into ARM gic).
> > >
> > > Under heavy load in Linux (core 0!), RTOS, which runs on core1
> > misses
> > > interrupts.
> > 
> > Uhm. Can you reconstruct that issue w/o Jailhouse under Linux?
> > 
> > I mean, can you set the SMP affinity of that IRQ to core 1 under Linux,
> > and then write some test application running on core 1 that just
> > receives the IRQ. If that issue happens under Linux as well, then you
> > know that the issue has probably nothing to do with Jailhouse.
> > 
> > 
> > What also might happen: If there's enough pressure on the shared system
> > bus when Linux is under load, then you simply loose those IRQs as the
> > RTOS doesn't have enough time to handle it. You can test this
> > hypothesis
> > if you lower the frequency of the the periodic interrupt. If you still
> > loose IRQs, then this should not be the case.
> > 
> > >
> > > Questions
> > >
> > > 1. Does linux/hypervisor participate in interrupt
> > scheduling/forwarding
> > > to cell on Core1
> > 
> > Linux: No, Linux does not participate in anything that is going on on
> > CPU 1. That's the idea behind Jailhouse.
> > 
> > Jailhouse: Maybe. On ARM platforms, Jailhouse needs to reinject the
> > 

Re: RTOS inmate misses interrupts

2022-12-06 Thread Ralf Ramsauer

Hi,

On 05/12/2022 17:30, Rasty Slutsker wrote:

Hi Ralf,
Thank you for the answer.
We have periodic interrupt each 30 u(!)Sec. Linux cannot deal with such 
rate, so we need hypervisor/RTOS.


I understand.

We managed to read a code of hypervisor. It appears that all interrupts 
to all cores are intercepted by hypervisor and then forwarded to guests 
(per core).


Yes, exactly, that's the case if you don't have SDEI. If you have a 
platform that would come with SDEI, then you have of course less overhead.


If we reduce interrupt priority of mentioned interrupt (as you suggest) 
we lose even more interrupts, even without stress.
Interrupt is defined as edge triggered, I assumed that it is memorized 
by gic until serviced.
Is it possible that Hypervisor acknowledges pending interrupt while 
servicing interrupt from another source ? Kind of race - 2 interrupts 
for 2 cores arrive nearly simultaneously. One is lost.


The EOIR and IAR registers of the GIC are core-local registers of the 
GIC CPU interface (GICC), so I wonder how this should cause a race, 
unless there isn't a hard logical mistake in the code, which I doubt.



What you could try to do for debugging purposes:

1. Slow down from 30µs to something slwer, which you can handle
   even under load. Say 100µs, 500µs, something like that.

2. Measure the jitter x between arrival of the interrupt, and final
   acknowledgement in your RTOS. You can use performance monitoring
   registers, or watch CPU cycle counters, whatever. Repeat the
   measurement, w/ load and w/o load on Linux-side.

3. If max(x) >= 30µs, then you know where your IRQs go in case of a
   periodic cycle of 30µs.


Reason: What I did some while ago, is measuring the Jitter of 
Linux+Jailhouse on ARM systems with cyclictest. On a Jetson TX1 
platform, for example, we saw Jitter up 50µs. So there's IRQ 
reinjection, a full Linux stack and some userspace application involved, 
so three context switches and lots of code. You have probably two 
context switches and less code, as you use a RTOS, but I think there's a 
certain chance to exceed 30µs.


What my gut feeling tells me is that you manage to hold those 30µs if 
Linux is quiet. As soon as there's some stress on the system bus, and 
even on shared caches, you exceed you deadline.


Thanks
  Ralf



Best regards
Rasty

On Monday, December 5, 2022 at 5:14:37 PM UTC+2 Ralf Ramsauer wrote:

Hi Nir,

On 29/11/2022 14:21, nirge...@gmail.com wrote:
 > Hi there,
 >
 > Our target is Sitara AM5726 , CortexA15 dual core on which we are
 > running Linux on A15 core0 and RTOS on core1.
 >
 > __
 >
 > RTOS gets periodic interrupt from external hardware via nirq1 pin
 > (dedicated input into ARM gic).
 >
 > Under heavy load in Linux (core 0!), RTOS, which runs on core1
misses
 > interrupts.

Uhm. Can you reconstruct that issue w/o Jailhouse under Linux?

I mean, can you set the SMP affinity of that IRQ to core 1 under Linux,
and then write some test application running on core 1 that just
receives the IRQ. If that issue happens under Linux as well, then you
know that the issue has probably nothing to do with Jailhouse.


What also might happen: If there's enough pressure on the shared system
bus when Linux is under load, then you simply loose those IRQs as the
RTOS doesn't have enough time to handle it. You can test this
hypothesis
if you lower the frequency of the the periodic interrupt. If you still
loose IRQs, then this should not be the case.

 >
 > Questions
 >
 > 1. Does linux/hypervisor participate in interrupt
scheduling/forwarding
 > to cell on Core1

Linux: No, Linux does not participate in anything that is going on on
CPU 1. That's the idea behind Jailhouse.

Jailhouse: Maybe. On ARM platforms, Jailhouse needs to reinject the
Interrupt from the hypervisor to the guest, if SDEI is not available.
Does the Sitara come with support for SDEI support?

(You can btw monitor the exits of the hypervisor with 'jailhouse cell
stats')

Ralf

 > 2. Is there a description of interrupt forwarding/virtualization
scheme
 > to cores (if exists)? Any pointer to document/source code would be
 > appreciated.
 >
 > Thanks a lot,
 >
 > Nir.
 >
 > --
 > You received this message because you are subscribed to the Google
 > Groups "Jailhouse" group.
 > To unsubscribe from this group and stop receiving emails from it,
send
 > an email to jailhouse-de...@googlegroups.com
 > .
 > To view this discussion on the web visit
 >

https://groups.google.com/d/msgid/jailhouse-dev/fde55f66-2e83-4df2-8f5e-44b0fb831acbn%40googlegroups.com 

 

Re: RTOS inmate misses interrupts

2022-12-05 Thread Rasty Slutsker
Hi Ralf,
Thank you for the answer.
We have periodic interrupt each 30 u(!)Sec. Linux cannot deal with such 
rate, so we need hypervisor/RTOS.
We managed to read a code of hypervisor. It appears that all interrupts to 
all cores are intercepted by hypervisor and then forwarded to guests (per 
core).
If we reduce interrupt priority of mentioned interrupt (as you suggest) we 
lose even more interrupts, even without stress.
Interrupt is defined as edge triggered, I assumed that it is memorized by 
gic until serviced.
Is it possible that Hypervisor acknowledges pending interrupt while 
servicing interrupt from another source ? Kind of race - 2 interrupts for 2 
cores arrive nearly simultaneously. One is lost.

Best regards
Rasty

On Monday, December 5, 2022 at 5:14:37 PM UTC+2 Ralf Ramsauer wrote:

> Hi Nir,
>
> On 29/11/2022 14:21, nirge...@gmail.com wrote:
> > Hi there,
> > 
> > Our target is Sitara AM5726 , CortexA15 dual core on which we are 
> > running Linux on A15 core0 and RTOS on core1.
> > 
> > __
> > 
> > RTOS gets periodic interrupt from external hardware via nirq1 pin 
> > (dedicated input into ARM gic).
> > 
> > Under heavy load in Linux (core 0!), RTOS, which runs on core1 misses 
> > interrupts.
>
> Uhm. Can you reconstruct that issue w/o Jailhouse under Linux?
>
> I mean, can you set the SMP affinity of that IRQ to core 1 under Linux, 
> and then write some test application running on core 1 that just 
> receives the IRQ. If that issue happens under Linux as well, then you 
> know that the issue has probably nothing to do with Jailhouse.
>
>
> What also might happen: If there's enough pressure on the shared system 
> bus when Linux is under load, then you simply loose those IRQs as the 
> RTOS doesn't have enough time to handle it. You can test this hypothesis 
> if you lower the frequency of the the periodic interrupt. If you still 
> loose IRQs, then this should not be the case.
>
> > 
> > Questions
> > 
> > 1. Does linux/hypervisor participate in interrupt scheduling/forwarding
> > to cell on Core1
>
> Linux: No, Linux does not participate in anything that is going on on 
> CPU 1. That's the idea behind Jailhouse.
>
> Jailhouse: Maybe. On ARM platforms, Jailhouse needs to reinject the 
> Interrupt from the hypervisor to the guest, if SDEI is not available. 
> Does the Sitara come with support for SDEI support?
>
> (You can btw monitor the exits of the hypervisor with 'jailhouse cell 
> stats')
>
> Ralf
>
> > 2. Is there a description of interrupt forwarding/virtualization scheme
> > to cores (if exists)? Any pointer to document/source code would be
> > appreciated.
> > 
> > Thanks a lot,
> > 
> > Nir.
> > 
> > -- 
> > You received this message because you are subscribed to the Google 
> > Groups "Jailhouse" group.
> > To unsubscribe from this group and stop receiving emails from it, send 
> > an email to jailhouse-de...@googlegroups.com 
> > .
> > To view this discussion on the web visit 
> > 
> https://groups.google.com/d/msgid/jailhouse-dev/fde55f66-2e83-4df2-8f5e-44b0fb831acbn%40googlegroups.com
>  
> <
> https://groups.google.com/d/msgid/jailhouse-dev/fde55f66-2e83-4df2-8f5e-44b0fb831acbn%40googlegroups.com?utm_medium=email_source=footer
> >.
>

-- 
You received this message because you are subscribed to the Google Groups 
"Jailhouse" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to jailhouse-dev+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/jailhouse-dev/9f1616c7-ac5f-49de-bc24-8bd8520f4c07n%40googlegroups.com.


Re: RTOS inmate misses interrupts

2022-12-05 Thread Ralf Ramsauer

Hi Nir,

On 29/11/2022 14:21, nirge...@gmail.com wrote:

Hi there,

Our target is Sitara AM5726 , CortexA15 dual core on which we are 
running Linux on A15 core0 and RTOS on core1.


__

RTOS gets periodic interrupt from external hardware via nirq1 pin 
(dedicated input into ARM gic).


Under heavy load in Linux (core 0!), RTOS, which runs on core1 misses 
interrupts.


Uhm. Can you reconstruct that issue w/o Jailhouse under Linux?

I mean, can you set the SMP affinity of that IRQ to core 1 under Linux, 
and then write some test application running on core 1 that just 
receives the IRQ. If that issue happens under Linux as well, then you 
know that the issue has probably nothing to do with Jailhouse.



What also might happen: If there's enough pressure on the shared system 
bus when Linux is under load, then you simply loose those IRQs as the 
RTOS doesn't have enough time to handle it. You can test this hypothesis 
if you lower the frequency of the the periodic interrupt. If you still 
loose IRQs, then this should not be the case.




Questions

 1. Does linux/hypervisor participate in interrupt scheduling/forwarding
to cell on Core1


Linux: No, Linux does not participate in anything that is going on on 
CPU 1. That's the idea behind Jailhouse.


Jailhouse: Maybe. On ARM platforms, Jailhouse needs to reinject the 
Interrupt from the hypervisor to the guest, if SDEI is not available. 
Does the Sitara come with support for SDEI support?


(You can btw monitor the exits of the hypervisor with 'jailhouse cell 
stats')


  Ralf


 2. Is there a description of interrupt forwarding/virtualization scheme
to cores (if exists)? Any pointer to document/source code would be
appreciated.

Thanks a lot,

Nir.

--
You received this message because you are subscribed to the Google 
Groups "Jailhouse" group.
To unsubscribe from this group and stop receiving emails from it, send 
an email to jailhouse-dev+unsubscr...@googlegroups.com 
.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/jailhouse-dev/fde55f66-2e83-4df2-8f5e-44b0fb831acbn%40googlegroups.com .


--
You received this message because you are subscribed to the Google Groups 
"Jailhouse" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to jailhouse-dev+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/jailhouse-dev/34d2d078-1282-c240-9a65-301469b0bbe2%40oth-regensburg.de.


RTOS inmate misses interrupts

2022-11-29 Thread nirge...@gmail.com


Hi there,

Our target is Sitara AM5726 , CortexA15 dual core on which we are running 
Linux on A15 core0 and RTOS on core1.

RTOS gets periodic interrupt from external hardware via nirq1 pin 
(dedicated input into ARM gic).

Under heavy load in Linux (core 0!), RTOS, which runs on core1 misses 
interrupts.

Questions

   1. Does linux/hypervisor participate in interrupt scheduling/forwarding 
   to cell on Core1
   2. Is there a description of interrupt forwarding/virtualization scheme 
   to cores (if exists)? Any pointer to document/source code would be 
   appreciated.

Thanks a lot,

Nir.

-- 
You received this message because you are subscribed to the Google Groups 
"Jailhouse" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to jailhouse-dev+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/jailhouse-dev/fde55f66-2e83-4df2-8f5e-44b0fb831acbn%40googlegroups.com.