how to improve IPI latency?

Alex Züpke Mon, 15 Jun 2015 08:08:24 -0700

Am 15.06.2015 um 16:51 schrieb Peter Maydell:
> On 15 June 2015 at 15:44, Alex Züpke <alexander.zue...@hs-rm.de> wrote:
>> Am 12.06.2015 um 20:03 schrieb Peter Maydell:
>>> Probably the best approach would be to have something in
>>> arm_cpu_set_irq() which says "if we are CPU X and we've
>>> just caused an interrupt to be set for CPU Y, then we
>>> should ourselves yield back to the main loop".
>>>
>>> Something like this, maybe, though I have done no more testing
>>> than checking it doesn't actively break kernel booting :-)
>>
>>
>> Thanks! One more check for "level" is needed to get it work:
> 
> What happens without that? It's reasonable to have it,
> but extra cpu_exit()s shouldn't cause a problem beyond
> being a bit inefficient...


The emulation get's stuck, for whatever reason I don't understand.
I checked if something similar is done on other architectures and found 
that the level check is missing, see for example cpu_request_exit() in 
hw/ppc/prep.c:
  static void cpu_request_exit(void *opaque, int irq, int level)
  {
      CPUState *cpu = current_cpu;

      if (cpu && level) {
          cpu_exit(cpu);
      }
  }

But probably this is used for something completely unrelated.

> It would be interesting to know if this helps Linux as well
> as your custom OS. (I don't know whether a "CPU #0 polls"
> approach is bad on hardware too; the other option would be
> to have CPU #1 IPI back in the other direction if 0 needed
> to wait for a response.)
> 
> -- PMM

IIRC, Linux TLB shootdown on x86 once used such a scheme, but I don't know if 
they changed it.

I'd say that an IPI+poll pattern is used quite often in the tricky parts of a 
kernel, like kernel debugging.



Here's a simple IPI tester sending IPIs from CPU #0 to CPU #1 in an endless 
loop.
The IPIs are delayed until the timer interrupt triggers the main loop.

http://www.cs.hs-rm.de/~zuepke/qemu/ipi.elf
3174 bytes, md5sum 8d73890a60cd9b24a4f9139509b580e2

Run testcase:
$ qemu-system-arm -M vexpress-a15 -smp 2 -kernel ipi.elf -nographic

The testcase prints the following on the serial console without the patch:

  +------- CPU 0 came up
  |+------ CPU 0 initialization completed
  || +---- CPU 0 timer interrupt, 1 HZ
  || |
  vv v
  0!1T.T.T.T.T.T.T.
    ^ ^
    | |
    | +-- CPU 1 received an IPI
    +---- CPU 1 came up


Expected testcase output with patch:

  0!1T..............<hundreds of dots>.................T...............

So: more dots == more IPIs handled between two timer interrupts "T" ...



Best regards
Alex

Re: [Qemu-devel] QEMU ARM SMP: IPI delivery delayed until next main loop event // how to improve IPI latency?

Reply via email to