tl;dr: pcie access = latency
it's interesting to take a look at irq latency for a number of devices on
different
machines. it looks like the experimental device --- the interrupt does nothing
---
gives us a lower bound for irq latency, which works out to be 36ns (!). clearly
this doesn't count dispatch, or the overhead of trap().
m# aux/cpuid -i
Intel(R) Xeon(R) CPU E5-1620 v3 @ 3.50GHz
vector.mach bus irq count sum(cycles) cycles/call type name
m# /usr/quanstro/bin/rc/irqlatency
66.0 1124605 3143914 127.78
msi-x experimental
50.0 50 348487802964226 2304.14
lapic APIC timer
50.6 50 324950841831354 2590.65
lapic APIC timer
50.4 50 319746832771958 2604.48
lapic APIC timer
50.2 50 323061846873676 2621.40
lapic APIC timer
50.3 50 323080867020068 2683.61
lapic APIC timer
50.5 50 323056886385231 2743.75
lapic APIC timer
50.7 50 324165905699976 2793.95
lapic APIC timer
50.1 50 323519997341922 3082.79
lapic APIC timer
65.0 11 4383 45452712 10370.23
msi-x ether0
66.5 566222777537556 11741.38
ioapic usbehci
65.7 119 247991 27554.56
msi sdF (ahci)
65.1 4 1084 38400923 35425.21
ioapic COM1
; aux/cpuid -i
AMD Phenom(tm) II X4 965 Processor
; irqlatency
50.3 50 83106879 109878207918 1322.13
lapic APIC timer
50.2 50 83070335 109844976915 1322.31
lapic APIC timer
50.1 50 83228230 110448926024 1327.06
lapic APIC timer
50.0 50 90772899 169367434384 1865.84
lapic APIC timer
65.2 10 1179147 11639594649 9871.20
msi-x ether0
65.0 117657178037258 10083.10
ioapic i8042
68.2 12 312539 3476219928 11122.52
ioapic kbdaux
66.0 4 911 21026234 23080.39
ioapic COM1
71.1 11 58 1780132 30691.93
msi sdE (ahci)
68.1 101 7934988 7934988.00
ioapic usbohci
given the efficency of locks, and the same ethernet devices (i211), we can
reason
that since the latency is prop. to the frequency,
10370*3.4/3.5 = 10073.71428571
the latency must largely be pcie register reads and writes. sadly, the msi-x
interrupt
doesn't handle single causes, otherwise no register access would be necessary.
- erik
ps.
interesting question: why do the lapic timers take longer on the faster
machine? that's
double the latency. is this an intel thing? (note this is a different
ethernet chipset,
and a different driver so the comparison with the other two is bogus.)
lilly; aux/cpuid -i
Intel(R) Atom(TM) CPU D525 @ 1.80GHz
lilly; irqlatency
50.3 508021803131705456340928 2126.03
lapic APIC timer
50.1 508039881801811997635679 2253.76
lapic APIC timer
50.2 508325052881914666440067 2299.89
lapic APIC timer
50.0 508308619692509217805168 3020.02
lapic APIC timer
70.2 15216587 8293.50
ioapic usbehci
66.2 11 36660721 362749665600 9894.78
msi ether0
66.1 10 395700 4099132008 10359.19
msi ether1
65.0 1110404 10404.00
ioapic i8042
68.1 14 35 379197 10834.20
ioapic usbuhci
67.3 10 95 1065114 11211.73
ioapic usbehci
65.3 4 603 8503128 14101.37
ioapic COM1