On Thu, Nov 26, 2009 at 01:21:39PM +0000, Paul Brook wrote: > >> It's really not that much of a fast path. Unless you're doing something > >> particularly obscure then even under heavy load you're unlikely to exceed > >> a few kHz. > > > >I think with kvm, heavy disk stressing benchmark can get higher. > > I'd still expect this to be the least of your problems. > > If nothing else you've at least one host signal delivery and/or thread > context > switch in there.
iotread which does the signalling might be running in parallel with the guest CPU. > Not to mention the overhead to forwarding the interrupt to > the guest CPU. This is often mitigated as KVM knows to inject the interrupt on the next vmexit. > > > Compared to the average PIC implementation, and the overhead of the > > > actual CPU interrupt, I find it hard to believe that looping over > > > precisely 4 entries has any real performance hit. > > > > I don't think it is major, but I definitely have seen, in the past, > > that extra branches and memory accesses have small but measureable effect > > when taken in interrupt handler routines in drivers, and same should > > apply here. > > > > OTOH keeping the sum around is trivial. > > Not entirely. You now have two different bits of information that you have to > keep consistent. This is inherent in pci spec definition: interrupt status bit in config space duplicates interrupt state. > Unless you can show that this is performance critical code I strongly > recommend keeping it as simple as possible. > > Paul I don't see there is anything left show: interrupt delivery is *obviously* performance critical: people are running *latency benchmarks* measuring how fast a packet can get from an external interface into guest, in microseconds. We definitely want to remove obvious waste there. -- MST