On Fri, Aug 24, 2012 at 07:40:36AM +0200, Jan Kiszka wrote: > On 2012-08-23 08:24, Matthew Ogilvie wrote: > > This patch provides a way to optionally suppress spurious interrupts,
[snip] > > I'm not sure why it only sporadically hits this sequence of events. > > There doesn't seem to be other IRQs asserted or serviced anywhere > > in the near past; the last several were all IRQ14's. But I can't > > help feeling I'm not reading the log output correctly or something, > > because that doesn't make sense. Maybe there is there some kind > > of a-few-instructions delay before a CPU interrupt is actually > > deliviered after interrupts are enabled, or some delay in raising > > IRQ14 after a hard drive operation is requested, and such delays > > need to fall into a narrow window of opportunity left by UNIX? > > > > I can get a disassembly of the UNIX kernel using a "coff"-enabled > > build of GNU objdump, giving function names but not much else. > > But I haven't studied it in enough detail to actually find the > > relevant code path that is manipulating imr as described above. > > However, this old post outlines some of the high level theory > > of UNIX spl*() functions: > > http://www.linuxmisc.com/29-unix-internals/4e6c1f6fa2e41670.htm > > > > If anyone wants to look into this further, I can provide access to the > > initial boot install floppy, at least. Email me. (Without the rest > > of the install disks, it isn't much use for anything except testing > > virtual machines like qemu against rare corner cases...) > > > > ============ > > Alternative Approaches: > > > > An alternative to this patch that might work (I haven't tried) would > > be to have BIOS set the master's elcr register 0x04 bit, making IRQ2 > > level triggered instead of edge triggered. I'm not sure what other > > effects this might have. Maybe it would actually be a more accurate > > model (I haven't checked documentation; maybe "slave mode" of a > > IRQ line into the master is supposed to be level triggered?) > > > > Or perhaps find a way to model the minimum timescale that a interrupt > > request needs to be active to be recognized? > > > > Or maybe my analysis isn't correct; I wasn't able to find the > > relevant code path in the UNIX kernel. [snip] > > Has to mention or even actively warn that it doesn't work with KVM and > its in-kernel irqchip (as that PIC model lacks your hack). I'll make an incremental patch to the documentation soon. > > However, I strongly suspect you are nastily papering over an issue in > some device model. So I would prefer to dig deeper before installing > this in upstream (also due to its dependency on the userspace PIC model). This is certainly possible. I'm not an expert on the whole interrupt subsystem design in a PC. But other than the wild speculation above (making IRQ2 level triggered via elcr, or some kind of timing preventing the edge triggering from catching a very short blip), I'm not sure what to look for. - Matthew Ogilvie