Hello! > I ran this through my test scripts and I'm now quite sure that there's > some breakage in here. > > One of my tests is running two VMs in parallel, each booting up, running > hackbench, and then doing reboot (from within the guest), and just > repeating like that. > > I've run your patches in the above config 100 times, and every time, > the rebooting VMs got stuck before 50 reboots. > > Without these patches, I could run the above config 100 times, and every > time, the rebooting VMs passed 200 reboots.
Huh, the description looks like some problem with vgic_retire_disabled_irqs(). By the way, during reboot, who does call it? The only call i see is in vgic_handle_enable_reg(), which obviously just processes emulated register accesses... And the only thing i know is that in case of GICv2 the userland resets vGIC manually by resetting each register to its default value (therefore all ENABLER are set to 0). At least qemu does this, and i'm not sure about kvmtool. And in case of vGICv3 nobody can do this because there's no API to set registers yet. So, could we be rebooting with interrupts enabled or something like that? So: what kind of container are you running and what vGIC version? Does this problem reproduce with both vGICv2 and vGICv3? By this time i'll make a very minimal version of patch 0001, for you to test it. If we have problems with current 0001, which we cannot solve quickly, we could stick to that version then, which will provide the necessary changes to plug in LPIs, yet with minimal changes (it will only remove vgic_irq_lr_map). I guess i should have done it before. Or, i could even respin v5, with current 0001 split up. This should make it easier to bisect the problem. Kind regards, Pavel Fedin Expert Engineer Samsung Electronics Research center Russia -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html