[Qemu-devel] [Bug 1307473] Re: guest hang due to missing clock interrupt

2014-07-14 Thread Ondergetekende
We've resolved our issues by disabling KSM on the affected nodes. All of the non-affected nodes didn't have KSM enabled (due to a packaging bug elsewhere). After disabling KSM, our problems went away gradually in ~3 days. This means we're no longer affected by this issue (and given the other

[Qemu-devel] [Bug 1307473] Re: guest hang due to missing clock interrupt

2014-07-09 Thread Ondergetekende
We haven't been able to reproduce the issues under lab conditions, and I'm not willing to use our production setup as a guinypig anymore. These issues have cost me too much credibility already. We believe #1326367 is causing this, as we've bisected this issue to be between 3.13.0-27.50 and

[Qemu-devel] [Bug 1307473] Re: guest hang due to missing clock interrupt

2014-07-08 Thread Ondergetekende
Note that my list of affected nodes also include migrated VMs, so there are some false positives (VMs that came from an affected node). The affected VMs on node 1-8 all seem to be migrated from another node. -- You received this bug notification because you are a member of qemu- devel-ml, which

[Qemu-devel] [Bug 1307473] Re: guest hang due to missing clock interrupt

2014-07-08 Thread Ondergetekende
I'm not confident yet we're seeing the exact same problem, but it is pretty close. We're running a somewhat wide range of hyperisor kernels, these are our observations so far. node-1-1 3.13.0-24-generic is affected for 0% of vms node-1-3 3.13.0-24-generic is affected for 0% of vms node-1-5