Public bug reported: [Impact]
When using KVM on NUMA machines, both Linux and Windows guests can exhibit very poor performance and potential crashes. Disabling KSM is a known workaround to fix this issue. [Fix] The following patch fixes the issue in our testing: http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=64a9a34e22896dad430e21a28ad8cb00a756fefc This patch is present in v3.14-rc1 and onwards. [Test Case] General test case: 1) On a NUMA capable machine, setup the machine as a KVM hypervisor - lscpu should show more than 1 NUMA node 2) Install 4 KVM VMs 3) Run the following in another terminal to ensure that pages_shared and pages_sharing is increasing - watch 'tail /sys/kernel/mm/ksm/*' 4) In another terminal run a program that continually pings each node and alerts on high latencies What we've observed is that in Linux guests, the ping latencies can go into the ~2 second range for a few pings, then return back to the < 1ms range. (This is machine dependent.) In addition, occasionally when running this test with Windows guests we observe BSODs during this test. ** Affects: linux (Ubuntu) Importance: Undecided Status: Fix Released ** Affects: linux (Ubuntu Trusty) Importance: High Assignee: Chris J Arges (arges) Status: In Progress ** Description changed: [Impact] When using KVM on NUMA machines, both Linux and Windows guests can exhibit very poor performance and potential crashes. Disabling KSM is a known workaround to fix this issue. - [ Fix ] + [Fix] The following patch fixes the issue in our testing: http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=64a9a34e22896dad430e21a28ad8cb00a756fefc This patch is present in v3.14-rc1 and onwards. [Test Case] General test case: 1) On a NUMA capable machine, setup the machine as a KVM hypervisor - - lscpu should show more than 1 NUMA node + - lscpu should show more than 1 NUMA node 2) Install 4 KVM VMs 3) Run the following in another terminal to ensure that pages_shared and pages_sharing is increasing - - watch 'tail /sys/kernel/mm/ksm/*' + - watch 'tail /sys/kernel/mm/ksm/*' 4) In another terminal run a program that continually pings each node and alerts on high latencies What we've observed is that in Linux guests, the ping latencies can go into the ~2 second range for a few pings, then return back to the < 1ms range. (This is machine dependent.) In addition, using Windows guests, occasionally when running this test we observe that the guests BSOD during this test. ** Also affects: linux (Ubuntu Trusty) Importance: Undecided Status: New ** Changed in: linux (Ubuntu Trusty) Assignee: (unassigned) => Chris J Arges (arges) ** Changed in: linux (Ubuntu) Assignee: Chris J Arges (arges) => (unassigned) ** Changed in: linux (Ubuntu Trusty) Importance: Undecided => High ** Changed in: linux (Ubuntu Trusty) Status: New => In Progress ** Changed in: linux (Ubuntu) Status: In Progress => Fix Released ** Changed in: linux (Ubuntu) Importance: High => Undecided ** Description changed: [Impact] When using KVM on NUMA machines, both Linux and Windows guests can exhibit very poor performance and potential crashes. Disabling KSM is a known workaround to fix this issue. [Fix] The following patch fixes the issue in our testing: http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=64a9a34e22896dad430e21a28ad8cb00a756fefc This patch is present in v3.14-rc1 and onwards. [Test Case] General test case: 1) On a NUMA capable machine, setup the machine as a KVM hypervisor - lscpu should show more than 1 NUMA node 2) Install 4 KVM VMs 3) Run the following in another terminal to ensure that pages_shared and pages_sharing is increasing - watch 'tail /sys/kernel/mm/ksm/*' 4) In another terminal run a program that continually pings each node and alerts on high latencies What we've observed is that in Linux guests, the ping latencies can go into the ~2 second range for a few pings, then return back to the < 1ms - range. (This is machine dependent.) In addition, using Windows guests, - occasionally when running this test we observe that the guests BSOD - during this test. + range. (This is machine dependent.) In addition, occasionally when + running this test with Windows guests we observe BSODs during this test. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1346917 Title: Using KSM on NUMA capable machines can cause KVM guest performance and stability issues Status in “linux” package in Ubuntu: Fix Released Status in “linux” source package in Trusty: In Progress Bug description: [Impact] When using KVM on NUMA machines, both Linux and Windows guests can exhibit very poor performance and potential crashes. Disabling KSM is a known workaround to fix this issue. [Fix] The following patch fixes the issue in our testing: http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=64a9a34e22896dad430e21a28ad8cb00a756fefc This patch is present in v3.14-rc1 and onwards. [Test Case] General test case: 1) On a NUMA capable machine, setup the machine as a KVM hypervisor - lscpu should show more than 1 NUMA node 2) Install 4 KVM VMs 3) Run the following in another terminal to ensure that pages_shared and pages_sharing is increasing - watch 'tail /sys/kernel/mm/ksm/*' 4) In another terminal run a program that continually pings each node and alerts on high latencies What we've observed is that in Linux guests, the ping latencies can go into the ~2 second range for a few pings, then return back to the < 1ms range. (This is machine dependent.) In addition, occasionally when running this test with Windows guests we observe BSODs during this test. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1346917/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp