** Description changed:

- I recently replaced some Xenial servers, and started experiencing "Out
- of memory" problems with the default kernel.
+ After a fix for LP#1647400, a bug that caused freezes under some
+ workloads, some users noticed regular OOMs. Those regular OOMs were
+ reported under this bug, and fixed after some releases.
+ 
+ Some of the affected kernels are documented below. In order to check
+ your particular kernel, read its changelog and lookup for 1655842 and
+ 1647400. If it has the fix for 1647400, but not the fix for 1655842,
+ then it's affected.
+ 
+ It's still possible that you notice regressions compared to kernels that
+ didn't have the fixes for any of the bugs. However, reverting all fixes
+ would cause the freeze bug to come back. So, it's not a possible
+ solution moving forward.
+ 
+ If you see any regressions, in the form of OOMs, mainly, please report a
+ new bug. Different workloads may require different solutions, and it's
+ possible that further fixes are needed, be them upstream or not. The
+ best way to get such fixes applied is reporting that under a new bug,
+ one that can be verified, so being able to reproduce the bug makes it
+ possible to verify the fixes really fix the identified bug.
+ 
+ Kernels affected:
+ 
+ linux  4.4.0-58, 4.4.0-59, 4.4.0-60, 4.4.0-61, 4.4.0-62.
+ linux-raspi2  4.4.0-1039 to 4.4.0-1042 and 4.4.0-1044 to 4.4.0-1071
+ 
+ 
+ Particular kernels NOT affected by THIS bug:
+ 
+ linux-aws
+ 
+ To reiterate, if you find an OOM with an affected kernel, please upgrade.
+ If you find an OOM with a non-affected kernel, please report a new bug. We 
want to investigate it and fix it.
+ 
+ 
+ ===================
+ I recently replaced some Xenial servers, and started experiencing "Out of 
memory" problems with the default kernel.
  
  We bake Amazon AMIs based on an official Ubuntu-provided image (ami-
  e6b58e85, in ap-southeast-2, from https://cloud-
  images.ubuntu.com/locator/ec2/).  Previous versions of our AMI included
  "4.4.0-57-generic", but the latest version picked up "4.4.0-59-generic"
  as part of a "dist-upgrade".
  
  Instances booted using the new AMI have been using more memory, and
  experiencing OOM issues - sometimes during boot, and sometimes a while
  afterwards.  An example from the system log is:
  
  [  130.113411] cloud-init[1560]: Cloud-init v. 0.7.8 running 'modules:final' 
at Wed, 11 Jan 2017 22:07:53 +0000. Up 29.28 seconds.
  [  130.124219] cloud-init[1560]: Cloud-init v. 0.7.8 finished at Wed, 11 Jan 
2017 22:09:35 +0000. Datasource DataSourceEc2.  Up 130.09 seconds
  [29871.137128] Out of memory: Kill process 2920 (ruby) score 107 or sacrifice 
child
  [29871.140816] Killed process 2920 (ruby) total-vm:675048kB, 
anon-rss:51184kB, file-rss:2164kB
  [29871.449209] Out of memory: Kill process 3257 (splunkd) score 97 or 
sacrifice child
  [29871.453282] Killed process 3258 (splunkd) total-vm:66272kB, 
anon-rss:6676kB, file-rss:0kB
  [29871.677910] Out of memory: Kill process 2647 (fluentd) score 51 or 
sacrifice child
  [29871.681872] Killed process 2647 (fluentd) total-vm:117944kB, 
anon-rss:23956kB, file-rss:1356kB
  
  I have a hunch that this may be related to the fix for
  https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1647400, introduced
  in linux (4.4.0-58.79).
  
  ProblemType: Bug
  DistroRelease: Ubuntu 16.04
  Package: linux-image-4.4.0-59-generic 4.4.0-59.80
  ProcVersionSignature: User Name 4.4.0-59.80-generic 4.4.35
  Uname: Linux 4.4.0-59-generic x86_64
  AlsaDevices:
   total 0
   crw-rw---- 1 root audio 116,  1 Jan 12 06:29 seq
   crw-rw---- 1 root audio 116, 33 Jan 12 06:29 timer
  AplayDevices: Error: [Errno 2] No such file or directory: 'aplay'
  ApportVersion: 2.20.1-0ubuntu2.4
  Architecture: amd64
  ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord'
  AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', 
'/dev/snd/timer'] failed with exit code 1:
  Date: Thu Jan 12 06:38:45 2017
  Ec2AMI: ami-0f93966c
  Ec2AMIManifest: (unknown)
  Ec2AvailabilityZone: ap-southeast-2a
  Ec2InstanceType: t2.nano
  Ec2Kernel: unavailable
  Ec2Ramdisk: unavailable
  IwConfig: Error: [Errno 2] No such file or directory: 'iwconfig'
  Lsusb: Error: command ['lsusb'] failed with exit code 1:
  MachineType: Xen HVM domU
  PciMultimedia:
  
  ProcEnviron:
   TERM=xterm-256color
   PATH=(custom, no user)
   XDG_RUNTIME_DIR=<set>
   LANG=en_US.UTF-8
   SHELL=/bin/bash
  ProcFB: 0 cirrusdrmfb
  ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-4.4.0-59-generic 
root=UUID=fb0fef08-f3c5-40bf-9776-f7ba00fe72be ro console=tty1 console=ttyS0
  RelatedPackageVersions:
   linux-restricted-modules-4.4.0-59-generic N/A
   linux-backports-modules-4.4.0-59-generic  N/A
   linux-firmware                            1.157.6
  RfKill: Error: [Errno 2] No such file or directory: 'rfkill'
  SourcePackage: linux
  UpgradeStatus: No upgrade log present (probably fresh install)
  dmi.bios.date: 12/09/2016
  dmi.bios.vendor: Xen
  dmi.bios.version: 4.2.amazon
  dmi.chassis.type: 1
  dmi.chassis.vendor: Xen
  dmi.modalias: 
dmi:bvnXen:bvr4.2.amazon:bd12/09/2016:svnXen:pnHVMdomU:pvr4.2.amazon:cvnXen:ct1:cvr:
  dmi.product.name: HVM domU
  dmi.product.version: 4.2.amazon
  dmi.sys.vendor: Xen

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-aws in Ubuntu.
https://bugs.launchpad.net/bugs/1655842

Title:
  "Out of memory" errors after upgrade to 4.4.0-59

Status in linux package in Ubuntu:
  Fix Released
Status in linux-aws package in Ubuntu:
  Confirmed
Status in linux-raspi2 package in Ubuntu:
  Fix Committed
Status in linux source package in Xenial:
  Fix Released
Status in linux-aws source package in Xenial:
  Confirmed
Status in linux-raspi2 source package in Xenial:
  Fix Committed

Bug description:
  After a fix for LP#1647400, a bug that caused freezes under some
  workloads, some users noticed regular OOMs. Those regular OOMs were
  reported under this bug, and fixed after some releases.

  Some of the affected kernels are documented below. In order to check
  your particular kernel, read its changelog and lookup for 1655842 and
  1647400. If it has the fix for 1647400, but not the fix for 1655842,
  then it's affected.

  It's still possible that you notice regressions compared to kernels
  that didn't have the fixes for any of the bugs. However, reverting all
  fixes would cause the freeze bug to come back. So, it's not a possible
  solution moving forward.

  If you see any regressions, in the form of OOMs, mainly, please report
  a new bug. Different workloads may require different solutions, and
  it's possible that further fixes are needed, be them upstream or not.
  The best way to get such fixes applied is reporting that under a new
  bug, one that can be verified, so being able to reproduce the bug
  makes it possible to verify the fixes really fix the identified bug.

  Kernels affected:

  linux  4.4.0-58, 4.4.0-59, 4.4.0-60, 4.4.0-61, 4.4.0-62.
  linux-raspi2  4.4.0-1039 to 4.4.0-1042 and 4.4.0-1044 to 4.4.0-1071

  
  Particular kernels NOT affected by THIS bug:

  linux-aws

  To reiterate, if you find an OOM with an affected kernel, please upgrade.
  If you find an OOM with a non-affected kernel, please report a new bug. We 
want to investigate it and fix it.

  
  ===================
  I recently replaced some Xenial servers, and started experiencing "Out of 
memory" problems with the default kernel.

  We bake Amazon AMIs based on an official Ubuntu-provided image (ami-
  e6b58e85, in ap-southeast-2, from https://cloud-
  images.ubuntu.com/locator/ec2/).  Previous versions of our AMI
  included "4.4.0-57-generic", but the latest version picked up
  "4.4.0-59-generic" as part of a "dist-upgrade".

  Instances booted using the new AMI have been using more memory, and
  experiencing OOM issues - sometimes during boot, and sometimes a while
  afterwards.  An example from the system log is:

  [  130.113411] cloud-init[1560]: Cloud-init v. 0.7.8 running 'modules:final' 
at Wed, 11 Jan 2017 22:07:53 +0000. Up 29.28 seconds.
  [  130.124219] cloud-init[1560]: Cloud-init v. 0.7.8 finished at Wed, 11 Jan 
2017 22:09:35 +0000. Datasource DataSourceEc2.  Up 130.09 seconds
  [29871.137128] Out of memory: Kill process 2920 (ruby) score 107 or sacrifice 
child
  [29871.140816] Killed process 2920 (ruby) total-vm:675048kB, 
anon-rss:51184kB, file-rss:2164kB
  [29871.449209] Out of memory: Kill process 3257 (splunkd) score 97 or 
sacrifice child
  [29871.453282] Killed process 3258 (splunkd) total-vm:66272kB, 
anon-rss:6676kB, file-rss:0kB
  [29871.677910] Out of memory: Kill process 2647 (fluentd) score 51 or 
sacrifice child
  [29871.681872] Killed process 2647 (fluentd) total-vm:117944kB, 
anon-rss:23956kB, file-rss:1356kB

  I have a hunch that this may be related to the fix for
  https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1647400,
  introduced in linux (4.4.0-58.79).

  ProblemType: Bug
  DistroRelease: Ubuntu 16.04
  Package: linux-image-4.4.0-59-generic 4.4.0-59.80
  ProcVersionSignature: User Name 4.4.0-59.80-generic 4.4.35
  Uname: Linux 4.4.0-59-generic x86_64
  AlsaDevices:
   total 0
   crw-rw---- 1 root audio 116,  1 Jan 12 06:29 seq
   crw-rw---- 1 root audio 116, 33 Jan 12 06:29 timer
  AplayDevices: Error: [Errno 2] No such file or directory: 'aplay'
  ApportVersion: 2.20.1-0ubuntu2.4
  Architecture: amd64
  ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord'
  AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', 
'/dev/snd/timer'] failed with exit code 1:
  Date: Thu Jan 12 06:38:45 2017
  Ec2AMI: ami-0f93966c
  Ec2AMIManifest: (unknown)
  Ec2AvailabilityZone: ap-southeast-2a
  Ec2InstanceType: t2.nano
  Ec2Kernel: unavailable
  Ec2Ramdisk: unavailable
  IwConfig: Error: [Errno 2] No such file or directory: 'iwconfig'
  Lsusb: Error: command ['lsusb'] failed with exit code 1:
  MachineType: Xen HVM domU
  PciMultimedia:

  ProcEnviron:
   TERM=xterm-256color
   PATH=(custom, no user)
   XDG_RUNTIME_DIR=<set>
   LANG=en_US.UTF-8
   SHELL=/bin/bash
  ProcFB: 0 cirrusdrmfb
  ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-4.4.0-59-generic 
root=UUID=fb0fef08-f3c5-40bf-9776-f7ba00fe72be ro console=tty1 console=ttyS0
  RelatedPackageVersions:
   linux-restricted-modules-4.4.0-59-generic N/A
   linux-backports-modules-4.4.0-59-generic  N/A
   linux-firmware                            1.157.6
  RfKill: Error: [Errno 2] No such file or directory: 'rfkill'
  SourcePackage: linux
  UpgradeStatus: No upgrade log present (probably fresh install)
  dmi.bios.date: 12/09/2016
  dmi.bios.vendor: Xen
  dmi.bios.version: 4.2.amazon
  dmi.chassis.type: 1
  dmi.chassis.vendor: Xen
  dmi.modalias: 
dmi:bvnXen:bvr4.2.amazon:bd12/09/2016:svnXen:pnHVMdomU:pvr4.2.amazon:cvnXen:ct1:cvr:
  dmi.product.name: HVM domU
  dmi.product.version: 4.2.amazon
  dmi.sys.vendor: Xen

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1655842/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to     : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to