Public bug reported:

Ever since the "ubuntu-bionic-18.04-amd64-server-20200729" EC2 Ubuntu
AMI was released which has the "5.3.0-1032-aws" kernel we have been
hitting a 100% repro memory leak that causes our app that is running
under docker to be OOM killed.

The scenario is that we have an app running in a docker container and it
occasionally catches a crash happening within itself and when that
happens it creates another process which triggers a gdb dump of that
parent app. Normally this works fine but under these specific kernels it
causes the memory usage to grow and grow until it hits the maximum
allowed memory for the container at which point the container is killed.

I have tested using several of the latest available Ubuntu AMIs
including the latest "ubuntu-bionic-18.04-amd64-server-20210415" which
has the "5.4.0-1045-aws" kernel and the bug still exists.

I also tested a bunch of the mainline kernels and found the fix was
introduced for this memory leak in the v5.9-rc4 kernel
(https://kernel.ubuntu.com/~kernel-ppa/mainline/v5.9-rc4/CHANGES).

Do you all have any idea if or when that set of changes will be
backported into a supported kernel for Ubuntu 18.04 or 20.04?

Release we are running:
root@<redacted>:~# lsb_release -rd
Description:    Ubuntu 18.04.5 LTS
Release:        18.04

Docker / containerd.io versions:
- containerd.io: 1.4.4-1
- docker-ce: 5:20.10.5~3-0~ubuntu-bionic

Latest supported kernel I tried which still sees the memory leak:
root@hostname:~# apt-cache policy linux-aws
linux-aws:
  Installed: 5.4.0.1045.27
  Candidate: 5.4.0.1045.27
  Version table:
 *** 5.4.0.1045.27 500
        500 http://us-east-1.ec2.archive.ubuntu.com/ubuntu bionic-updates/main 
amd64 Packages
        500 http://security.ubuntu.com/ubuntu bionic-security/main amd64 
Packages
        100 /var/lib/dpkg/status
     4.15.0.1007.7 500
        500 http://us-east-1.ec2.archive.ubuntu.com/ubuntu bionic/main amd64 
Packages

Thanks,
Paul

** Affects: linux-aws (Ubuntu)
     Importance: Undecided
         Status: New

** Description changed:

  Ever since the "ubuntu-bionic-18.04-amd64-server-20200729" EC2 Ubuntu
  AMI was released which has the "5.3.0-1032-aws" kernel we have been
  hitting a 100% repro memory leak that causes our app that is running
  under docker to be OOM killed.
  
  The scenario is that we have an app running in a docker container and it
  occasionally catches a crash happening within itself and when that
  happens it creates another process which triggers a gdb dump of that
  parent app. Normally this works fine but under these specific kernels it
  causes the memory usage to grow and grow until it hits the maximum
  allowed memory for the container at which point the container is killed.
  
  I have tested using several of the latest available Ubuntu AMIs
  including the latest "ubuntu-bionic-18.04-amd64-server-20210415" which
  has the "5.4.0-1045-aws" kernel and the bug still exists.
  
  I also tested a bunch of the mainline kernels and found the fix was
  introduced for this memory leak in the v5.9-rc4 kernel
  (https://kernel.ubuntu.com/~kernel-ppa/mainline/v5.9-rc4/CHANGES).
  
  Do you all have any idea if or when that set of changes will be
  backported into a supported kernel for Ubuntu 18.04 or 20.04?
  
  Release we are running:
  root@<redacted>:~# lsb_release -rd
  Description:  Ubuntu 18.04.5 LTS
  Release:      18.04
  
  Docker / containerd.io versions:
  - containerd.io: 1.4.4-1
  - docker-ce: 5:20.10.5~3-0~ubuntu-bionic
  
  Latest supported kernel I tried which still sees the memory leak:
- root@us-east-1a-dev-devops03-reg-gs-i-04742b937b7628f05:~# apt-cache policy 
linux-aws
+ root@hostname:~# apt-cache policy linux-aws
  linux-aws:
-   Installed: 5.4.0.1045.27
-   Candidate: 5.4.0.1045.27
-   Version table:
-  *** 5.4.0.1045.27 500
-         500 http://us-east-1.ec2.archive.ubuntu.com/ubuntu 
bionic-updates/main amd64 Packages
-         500 http://security.ubuntu.com/ubuntu bionic-security/main amd64 
Packages
-         100 /var/lib/dpkg/status
-      4.15.0.1007.7 500
-         500 http://us-east-1.ec2.archive.ubuntu.com/ubuntu bionic/main amd64 
Packages
+   Installed: 5.4.0.1045.27
+   Candidate: 5.4.0.1045.27
+   Version table:
+  *** 5.4.0.1045.27 500
+         500 http://us-east-1.ec2.archive.ubuntu.com/ubuntu 
bionic-updates/main amd64 Packages
+         500 http://security.ubuntu.com/ubuntu bionic-security/main amd64 
Packages
+         100 /var/lib/dpkg/status
+      4.15.0.1007.7 500
+         500 http://us-east-1.ec2.archive.ubuntu.com/ubuntu bionic/main amd64 
Packages
  
  Thanks,
  Paul

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1925261

Title:
  memory leak on AWS kernels when using docker

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-aws/+bug/1925261/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to