Bug#990263: podman sets oom_score_adj to -1000 for processes inside the

2021-07-01 Thread Max Bruckner
I just found the bug and the fix! It's not in podman but in conmon!

See https://github.com/containers/conmon/releases/tag/v2.0.29 and
https://github.com/containers/conmon/commit/b033cb5dfde6de05e63408fc839f1bb641cddd85


On Tue, 29 Jun 2021 00:08:48 +0900 Hideki Yamane  wrote:
>  Well, I've tested it too with bullseye on KVM and reproduced it, however,
>  it's only under root privilege. Just run "$ podman run -it --rm debian sh"
>  via normal user and it returns 0.

Yes, when running as normal user it just doesn't have the permissions to set 
negative OOM score adjustments, that's why
it's 0.

>  And also tested with my daily driver unstable system I cannot reproduce it.
>  (But sid on KVM can reproduce it, hmm...)

Probably because it's the conmon version that matters. I can reproduce it on 
Archlinux as well by downgrading conmon to
2.0.28
 
>  It may be better to downgrade as important if it's only root privilege, IMO.

I'm new to debian bug reports and only saw the "breaks the whole system" 
criterium in the list that "reportbug" printed.
So feel free to downgrade. Not sure if I have the permission to do so as the 
bug reporter, but if so I don't even know
how to.



Bug#990263: podman sets oom_score_adj to -1000 for processes inside the

2021-06-28 Thread Hideki Yamane
 container so the system breaks in OOM situations
Message-Id: <20210629000848.a3125fe89a9984a780074...@iijmio-mail.jp>
In-Reply-To: <502056d5848360369973f2c96882ff37ad42bb4f.ca...@doo.shop>
X-Mailer: Sylpheed 3.7.0 (GTK+ 2.24.33; x86_64-pc-linux-gnu)
Mime-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit

Hi,

On Thu, 24 Jun 2021 11:37:35 +0200 Max Bruckner  wrote:
> How to reproduce:
> 
> ```
> # podman run -it --rm debian sh
> # cat /proc/$$/oom_score_adj
> -1000
> ```

 Well, I've tested it too with bullseye on KVM and reproduced it, however,
 it's only under root privilege. Just run "$ podman run -it --rm debian sh"
 via normal user and it returns 0.

 And also tested with my daily driver unstable system I cannot reproduce it.
 (But sid on KVM can reproduce it, hmm...)


 It may be better to downgrade as important if it's only root privilege, IMO.


-- 
Regards,

 Hideki Yamane henrich @ debian.org/iijmio-mail.jp



Bug#990263: podman sets oom_score_adj to -1000 for processes inside the container so the system breaks in OOM situations

2021-06-24 Thread Max Bruckner
Package: podman
X-Debbugs-Cc: m...@doo.shop
Version: 3.0.1+dfsg1-2+b2
Severity: critical
Justification: breaks the whole system
Tags: newcomer

Dear Maintainer,

when processes inside a podman container consume all the available
memory, system processes start to get killed instead of the process
inside of the container. This is because podman in this version seems to
set an oom_score_adj value of -1000 for all processes inside the
container.

Marked as critical because what would normally just result in a process
being killed by the OOM reaper now affects the entire system to the
point that it isn't accessible via SSH anymore.

This seems to be fixed at least in podman 3.2.1 (tested on Archlinux) but I 
haven't found a
respective entry in the upstream release notes, so I don't know what version
actually made the fix. I also don't know if the problem is in podman
itself or one of it's dependencies or if it is in the upstream version at all.

How to reproduce:

```
# podman run -it --rm debian sh
# cat /proc/$$/oom_score_adj
-1000
```

I would expect this to show 0 for the oom_score_adj value.

I tried to work around this problem, by passing --oom-score-adj=0 to the
podman command, but with no effect (this might be the same bug or
related to a different one.

```
# podman run -it --rm --oom-score-adj=0 debian sh
# cat /proc/$$/oom_score_adj
-1000
```

What DOES work however is setting a nonzero value:

```
# podman run -it --rm --oom-score-adj=1 debian sh
# cat /proc/$$/oom_score_adj
1
```

This is probably related to a typical golang programming error where 0
values are interpreted as "absence of a value" and a default fallback is
used, but this is just a guess.


-- System Information:
Debian Release: 11.0
  APT prefers testing
  APT policy: (500, 'testing')
Architecture: amd64 (x86_64)

Kernel: Linux 5.10.0-7-amd64 (SMP w/2 CPU threads)
Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8), LANGUAGE not set
Shell: /bin/sh linked to /usr/bin/dash
Init: systemd (via /run/systemd/system)

Versions of packages podman depends on:
ii  conmon   2.0.25+ds1-1
ii  containernetworking-plugins  0.9.0-1+b5
ii  crun 0.17+dfsg-1
ii  golang-github-containers-common  0.33.4+ds1-1
ii  init-system-helpers  1.60
ii  iptables 1.8.7-1
ii  libc62.31-12
ii  libdevmapper1.02.1   2:1.02.175-2.1
ii  libgpgme11   1.14.0-1+b2
ii  libseccomp2  2.5.1-1

Versions of packages podman recommends:
pn  buildah   
pn  catatonit | tini | dumb-init  
pn  fuse-overlayfs
pn  golang-github-containernetworking-plugin-dnsname  
pn  slirp4netns   
pn  uidmap

Versions of packages podman suggests:
pn  containers-storage  
pn  docker-compose  

-- no debconf information