Bug#1026793: docker.io: --memory-swap not enforced on systemd cgroup driver

2023-01-11 Thread Saj Goonatilleke
Hi Shengjing,

On 7 Jan 2023, at 6:48, Saj Goonatilleke wrote:
> You are right:  something is missing from the original report.
> Now I am unable to reproduce the problem on the same machine
> that first exhibited the problem.
>
> I have reinstated our production experiment using the same config from before.
>
> Will make a note to follow up here next week.

I was unable to repro.  Our production (bullseye) systems seem fine now.
Please feel free to close this bug.

On a hunch, I also experimented with restart policies,
(thinking that the cgroup limits may disappear on an automatic restart)
however even that seemed to work OK upon an OOM of PID 1.
Very perplexing.

My apologies for the noise.



Bug#1026793: docker.io: --memory-swap not enforced on systemd cgroup driver

2023-01-06 Thread Saj Goonatilleke
Hi Shengjing,

On 2 Jan 2023, at 6:44, Shengjing Zhu wrote:
> Can't reproduce it on bullseye as well.

You are right:  something is missing from the original report.
Now I am unable to reproduce the problem on the same machine
that first exhibited the problem.

Either this is a case of user error on my part,
or the problem only strikes when some other variable -- so far unknown --
is also present.

I have reinstated our production experiment using the same config from before.
I surveyed one machine by hand and it looked OK.  Data from the others
should trickle in over the coming days.  If everything looks still looks OK,
I suppose we'll just put this one down to PEBKAC.

Will make a note to follow up here next week.

Thank you!



Bug#1026793: docker.io: --memory-swap not enforced on systemd cgroup driver

2022-12-21 Thread Saj Goonatilleke
Package: docker.io
Version: 20.10.5+dfsg1-1+deb11u1
Severity: normal

Hello,

https://docs.docker.com/config/containers/resource_constraints/#limit-a-containers-access-to-memory

--memory-swap imparts no effect.
--memory does impart an effect, but is useless without --memory-swap.
Anon pages will begin to overflow into swap once a workload approaches
its --memory limit.  Instead of a quick OOM and workload restart,
the workload will bring down all system perf with swap thrashing.

Docker is using the systemd cgroup driver.
README.Debian includes a note about swapaccount,
but I don't think this caveat applies to cgroups v2 and linux 5.10.
As shown below, swap accounting does appear to work OK.

--- 8< ---
$ docker info
[...]
 Server Version: 20.10.5+dfsg1
[...]
 Cgroup Driver: systemd
 Cgroup Version: 2
--- >8 ---

Here are the relevant bits of the Docker container configuration:

--- 8< ---
$ docker inspect container | jq '.[0].HostConfig.Memory'
25165824
$ docker inspect container | jq '.[0].HostConfig.MemorySwap'
25165824
--- >8 ---

The configuration is faithfully sent to containerd:

--- 8< ---
# ctr --namespace moby c info container-id | jq .Spec.linux.resources.memory
{
  "limit": 25165824,
  "swap": 25165824
}
--- >8 ---

The swap limit goes missing somewhere between containerd and systemd:

--- 8< ---
# systemctl show docker-container-id.scope | awk -F = '$1 ~ /Memory.*Max/'
MemoryMax=25165824
MemorySwapMax=infinity
--- >8 ---

From the cgroup:

--- 8< ---
# cat memory.swap.max
max

# cat memory.current memory.swap.current
22798336
218423296
--- >8 ---

I would expect memory.swap.max to read zero (swap - limit),
and likewise for memory.swap.current.

I tried to find the missing puzzle piece,
but there are many pieces in this puzzle.
(Is this a runc problem?)

--- 8< ---
$ docker version
Client:
 Version:   20.10.5+dfsg1
 API version:   1.41
 Go version:go1.15.15
 Git commit:55c4c88
 Built: Sat Dec  4 10:53:03 2021
 OS/Arch:   linux/amd64
 Context:   default
 Experimental:  true

Server:
 Engine:
  Version:  20.10.5+dfsg1
  API version:  1.41 (minimum version 1.12)
  Go version:   go1.15.15
  Git commit:   363e9a8
  Built:Sat Dec  4 10:53:03 2021
  OS/Arch:  linux/amd64
  Experimental: false
 containerd:
  Version:  1.4.13~ds1
  GitCommit:1.4.13~ds1-1~deb11u2
 runc:
  Version:  1.0.0~rc93+ds1
  GitCommit:1.0.0~rc93+ds1-5+deb11u2
 docker-init:
  Version:  0.19.0
  GitCommit:
--- >8 ---

-- System Information:
Debian Release: 11.6
  APT prefers stable-security
  APT policy: (500, 'stable-security'), (500, 'stable')
Architecture: amd64 (x86_64)

Kernel: Linux 5.10.0-14-cloud-amd64 (SMP w/8 CPU threads)
Locale: LANG=C.UTF-8, LC_CTYPE=C.UTF-8 (charmap=UTF-8), LANGUAGE not set
Shell: /bin/sh linked to /usr/bin/dash
Init: systemd (via /run/systemd/system)

Versions of packages docker.io depends on:
ii  adduser  3.118
ii  containerd   1.4.13~ds1-1~deb11u2
ii  init-system-helpers  1.60
ii  iptables 1.8.7-1
ii  libc62.31-13+deb11u5
ii  libdevmapper1.02.1   2:1.02.175-2.1
ii  libsystemd0  247.3-7+deb11u1
ii  lsb-base 11.1.0
ii  runc 1.0.0~rc93+ds1-5+deb11u2
ii  tini 0.19.0-1

Versions of packages docker.io recommends:
pn  apparmor 
ii  ca-certificates  20210119
pn  cgroupfs-mount   
ii  git  1:2.30.2-1
pn  needrestart  
ii  xz-utils 5.2.5-2.1~deb11u1

Versions of packages docker.io suggests:
pn  aufs-tools 
pn  btrfs-progs
pn  debootstrap
pn  docker-doc 
ii  e2fsprogs  1.46.2-2
pn  rinse  
pn  rootlesskit
pn  xfsprogs   
pn  zfs-fuse | zfsutils-linux  

-- Configuration Files:
/etc/default/docker changed:
DOCKER_OPTS="--bip 172.17.0.1/16 --log-opt max-size=1m --log-opt max-file=2 
--live-restore=true --raw-logs --insecure-registry=REDACTED:5000 
--insecure-registry=REDACTED:5000"


-- no debconf information



Bug#640456: snmpd: init script may try to restart daemon before it terminates

2011-09-04 Thread Saj Goonatilleke
Package: snmpd
Version: 5.4.3~dfsg-2

Problem:

`invoke-rc.d snmpd restart' may sometimes fail if the snmpd daemon 
does not terminate by the time the init script's 2-second sleep 
expires:

invoke-rc.d: initscript snmpd, action "restart" failed.

The chance of failure is significant enough at sites where 
restarting snmpd across several hundred servers every day can 
reliably trip it over.

Expected behaviour:

The init script should wait for the daemon to terminate before 
trying to start it back up again.  Adding --retry to 
start-stop-daemon solved the problem for me.

Patch:

-- 8< --
--- a/debian/snmpd.init
+++ b/debian/snmpd.init
@@ -55,18 +55,16 @@
 ;;
   stop)
 echo -n "Stopping network management services:"
-start-stop-daemon --quiet --stop --oknodo --exec /usr/sbin/snmpd
+start-stop-daemon --quiet --stop --oknodo --retry 2 --exec /usr/sbin/snmpd
 echo -n " snmpd"
-start-stop-daemon --quiet --stop --oknodo --exec /usr/sbin/snmptrapd
+start-stop-daemon --quiet --stop --oknodo --retry 2 --exec 
/usr/sbin/snmptrapd
 echo -n " snmptrapd"
 echo "."
 ;;
   restart)
 echo -n "Restarting network management services:"
-start-stop-daemon --quiet --stop --oknodo --exec /usr/sbin/snmpd
-start-stop-daemon --quiet --stop --oknodo --exec /usr/sbin/snmptrapd
-# Allow the daemons time to exit completely.
-sleep 2
+start-stop-daemon --quiet --stop --oknodo --retry 2 --exec /usr/sbin/snmpd
+start-stop-daemon --quiet --stop --oknodo --retry 2 --exec 
/usr/sbin/snmptrapd
 if [ "$SNMPDRUN" = "yes" -a -f /etc/snmp/snmpd.conf ]; then
start-stop-daemon --quiet --start --exec /usr/sbin/snmpd -- $SNMPDOPTS
echo -n " snmpd"
-- >8 --

Thanks!



-- 
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org