Bug#959022: cgroup-tools: does not work in cgroup2 / unified hierarchy

Paul Gevers Wed, 24 Mar 2021 13:51:57 -0700

Hi all,

For documentation, I log our IRC conversation here.


[20:24:27] <mbiebl[m]> elbrus et al: just in case you have already seen
santiago_ 's email: what's your thought on #959022
[20:24:38] [zwiebelbot] Debian#959022: cgroup-tools: does not work in
cgroup2 / unified hierarchy - https://bugs.debian.org/959022
[20:26:24] <mbiebl[m]> do you lean towards getting libcgroup updated to
support cgroupv2 or documenting in the release notes that rdeps of
libcgroup will have to switch to the old cgroupv1 setup if they need
this functionality?
[20:29:13] -*- bunk looks at
https://tracker.debian.org/news/1236679/accepted-clsync-045-2-source-into-unstable/
and wonders whether the remaining 3 rdeps should/can also drop
dependencies in the latter case.
[20:31:24] <elbrus> I think zigo said cinder needs it
[20:32:10] <elbrus> no idea what that is, but I understand it's part of
the OpenStack stack?
[20:33:25] -*- bunk wonders whether there is a reviewable change for the
"fix by adding/packporting cgroupv2" option, since that would IMHO sound
better than documenting manual changes in the release notes
[20:34:50] <mbiebl[m]> elbrus: good question. It sounds more like,
cinder can use cgroup in certain configurations. Not sure if it's really
a hard dependency
[20:35:55] <mbiebl[m]> Then again, I don't really know OpenStack either.
zigo_ are you around?
[20:38:27] <elbrus> mbiebl[m]: so any user of any of these rdeps would
need to run their full system in an "less supported" way?
[20:38:46] <elbrus> sounds ... not ideal
[20:38:55] <mbiebl[m]> elbrus: yeah, it's an all-or-nothing setting
[20:39:01] <elbrus> boo
[20:39:40] <mbiebl[m]> right, this makes me a bit nervous. I'm not sure
where cgroupv1 support will be 3 years down the lane
[20:40:03] <mbiebl[m]> I can imagine that systemd upstream won't have
any interest in bugs that might result of it
[20:40:04] <elbrus> security risks?
[20:40:14] <elbrus> or just bugs?
[20:41:02] <elbrus> by the way, I'll probably copy/paste this discussion
in the bug unless anybody objects
[20:41:11] <mbiebl[m]> hm, not sure if it has security implications
[20:41:46] <elbrus> and what's the amount of updates systemd normally
gets in the release?
[20:41:47] <mbiebl[m]> I can tell you in 3 years :-)
[20:41:56] <elbrus> obviously ;)
[20:42:26] <elbrus> but I mean, do you consider this a risky surface, or
is it most likely OK?
[20:46:21] <bunk> cinder using libcgroup to configure I/O bandwidth
looks like optional functionality already present upstream in buster but
only enabled in bullseye.
[20:46:51] <ansgar> elbrus: AFAIU the cgroup stuff is mostly resource
control, so not security critical.  But by bookworm people should
probably finish migration to cgroupv2...
[20:46:57] <mbiebl[m]> booting with hybrid/cgroupv1 probably works ok
(still).
[20:46:57] <mbiebl[m]> But a/ you first need to know, that you have to
fiddle with grub / kernel command line settings to make it work
[20:46:57] <mbiebl[m]> I'mf not sure if the error messages in
cinder/OpenStack give any clue what the problem is
[20:47:30] <mbiebl[m]> and b/ I'd prefer if all users just would use
cgroupv2, as this makes it easier for me as systemd maintainer
[20:48:10] <elbrus> ansgar: I don't think there's a question about
bookworm :)
[20:48:13] <ansgar> From #d-systemd yesterday: "cgcreate: libcgroup
initialization failed: Cgroup is not mounted" seems to be the error
message from cgcreate (which cinder calls)
[20:49:24] <ansgar> Maybe it's easy to patch relevant places in
cgroup-tools to include some Debian-specific note pointing to the kernel
command-line arguments or some README file?
[20:49:29] <elbrus> then we should ask zigo to disable that new option?
[20:49:54] <mbiebl[m]> thanks ansgar
[20:50:16] <ansgar> (I have no idea how discoverable these messages are
to users if it's some other tool calling cgcreate.)
[20:50:51] <mbiebl[m]> I guess the only really relevant rdep of
libcgroup is cinder i.e. the OpenStack suite
[20:50:53] <zigo> elbrus: Cinder uses "cgcreate" from cgroupv1 to do
block device QoS, so yeah, we need to document the kernel command line
parameters and keep cgroups v1 around if possible.
[20:51:12] <zigo> elbrus: I did test the command line arguments that I
documented in the bug, and it does work perfectly.
[20:51:19] <elbrus> zigo: I read the bug
[20:51:47] <elbrus> FSVO perfectly
[20:52:06] <elbrus> the maintainer of systemd disagrees about it being
perfectly supported
[20:52:19] <bunk> nova looks similar to cinder?
[20:52:29] <bunk> then there's mininet
[20:52:52] <mbiebl[m]> mininet is a leaf package
[20:53:47] <elbrus> cinder looks like that too (at least on popcon)
[20:54:21] <elbrus> (sorry, not leaf, didn't check that)
[20:54:26] <mbiebl[m]> as said, I'm not really familiar with the
OpenStack suite of software
[20:55:03] <mbiebl[m]> so I don't know if cinder is actually entirely
optional (my understanding was, it isn't)
[20:56:05] <bunk> IMHO plan A would be if santiago would have a
reasonable package to update cgroup-tools (there is some preliminary
package mentioned in the bug)
[20:56:32] <zigo> The thing to write in the command line is:
[20:56:32] <zigo> systemd.unified_cgroup_hierarchy=false
systemd.legacy_systemd_cgroup_controller=false
[20:56:49] <zigo> With it, cgroups v1 is mounted somewhere in /sys.
[20:57:04] <bunk> "new upstream version" sounds better than "keep the
old version that requires manual kernel commandline changes"
[20:57:06] <mbiebl[m]> right, this turns your whole system to cgroupv1
[20:57:50] <zigo> bunk: Nova has the dependency because it can also do
disk I/O scheduling.
[20:58:03] <zigo> Though Nova doesn't call cgcreate as much as I can tell.
[20:59:05] <mbiebl[m]> zigo: would nova/cinder still work if the
cgcreate* calls fail?
[20:59:17] <mbiebl[m]> I.e. would the services just run "unrestricted"
[20:59:21] <mbiebl[m]> IO wise
[21:01:45] <mbiebl[m]> Apologies, I'm not well prepared. Wasn't really
around the last couple of days, so I couldn't properly research the
situation.
[21:05:31] <elbrus> no worries, please try to align with zigo and make
sure the bug severity matches the outcome; if you *need* our call to
judge you're welcome to come back
[21:05:54] <mbiebl[m]> yeah, it probably makes sense to table this
discussion
[21:06:12] <elbrus> release-notes are there to take
warnings/instructions if that's required
[21:06:42] <elbrus> I'll copy/paste this into the bug for documentation.
[21:06:45] <mbiebl[m]> my main motivation for my initial question was,
if updating libcgroup was still an option from the RT POV
[21:07:33] <mbiebl[m]> if not, this would change the options we have
[21:07:56] <elbrus> we would rather not see new upstream release,
especially with half backed code, but if that's by far the best solution
and it's reviewable...
[21:08:35] <zigo> mbiebl[m]: They would, but if one configures QoS, they
will fail.
[21:08:42] <elbrus> removal is also an option, but zigo will object ;)
[21:09:09] <zigo> I object that a package which is completely working,
and just needs a bit of configuration gets removed, indeed... :)
[21:09:17] <mbiebl[m]> right, at the very least nova/cinder would have
to drop the hard depends
[21:09:23] <elbrus> zigo: it's about support from systemd
[21:09:46] <zigo> elbrus: Does systemd need to get this dropped, somehow?
[21:09:49] <mbiebl[m]> question is, if a hard depends is actually the
correct dependency, if it's "only" optional functionality
[21:10:07] <elbrus> I understand cgroups v1 isn't really supported anymore
[21:10:17] <mbiebl[m]> zigo: the question is more, that cgroupv1 is no
longer actively used anymore
[21:10:22] <zigo> mbiebl[m]: It's not how I see things. In a public
cloud, if you don't set I/O restrictions, it quickly becomes a mess.
[21:10:30] <mbiebl[m]> so it might have effects on other parts of your
system
[21:10:49] <zigo> One user will abuse and the Ceph cluster becomes not
useable...
[21:11:06] <zigo> Or a volume live migration will take all of the
bandwith available.
[21:11:53] <zigo> So yeah, we *CAN* make a cloud deployment work without
it, but that's absolutely against all the recommendations.
[21:12:23] <zigo> Also, as you wrote, our users will not understand
what's going on...
[21:12:36] <mbiebl[m]> zigo: does a standard/default cinder/nova setup
use IO cgroup limits?
[21:12:48] <zigo> It's only going to be a "cgcreate command failed" in
the logs.
[21:13:30] <zigo> mbiebl[m]: Usually, you set that up in the flavors. In
some cases, even users can configure it themselves, if you give them
enough rights to create VM flavors.
[21:14:34] <mbiebl[m]> Do you expect, that most configurations/setups
will use cgroups?
[21:14:59] <zigo> mbiebl[m]: I do expect that any reasonable deployment
will need I/O QoS yeah.
[21:15:17] <zigo> For your single all-in-one server in your garage, you
probably don't need it.
[21:15:22] <mbiebl[m]> ok
[21:15:42] <zigo> But with some reasonable workload (let's say 1000
VMs), as resources are shared ...
[21:16:09] <zigo> Also, just picture a public cloud with some abusers
doing crypto mining and other disk I/O intensive stuff ...
[21:16:27] <zigo> Without QoS it becomes an operator hell.
[21:16:37] <bunk> How does that work in buster?
[21:16:42] <mbiebl[m]> I guess we should then take this discussion back
to #debian-systemd and figure out, if we can make this use-case work
with the patched libcgroup from santiago ootb
[21:16:52] <mbiebl[m]> that would be my preferred outcome
[21:17:18] <mbiebl[m]> i.e. users not needing to fiddle with kernel
command line parameters
[21:17:30] <zigo> I've asked Cinder upstream to work on it and work on
cgroups v2, but it's not reasonable that they get a solution before
Bullseye is out.
[21:17:43] <zigo> bunk: In buster, cgcreate just works by default ...
[21:18:06] <zigo> mbiebl[m]: What does the patch from Santiago do?
[21:18:27] <bunk> zigo: In buster, there are no dependencies from
openstack on libcgroup
[21:18:44] <mbiebl[m]> it updates libcgroup to add support for cgroupv2,
but rdeps of libcgrop probably will need changes too
[21:18:55] <mbiebl[m]> libcgroup/cgroup-tools
[21:19:10] <zigo> bunk: Though on the packages in maintain in my extrepo
backports, there's such a dependency, even though that's not official...
[21:19:45] <zigo> bunk: I probably discovered how to use disk QoS over
the last 2 years, if I knew buster OpenStack packages would have it.
[21:19:47] <mbiebl[m]> zigo: stupid question: do you have any idea how
Fedora handles this?
[21:19:52] <elbrus> zigo: but how does that work with official (Debian)
packages in buster then?
[21:20:09] <mbiebl[m]> or is OpenStack on Fedora uncommon enough to not
be an issue
[21:20:37] <elbrus> zigo: so if we don't have it in buster, it's not a
regression, is it?
[21:20:43] <elbrus> to not  have it
[21:21:08] <zigo> Well, in Buster, cinder has a dependency on
cgroup-bin, which is the old name for cgroup-tools.
[21:21:52] <zigo> My bad.
[21:22:17] <zigo> I have it in my git repo for cinder 2:13.0.7-1+deb10u1
(and in extrepo), but this was never published in official Buster.
[21:22:25] <zigo> Though it should have ... :/

[21:22:44] <bunk> mbiebl[m]: In a semi-related note (especially if
libcgroup gets updated or removed), should systemd have a Breaks on
everything v1-only in buster to avoid situations rebooting into the new
systemd with v1-using software? And in a follow-up question, how much
does the switch to v2 break in buster-backports?
[21:25:59] <mbiebl[m]> bunk: hm, in most cases, the cgroup bits are
optional functionality. So kicking out the complete package because of
it is probably a bit extreme.
[21:26:20] <mbiebl[m]> kicking out == having a (versioned) Breaks
[21:28:48] <zigo> mbiebl[m]: I'm sorry, but I think I still don't get
it: what's the actual cost of keeping cgroups v1 for you as a systemd
maintainer?
[21:29:34] <mbiebl[m]> bunk: as for buster-backports: I guess mostly
relevant here is debci/lxc
[21:30:41] <elbrus> we don't have debci on buster-backports
[21:30:51] <ansgar> Do e expect any of the libcgroup* tools to be
installed on general-purpuse systems?  I would expect OpenStack block
storage services to be just installed on dedicated systems anyway.
[21:30:56] <mbiebl[m]> zigo: I won't remove any cgroupv1 code in systemd
in Debian for the bullseye lifetime. The point is, that running a
cgroupv1 setup is not something, that I do or systemd upstream does
[21:31:40] <ansgar> That should make problems with the cgroup1 facllback
in systemd limited to not too many systems running not too diverse software.
[21:32:39] <zigo> ansgar: I agree, and that's why I'm saying it's fine
to just document the fact that users must configure it on the kernel
command line on those dedicated hosts.
[21:34:30] <zigo> Also:
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=985789
[21:34:39] [zwiebelbot] Debian#985789: unblock:
openstack-debian-images/1.58 - https://bugs.debian.org/985789
[21:35:16] <zigo> (that's where I hard-wire the kernel command line
options in my hardware provisioning software, so that cinder+nova nodes
get setup correctly)
[21:37:15] <mbiebl[m]> zigo: I think it kinda sucks that you'd have to
reconfigure any "reasonably big" OpenStack setup to use cgroupv1 .
[21:38:14] <zigo> mbiebl[m]: It does, especially when people upgrade
from Buster (where it's not needed) to Bullseye (where it becomes
mandatory), however it's still better than having no QoS support at all.
[21:38:20] <mbiebl[m]> but you mentioned earlier, that updating cinder
to support cgroupv2 is not something you'd expect to happen RSN. So I
guess this our only option the
[21:38:33] <mbiebl[m]> then
[21:38:38] <zigo> Yeah.
[21:38:49] <zigo> And I can write stuff in my cluster upgrade script...
[21:39:10] <zigo> (which I still didn't have time to finish writing... I
guess this can go in a point release update...)
[21:40:06] <mbiebl[m]> Ok, so I guess it's indeed best, if we keep
libcgroup as-is

OpenPGP_signature
Description: OpenPGP digital signature

Bug#959022: cgroup-tools: does not work in cgroup2 / unified hierarchy

Reply via email to