Hi,

that's also a +1 from me — we also use containers heavily for scientific 
workflows, and know their benefits well.
But they are not the "best", or rather, the most fitting tool in every 
situation.
You have provided a great summary and I agree with all points, and thank you a 
lot for this very competent and concise write-up.


Since in this lengthy thread, static linking and solving the issue of many 
inter-dependencies for production services with containers have been mentioned 
as solutions,
I'd like to add another point to your list of complexities:
* Keeping production systems secure may be a lot more of a hassle.

Even though the following article is long and many may regard it as 
controversial, I'd like to link to a concise write-up from a packager 
discussing this topic in a quite generic way:
 
https://blogs.gentoo.org/mgorny/2021/02/19/the-modern-packagers-security-nightmare/
While the article discusses the issues of static linking and package management 
performed in language-specific domains, it applies all the same to containers.

If I operate services in containers built by developers, of course this ensures 
the setup works, and dependencies are well tested, and even upgrades work well 
— but it also means that,
at the end of the day, if I run 50 services in 50 different containers from 50 
different upstreams, I'll have up to 50 different versions of OpenSSL floating 
around my production servers.
If a security issue is found in any of the packages used in all the container 
images, I now need to trust the security teams of all the 50 developer groups 
building these containers
(and most FOSS projects won't have the ressources, understandably...),
instead of the one security team of the disto I use. And then, I also have to 
re-pull all these containers, after finding out that a security fix has become 
available.
Or I need to build all these containers myself, and effectively take over the 
complete job, and have my own security team.

This may scale somewhat well, if you have a team of 50 people, and every person 
takes care of one service. Containers are often your friend in this case[1],
since it allows to isolate the different responsibilities along with the 
service.

But this is rarely the case outside of industry, and especially not in 
academics.
So the approach we chose for us is to have one common OS everywhere, and 
automate all of our deployment and configuration management with Puppet.
Of course, that puts is in one of the many corners out there, but it scales 
extremely well to all services we operate,
and I can still trust the distro maintainers to keep the base OS safe on all 
our servers, automate reboots etc.

For Ceph, we've actually seen questions about security issues already on the 
list[0] (never answered AFAICT).


To conclude, I strongly believe there's no one size fits all here.

That was why I was hopeful when I first heard about the Ceph orchestrator idea, 
when it looked to be planned out to be modular,
with the different tasks being implementable in several backends, so one could 
imagine them being implemented with containers, with classic SSH on bare-metal 
(i.e. ceph-deploy-like), ansible, rook or maybe others.
Sadly, it seems it ended up being "container-only".
Containers certainly have many uses, and we run thousands of them daily, but 
neither do they fit each and every existing requirement,
nor are they a magic bullet to solve all issues.

Cheers,
        Oliver


[0] 
https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/message/PPLJIHT6WKYPDJ45HVJ3Z37375WIGKDW/
[1] But you may also just have a very well structured configuration management 
system fitting your organizational structure.


Am 02.06.21 um 11:36 schrieb Matthew Vernon:
Hi,

In the discussion after the Ceph Month talks yesterday, there was a bit of chat 
about cephadm / containers / packages. IIRC, Sage observed that a common reason 
in the recent user survey for not using cephadm was that it only worked on 
containerised deployments. I think he then went on to say that he hadn't heard 
any compelling reasons why not to use containers, and suggested that resistance 
was essentially a user education question[0].

I'd like to suggest, briefly, that:

* containerised deployments are more complex to manage, and this is not simply 
a matter of familiarity
* reducing the complexity of systems makes admins' lives easier
* the trade-off of the pros and cons of containers vs packages is not obvious, 
and will depend on deployment needs
* Ceph users will benefit from both approaches being supported into the future

We make extensive use of containers at Sanger, particularly for scientific workflows, and 
also for bundling some web apps (e.g. Grafana). We've also looked at a number of 
container runtimes (Docker, singularity, charliecloud). They do have advantages - it's 
easy to distribute a complex userland in a way that will run on (almost) any target 
distribution; rapid "cloud" deployment; some separation (via namespaces) of 
network/users/processes.

For what I think of as a 'boring' Ceph deploy (i.e. install on a set of 
dedicated hardware and then run for a long time), I'm not sure any of these 
benefits are particularly relevant and/or compelling - Ceph upstream produce 
Ubuntu .debs and Canonical (via their Ubuntu Cloud Archive) provide .debs of a 
couple of different Ceph releases per Ubuntu LTS - meaning we can easily 
separate out OS upgrade from Ceph upgrade. And upgrading the Ceph packages 
_doesn't_ restart the daemons[1], meaning that we maintain control over restart 
order during an upgrade. And while we might briefly install packages from a PPA 
or similar to test a bugfix, we roll those (test-)cluster-wide, rather than 
trying to run a mixed set of versions on a single cluster - and I understand 
this single-version approach is best practice.

Deployment via containers does bring complexity; some examples we've found at 
Sanger (not all Ceph-related, which we run from packages):

* you now have 2 process supervision points - dockerd and systemd
* docker updates (via distribution unattended-upgrades) have an unfortunate 
habit of rudely restarting everything
* docker squats on a chunk of RFC 1918 space (and telling it not to can be a 
bore), which coincides with our internal network...
* there is more friction if you need to look inside containers (particularly if 
you have a lot running on a host and are trying to find out what's going on)
* you typically need to be root to build docker containers (unlike packages)
* we already have package deployment infrastructure (which we'll need 
regardless of deployment choice)

We also currently use systemd overrides to tweak some of the Ceph units (e.g. 
to do some network sanity checks before bringing up an OSD), and have some 
tools to pair OSD / journal / LVM / disk device up; I think these would be more 
fiddly in a containerised deployment. I'd accept that fixing these might just 
be a SMOP[2] on our part.

Now none of this is show-stopping, and I am most definitely not saying "don't ship 
containers". But I think there is added complexity to your deployment from going the 
containers route, and that is not simply a "learn how to use containers" learning curve. 
I do think it is reasonable for an admin to want to reduce the complexity of what they're dealing 
with - after all, much of my job is trying to automate or simplify the management of complex 
systems!

I can see from a software maintainer's point of view that just building one container and shipping 
it everywhere is easier than building packages for a number of different distributions (one of my 
other hats is a Debian developer, and I have a bunch of machinery for doing this sort of thing). 
But it would be a bit unfortunate if the general thrust of "let's make Ceph easier to set up 
and manage" was somewhat derailed with "you must use containers, even if they make your 
life harder".

I'm not going to criticise anyone who decides to use a container-based 
deployment (and I'm sure there are plenty of setups where it's an obvious win), 
but if I were advising someone who wanted to set up and use a 'boring' Ceph 
cluster for the medium term, I'd still advise on using packages. I don't think 
this makes me a luddite :)

Regards, and apologies for the wall of text,

Matthew

[0] I think that's a fair summary!
[1] This hasn't always been true...
[2] Simple (sic.) Matter of Programming




--
Oliver Freyermuth
Universität Bonn
Physikalisches Institut, Raum 1.047
Nußallee 12
53115 Bonn
--
Tel.: +49 228 73 2367
Fax:  +49 228 73 7869
--

Attachment: smime.p7s
Description: S/MIME Cryptographic Signature

_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

Reply via email to