Re: [systemd-devel] Smooth upgrades for socket activated services
Ah, to clarify, I'm talking about app-specific servers not Linux system services, so dbus isn't really relevant (what would it be used for?). The sort of programs that tend to be packaged with Docker today, or deployed using AWS Lambda or just copied up to the server. For example a typical business-specific Ruby on Rails or Spring Boot app. Such programs don't have much use for dbus, will have complex but short lived per-request state and will often be written on other platforms, only deployed to Linux. You don't want to just cut a connection whilst it's live because that'd break things like file downloads, users would see 500 errors, at the same time trying to serialize the full state of the connection to a buffer is impractical because the app is highly complex and changing regularly (e.g. daily).
Re: [systemd-devel] Smooth upgrades for socket activated services
On Fr, 03.03.23 10:16, Mike Hearn (mike@hydraulic.software) wrote: > Sorry, by "apps" I meant anything not supplied by OS developers. In > this context, servers e.g. custom web app servers. I do currently run > some of those with DynamicUser=1 and similar. > > > As long as the tool updating the disk image creates the new one under > > a temporary name, and then replaces the old one with it via renaming, > > upgrading portable services is as easy as restarting them > > Great. > > > > > But of course such an approach requires that services are written in a > > > > way this is possible > > > > > > Right. I think that'd be quite hard to do especially with servers > > > written in portable languages that don't expose stuff unavailable on > > > Windows e.g. the JVM. > > > > Why would that be? portable services are just regular services that > > happen to come with their own disk images, that's all. > > Sorry I meant the serialization and transmission of FDs to the fd > store to support user-transparent restart. For example the Java API > has no way to send fds over a UNIX domain socket because Windows > doesn't support that, so you need third party libraries. And then it > would appear to turn into a general problem of serializing the entire > state of the app which is quite hard. Easier to assume that one > connection should stick with one server version for the lifetime of > that connection and then just phase in new servers as new connections > roll in. Right, writing system services in Java is indeed a headache I am sure. No ready notifications, no socket activation, no fdstore, no signals, no dbus, no watchdog logic, … it's a race to the bottom if you never want to make use of the *good* stuff. But then you shouldn't be surprised if you can't do certain things... Lennart -- Lennart Poettering, Berlin
Re: [systemd-devel] Smooth upgrades for socket activated services
On Fri, 3 Mar 2023 at 09:17, Mike Hearn wrote: > > > > But of course such an approach requires that services are written in a > > > > way this is possible > > > > > > Right. I think that'd be quite hard to do especially with servers > > > written in portable languages that don't expose stuff unavailable on > > > Windows e.g. the JVM. > > > > Why would that be? portable services are just regular services that > > happen to come with their own disk images, that's all. > > Sorry I meant the serialization and transmission of FDs to the fd > store to support user-transparent restart. For example the Java API > has no way to send fds over a UNIX domain socket because Windows > doesn't support that, so you need third party libraries. And then it > would appear to turn into a general problem of serializing the entire > state of the app which is quite hard. Easier to assume that one > connection should stick with one server version for the lifetime of > that connection and then just phase in new servers as new connections > roll in. It only sounds easier, because it postpones the difficult part for later. It requires every service to behave perfectly well and according to the specification, and delegates process management down to them. Except services cannot be relied upon, and will get it wrong, and that will cause multiple versions of the same service to exist at the same time and conflict with each other, and require manual intervention to fix. On a "pet" machine (ie: your laptop) it's fixable busywork, on a system with tens of thousands of headless nodes not much so. It is not a reliable and trustworthy pattern. The advantage of moving state across via FD is not only speed and memory (double amount of services, double amount of memory/cpu consumed and double hard cap of memory needed on the system), but it's mainly about reliability by not having to delegate process management to clients. Ie: when systemd tells you to stop, you stop, end of story.
Re: [systemd-devel] Smooth upgrades for socket activated services
Sorry, by "apps" I meant anything not supplied by OS developers. In this context, servers e.g. custom web app servers. I do currently run some of those with DynamicUser=1 and similar. > As long as the tool updating the disk image creates the new one under > a temporary name, and then replaces the old one with it via renaming, > upgrading portable services is as easy as restarting them Great. > > > But of course such an approach requires that services are written in a > > > way this is possible > > > > Right. I think that'd be quite hard to do especially with servers > > written in portable languages that don't expose stuff unavailable on > > Windows e.g. the JVM. > > Why would that be? portable services are just regular services that > happen to come with their own disk images, that's all. Sorry I meant the serialization and transmission of FDs to the fd store to support user-transparent restart. For example the Java API has no way to send fds over a UNIX domain socket because Windows doesn't support that, so you need third party libraries. And then it would appear to turn into a general problem of serializing the entire state of the app which is quite hard. Easier to assume that one connection should stick with one server version for the lifetime of that connection and then just phase in new servers as new connections roll in.
Re: [systemd-devel] Smooth upgrades for socket activated services
On Do, 02.03.23 23:05, Mike Hearn (mike@hydraulic.software) wrote: > > There's currently no mechanism for that. File an RFE issue. > > https://github.com/systemd/systemd/issues/26647 > > > In the "Portable Services" concept we currently assume you update the > > disk image ("DDI") the service is on, and then simply restart the > > service while leaving the socket around. > > I've always wanted to understand portable services better. I never > quite grokked if portable services were meant for apps or operating > system level stuff, or if it didn't matter. Not sure what you mean by "apps"? desktop apps? They are conceptually suitable for that, but not realistically, since we currently require privs to mount disk images, and thus the whole concept is simply not available for unpriv code. So the focus is system-level services or system-level "apps". i.e. stuff that might or might not have privs, stuff that could use DynamicUser=1 (though this is not a requirement) and similar. > It also wasn't quite clear > to me how upgrades worked for them either - presumably if you stick > them inside a deb or rpm you have the same problem, or if you rsync up > a new image, etc. It'd be great to have some blog posts that tackle > portable services end-to-end from the perspective of running > servers. As long as the tool updating the disk image creates the new one under a temporary name, and then replaces the old one with it via renaming, upgrading portable services is as easy as restarting them (well, unless you make changes to the service definitions, in that case you need to issue "portablectl reattach"). if tools update files like that then the old version of the portable services can use the old image as long as it wants, and only once the last reference to it is dropped it disappears from memory on disk. At the same time the new invocatoin will only use the new disk image. > > But of course such an approach requires that services are written in a > > way this is possible > > Right. I think that'd be quite hard to do especially with servers > written in portable languages that don't expose stuff unavailable on > Windows e.g. the JVM. Why would that be? portable services are just regular services that happen to come with their own disk images, that's all. Lennart -- Lennart Poettering, Berlin
Re: [systemd-devel] Smooth upgrades for socket activated services
> There's currently no mechanism for that. File an RFE issue. https://github.com/systemd/systemd/issues/26647 > In the "Portable Services" concept we currently assume you update the > disk image ("DDI") the service is on, and then simply restart the > service while leaving the socket around. I've always wanted to understand portable services better. I never quite grokked if portable services were meant for apps or operating system level stuff, or if it didn't matter. It also wasn't quite clear to me how upgrades worked for them either - presumably if you stick them inside a deb or rpm you have the same problem, or if you rsync up a new image, etc. It'd be great to have some blog posts that tackle portable services end-to-end from the perspective of running servers. > But of course such an approach requires that services are written in a > way this is possible Right. I think that'd be quite hard to do especially with servers written in portable languages that don't expose stuff unavailable on Windows e.g. the JVM. Also, from the perspective of a packaging tool/docker alternative, asking users to add major new features to their servers is a non-starter. You don't need to do that stuff with containers.
Re: [systemd-devel] Smooth upgrades for socket activated services
On Mo, 20.02.23 11:05, Mike Hearn (mike@hydraulic.software) wrote: > Hi, > > I'm exploring socket activation as part of work on a tool that makes > systemd-controlled servers easier to deploy and use. Given a config > file the tool builds a package that contains the app and systemd > units, uploads it, installs it with dependency resolution, the > postinst scripts start the service etc. It's sort of a Docker > alternative that's more classically Linux-y, designed for a world > where really big machines are really cheap and thus many apps don't > need to be cattle-ized. Pets are sometimes OK. > > As part of this I'm looking at how to make upgrades smooth. Socket > activation already allows you to shut down, upgrade and restart a > service without dropping connections because systemd will hold the > connections until the service comes back but there are a couple of > aspects that weren't really clear to me from reading the excellent > "pid eins" blog post series. Could we maybe get a new blog post > exploring these issues? > > 1. How exactly should you stop a service that's socket activated so it > won't be re-activated during the upgrade but new connections won't be > lost, e.g. in package scripts that are executed across upgrades. > Currently the scripts stop the service before the upgrade happens, > then restart afterwards. There's currently no mechanism for that. File an RFE issue. In the "Portable Services" concept we currently assume you update the disk image ("DDI") the service is on, and then simply restart the service while leaving the socket around. I can see though that if you operate without disk images, then you might want an explicit synchronization step. Currently we implement a "freeze" concept for services (which uses the cgroup freezer underneath), maybe we should extend this for socket units to mean that we keep the sockets open but don#t act anymore. You'd then issue "systemctl freeze foobar.socket" before you do your upgrade and "systemctl thaw" afterwards. > 2. Is it possible to run two versions of a service unit at once such > that the old version finishes handling connections and then shuts > down, whilst new connections are being handled by the new version? Currently, not. We have been discussing this scenario many times, and we could certainly add something for this, but this kinda conflicts with the goal to provide a pristine execution context for services: if we'd restart a service like this and leave old processes around then the cgroup of the service would of course still contain "legacy" processes, which contradicts the rule that we always start with a pristine execution environment. So, there are two conflicting goals: the goal of guaranteeing clean invocation and the goal of allowing old stuff to "passivate". Inside of Microsoft we mostly settled on a different approach: instead of leaving processes around during such restarts, let's instead serialize all state of ongoing connections and upload their sockets to the fdstore (i.e. see FileDescriptorStore= docs), along with a memfd of the serialized state. Benefit of this approach: you solve the problem properly and fully: after the restart only new code is in place, and all old code is flushed out. But of course such an approach requires that services are written in a way this is possible, i.e. are capable of serializing their fully state for all ongoing connections along with the socket fds to the fdstore, and then deserialize all that when initializing again. This is not hard but also not exactly trivial. Lennart -- Lennart Poettering, Berlin
Re: [systemd-devel] Smooth upgrades for socket activated services
Hello Mike. On Mon, Feb 20, 2023 at 11:05:41AM +0100, Mike Hearn wrote: > 2. Is it possible to run two versions of a service unit at once such > that the old version finishes handling connections and then shuts > down, whilst new connections are being handled by the new version? This is a recurring topic, tracked in [1]. I hope to make some progress there soon. Feel free to add your ideas there, Michal [1] https://github.com/systemd/systemd/issues/10228 signature.asc Description: PGP signature
Re: [systemd-devel] Smooth upgrades for socket activated services
On Mon, 2023-02-20 at 12:22 +0100, Mike Hearn wrote: > I see. So basically you have to keep the service running across the > upgrade and then wait for it to shut down due to inactivity, then be > restarted by systemd to make the update apply. Or alternatively you > could make the app detect that it's been updated, stop accepting new > connections, finish servicing the old connections, and then shut > itself down once all existing connections are finished. On restart > it'd then be using the new code, re-accept the socket from systemd > and start accepting again. Instead of "detect that it's been updated", I believe a more common and recommendable approach would be to make it part of the daemon's normal clean shutdown (for daemons where this behavior is appropriate). That is, stop accepting new connections from the listening socket, but finish serving already accepted connections. Then the "restart" part alone is enough to switch to a new version without losing connections (at least if things don't take so long that connections time out).
Re: [systemd-devel] Smooth upgrades for socket activated services
I see. So basically you have to keep the service running across the upgrade and then wait for it to shut down due to inactivity, then be restarted by systemd to make the update apply. Or alternatively you could make the app detect that it's been updated, stop accepting new connections, finish servicing the old connections, and then shut itself down once all existing connections are finished. On restart it'd then be using the new code, re-accept the socket from systemd and start accepting again. I guess this can work for quiet services that are safe to change on disk because they open everything at startup and never close or re-open the fds, or if there's a snapshotting layer on top.
Re: [systemd-devel] Smooth upgrades for socket activated services
Am Mo., 20. Feb. 2023 um 11:06 Uhr schrieb Mike Hearn : > > Hi, > > I'm exploring socket activation as part of work on a tool that makes > systemd-controlled servers easier to deploy and use. Given a config > file the tool builds a package that contains the app and systemd > units, uploads it, installs it with dependency resolution, the > postinst scripts start the service etc. It's sort of a Docker > alternative that's more classically Linux-y, designed for a world > where really big machines are really cheap and thus many apps don't > need to be cattle-ized. Pets are sometimes OK. > > As part of this I'm looking at how to make upgrades smooth. Socket > activation already allows you to shut down, upgrade and restart a > service without dropping connections because systemd will hold the > connections until the service comes back but there are a couple of > aspects that weren't really clear to me from reading the excellent > "pid eins" blog post series. Could we maybe get a new blog post > exploring these issues? > > 1. How exactly should you stop a service that's socket activated so it > won't be re-activated during the upgrade but new connections won't be > lost, e.g. in package scripts that are executed across upgrades. > Currently the scripts stop the service before the upgrade happens, > then restart afterwards. Currently, there is no way to "freeze" the execution of a socket activated service. A feature I'm missing as well, fwiw.
[systemd-devel] Smooth upgrades for socket activated services
Hi, I'm exploring socket activation as part of work on a tool that makes systemd-controlled servers easier to deploy and use. Given a config file the tool builds a package that contains the app and systemd units, uploads it, installs it with dependency resolution, the postinst scripts start the service etc. It's sort of a Docker alternative that's more classically Linux-y, designed for a world where really big machines are really cheap and thus many apps don't need to be cattle-ized. Pets are sometimes OK. As part of this I'm looking at how to make upgrades smooth. Socket activation already allows you to shut down, upgrade and restart a service without dropping connections because systemd will hold the connections until the service comes back but there are a couple of aspects that weren't really clear to me from reading the excellent "pid eins" blog post series. Could we maybe get a new blog post exploring these issues? 1. How exactly should you stop a service that's socket activated so it won't be re-activated during the upgrade but new connections won't be lost, e.g. in package scripts that are executed across upgrades. Currently the scripts stop the service before the upgrade happens, then restart afterwards. 2. Is it possible to run two versions of a service unit at once such that the old version finishes handling connections and then shuts down, whilst new connections are being handled by the new version? I feel intuitively that this should be possible for services like ssh, but you'd need it for anything that serves downloads. Obviously services would have to opt in to this, as they'd have to be able to handle two versions running at once in terms of shared state/config/caches etc, but for servers that can handle this it would make upgrades entirely transparent. thanks, -mike