Good rmoning Mark, > Hi, > > raid5atemyhomework raid5atemyhomew...@protonmail.com writes: > > > GNU Shepherd is the `init` system used by GNU Guix. It features: > > > > - A rich full Scheme language to describe actions. > > - A simple core that is easy to maintain. > > > > However, in this critique, I contend that these features are bugs. > > The Shepherd language for describing actions on Shepherd daemons is a > > Turing-complete Guile language. Turing completeness runs afoul of the > > Principle of Least Power. In principle, all that actions have to do > > is invoke `exec`, `fork`, `kill`, and `waitpid` syscalls. > > These 4 calls are already enough to run "sleep 100000000000" and wait > for it to finish, or to rebuild your Guix system with an extra patch > added to glibc.
I agree. But this mechanism is intended to avoid stupid mistakes like what I committed, not protect against an attacker who is capable of invoking `guix system reconfigure` on arbitrary Scheme code (and can easily wrap anything nefarious in any `unsafe-turing-complete` or `without-static-analysis` escape mechanism). Seatbelts, not steel walls. > > > Yet the language is a full Turing-complete language, including the > > major weakness of Turing-completeness: the inability to solve the > > halting problem. > > The fact that the halting problem is unsolved in the language means it > > is possible to trivially write an infinite loop in the language. In > > the context of an `init` system, the possibility of an infinite loop > > is dangerous, as it means the system may never complete bootup. > > Limiting ourselves to strictly total functions wouldn't help much here, > because for all practical purposes, computing 10^100 digits of Pi is > just as bad as an infinite loop. Indeed. Again, seatbelts, not steel walls. It's fairly difficult to commit a mistake that causes you to accidentally write a program that computes 10^100 digits of pi, not so difficult to have a brain fart and use `(- count 1)` instead of `(+ count 1)` because you were wondering idly whether an increment or a decrement loop would be more Scemey or if both are just as Schemey as the other. What I propose would protect against the latter (a much more likely mistake), as in-context the recursive loop would be flagged since the recursion would be flagged due to being a call to a function that is not a member of a whitelist. Hopefully getting recursive loops flagged would make the sysad writing `configuration.scm` look for the "proper" way to wait for an event to be true, and hopefully lead to them discovering the (hopefully extant) documentation on whatever domain-specific language we have for waiting for the event to be true instead of rolling their own. > That said, I certainly agree that Shepherd could use improvement, and > I'm glad that you've started this discussion. > > At a glance, your idea of having Shepherd do more within subprocesses > looks promising to me, although this is not my area of expertise. An issue here is that we sometimes pass data across Shepherd actions using environment variables, which do not cross process boundaries. Xref. the `set-http-proxy` of `guix-daemon`; the environment variable is used as a global namespace that is accessible from both the `set-http-proxy` and `start` actions. On the other hand, arguably the environment variable table is a global resource shared amongst multiple shepherd daemons. This technique in general may not scale well for large numbers of daemons; environment variable name conflicts may cause subtle problems later. I think it would be better if in addition to the "value" (typically the PID) each Shepherd service also had a `settings` (which can be used to contain anything that satisfies `(lambda (x) (equal? x (read (print x))))` so that it can be easily serialized across each subprocess launched by each action) that can be read and modified by each action. Then the `set-http-proxy` action would update this `settings` field for the shepherd service, then queue up a `restart` action. It could by convention be an association list. This would also persist the `http_proxy` setting, BTW --- currently if you `herd set-http-proxy guix-daemon <whatever>` and then `herd restart guix-daemon` later, the HTTP proxy is lost (since the environment variable is cleared after `set-http-proxy` restarts the `guix-daemon`). In short, this `set-http-proxy` example looks like a fairly brittle hack anyway, and maybe worth avoiding as a pattern. Then there's actions that invoke other actions. From a cursory glance at the Guix code it looks like only Ganeti and Guix-Daemon have actions that invoke actions, and they only invoke actions on their own Shepherd services. It seems to me safe for an action invoked in another action of the same service to *not* spawn a new process, but to execute as the same process. Not sure how safe it would be to allow one shepherd service to invoke an action on another shepherd service --- but then the `start` action of any service may cause other services it requires to be started as well, so we still do need to figure out what subprocesses to launch or not launch. Or maybe each Shepherd service has its own subprocess that is its own mainloop, and the "main" Shepherd process mainloop "just" serves as a switching center to forward commands to each service's mainloop-subprocess, and also incidentally monitors per-service mainloop-subprocess that are not responding fast enough (and possibly decide to kill those mainloops and all its children, then disable that service). This would make each service's environment variables a persistent but local store that is specific to each service and makes its use in `guix-daemon` safe, and the `set-http-proxy` would simply not clear the env vars so that the setting persists. This allows Shepherd to remain responsive at all times even if some action of some Shepherd service enters an infloop or 10^100 pi digits condition; it could even have `herd status` report the number of pending unhandled commands for each service to inform the sysad about possible problems with specific services. Thanks raid5atemyhomework