August 24, 2023 3:57 PM, "Martin Baulig" <mar...@baulig.is> wrote:
> Hello, > > About 2–3 months ago, I got an initial prototype of Bacula working on GNU > Guix. I had the Bacula > Director, two separate Storage Daemons and the Baculum web interface running > in a GNU Guix VM on my > Synology NAS. I had to look it up...Apparently Bacula is a way to back up computers on a network. Sounds cool! https://en.wikipedia.org/wiki/Bacula > At some point, I would really love to upstream these changes, but it's quite > a complex > configuration - and I also had to do quite a few refactorings and clean-ups > for this to pass my > personal quality standards. > > One issue I had to deal with is that Bacula heavily relies upon clear-text > passwords in its various > configuration files. To communicate between its different components, it uses > TLS with Client > Certificates in addition to passwords. So in addition to writing clear-text > passwords into various > configuration files, the X509 private keys, DH parameters, etc. also need to > be installed into > appropriate directories. > > I came up with quite an elegant solution for this problem - and introduced > three new services and > an extension. > > * My "guix secrets" tool provides a command-line interface to maintain a > "secrets database" > (/etc/guix/secrets.db) that's only accessible to root. It can contain simple > passwords, arbitrary > text (like for instance X509 certificates in PEM format) and binary data. I know guix has been wanting to figure out how to have services that need passwords in the configuration file. This sounds like it could work! > * The problem with the standard activation service is that it runs early in > the boot process and > all activation actions are run in a seemingly random way, there isn't a way > to provide any real > dependencies. Any failures could possibly prevent the system from fully > booting up. > > I created a new "activation-tree-service-type" - currently experimental and a > bit in a refactoring > stage. It creates a separate one-shot Shepherd service for each activation > action, and you can > declare dependencies between them. > > Since it's using normal Shepherd services underneath the hood, you could for > instance depend on > user-homes and the network being up, so you could SSH in and use GNU Emacs to > fix any issues. > > And any arbitrary Shepherd service could also depend on some of these actions > - such as for > instance the various Bacula services. > > * Then I created "service-accounts-service-type" that extends the standard > account creation with > the ability to also create home directories, run and PID directories and the > log-file. It's mostly > used under the hood. > > * Finally, "secrets-service-type" depends on all of the above to do its work. > > It takes a template file - which is typically interned in the store - > containing special "tokens" > that tell it which keys to look up from the secrets database. > > It uses the above mentioned service-accounts-service-type to specify where > the substituted > configuration file should be installed, insuring that the directory has been > set up with > appropriate permissions. > > And then it substitutes the special tokens from the template file with the > actual secrets. For > instance "@password:foo@" would be substituted with a password entry called > "foo". For arbitrary > text or binary data, the template would contain something like "@blob:data@" > - this will be > substituted with the full path name of a file where the actual data will be > written to. > > * * * * > > All of the above has been mostly working in early August, just one problem > remained: > > I do not want to store any of the actual data inside the VM, but rather use a > folder on the NAS > itself. Even the PostgreSQL database lives on a NFS-mounted volume. The > problem is quite simply > that Synology's Virtual Machine Manager software does not provide any way of > exporting or importing > volumes. You cannot even move them between VMs. And I really don't want to > tie my data to the > lifecycle of the VM. > > Using traditional NFS (either version 2 or 3) worked perfectly fine and since > this is a very > locked-down environment, encrypting the NFS traffic really isn't needed. > Like, and attacker that > got access to either the NAS or the VM running inside it would already have > all the data anyway. > > However, I wanted to give it a try regardless and see whether I could get > SSSD working with GNU > Guix. > > And this is where the nightmares began! > > Firstly, I had to make a few changes to GNU Guix itself, most of which I'd > like to upstream. The > code is in my public GitLab repo, but it's a bit of a mess right now, and > I'll need at least a day > or two to clean it up. But I also ran across a couple of questions and issues. > > * GNU Guix is currently using nfs-utils 2.4.3, whereas 2.6.3 is currently the > latest version. We > don't need to upgrade, but I would like to backport one change, affecting a > single function. This > is needed for idmap-daemon to work with arbitrary plugins. > > Back in nfs-utils 2.4.3, the plugin search path was hard-coded - and since > that hard-coded path > will be inside the store, other packages can't add anything to it. > > In later versions, this was changed to attempt to load the plugin from the > library search path > first, prior to falling back to the hard-coded default. > > * Once nfs-utils is patched, rpc.idmapd then needs to be started with > LD_LIBRARY_PATH set to the > plugin directories - similar to how it's done with nscd. > > I added a few new fields to idmap-service-type and nfs-service-type for this. > > It also looks like you can't instantiate idmap-service-type without > nfs-service-type due to what > seems to be a bug. > > It's currently using >> (extend (lambda (config values) (first values))) > which fails if there isn't any previous value. Replacing that with > last-extension-or-cfg (from > "(gnu home services xdg)") fixes that issue. > > * For the sssd package, this is currently built without nfsidmap support and > has it's sysconfdir > set to /etc. > > Was there a particular reason for this? I suppose nfsidmap support was > disabled because it > previously did not work? > > As for its sysconfdir - there isn't really anything confidential in the > sssd.conf file, so I would > rather have that interned in the store if possible. This requires a little > patch to sssd, though, > to disable its permission checks on the config file. > > * For the realmd package - it currently does not compile on GNU/Guix master. > All that's needed is a > small fix to the configure script. GNU/Guix master uses a newer version of > GNU Glibc - there is no > "__res_querydomain" in -lresolv anymore, that's now called "res_querydomain" > and is in glibc. > > * To make realmd actually work, it needs a configuration file. > > Could we possibly either move it from (gnu packages admin) into (gnu packages > sssd), or add a > "realmd-sssd" package with a standard configuration file? A very simple > config file will work fine, > but it needs to contain the store paths of adcli. sssd and sss_cache. > > These are the parts that I got working so far. You can join the domain, > acquire Kerberos tickets, > mount the network share - and access is handled by the server according to > the current user's > Kerberos credentials. You also don't need to copy around any keytabs or > anything for that, as would > be required with Samba. This is just really cool. > > However, here's where the problems start: > > * I couldn't figure out how to use gssproxy - setting that environment > variable doesn't seem to be > doing anything, I ran the various daemons with strace and nothing was ever > attempting to use the > proxy. Then, I looked at the mit-krb5 source code as well as the nfs-utils > and gss-daemon source > code and couldn't find any reference to that environment variable either. > > Is it possible that Fedora / Red Hat is using some custom patches in their > distribution. > > * I finally worked around that by installing client keytabs for my service > principals, using my > secrets service. > > Works great for local accounts, but using domain accounts gave me quite a bit > of a headache! > > Let's say "storage" in a domain account. I can do "getent passwd storage" and > it works. I can do > "chown storage foo" on a local file system as root and then "ls -l storage" > shows me the correct > owner. > > On the mounted network share, root is mapped to the machine credential, so I > have to create and > chown things on the server. After a bit of starting / restarting nscd, sssd > and gss-daemon, file > permissions will also show up correctly in "ls -l". > > I can also do "su storage" as root and that works (after I create the home > directory); "su -s > /bin/sh storage -c id" works fine. > > * In guile, I can also do (getent "storage") and that works. > > However, it fails when I put that inside a G-Exp - to run it as part of a > one-shot Shepherd > service. I can open a pipe to "su -s /bin/sh storage -c > /gnu/store/...-coreutils-../bin/id" and > that works. > > One would assume that (getent) won't work inside a G-Exp because it doesn't > have access to NSCD / > SSSD. > > But why can I (invoke) "su" inside that same G-Exp and it works fine? > > My gut feeling tells me that this "su pipe" thing might not be the most > reliable thing to depend > on. > > The reason I need the domain account's UID is to put the Kerberos client > keytab into > "/var/krb5/user/<UID>/client.keytab". Maybe there's a way to use the username > instead? I ran an > "strace" on the gss-daemon and it currently only looks in that <UID> > directory. > > * PostgreSQL - ... yeah, here it is getting interesting! > > The first question here is which user account to use - and whether to create > a local or domain > account. > > It seems like using a local "postgres" account might be the most robust thing > to do. Any access to > the mounted network share will be mapped to whichever Kerberos principal I > place in the > "client.keytab". > > Either way, the local "root" user will not have any access to the data > directory - and the local > "postgres" user will only have access to it once SSSD is up and running and > it's mounted. > > I have an "activation-tree-service-type" action to mount the share once SSSD > is ready and that > seems to be working fine on system boot. > > However, for PostgreSQL, I'd probably have to provide my own service that > uses the same activation > logic - not create the data directory at all, create the local state and pid > directory and log-file > once we have the user's UID (which is trivial for a local "postgres" account, > but more complicated > for domain accounts). > > * Finally, each of Bacula's service accounts then also needs client keytabs > installed and started > in the correct order. > > * * * * > > Here, I start to wonder whether it's even worth the hassle. To summarize, to > use Kerberized NFSv4, > all of the following is needed: > > * Some patches to GNU Guix (most of which can probably be upstreamed > regardless). > * Complicated activation actions, to put client keytabs in the correct > places, with the correct > permissions. > * Strict, particular order in which services need to be started up on system > boot. > * Manually creating directories on the server with the right owner and > permissions. > * Manually running "samba-tool domain exportkeytab > --principal=<service-user>" for each service > user, coping them over and adding to "guix secrets". > * There will be quite a few as I have set up Bacula with strict privilege > separation, even using > different Storage Daemons for different backups, each running as a distinct > user account. > * Custom PostgreSQL service. > > Whereas with just using unencrypted NFSv3, I can: > > * Use GNU Guix master as-is. > * Have my activation-tree-service-type create all the service accouts, their > directories and > everything with appropriate permissions. > * Only run "guix secrets" locally, without the need to SSH into the server > and run stuff as root > there. > * Have a much more simple activation logic. > > Bacula is something that I would really like to get running and most of my > work so far has been to > make that happen in a clean and stable manner. > > However, I am strongly leading towards declaring the entire SSSD endeavor a > failed experiment and > not pursue it any further. > > In case there is any interest from your part, then I'd gladly polish up my > Guix changes and submit > them as a series of patches. I was actually planning to have that done by the > end of this week, but > then SSSD took far more time than I had anticipated. > > Has anybody else ever made similar experiences or what are your > recommendations? > > I'm about to head out for a longer weekend, going on a bit of a road trip to > visit some friends, so > this is a great point for me to take a break and then come fresh next week. > > Looking forward to hearing back from you and have a wonderful weekend, > > Martin Baulig Congrats Martin! This whole email looks awesome!