Hi list, TL;DR: passwordless sudo is same as making $USER equal to root at all times. Requiring a password is a royal PITA when trying to run one command on many many hosts. Scripting interactive password input sucks. Other methods are non-portable. Practical ideas?
Long version: I've been working on Judo[1], a less-sucky alternative to Ansible[2]. The exact problem it solves is basically that of "for host in ...; do scp script ${host}: && ssh ${host} ./script; done", except with some seatbelts and goodies (like parallel execution, logging, host inventory, cleanup, master connection for speed, etc). [1]: https://github.com/rollcat/judo [2]: https://www.ansible.com/ I have a working PoC, written in Go, which I've been using to manage my personal machines, and occasionally even for stuff I was actually paid to do. So, Judo is already proving productive. But before I can finalize the design and set some things in stone, I feel that I need to solve one last outstanding design problem: privilege escalation at scale. I want managing 5 hosts to be as simple and practical as managing a fleet of 5000, while not compromising too much in other areas. Currently, Judo assumes absolutely no interaction from its scripts on any hosts. On the remote side, stdin is closed as soon as possible, no PTY is allocated, etc. This is on purpose, by design, and won't change: this is one of the more frustrating things about Ansible, and there's no sane way to do it when interacting with N hosts, in parallel. This doesn't mesh well with interactive sudo (or doas, or su, or whatever other interactive privilege escalation tool you'd use). The "get things done" solution is to use the 0-factor authentication variant with NOPASSWD, which IMHO is just a more complex and elaborate way of aliasing $USER to UID 0 in /etc/passwd, and will leave many sysops unhappy. The way Ansible solves this, is there's a script to interact with sudo, su, doas, etc - this means the core product has specific kludge in place to interact with every possible PrivEsc method, and every new tool to support needs another piece of kludge. (Probably why it's at 600k SLOC vs Judo's 1k.) On the other hand, Judo is not even aware of sudo / doas / sup / su / etc, as privilege escalation is simply not considered in scope at this time - the problem is currently delegated to the script running on the target machine. Most of my scripts have this preamble: 'if [ "$(id -u)" != 0 ]; then exec sudo -n -- $0 "$@"; fi'. If we wanted to require some other authentication factor, say via PAM, that'd make matters even worse. Judo currently seems to work fine with whatever you throw at it, I've tested it with Debian, CentOS, OpenBSD, FreeBSD, RouterOS, on amd64, i386, arm... And it just works, because the core assumes very very little about the remote machine. PAM sucks and doesn't even exist on OpenBSD, Slackware, many embedded systems, etc. I was thinking about writing a tiny helper tool to run on the remote end, that would: 1. read a one-time token from a file/fd, 2. delete that file, and 3. if ok, execute the given script with escalated privileges; then integrate this with Judo, so that the controlling host would generate these one-time tokens and send them off to remote hosts. But the tool would have to be a statically-linked executable, one for each OS+release+arch combo. Sounds like another PITA to manage; especially since Judo assumes almost nothing about the target host (and again, I'd like to keep it that way). Until a while ago, I've been telling myself this is not a real problem, and password-less sudo is fine, because if someone can put something like 'sudo() { ... }' in my ~/.profile, I'm toast anyway. Somehow I don't feel easy about this, there must be a way to add an authentication factor to privilege escalation that doesn't suck. And lastly, I don't want to assume that everyone's remote hosts will always allow unauthenticated sudo, because that's a silly assumption. I also don't want anyone to have to put an 'if not root then sudo myself' preamble in every script - boilerplate is evil. Looks like whichever way to go, there's always a compromise to make: adding complexity, dropping authentication, sacrificing portability, hardcoding assumptions, proliferating boilerplate... I'd love to hear ideas & opinions. <3,K.