On 16 Oct 2020, at 21:02, Jann Horn wrote:
On Sat, Oct 17, 2020 at 5:36 AM Willy Tarreau <w...@1wt.eu> wrote: But in userspace, we just need a simple counter. There's no need for us to worry about anything else, like timestamps or whatever. If we repeatedly fork a paused VM, the forked VMs will see the same counter value, but that's totally fine, because the only thing that matters to userspace is that the counter changes when the VM is forked.
For user-space, even a single bit would do. We added MADVISE_WIPEONFORK so that userspace libraries can detect fork()/clone() robustly, for the same reasons. It just wipes a page as the indicator, which is effectively a single-bit signal, and it works well. On the user-space side of this, I’m keen to find a solution like that that we can use fairly easily inside of portable libraries and applications. The “have I forked” checks do end up in hot paths, so it’s nice if they can be CPU cache friendly. Comparing a whole 128-bit value wouldn’t be my favorite.
And actually, since the value is a cryptographically random 128-bit value, I think that we should definitely use it to help reseed the kernel's RNG, and keep it secret from userspace. That way, even if the VM image is public, we can ensure that going forward, the kernel RNG will return securely random data.
If the image is public, you need some extra new raw entropy from somewhere. The gen-id could be mixed in, that can’t do any harm as long as rigorous cryptographic mixing with the prior state is used, but if that’s all you do then the final state is still deterministic and non-secret. The kernel would need to use the change as a trigger to measure some entropy (e.g. interrupts and RDRAND, or whatever). Our just define the machine contract as “this has to be unique random data and if it’s not unique, or if it’s pubic, you’re toast”.
- Colm