Hi Ludo, l...@gnu.org (Ludovic Courtès) writes:
> Hello, > > The ‘readTempRoots’ function in gc.cc has this: > > /* Try to acquire a write lock without blocking. This can > only succeed if the owning process has died. In that case > we don't care about its temporary roots. */ > if (lockFile(*fd, ltWrite, false)) { > printMsg(lvlError, format("removing stale temporary roots file > `%1%'") % path); > unlink(path.c_str()); > > There’s a thinko here: locking the file also succeeds when the lock is > already held by the calling process. > > In that case, this code ends up removing the temporary root file of > calling process, which is bad. Here’s a sample session: > > scheme@(guile-user)> ,use(guix) > scheme@(guile-user)> (define s (open-connection)) > scheme@(guile-user)> (current-build-output-port (current-error-port)) > $2 = #<output: file /dev/pts/9> > scheme@(guile-user)> (set-build-options s #:verbosity 10) > $3 = #t > scheme@(guile-user)> (add-text-to-store s "foo" "bar!") > acquiring global GC lock `/var/guix/gc.lock' > acquiring read lock on `/var/guix/temproots/4259' > acquiring write lock on `/var/guix/temproots/4259' > downgrading to read lock on `/var/guix/temproots/4259' > locking path `/gnu/store/0siy93lggjw7sfdg8gsvrzafaa974h2d-foo' > lock acquired on `/gnu/store/0siy93lggjw7sfdg8gsvrzafaa974h2d-foo.lock' > `/gnu/store/0siy93lggjw7sfdg8gsvrzafaa974h2d-foo' has hash > `c756ef12a70bad10c9ac276ecd1213ea7cc3a2e6c462ba47e4f9a88756a055d0' > lock released on `/gnu/store/0siy93lggjw7sfdg8gsvrzafaa974h2d-foo.lock' > $4 = "/gnu/store/0siy93lggjw7sfdg8gsvrzafaa974h2d-foo" > scheme@(guile-user)> (delete-paths s (list $4)) > acquiring global GC lock `/var/guix/gc.lock' > finding garbage collector roots... > executing > `/gnu/store/l99rkv2713nl53kr3gn4akinvifsx19h-guix-0.11.0-3.7ca3/libexec/guix/list-runtime-roots' > to find additional roots > […] > reading temporary root file `/var/guix/temproots/4259' > removing stale temporary roots file `/var/guix/temproots/4259' > […] > considering whether to delete > `/gnu/store/0siy93lggjw7sfdg8gsvrzafaa974h2d-foo' > | invalidating path `/gnu/store/0siy93lggjw7sfdg8gsvrzafaa974h2d-foo' > | deleting `/gnu/store/0siy93lggjw7sfdg8gsvrzafaa974h2d-foo' > | recursively deleting path > `/gnu/store/0siy93lggjw7sfdg8gsvrzafaa974h2d-foo' > | | /gnu/store/0siy93lggjw7sfdg8gsvrzafaa974h2d-foo > deleting `/gnu/store/trash' > recursively deleting path `/gnu/store/trash' > | /gnu/store/trash > deleting unused links... > deleting unused link > `/gnu/store/.links/1l2ml1b8ga7rwi3vlqn4wsic6z7a2c9csvi7mk4i1b8blw9fymn7' > note: currently hard linking saves 6699.22 MiB > $5 = ("/gnu/store/0siy93lggjw7sfdg8gsvrzafaa974h2d-foo") > $6 = 4096 > > Notice the “removing stale temporary roots file” message. > > Eelco: shouldn’t it be changed along the lines of the attached path? > > > Thanks, > Ludo’. > > diff --git a/nix/libstore/gc.cc b/nix/libstore/gc.cc > index 72eff52..d92388f 100644 > --- a/nix/libstore/gc.cc > +++ b/nix/libstore/gc.cc > @@ -2,6 +2,7 @@ > #include "misc.hh" > #include "local-store.hh" > > +#include <string> > #include <functional> > #include <queue> > #include <algorithm> > @@ -225,10 +226,10 @@ static void readTempRoots(PathSet & tempRoots, FDs & > fds) > //FDPtr fd(new AutoCloseFD(openLockFile(path, false))); > //if (*fd == -1) continue; > > - /* Try to acquire a write lock without blocking. This can > - only succeed if the owning process has died. In that case > - we don't care about its temporary roots. */ > - if (lockFile(*fd, ltWrite, false)) { > + /* Try to acquire a write lock without blocking. This can only > + succeed if the owning process has died, in which case we don't > care > + about its temporary roots, or if we are the owning process. */ > + if (i.name != std::to_string(getpid()) && lockFile(*fd, ltWrite, > false)) { > printMsg(lvlError, format("removing stale temporary roots file > `%1%'") % path); > unlink(path.c_str()); > writeFull(*fd, "d"); > I'm not Eelco, but your change LGTM. Note that the upstream version still uses the original code [0]. I've installed the change, tested that it had the expected result: --8<---------------cut here---------------start------------->8--- reading temporary root file `/var/guix/temproots/8386' waiting for read lock on `/var/guix/temproots/8386' got temporary root `/gnu/store/0siy93lggjw7sfdg8gsvrzafaa974h2d-foo' considering whether to delete `/gnu/store/0siy93lggjw7sfdg8gsvrzafaa974h2d-foo' | cannot delete `/gnu/store/0siy93lggjw7sfdg8gsvrzafaa974h2d-foo' because it's a root | cannot delete `/gnu/store/0siy93lggjw7sfdg8gsvrzafaa974h2d-foo' because it's still reachable ice-9/boot-9.scm:1685:16: In procedure raise-exception: ERROR: 1. &store-protocol-error: message: "cannot delete path `/gnu/store/0siy93lggjw7sfdg8gsvrzafaa974h2d-foo' since it is still alive" status: 1 --8<---------------cut here---------------end--------------->8--- and pushed! Closing. [0] https://github.com/NixOS/nix/blob/master/src/libstore/gc.cc#L194 -- Thanks, Maxim