Re: Bug: git-upload-pack will return successfully even when it can't read all references
On Mon, Sep 07, 2015 at 02:11:15PM +0200, Ævar Arnfjörð Bjarmason wrote: > This turned out to be a pretty trivial filesystem error. > refs/heads/master wasn't readable by the backup process, but some > other stuff in refs/heads and objects/* was. > > [...] > > I wanted to check if this was a regression and got as far back as > v1.4.3 with the same behavior before the commands wouldn't work > anymore due to changes in the git config parsing code. Right, it has basically always been this way. for_each_ref() silently eats oddities or errors while reading refs. Calling for_each_rawref() will include them, but we don't do it in most places; it would make non-critical operations on a corrupted repo barf. And it is difficult to know what is "critical" inside the code. You might be calling "upload-pack" to salvage what you can from a corrupted repo, or to make a backup where you want to know what is corrupted and what is not. Commit 49672f2 introduced a "ref paranoia" environment variable to let you specify this (and robust backups was definitely one of the use cases I had in mind). It's a little tricky to use with upload-pack because you may be crossing an ssh boundary, but: git clone -u 'GIT_REF_PARANOIA=1 git-upload-pack' ... should work. With your case: $ git clone --no-local -u 'GIT_REF_PARANOIA=1 git-upload-pack' repo repo-checkout Cloning into 'repo-checkout'... fatal: git upload-pack: not our ref fatal: The remote end hung up unexpectedly Without "--no-local" it behaves weirdly, but I would not recommend local clones in general if you are trying to be careful. They optimize out a lot of the safety checks, and we do things like copy the packed-refs file wholesale. And certainly the error message is not the greatest. upload-pack is not checking for the REF_ISBROKEN flag, so it just dumps: refs/heads/master in the advertisement, and the client happily requests that object. REF_PARANOIA is really just a band-aid to feed the broken refs to the normal code paths, which typically barf on their own. :) Something like this: diff --git a/upload-pack.c b/upload-pack.c index 89e832b..3c621a5 100644 --- a/upload-pack.c +++ b/upload-pack.c @@ -731,6 +731,9 @@ static int send_ref(const char *refname, const struct object_id *oid, if (mark_our_ref(refname, oid)) return 0; + if (flag & REF_ISBROKEN) + warning("remote ref '%s' is broken", refname); + if (capabilities) { struct strbuf symref_info = STRBUF_INIT; kind of helps, but the advertisement is too early for us to send sideband messages. So it makes it to the user if the transport is local or ssh, but not over git:// or http. That's something we could do better with protocol v2 (we'll negotiate capabilities before the advertisement). -Peff -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Bug: git-upload-pack will return successfully even when it can't read all references
On Tue, Sep 8, 2015 at 8:53 AM, Jeff Kingwrote: > On Mon, Sep 07, 2015 at 02:11:15PM +0200, Ævar Arnfjörð Bjarmason wrote: > >> This turned out to be a pretty trivial filesystem error. >> refs/heads/master wasn't readable by the backup process, but some >> other stuff in refs/heads and objects/* was. >> >> [...] >> >> I wanted to check if this was a regression and got as far back as >> v1.4.3 with the same behavior before the commands wouldn't work >> anymore due to changes in the git config parsing code. > > Right, it has basically always been this way. for_each_ref() silently > eats oddities or errors while reading refs. Calling for_each_rawref() > will include them, but we don't do it in most places; it would make > non-critical operations on a corrupted repo barf. And it is difficult > to know what is "critical" inside the code. You might be calling > "upload-pack" to salvage what you can from a corrupted repo, or to make > a backup where you want to know what is corrupted and what is not. > > Commit 49672f2 introduced a "ref paranoia" environment variable to let > you specify this (and robust backups was definitely one of the use cases > I had in mind). It's a little tricky to use with upload-pack because you > may be crossing an ssh boundary, but: > > git clone -u 'GIT_REF_PARANOIA=1 git-upload-pack' ... > > should work. > > With your case: > > $ git clone --no-local -u 'GIT_REF_PARANOIA=1 git-upload-pack' repo > repo-checkout > Cloning into 'repo-checkout'... > fatal: git upload-pack: not our ref > fatal: The remote end hung up unexpectedly > > Without "--no-local" it behaves weirdly, but I would not recommend local > clones in general if you are trying to be careful. They optimize out a > lot of the safety checks, and we do things like copy the packed-refs > file wholesale. > > And certainly the error message is not the greatest. upload-pack is not > checking for the REF_ISBROKEN flag, so it just dumps: > > refs/heads/master > > in the advertisement, and the client happily requests that object. > REF_PARANOIA is really just a band-aid to feed the broken refs to the > normal code paths, which typically barf on their own. :) > > Something like this: > > diff --git a/upload-pack.c b/upload-pack.c > index 89e832b..3c621a5 100644 > --- a/upload-pack.c > +++ b/upload-pack.c > @@ -731,6 +731,9 @@ static int send_ref(const char *refname, const struct > object_id *oid, > if (mark_our_ref(refname, oid)) > return 0; > > + if (flag & REF_ISBROKEN) > + warning("remote ref '%s' is broken", refname); > + > if (capabilities) { > struct strbuf symref_info = STRBUF_INIT; > > kind of helps, but the advertisement is too early for us to send > sideband messages. So it makes it to the user if the transport is local > or ssh, but not over git:// or http. > > That's something we could do better with protocol v2 (we'll negotiate > capabilities before the advertisement). Fantastic. REF_PARANOIA does exactly what I need, i.e. stall the fetch process so permissions can be manually repaired. I think it makes sense to keep the default at "let's try to copy over what we can", for salvage purposes. I think the bug is that we still return success in that case, and should return non-zero, but as you point out this is easier said than done due to needing to deal with the case where the remote transport sends us the ... ref. I wonder if --upload-pack="GIT_REF_PARANOIA=1 git-upload-pack" should be the default when running fetch if you have --prune enabled. There's a particularly bad edge case now where if you have permission errors on the master repository and run --prune on your backup along with a --mirror clone to mirror the refs, then when you have permission issues you'll prune everything from the backup. But yeah, a proper fix needs protocol v2. Because among other things that --upload-pack hack will only work for ssh, not http. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Bug: git-upload-pack will return successfully even when it can't read all references
On Tue, Sep 08, 2015 at 02:16:03PM +0200, Ævar Arnfjörð Bjarmason wrote: > I wonder if --upload-pack="GIT_REF_PARANOIA=1 git-upload-pack" should > be the default when running fetch if you have --prune enabled. There's > a particularly bad edge case now where if you have permission errors > on the master repository and run --prune on your backup along with a > --mirror clone to mirror the refs, then when you have permission > issues you'll prune everything from the backup. > > But yeah, a proper fix needs protocol v2. Because among other things > that --upload-pack hack will only work for ssh, not http. Right. There is no real way to flip the flag from the client side, because we cannot reliably communicate it. Once we have such a mechanism, it might actually make sense to _always_ flip on paranoia. That is, we would rather an operation fail and be retried with in a "loose" mode explicitly. That's safer. It's less convenient when you have a broken repo, but surely that's the exception, right? -Peff -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Bug: git-upload-pack will return successfully even when it can't read all references
We have a process to back up our Git repositories at work, this started alerting because it wasn't getting the same refs as the remote. This turned out to be a pretty trivial filesystem error. refs/heads/master wasn't readable by the backup process, but some other stuff in refs/heads and objects/* was. But I think it's a bug that if we ssh to the remote end, and git-upload-pack can't read certain refs in refs/heads/ that we don't return an error. This simple shellscript reproduces the issue: rm -rf /tmp/repo /tmp/repo-checkout git init /tmp/repo cd /tmp/repo touch foo git add foo git commit -m"foo" git checkout -b branch git checkout master git show-ref chmod 000 .git/refs/heads/master git show-ref cd /tmp git clone repo repo-checkout echo "Status code of clone: $?" cd repo-checkout git show-ref After running this you get: $ (cd /tmp/repo-checkout && echo -n | strace /tmp/avar/bin/git-upload-pack /tmp/repo 2>&1 | grep -e EACCES) open("refs/heads/master", O_RDONLY) = -1 EACCES (Permission denied) open("refs/heads/master", O_RDONLY) = -1 EACCES (Permission denied) open("refs/heads/master", O_RDONLY) = -1 EACCES (Permission denied) And "git fetch" will return 0. We fail to call get refs/heads/master in head_ref_namespaced() called by upload_pack(). I was going to see if I could patch it to return an error, but that code seems very far removed from any error checking. This isn't only an issue with git-upload-pack, e.g. show-ref itself has the same issue: $ chmod 600 .git/refs/heads/master $ git show-ref; echo $? e7255c8fcabc6e15f57cd984f9f117870052c1a0 refs/heads/branch e7255c8fcabc6e15f57cd984f9f117870052c1a0 refs/heads/master 0 $ chmod 000 .git/refs/heads/master $ git show-ref; echo $? e7255c8fcabc6e15f57cd984f9f117870052c1a0 refs/heads/branch 0 I wanted to check if this was a regression and got as far back as v1.4.3 with the same behavior before the commands wouldn't work anymore due to changes in the git config parsing code. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html