Hi,

Jonathan Tan wrote:

> connect.c, when processing packfiles, treats a zero ID (with
> `capabilities^{}` in place of the refname) as an actual ref instead of a
> placeholder for a capability declaration, contrary to the specification
> in Reference Discovery in Documentation/technical/pack-protocol.txt.
> This is an issue when interacting with Git implementations that follow
> this specification. For example, `ls-remote` (against a git://
> repository served by such a Git implementation) will report a ref when
> there are none, and `clone` (against something similar) will fail its
> checkout.

For example, this means I get the following confusing message when
cloning an empty repository served by "jgit daemon":

 $ git clone https://googlers.googlesource.com/jrn/empty
 Cloning into 'empty'...
 Checking connectivity... done.
 warning: remote HEAD refers to nonexistent ref, unable to checkout.

 $

It also means that standard "git daemon" is not able to advertise
capabilities on a fetch from a repository without advertising some
refs, so I cannot fetch-by-sha1 from a C git server unless some refs
are advertised.

This fix should solve the former problem and paves the way for the
latter to be solved once a year or two passes and people's clients are
up to date (or earlier if a server operator chooses).

> Make connect.c follow the specification with respect to this, while
> maintaining compatibility with existing implementations that do not
> serve the zero ID when a repository has no refs.

This description focuses on the code.  When deciding whether to apply
the patch (or whether to revert it if it comes up while trying to
investigate another problem), I am likely to wonder "what effect will
the patch have on me?" instead of "what standards does it follow?"

Flipping the emphasis, we could say something like

        connect: treat ref advertisement with capabilities^{} line as one with 
no refs

        When cloning an empty repository served by standard git, "git clone"
        produces the following reassuring message:

                $ git clone git://localhost/tmp/empty
                Cloning into 'empty'...
                warning: You appear to have cloned an empty repository.
                Checking connectivity... done.
        
        Meanwhile when cloning an empty repository served by JGit, the output
        is more haphazard:

                $ git clone git://localhost/tmp/empty
                Cloning into 'empty'...
                Checking connectivity... done.
                warning: remote HEAD refers to nonexistent ref, unable to 
checkout.

        This is a common command to run immediately after creating a remote
        repository as preparation for adding content to populate it and pushing.
        The warning is confusing and needlessly worrying.

        The cause is that, since v3.1.0.201309270735-rc1~22 (Advertise 
capabilities
        with no refs in upload service., 2013-08-08), JGit's ref advertisement
        includes a ref named capabilities^{} to advertise its refs on, while 
git's
        ref advertisement is empty in this case.  This allows the client to 
learn
        about the server's capabilities and is need, for example, for 
fetch-by-sha1
        to work when no refs are advertised.

        Git advertises the same capabilities^{} ref in its ref advertisement for
        push but since it never remembered to do so for fetch, the client forgot
        to handle this case. Handle it.

        So now you can see the same friendly message whether the server runs C
        git or JGit.

        The pack-protocol.txt documentation gives some guidance about the 
expected
        server behavior: what JGit does is currently required, since a 
list-of-refs
        is required to be non-empty.

                  advertised-refs  =  (no-refs / list-of-refs)
                                      *shallow
                                      flush-pkt

                  no-refs          =  PKT-LINE(zero-id SP "capabilities^{}"
                                      NUL capability-list)

                  list-of-refs     =  first-ref *other-ref

        Because git client versions without this fix are expected to exist in 
the
        wild for a while, we should not change the server to always send the
        capabilities^{} line when there are no refs to advertise yet.  A 
transition
        will take multiple steps:

         1. This patch, which updates the client

         2. Update pack-protocol to clarify that both server behaviors must be
            tolerated.

         3. Add a configuration variable to allow git upload-pack to advertise
            capabilities when there are no refs to advertise.  Leave it disabled
            by default since git clients can't be counted on to have this patch 
(1)
            yet.

         4. After a year or so, flip the default for that server configuration
            variable to true.

[...]
> --- a/connect.c
> +++ b/connect.c
> @@ -165,6 +165,13 @@ struct ref **get_remote_heads(int in, char *src_buf, 
> size_t src_len,
>                       continue;
>               }
>  
> +             if (is_null_oid(&old_oid)) {
> +                     if (strcmp(name, "capabilities^{}"))
> +                             warning("zero object ID received that is not 
> accompanied by a "
> +                                     "capability declaration, ignoring and 
> continuing anyway");
> +                     continue;
> +             }

I'd rather this be stricter.  Though it's not your fault: the code
before it is very lax in letting each line specify capabilities again.

Could this behave more like the ".have" case?  That is, just

                if (!strcmp(name, "capabilities^{}"))
                        continue;

Or if we want to sanity-check the line further,

                if (!strcmp(name, "capabilities^{}")) {
                        if (len == name_len + GIT_SHA1_HEX_SZ + 1)
                                die("protocol error: expected capabilities 
after nul, got none");
                        if (!is_null_oid(&old_oid))
                                die("protocol error: expected zero-id, got %s", 
oid_to_hex(&old_oid));
                        continue;
                }

The protocol definition and other clients already handle capabilities^{} 
differently
from other cases of zero-id.

[...]
> --- a/t/t5512-ls-remote.sh
> +++ b/t/t5512-ls-remote.sh
> @@ -207,5 +207,27 @@ test_expect_success 'ls-remote --symref omits 
> filtered-out matches' '
[...]
> +test_expect_success JGIT,GIT_DAEMON 'indicate no refs in standards-compliant 
> empty remote' '
> +     JGIT_DAEMON_PID= &&
> +     git init --bare empty.git &&
> +     touch empty.git/git-daemon-export-ok &&
> +     {
> +             jgit daemon --port="$JGIT_DAEMON_PORT" . &
> +             JGIT_DAEMON_PID=$!
> +     } &&
> +     test_when_finished kill "$JGIT_DAEMON_PID" &&
> +     sleep 1 && # allow jgit daemon some time to set up

Can this sleep be avoided?  E.g. start_git_daemon in lib-git-daemon.sh
uses output from the daemon to tell when it is serving.

Thanks for writing this.  I'll be happy when this protocol blip is gone.

Sincerely,
Jonathan

Reply via email to