Jonathan Tan <jonathanta...@google.com> writes:

> Refactor, into a common function, the version and capability negotiation
> done when invoking a long-running process as a clean or smudge filter.
> This will be useful for other Git code that needs to interact similarly
> with a long-running process.
>
> As you can see in the change to t0021, this commit changes the error
> message reported when the long-running process does not introduce itself
> with the expected "server"-terminated line. Originally, the error
> message reports that the filter "does not support filter protocol
> version 2", differentiating between the old single-file filter protocol
> and the new multi-file filter protocol - I have updated it to something
> more generic and useful.
>
> Signed-off-by: Jonathan Tan <jonathanta...@google.com>

Overall I like the direction, even though the abstraction the
resulting code results in seems to me a bit too tightly defined; in
other words, I cannot be sure that this will be useful enough in a
more general context, or make some potential applications feel a bit
too constrained.

> +     static int versions[] = {2, 0};
> +     static struct subprocess_capability capabilities[] = {
> +             {"clean", CAP_CLEAN}, {"smudge", CAP_SMUDGE}, {NULL, 0}
> +     };
>       struct cmd2process *entry = (struct cmd2process *)subprocess;
> ...
> +     return subprocess_handshake(subprocess, "git-filter-", versions, NULL,
> +                                 capabilities,
> +                                 &entry->supported_capabilities);
>  }

I would have defined the welcome prefix to lack the final dash,
i.e. forcing the hardcoded suffixes for clients and servers in any
protocol that uses this API to end with "-client" and "-server",
i.e. with dash.

> diff --git a/sub-process.c b/sub-process.c
> index a3cfab1a9..1a3f39bdf 100644
> --- a/sub-process.c
> +++ b/sub-process.c
> @@ -105,3 +105,97 @@ int subprocess_start(struct hashmap *hashmap, struct 
> subprocess_entry *entry, co
>       hashmap_add(hashmap, entry);
>       return 0;
>  }
> +
> +int subprocess_handshake(struct subprocess_entry *entry,
> +                      const char *welcome_prefix,
> +                      int *versions,
> +                      int *chosen_version,
> +                      struct subprocess_capability *capabilities,
> +                      unsigned int *supported_capabilities) {
> +     int version_scratch;
> +     unsigned int capabilities_scratch;
> +     struct child_process *process = &entry->process;
> +     int i;
> +     char *line;
> +     const char *p;
> +
> +     if (!chosen_version)
> +             chosen_version = &version_scratch;
> +     if (!supported_capabilities)
> +             supported_capabilities = &capabilities_scratch;
> +
> +     sigchain_push(SIGPIPE, SIG_IGN);
> +
> +     if (packet_write_fmt_gently(process->in, "%sclient\n",
> +                                 welcome_prefix)) {
> +             error("Could not write client identification");
> +             goto error;
> +     }
> +     for (i = 0; versions[i]; i++) {
> +             if (packet_write_fmt_gently(process->in, "version=%d\n",
> +                                         versions[i])) {
> +                     error("Could not write requested version");
> +                     goto error;
> +             }
> +     }

This forces version numbers to be positive integers, which is OK, as
I do not see it a downside that any potential application cannot use
"version=0".

> +     if (packet_flush_gently(process->in))
> +             goto error;
> +
> +     if (!(line = packet_read_line(process->out, NULL)) ||
> +         !skip_prefix(line, welcome_prefix, &p) ||
> +         strcmp(p, "server")) {
> +             error("Unexpected line '%s', expected %sserver",
> +                   line ? line : "<flush packet>", welcome_prefix);
> +             goto error;
> +     }
> +     if (!(line = packet_read_line(process->out, NULL)) ||
> +         !skip_prefix(line, "version=", &p) ||
> +         strtol_i(p, 10, chosen_version)) {
> +             error("Unexpected line '%s', expected version",
> +                   line ? line : "<flush packet>");
> +             goto error;
> +     }
> +     for (i = 0; versions[i]; i++) {
> +             if (versions[i] == *chosen_version)
> +                     goto version_found;
> +     }
> +     error("Version %d not supported", *chosen_version);
> +     goto error;
> +version_found:

It would have been more natural to do

        for (i = 0; versions[i]; i++)
                if (versions[i] == *chosen_version)
                        break;
        if (versions[i]) {
                error("...");
                goto error;
        }

without "version_found:" label.  In general, I'd prefer to avoid
jumping to a label in the normal/expected case and reserve "goto"
for error handling.

> +     if ((line = packet_read_line(process->out, NULL))) {
> +             error("Unexpected line '%s', expected flush", line);
> +             goto error;
> +     }
> +
> +     for (i = 0; capabilities[i].name; i++) {
> +             if (packet_write_fmt_gently(process->in, "capability=%s\n",
> +                                         capabilities[i].name)) {
> +                     error("Could not write requested capability");
> +                     goto error;
> +             }
> +     }
> +     if (packet_flush_gently(process->in))
> +             goto error;
> +
> +     while ((line = packet_read_line(process->out, NULL))) {
> +             if (!skip_prefix(line, "capability=", &p))
> +                     continue;
> +
> +             for (i = 0; capabilities[i].name; i++) {
> +                     if (!strcmp(p, capabilities[i].name)) {
> +                             *supported_capabilities |= capabilities[i].flag;
> +                             goto capability_found;
> +                     }
> +             }
> +             warning("external filter requested unsupported filter 
> capability '%s'",
> +                     p);
> +capability_found:
> +             ;

Likewise.

Also, this is the reason why I said this might make future
applications feel a bit too constrained; is the set of fields in the
subprocess_capability struct general enough?  It can only say "a
capability with this name was found" with a single bit, so you can
have only 32 (or 64) capabilities that are all yes/no.  I am not
saying that is definitely insufficient (not yet anyway); I am
wondering if future applications may need to have something like:

        capability=buffer-size=64k

where "=64k" part is not known at this layer but is known by the
user of the API.

> +     }
> +
> +     sigchain_pop(SIGPIPE);
> +     return 0;
> +error:
> +     sigchain_pop(SIGPIPE);
> +     return 1;

I would prepare at the beginning of the function:

        int retval = -1; /* assume failure */

and rewrite the above to

                retval = 0;
        error:
                sigchain_pop(SIGPIPE);
                return retval;

if I were writing this code.

Reply via email to