Am 05.04.2019 um 01:27 schrieb Jeff King:
> We can use skip_prefix() and parse_oid_hex() to continuously increment
> our pointer, rather than dealing with magic numbers. This also fixes a
> few small shortcomings:
>
> - if we see a 'P' line that does not match our expectations, we'll
> leave our "i" counter in the middle of the line. So we'll parse:
> "P P P pack-1234.pack" as if there was just one "P" which was not
> intentional (though probably not that harmful).
How so? The default case, which we'd fall through to, skips the rest
of such a line, doesn't it?
>
> - if we see a line with the right prefix, suffix, and length, i.e.
> matching /P pack-.{40}.pack\n/, we'll interpret the middle part as
> hex without checking if it could be parsed. This could lead to us
> looking at uninitialized garbage in the hash array. In practice this
> means we'll just make a garbage request to the server which will
> fail, though it's interesting that a malicious server could convince
> us to leak 40 bytes of uninitialized stack to them.
>
> - the current code is picky about seeing a newline at the end of file,
> but we can easily be more liberal
>
> Signed-off-by: Jeff King <[email protected]>
> ---
> http.c | 35 ++++++++++++++---------------------
> 1 file changed, 14 insertions(+), 21 deletions(-)
>
> diff --git a/http.c b/http.c
> index a32ad36ddf..2ef47bc779 100644
> --- a/http.c
> +++ b/http.c
> @@ -2147,11 +2147,11 @@ static int fetch_and_setup_pack_index(struct
> packed_git **packs_head,
> int http_get_info_packs(const char *base_url, struct packed_git **packs_head)
> {
> struct http_get_options options = {0};
> - int ret = 0, i = 0;
> - char *url, *data;
> + int ret = 0;
> + char *url;
> + const char *data;
> struct strbuf buf = STRBUF_INIT;
> - unsigned char hash[GIT_MAX_RAWSZ];
> - const unsigned hexsz = the_hash_algo->hexsz;
> + struct object_id oid;
>
> end_url_with_slash(&buf, base_url);
> strbuf_addstr(&buf, "objects/info/packs");
> @@ -2163,24 +2163,17 @@ int http_get_info_packs(const char *base_url, struct
> packed_git **packs_head)
> goto cleanup;
>
> data = buf.buf;
> - while (i < buf.len) {
> - switch (data[i]) {
> - case 'P':
> - i++;
> - if (i + hexsz + 12 <= buf.len &&
> - starts_with(data + i, " pack-") &&
> - starts_with(data + i + hexsz + 6, ".pack\n")) {
> - get_sha1_hex(data + i + 6, hash);
> - fetch_and_setup_pack_index(packs_head, hash,
> - base_url);
> - i += hexsz + 11;
> - break;
> - }
> - default:
> - while (i < buf.len && data[i] != '\n')
> - i++;
> + while (*data) {
> + if (skip_prefix(data, "P pack-", &data) &&
> + !parse_oid_hex(data, &oid, &data) &&
> + skip_prefix(data, ".pack", &data) &&
> + (*data == '\n' || *data == '\0')) {
> + fetch_and_setup_pack_index(packs_head, oid.hash,
> base_url);
> + } else {
> + data = strchrnul(data, '\n');
> }
> - i++;
> + if (*data)
> + data++; /* skip past newline */
So much simpler, *and* converted to object_id -- I like it!
Parsing "P" and "pack-" together crosses logical token boundaries,
but that I don't mind it here.
René