Re: [PATCH v3 04/14] commit-graph: implement write_commit_graph()

Junio C Hamano Thu, 08 Feb 2018 14:15:00 -0800

Derrick Stolee <sto...@gmail.com> writes:

> +char* get_commit_graph_filename_hash(const char *pack_dir,


Asterisk sticks to the identifier, not type, in our codebase.

> +                                  struct object_id *hash)
> +{
> +     size_t len;
> +     struct strbuf path = STRBUF_INIT;
> +     strbuf_addstr(&path, pack_dir);
> +     strbuf_addstr(&path, "/graph-");
> +     strbuf_addstr(&path, oid_to_hex(hash));
> +     strbuf_addstr(&path, ".graph");

Use of strbuf_addf() would make it easier to read and maintain, no?

> +
> +     return strbuf_detach(&path, &len);
> +}
> +
> +static void write_graph_chunk_fanout(struct sha1file *f,
> +                                  struct commit **commits,
> +                                  int nr_commits)
> +{
> +     uint32_t i, count = 0;
> +     struct commit **list = commits;
> +     struct commit **last = commits + nr_commits;
> +
> +     /*
> +      * Write the first-level table (the list is sorted,
> +      * but we use a 256-entry lookup to be able to avoid
> +      * having to do eight extra binary search iterations).
> +      */
> +     for (i = 0; i < 256; i++) {
> +             while (list < last) {
> +                     if ((*list)->object.oid.hash[0] != i)
> +                             break;
> +                     count++;
> +                     list++;
> +             }

If count and list are always incremented in unison, perhaps you do
not need an extra variable "last".  If typeof(nr_commits) is wider
than typeof(count), this loop and the next write-be32 is screwed
anyway ;-)

This comment probably applies equally to some other uses of the same
"compute last pointer to compare with running pointer for
termination" pattern in this patch.

> +             sha1write_be32(f, count);
> +     }
> +}

> +static int if_packed_commit_add_to_list(const struct object_id *oid,

That is a strange name.  "collect packed commits", perhaps?

> +                                     struct packed_git *pack,
> +                                     uint32_t pos,
> +                                     void *data)
> +{
> +     struct packed_oid_list *list = (struct packed_oid_list*)data;
> +     enum object_type type;
> +     unsigned long size;
> +     void *inner_data;
> +     off_t offset = nth_packed_object_offset(pack, pos);
> +     inner_data = unpack_entry(pack, offset, &type, &size);
> +
> +     if (inner_data)
> +             free(inner_data);
> +
> +     if (type != OBJ_COMMIT)
> +             return 0;
> +
> +     ALLOC_GROW(list->list, list->nr + 1, list->alloc);

This probably will become inefficient in large repositories.  You
know you'll be walking all the pack files, and total number of
objects in a packfile can be read cheaply, so it may make sense to
make a rough guestimate of the number of commits (e.g. 15-25% of the
total number of objects) in the repository and allocate the list
upfront, instead of a hardcoded 1024.

Re: [PATCH v3 04/14] commit-graph: implement write_commit_graph()

Reply via email to