Am 05.04.2015 um 03:06 schrieb Jeff King:
> As I've mentioned before, I have some repositories with rather large
> numbers of refs. The worst one has ~13 million refs, for a 1.6GB
> packed-refs file. So I was saddened by this:
> 
>    $ time git.v2.0.0 rev-parse refs/heads/foo >/dev/null 2>&1
>    real    0m6.840s
>    user    0m6.404s
>    sys     0m0.440s
> 
>    $ time git.v2.4.0-rc1 rev-parse refs/heads/foo >/dev/null 2>&1
>    real    0m19.432s
>    user    0m18.996s
>    sys     0m0.456s
> 
> The command isn't important; what I'm really measuring is loading the
> packed-refs file. And yes, of course this repository is absolutely
> ridiculous. But the slowdowns here are linear with the number of refs.
> So _every_ git command got a little bit slower, even in less crazy
> repositories. We just didn't notice it as much.
> 
> Here are the numbers after this series:
> 
>    real    0m8.539s
>    user    0m8.052s
>    sys     0m0.496s
> 
> Much better, but I'm frustrated that they are still 20% slower than the
> original.
> 
> The main culprits seem to be d0f810f (which introduced some extra
> expensive code for each ref) and my 10c497a, which switched from fgets()
> to strbuf_getwholeline. It turns out that strbuf_getwholeline is really
> slow.

10c497a changed read_packed_refs(), which reads *all* packed refs.
Each is checked for validity.  That sounds expensive if the goal is
just to look up a single (non-existing) ref.

Would it help to defer any checks until a ref is actually accessed?
Can a binary search be used instead of reading the whole file?

I wonder if pluggable reference backends could help here.  Storing refs
in a database table indexed by refname should simplify things.

Short-term, can we avoid the getc()/strbuf_grow() dance e.g. by mapping
the packed refs file?  What numbers do you get with the following patch?

---
 refs.c | 36 ++++++++++++++++++++++++++++--------
 1 file changed, 28 insertions(+), 8 deletions(-)

diff --git a/refs.c b/refs.c
index 47e4e53..144255f 100644
--- a/refs.c
+++ b/refs.c
@@ -1153,16 +1153,35 @@ static const char *parse_ref_line(struct strbuf *line, 
unsigned char *sha1)
  *      compatibility with older clients, but we do not require it
  *      (i.e., "peeled" is a no-op if "fully-peeled" is set).
  */
-static void read_packed_refs(FILE *f, struct ref_dir *dir)
+static void read_packed_refs(int fd, struct ref_dir *dir)
 {
        struct ref_entry *last = NULL;
        struct strbuf line = STRBUF_INIT;
        enum { PEELED_NONE, PEELED_TAGS, PEELED_FULLY } peeled = PEELED_NONE;
+       struct stat st;
+       void *map;
+       size_t mapsz, len;
+       const char *p;
+
+       fstat(fd, &st);
+       mapsz = xsize_t(st.st_size);
+       if (!mapsz)
+               return;
+       map = xmmap(NULL, mapsz, PROT_READ, MAP_PRIVATE, fd, 0);
 
-       while (strbuf_getwholeline(&line, f, '\n') != EOF) {
+       for (p = map, len = mapsz; len; ) {
                unsigned char sha1[20];
                const char *refname;
                const char *traits;
+               const char *nl;
+               size_t linelen;
+
+               nl = memchr(p, '\n', len);
+               linelen = nl ? nl - p + 1 : len;
+               strbuf_reset(&line);
+               strbuf_add(&line, p, linelen);
+               p += linelen;
+               len -= linelen;
 
                if (skip_prefix(line.buf, "# pack-refs with:", &traits)) {
                        if (strstr(traits, " fully-peeled "))
@@ -1204,6 +1223,7 @@ static void read_packed_refs(FILE *f, struct ref_dir *dir)
        }
 
        strbuf_release(&line);
+       munmap(map, mapsz);
 }
 
 /*
@@ -1224,16 +1244,16 @@ static struct packed_ref_cache 
*get_packed_ref_cache(struct ref_cache *refs)
                clear_packed_ref_cache(refs);
 
        if (!refs->packed) {
-               FILE *f;
+               int fd;
 
                refs->packed = xcalloc(1, sizeof(*refs->packed));
                acquire_packed_ref_cache(refs->packed);
                refs->packed->root = create_dir_entry(refs, "", 0, 0);
-               f = fopen(packed_refs_file, "r");
-               if (f) {
-                       stat_validity_update(&refs->packed->validity, 
fileno(f));
-                       read_packed_refs(f, get_ref_dir(refs->packed->root));
-                       fclose(f);
+               fd = open(packed_refs_file, O_RDONLY);
+               if (fd >= 0) {
+                       stat_validity_update(&refs->packed->validity, fd);
+                       read_packed_refs(fd, get_ref_dir(refs->packed->root));
+                       close(fd);
                }
        }
        return refs->packed;
-- 
2.3.5

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to