Re: [PATCH 8/9] sha1-file: use loose object cache for quick existence check

2018-12-05 Thread Jeff King
On Wed, Dec 05, 2018 at 07:41:44PM +0100, René Scharfe wrote:

> > If we start to convert those, there's a
> > little bit of a rabbit hole, but it's actually not too bad.
> 
> You don't need to crawl in just for quick_has_loose(), but eventually
> everything has to be converted.  It seems a bit much for one patch, but
> perhaps that's just my ever-decreasing attention span speaking.

Yeah, my normal process here is to dig to the bottom of the rabbit hole,
and then break it into actual patches. I just shared the middle state
here. ;)

I suspect the http bits could be split off into their own thing. The
bits in sha1-file.c I'd plan to mostly do all together, as they are
a family of related functions.

Mostly I wasn't sure how to wrap this up with the other changes. It's
obviously pretty invasive, and I don't want it to get in the way of
actual functional changes we've been discussing.

> > @@ -879,15 +879,15 @@ int git_open_cloexec(const char *name, int flags)
> >   * Note that it may point to static storage and is only valid until another
> >   * call to stat_sha1_file().
> >   */
> > -static int stat_sha1_file(struct repository *r, const unsigned char *sha1,
> > - struct stat *st, const char **path)
> > +static int stat_loose_object(struct repository *r, const struct object_id 
> > *oid,
> > +struct stat *st, const char **path)
> 
> Hmm, read_sha1_file() was renamed to read_object_file() earlier this
> year, and I'd have expected this to become stat_object_file().  Names..

read_object_file() is about reading an object from _any_ source. These
are specifically about loose objects, and I think that distinction is
important (both here and for map_loose_object, etc).

I'd actually argue that read_object_file() should just be read_object(),
but that already exists. Sigh. (I think it's fixable, but obviously
orthogonal to this topic).

> Anyway, the comment above and one a few lines below should be updated
> as well.

Thanks, fixed.

> >  static int open_sha1_file(struct repository *r,
> > - const unsigned char *sha1, const char **path)
> > + const struct object_id *oid, const char **path)
> 
> That function should lose the "sha1" in its name as well.

Yep, fixed.

> > -static void *unpack_sha1_rest(git_zstream *stream, void *buffer, unsigned 
> > long size, const unsigned char *sha1)
> > +static void *unpack_loose_rest(git_zstream *stream,
> > +  void *buffer, unsigned long size,
> > +  const struct object_id *oid)
> 
> Hmm, both old and new name here look weird to me at this point.

It makes more sense in the pairing of unpack_sha1_header() and
unpack_sha1_rest(). Maybe "body" would be better than "rest".

At any rate, it probably makes sense to rename them together (but I
didn't touch the "header" one here). Maybe the name changes should come
as a separate patch. I was mostly changing them here because I was
changing the signatures anyway, and had to touch all of the callers.

-Peff


Re: [PATCH 8/9] sha1-file: use loose object cache for quick existence check

2018-12-05 Thread René Scharfe
Am 05.12.2018 um 09:15 schrieb Jeff King:
> On Wed, Dec 05, 2018 at 01:51:36AM -0500, Jeff King wrote:
> 
>>> This
>>> function is easily converted to struct object_id, though, as its single
>>> caller can pass one on -- this makes the copy unnecessary.
>>
>> If you mean modifying sha1_loose_object_info() to take an oid, then
>> sure, I agree that is a good step forward (and that is exactly the "punt
>> until" moment I meant).
> 
> So the simple thing is to do that, and then have it pass oid->hash to
> the other functions it uses.

Yes.

> If we start to convert those, there's a
> little bit of a rabbit hole, but it's actually not too bad.

You don't need to crawl in just for quick_has_loose(), but eventually
everything has to be converted.  It seems a bit much for one patch, but
perhaps that's just my ever-decreasing attention span speaking.

Converting one function prototype or struct member at a time seems
about the right amount of change per patch to me.  That's not always
possible due to entanglement, of course.

> Most of the spill-over is into the dumb-http code. Note that it actually
> uses sha1 itself! That probably needs to be the_hash_algo (though I'm
> not even sure how we'd negotiate the algorithm across a dumb fetch). At
> any rate, I don't think this patch makes anything _worse_ in that
> respect.

Right.

> diff --git a/sha1-file.c b/sha1-file.c
> index 3ddf4c9426..0705709036 100644
> --- a/sha1-file.c
> +++ b/sha1-file.c
> @@ -333,12 +333,12 @@ int raceproof_create_file(const char *path, 
> create_file_fn fn, void *cb)
>   return ret;
>  }
>  
> -static void fill_sha1_path(struct strbuf *buf, const unsigned char *sha1)
> +static void fill_loose_path(struct strbuf *buf, const struct object_id *oid)

The new name fits.

> @@ -879,15 +879,15 @@ int git_open_cloexec(const char *name, int flags)
>   * Note that it may point to static storage and is only valid until another
>   * call to stat_sha1_file().
>   */
> -static int stat_sha1_file(struct repository *r, const unsigned char *sha1,
> -   struct stat *st, const char **path)
> +static int stat_loose_object(struct repository *r, const struct object_id 
> *oid,
> +  struct stat *st, const char **path)

Hmm, read_sha1_file() was renamed to read_object_file() earlier this
year, and I'd have expected this to become stat_object_file().  Names..

Anyway, the comment above and one a few lines below should be updated
as well.

>  {
>   struct object_directory *odb;
>   static struct strbuf buf = STRBUF_INIT;
>  
>   prepare_alt_odb(r);
>   for (odb = r->objects->odb; odb; odb = odb->next) {
> - *path = odb_loose_path(odb, , sha1);
> + *path = odb_loose_path(odb, , oid);
>   if (!lstat(*path, st))
>   return 0;
>   }
> @@ -900,7 +900,7 @@ static int stat_sha1_file(struct repository *r, const 
> unsigned char *sha1,
>   * descriptor. See the caveats on the "path" parameter above.
>   */
>  static int open_sha1_file(struct repository *r,
> -   const unsigned char *sha1, const char **path)
> +   const struct object_id *oid, const char **path)

That function should lose the "sha1" in its name as well.

> -static void *map_sha1_file_1(struct repository *r, const char *path,
> -  const unsigned char *sha1, unsigned long *size)
> +static void *map_loose_object_1(struct repository *r, const char *path,
> +  const struct object_id *oid, unsigned long *size)

Similarly, map_object_file_1()?

> -void *map_sha1_file(struct repository *r,
> - const unsigned char *sha1, unsigned long *size)
> +void *map_loose_object(struct repository *r,
> +const struct object_id *oid,
> +unsigned long *size)

Similar.

> @@ -1045,7 +1043,9 @@ static int unpack_sha1_header_to_strbuf(git_zstream 
> *stream, unsigned char *map,
>   return -1;
>  }
>  
> -static void *unpack_sha1_rest(git_zstream *stream, void *buffer, unsigned 
> long size, const unsigned char *sha1)
> +static void *unpack_loose_rest(git_zstream *stream,
> +void *buffer, unsigned long size,
> +const struct object_id *oid)

Hmm, both old and new name here look weird to me at this point.

> -static int sha1_loose_object_info(struct repository *r,
> -   const unsigned char *sha1,
> -   struct object_info *oi, int flags)
> +static int loose_object_info(struct repository *r,
> +  const struct object_id *oid,
> +  struct object_info *oi, int flags)

And nothing of value was lost. :)

René


Re: [PATCH 8/9] sha1-file: use loose object cache for quick existence check

2018-12-05 Thread Jeff King
On Wed, Dec 05, 2018 at 01:51:36AM -0500, Jeff King wrote:

> > This
> > function is easily converted to struct object_id, though, as its single
> > caller can pass one on -- this makes the copy unnecessary.
> 
> If you mean modifying sha1_loose_object_info() to take an oid, then
> sure, I agree that is a good step forward (and that is exactly the "punt
> until" moment I meant).

So the simple thing is to do that, and then have it pass oid->hash to
the other functions it uses. If we start to convert those, there's a
little bit of a rabbit hole, but it's actually not too bad.

Most of the spill-over is into the dumb-http code. Note that it actually
uses sha1 itself! That probably needs to be the_hash_algo (though I'm
not even sure how we'd negotiate the algorithm across a dumb fetch). At
any rate, I don't think this patch makes anything _worse_ in that
respect.

diff --git a/http-push.c b/http-push.c
index cd48590912..0141b0ad53 100644
--- a/http-push.c
+++ b/http-push.c
@@ -255,7 +255,7 @@ static void start_fetch_loose(struct transfer_request 
*request)
struct active_request_slot *slot;
struct http_object_request *obj_req;
 
-   obj_req = new_http_object_request(repo->url, request->obj->oid.hash);
+   obj_req = new_http_object_request(repo->url, >obj->oid);
if (obj_req == NULL) {
request->state = ABORTED;
return;
diff --git a/http-walker.c b/http-walker.c
index 0a392c85b6..29b59e2fe0 100644
--- a/http-walker.c
+++ b/http-walker.c
@@ -58,7 +58,7 @@ static void start_object_request(struct walker *walker,
struct active_request_slot *slot;
struct http_object_request *req;
 
-   req = new_http_object_request(obj_req->repo->base, obj_req->oid.hash);
+   req = new_http_object_request(obj_req->repo->base, _req->oid);
if (req == NULL) {
obj_req->state = ABORTED;
return;
@@ -543,11 +543,11 @@ static int fetch_object(struct walker *walker, unsigned 
char *sha1)
} else if (req->zret != Z_STREAM_END) {
walker->corrupt_object_found++;
ret = error("File %s (%s) corrupt", hex, req->url);
-   } else if (!hasheq(obj_req->oid.hash, req->real_sha1)) {
+   } else if (!oideq(_req->oid, >real_oid)) {
ret = error("File %s has bad hash", hex);
} else if (req->rename < 0) {
struct strbuf buf = STRBUF_INIT;
-   loose_object_path(the_repository, , req->sha1);
+   loose_object_path(the_repository, , >oid);
ret = error("unable to write sha1 filename %s", buf.buf);
strbuf_release();
}
diff --git a/http.c b/http.c
index 7cfa7a16e0..e95b5b9be0 100644
--- a/http.c
+++ b/http.c
@@ -2298,9 +2298,9 @@ static size_t fwrite_sha1_file(char *ptr, size_t eltsize, 
size_t nmemb,
 }
 
 struct http_object_request *new_http_object_request(const char *base_url,
-   unsigned char *sha1)
+   const struct object_id *oid)
 {
-   char *hex = sha1_to_hex(sha1);
+   char *hex = oid_to_hex(oid);
struct strbuf filename = STRBUF_INIT;
struct strbuf prevfile = STRBUF_INIT;
int prevlocal;
@@ -2311,10 +2311,10 @@ struct http_object_request 
*new_http_object_request(const char *base_url,
 
freq = xcalloc(1, sizeof(*freq));
strbuf_init(>tmpfile, 0);
-   hashcpy(freq->sha1, sha1);
+   oidcpy(>oid, oid);
freq->localfile = -1;
 
-   loose_object_path(the_repository, , sha1);
+   loose_object_path(the_repository, , oid);
strbuf_addf(>tmpfile, "%s.temp", filename.buf);
 
strbuf_addf(, "%s.prev", filename.buf);
@@ -2456,16 +2456,16 @@ int finish_http_object_request(struct 
http_object_request *freq)
}
 
git_inflate_end(>stream);
-   git_SHA1_Final(freq->real_sha1, >c);
+   git_SHA1_Final(freq->real_oid.hash, >c);
if (freq->zret != Z_STREAM_END) {
unlink_or_warn(freq->tmpfile.buf);
return -1;
}
-   if (!hasheq(freq->sha1, freq->real_sha1)) {
+   if (!oideq(>oid, >real_oid)) {
unlink_or_warn(freq->tmpfile.buf);
return -1;
}
-   loose_object_path(the_repository, , freq->sha1);
+   loose_object_path(the_repository, , >oid);
freq->rename = finalize_object_file(freq->tmpfile.buf, filename.buf);
strbuf_release();
 
diff --git a/http.h b/http.h
index d305ca1dc7..66c52b2e1e 100644
--- a/http.h
+++ b/http.h
@@ -224,8 +224,8 @@ struct http_object_request {
CURLcode curl_result;
char errorstr[CURL_ERROR_SIZE];
long http_code;
-   unsigned char sha1[20];
-   unsigned char real_sha1[20];
+   struct object_id oid;
+   struct object_id real_oid;
git_SHA_CTX c;
git_zstream stream;
int zret;
@@ -234,7 +234,7 @@ struct http_object_request {
 };
 
 extern struct 

Re: [PATCH 8/9] sha1-file: use loose object cache for quick existence check

2018-12-04 Thread Jeff King
On Wed, Dec 05, 2018 at 07:02:17AM +0100, René Scharfe wrote:

> > I actually wrote it that way initially, but doing the hashcpy() in the
> > caller is a bit more awkward. My thought was to punt on that until the
> > rest of the surrounding code starts handling oids.
> 
> The level of awkwardness is the same to me, but sha1_loose_object_info()
> is much longer already, so it might feel worse to add it there. 

Right, what I meant was that in sha1_loose_object_info():

  if (special_case)
handle_special_case();

is easier to follow than a block setting up the special case and then
calling the function.

> This
> function is easily converted to struct object_id, though, as its single
> caller can pass one on -- this makes the copy unnecessary.

If you mean modifying sha1_loose_object_info() to take an oid, then
sure, I agree that is a good step forward (and that is exactly the "punt
until" moment I meant).

> > This patch looks sane. How do you want to handle it? I'd assumed your
> > earlier one would be for applying on top, but this one doesn't have a
> > commit message. Did you want me to squash down the individual hunks?
> 
> I'm waiting for the first one (object-store: use one oid_array per
> subdirectory for loose cache) to be accepted, as it aims to solve a
> user-visible performance regression, i.e. that's where the meat is.
> I'm particularly interested in performance numbers from Ævar for it.
> 
> I can send the last one properly later, and add patches for converting
> quick_has_loose() to struct object_id.  Those are just cosmetic..

Great, thanks. I just wanted to make sure these improvements weren't
going to slip through the cracks.

-Peff


Re: [PATCH 8/9] sha1-file: use loose object cache for quick existence check

2018-12-04 Thread René Scharfe
Am 05.12.2018 um 05:46 schrieb Jeff King:
> On Tue, Dec 04, 2018 at 10:45:13PM +0100, René Scharfe wrote:
> 
>>> The comment in the context there is warning callers to remember to load
>>> the cache first. Now that we have individual caches, might it make sense
>>> to change the interface a bit, and make these members private. I.e.,
>>> something like:
>>>
>>>   struct oid_array *odb_loose_cache(struct object_directory *odb,
>>> int subdir_nr) 
>>>   {
>>> if (!loose_objects_subdir_seen[subdir_nr])
>>> odb_load_loose_cache(odb, subdir_nr); /* or just inline it here 
>>> */
>>>
>>> return >loose_objects_cache[subdir_nr];
>>>   }
>>
>> Sure.  And it should take an object_id pointer instead of a subdir_nr --
>> less duplication, nicer interface (patch below).
> 
> I had considered that initially, but it's a little less flexible if a
> caller doesn't actually have an oid. Though both of the proposed callers
> do, so it's probably OK to worry about that if and when it ever comes up
> (the most plausible thing in my mind is doing some abbrev-like analysis
> without looking to abbreviate a _particular_ oid).

Right, let's focus on concrete requirements of current callers.  YAGNI..
:)

>> And quick_has_loose() should be converted to object_id as well -- adding
>> a function that takes a SHA-1 is a regression. :D
> 
> I actually wrote it that way initially, but doing the hashcpy() in the
> caller is a bit more awkward. My thought was to punt on that until the
> rest of the surrounding code starts handling oids.

The level of awkwardness is the same to me, but sha1_loose_object_info()
is much longer already, so it might feel worse to add it there.  This
function is easily converted to struct object_id, though, as its single
caller can pass one on -- this makes the copy unnecessary.

> This patch looks sane. How do you want to handle it? I'd assumed your
> earlier one would be for applying on top, but this one doesn't have a
> commit message. Did you want me to squash down the individual hunks?

I'm waiting for the first one (object-store: use one oid_array per
subdirectory for loose cache) to be accepted, as it aims to solve a
user-visible performance regression, i.e. that's where the meat is.
I'm particularly interested in performance numbers from Ævar for it.

I can send the last one properly later, and add patches for converting
quick_has_loose() to struct object_id.  Those are just cosmetic..

René


Re: [PATCH 8/9] sha1-file: use loose object cache for quick existence check

2018-12-04 Thread Jeff King
On Tue, Dec 04, 2018 at 10:45:13PM +0100, René Scharfe wrote:

> > The comment in the context there is warning callers to remember to load
> > the cache first. Now that we have individual caches, might it make sense
> > to change the interface a bit, and make these members private. I.e.,
> > something like:
> > 
> >   struct oid_array *odb_loose_cache(struct object_directory *odb,
> > int subdir_nr) 
> >   {
> > if (!loose_objects_subdir_seen[subdir_nr])
> > odb_load_loose_cache(odb, subdir_nr); /* or just inline it here 
> > */
> > 
> > return >loose_objects_cache[subdir_nr];
> >   }
> 
> Sure.  And it should take an object_id pointer instead of a subdir_nr --
> less duplication, nicer interface (patch below).

I had considered that initially, but it's a little less flexible if a
caller doesn't actually have an oid. Though both of the proposed callers
do, so it's probably OK to worry about that if and when it ever comes up
(the most plausible thing in my mind is doing some abbrev-like analysis
without looking to abbreviate a _particular_ oid).

> It would be nice if it could return a const pointer to discourage
> messing up the cache, but that's not compatible with oid_array_lookup().

Yeah.

> And quick_has_loose() should be converted to object_id as well -- adding
> a function that takes a SHA-1 is a regression. :D

I actually wrote it that way initially, but doing the hashcpy() in the
caller is a bit more awkward. My thought was to punt on that until the
rest of the surrounding code starts handling oids.

> ---
>  object-store.h |  8 
>  sha1-file.c| 19 ---
>  sha1-name.c|  4 +---
>  3 files changed, 13 insertions(+), 18 deletions(-)

This patch looks sane. How do you want to handle it? I'd assumed your
earlier one would be for applying on top, but this one doesn't have a
commit message. Did you want me to squash down the individual hunks?

-Peff


Re: [PATCH 8/9] sha1-file: use loose object cache for quick existence check

2018-12-04 Thread René Scharfe
Am 03.12.2018 um 23:04 schrieb Jeff King:
> On Sun, Dec 02, 2018 at 11:52:50AM +0100, René Scharfe wrote:
> 
>>> And for mu.git, a ~20k object repo:
>>>
>>> Test origin/master 
>>> peff/jk/loose-cache   avar/check-collisions-config
>>> 
>>> -
>>> 0008.2: index-pack with 256*1 loose objects  0.59(0.91+0.06)   
>>> 0.58(0.93+0.03) -1.7% 0.57(0.89+0.04) -3.4%
>>> 0008.3: index-pack with 256*10 loose objects 0.59(0.91+0.07)   
>>> 0.59(0.92+0.03) +0.0% 0.57(0.89+0.03) -3.4%
>>> 0008.4: index-pack with 256*100 loose objects0.59(0.91+0.05)   
>>> 0.81(1.13+0.04) +37.3%0.58(0.91+0.04) -1.7%
>>> 0008.5: index-pack with 256*250 loose objects0.59(0.91+0.05)   
>>> 1.23(1.51+0.08) +108.5%   0.58(0.91+0.04) -1.7%
>>> 0008.6: index-pack with 256*500 loose objects0.59(0.90+0.06)   
>>> 1.96(2.20+0.12) +232.2%   0.58(0.91+0.04) -1.7%
>>> 0008.7: index-pack with 256*750 loose objects0.59(0.92+0.05)   
>>> 2.72(2.92+0.17) +361.0%   0.58(0.90+0.04) -1.7%
>>> 0008.8: index-pack with 256*1000 loose objects   0.59(0.90+0.06)   
>>> 3.50(3.67+0.21) +493.2%   0.57(0.90+0.04) -3.4%
>>
>> OK, here's another theory: The cache scales badly with increasing
>> numbers of loose objects because it sorts the array 256 times as it is
>> filled.  Loading it fully and sorting once would help, as would using
>> one array per subdirectory.
> 
> Yeah, that makes sense. This was actually how I had planned to do it
> originally, but then I ended up just reusing the existing single-array
> approach from the abbrev code.
> 
> I hadn't actually thought about the repeated sortings (but that
> definitely makes sense that they would hurt in these pathological
> cases), but more just that we get a 256x reduction in N for our binary
> search (in fact we already do this first-byte lookup-table trick for
> pack index lookups).

Skipping eight steps in a binary search is something, but it's faster
even without that.

Just realized that the demo code can use "lookup" instead of the much
more expensive "for_each_unique" to sort.  D'oh!  With that change:

  for command in '
  foreach (0..255) {
$subdir = sprintf("%02x", $_);
foreach (1..$ARGV[0]) {
  printf("append %s%038d\n", $subdir, $_);
}
# intermediate sort
print "lookup " . "0" x 40 . "\n";
  }
' '
  foreach (0..255) {
$subdir = sprintf("%02x", $_);
foreach (1..$ARGV[0]) {
  printf("append %s%038d\n", $subdir, $_);
}
  }
  # sort once at the end
  print "lookup " . "0" x 40 . "\n";
' '
  foreach (0..255) {
$subdir = sprintf("%02x", $_);
foreach (1..$ARGV[0]) {
  printf("append %s%038d\n", $subdir, $_);
}
# sort each subdirectory separately
print "lookup " . "0" x 40 . "\n";
print "clear\n";
  }
'
  do
time perl -e "$command" 1000 | t/helper/test-tool sha1-array | wc -l
  done

And the results make the scale of the improvement more obvious:

  256

  real0m3.476s
  user0m3.466s
  sys 0m0.099s
  1

  real0m0.157s
  user0m0.148s
  sys 0m0.046s
  256

  real0m0.117s
  user0m0.116s
  sys 0m0.051s

> Your patch looks good to me. We may want to do one thing on top:
> 
>> diff --git a/object-store.h b/object-store.h
>> index 8dceed0f31..ee67a50980 100644
>> --- a/object-store.h
>> +++ b/object-store.h
>> @@ -20,7 +20,7 @@ struct object_directory {
>>   * Be sure to call odb_load_loose_cache() before using.
>>   */
>>  char loose_objects_subdir_seen[256];
>> -struct oid_array loose_objects_cache;
>> +struct oid_array loose_objects_cache[256];
> 
> The comment in the context there is warning callers to remember to load
> the cache first. Now that we have individual caches, might it make sense
> to change the interface a bit, and make these members private. I.e.,
> something like:
> 
>   struct oid_array *odb_loose_cache(struct object_directory *odb,
> int subdir_nr) 
>   {
>   if (!loose_objects_subdir_seen[subdir_nr])
>   odb_load_loose_cache(odb, subdir_nr); /* or just inline it here 
> */
> 
>   return >loose_objects_cache[subdir_nr];
>   }

Sure.  And it should take an object_id pointer instead of a subdir_nr --
less duplication, nicer interface (patch below).

It would be nice if it could return a const pointer to discourage
messing up the cache, but that's not compatible with oid_array_lookup().

And quick_has_loose() should be converted to object_id as well -- adding
a function that takes a SHA-1 is a regression. :D

René

---
 object-store.h |  8 
 sha1-file.c| 19 ---
 sha1-name.c|  4 +---
 3 files changed, 13 insertions(+), 18 deletions(-)


Re: [PATCH 8/9] sha1-file: use loose object cache for quick existence check

2018-12-03 Thread Jeff King
On Sun, Dec 02, 2018 at 11:52:50AM +0100, René Scharfe wrote:

> > And for mu.git, a ~20k object repo:
> > 
> > Test origin/master 
> > peff/jk/loose-cache   avar/check-collisions-config
> > 
> > -
> > 0008.2: index-pack with 256*1 loose objects  0.59(0.91+0.06)   
> > 0.58(0.93+0.03) -1.7% 0.57(0.89+0.04) -3.4%
> > 0008.3: index-pack with 256*10 loose objects 0.59(0.91+0.07)   
> > 0.59(0.92+0.03) +0.0% 0.57(0.89+0.03) -3.4%
> > 0008.4: index-pack with 256*100 loose objects0.59(0.91+0.05)   
> > 0.81(1.13+0.04) +37.3%0.58(0.91+0.04) -1.7%
> > 0008.5: index-pack with 256*250 loose objects0.59(0.91+0.05)   
> > 1.23(1.51+0.08) +108.5%   0.58(0.91+0.04) -1.7%
> > 0008.6: index-pack with 256*500 loose objects0.59(0.90+0.06)   
> > 1.96(2.20+0.12) +232.2%   0.58(0.91+0.04) -1.7%
> > 0008.7: index-pack with 256*750 loose objects0.59(0.92+0.05)   
> > 2.72(2.92+0.17) +361.0%   0.58(0.90+0.04) -1.7%
> > 0008.8: index-pack with 256*1000 loose objects   0.59(0.90+0.06)   
> > 3.50(3.67+0.21) +493.2%   0.57(0.90+0.04) -3.4%
> 
> OK, here's another theory: The cache scales badly with increasing
> numbers of loose objects because it sorts the array 256 times as it is
> filled.  Loading it fully and sorting once would help, as would using
> one array per subdirectory.

Yeah, that makes sense. This was actually how I had planned to do it
originally, but then I ended up just reusing the existing single-array
approach from the abbrev code.

I hadn't actually thought about the repeated sortings (but that
definitely makes sense that they would hurt in these pathological
cases), but more just that we get a 256x reduction in N for our binary
search (in fact we already do this first-byte lookup-table trick for
pack index lookups).

Your patch looks good to me. We may want to do one thing on top:

> diff --git a/object-store.h b/object-store.h
> index 8dceed0f31..ee67a50980 100644
> --- a/object-store.h
> +++ b/object-store.h
> @@ -20,7 +20,7 @@ struct object_directory {
>* Be sure to call odb_load_loose_cache() before using.
>*/
>   char loose_objects_subdir_seen[256];
> - struct oid_array loose_objects_cache;
> + struct oid_array loose_objects_cache[256];

The comment in the context there is warning callers to remember to load
the cache first. Now that we have individual caches, might it make sense
to change the interface a bit, and make these members private. I.e.,
something like:

  struct oid_array *odb_loose_cache(struct object_directory *odb,
int subdir_nr) 
  {
if (!loose_objects_subdir_seen[subdir_nr])
odb_load_loose_cache(odb, subdir_nr); /* or just inline it here 
*/

return >loose_objects_cache[subdir_nr];
  }

That's harder to get wrong, and this:

> diff --git a/sha1-file.c b/sha1-file.c
> index 05f63dfd4e..d2f5e65865 100644
> --- a/sha1-file.c
> +++ b/sha1-file.c
> @@ -933,7 +933,8 @@ static int quick_has_loose(struct repository *r,
>   prepare_alt_odb(r);
>   for (odb = r->objects->odb; odb; odb = odb->next) {
>   odb_load_loose_cache(odb, subdir_nr);
> - if (oid_array_lookup(>loose_objects_cache, ) >= 0)
> + if (oid_array_lookup(>loose_objects_cache[subdir_nr],
> +  ) >= 0)
>   return 1;
>   }

becomes:

  struct oid_array *cache = odb_loose_cache(odb, subdir_nr);
  if (oid_array_lookup(cache, ))
return 1;

(An even simpler interface would be a single function that computes
subdir_nr and does the lookup itself, but that would not be enough for
find_short_object_filename()).

-Peff


Re: [PATCH 8/9] sha1-file: use loose object cache for quick existence check

2018-12-02 Thread René Scharfe
Am 13.11.2018 um 11:02 schrieb Ævar Arnfjörð Bjarmason:
> 
> On Mon, Nov 12 2018, Ævar Arnfjörð Bjarmason wrote:
> 
>> On Mon, Nov 12 2018, Ævar Arnfjörð Bjarmason wrote:
>>
>>> I get:
>>>
>>> Test origin/master 
>>> peff/jk/loose-cache  avar/check-collisions-config
>>> 
>>> 
>>> 0008.2: index-pack with 256*1 loose objects  4.31(0.55+0.18)   
>>> 0.41(0.40+0.02) -90.5%   0.23(0.36+0.01) -94.7%
>>> 0008.3: index-pack with 256*10 loose objects 4.37(0.45+0.21)   
>>> 0.45(0.40+0.02) -89.7%   0.25(0.38+0.01) -94.3%
>>> 0008.4: index-pack with 256*100 loose objects4.47(0.53+0.23)   
>>> 0.67(0.63+0.02) -85.0%   0.24(0.38+0.01) -94.6%
>>> 0008.5: index-pack with 256*250 loose objects5.01(0.67+0.30)   
>>> 1.04(0.98+0.06) -79.2%   0.24(0.37+0.01) -95.2%
>>> 0008.6: index-pack with 256*500 loose objects5.11(0.57+0.21)   
>>> 1.81(1.70+0.09) -64.6%   0.25(0.38+0.01) -95.1%
>>> 0008.7: index-pack with 256*750 loose objects5.12(0.60+0.22)   
>>> 2.54(2.38+0.14) -50.4%   0.24(0.38+0.01) -95.3%
>>> 0008.8: index-pack with 256*1000 loose objects   4.52(0.52+0.21)   
>>> 3.36(3.17+0.17) -25.7%   0.23(0.36+0.01) -94.9%
>>>
>>> I then hacked it to test against git.git, but skipped origin/master for
>>> that one because it takes *ages*. So just mine v.s. yours:
>>>
>>> $ GIT_PERF_REPEAT_COUNT=3 GIT_PERF_MAKE_OPTS='-j56 CFLAGS="-O3"' ./run 
>>> peff/jk/loose-cache avar/check-collisions-config p0008-index-pack.sh
>>> [...]
>>> Test peff/jk/loose-cache   
>>> avar/check-collisions-config
>>> 
>>> ---
>>> 0008.2: index-pack with 256*1 loose objects  12.57(28.72+0.61) 
>>> 12.68(29.36+0.62) +0.9%
>>> 0008.3: index-pack with 256*10 loose objects 12.77(28.75+0.61) 
>>> 12.50(28.88+0.56) -2.1%
>>> 0008.4: index-pack with 256*100 loose objects13.20(29.49+0.66) 
>>> 12.38(28.58+0.60) -6.2%
>>> 0008.5: index-pack with 256*250 loose objects14.10(30.59+0.64) 
>>> 12.54(28.22+0.57) -11.1%
>>> 0008.6: index-pack with 256*500 loose objects14.48(31.06+0.74) 
>>> 12.43(28.59+0.60) -14.2%
>>> 0008.7: index-pack with 256*750 loose objects15.31(31.91+0.74) 
>>> 12.67(29.23+0.64) -17.2%
>>> 0008.8: index-pack with 256*1000 loose objects   16.34(32.84+0.76) 
>>> 13.11(30.19+0.68) -19.8%
>>>
>>> So not much of a practical difference perhaps. But then again this isn't
>>> a very realistic test case of anything. Rarely are you going to push a
>>> history of something the size of git.git into a repo with this many
>>> loose objects.
>>>
>>> Using sha1collisiondetection.git is I think the most realistic scenario,
>>> i.e. you'll often end up fetching/pushing something roughly the size of
>>> its entire history on a big repo, and with it:
>>>
>>> Test peff/jk/loose-cache   
>>> avar/check-collisions-config
>>> 
>>> ---
>>> 0008.2: index-pack with 256*1 loose objects  0.16(0.04+0.01)   
>>> 0.05(0.03+0.00) -68.8%
>>> 0008.3: index-pack with 256*10 loose objects 0.19(0.04+0.02)   
>>> 0.05(0.02+0.00) -73.7%
>>> 0008.4: index-pack with 256*100 loose objects0.32(0.17+0.02)   
>>> 0.04(0.02+0.00) -87.5%
>>> 0008.5: index-pack with 256*250 loose objects0.57(0.41+0.03)   
>>> 0.04(0.02+0.00) -93.0%
>>> 0008.6: index-pack with 256*500 loose objects1.02(0.83+0.06)   
>>> 0.04(0.03+0.00) -96.1%
>>> 0008.7: index-pack with 256*750 loose objects1.47(1.24+0.10)   
>>> 0.04(0.02+0.00) -97.3%
>>> 0008.8: index-pack with 256*1000 loose objects   1.94(1.70+0.10)   
>>> 0.04(0.02+0.00) -97.9%
>>>
>>> As noted in previous threads I have an in-house monorepo where (due to
>>> expiry policies) loose objects hover around the 256*250 mark.
>>>
>>> The script, which is hacky as hell and takes shortcuts not to re-create
>>> the huge fake loose object collection every time (takes ages). Perhaps
>>> you're interested in incorporating some version of this into a v2. To be
>>> useful it should take some target path as an env variable.
>>
>> I forgot perhaps the most useful metric. Testing against origin/master
>> too on the sha1collisiondetection.git repo, which as noted above I think
>> is a good stand-in for making a medium sized push to a big repo. This
>> shows when the loose cache becomes counterproductive:
>>
>> Test origin/master 
>> peff/jk/loose-cache   avar/check-collisions-config
>> 
>> 

Re: [PATCH 8/9] sha1-file: use loose object cache for quick existence check

2018-12-01 Thread Jeff King
On Tue, Nov 27, 2018 at 09:48:57PM +0100, René Scharfe wrote:

> > +static int quick_has_loose(struct repository *r,
> > +  const unsigned char *sha1)
> > +{
> > +   int subdir_nr = sha1[0];
> > +   struct object_id oid;
> > +   struct object_directory *odb;
> > +
> > +   hashcpy(oid.hash, sha1);
> > +
> > +   prepare_alt_odb(r);
> > +   for (odb = r->objects->odb; odb; odb = odb->next) {
> > +   odb_load_loose_cache(odb, subdir_nr);
> 
> Is this thread-safe?  What happens if e.g. one index-pack thread resizes
> the array while another one sorts it?

No, but neither is any of the object_info / has_object_file path, which
may use static function-local buffers, or (before my series) alt scratch
bufs, or even call reprepare_packed_git().

In the long run, I think the solution is probably going to be pushing
some mutexes into the right places, and putting one around the cache
fill is an obvious place.

> Loading the cache explicitly up-front would avoid that, and improves
> performance a bit in my (very limited) tests on an SSD.  Demo patch for
> next at the bottom.  How does it do against your test cases?

It's going to do badly on corner cases where we don't need to load every
object subdirectory, and one or more of them are big. I.e., if I look up
"1234abcd", the current code only needs to fault in $GIT_DIR/objects/12.
Pre-loading means we'd hit them all. Even without a lot of objects, on
NFS that's 256 latencies instead of 1.

-Peff


Re: [PATCH 8/9] sha1-file: use loose object cache for quick existence check

2018-11-27 Thread René Scharfe
Am 12.11.2018 um 15:54 schrieb Jeff King:
> diff --git a/sha1-file.c b/sha1-file.c
> index 4aae716a37..e53da0b701 100644
> --- a/sha1-file.c
> +++ b/sha1-file.c
> @@ -921,6 +921,24 @@ static int open_sha1_file(struct repository *r,
>   return -1;
>  }
>  
> +static int quick_has_loose(struct repository *r,
> +const unsigned char *sha1)
> +{
> + int subdir_nr = sha1[0];
> + struct object_id oid;
> + struct object_directory *odb;
> +
> + hashcpy(oid.hash, sha1);
> +
> + prepare_alt_odb(r);
> + for (odb = r->objects->odb; odb; odb = odb->next) {
> + odb_load_loose_cache(odb, subdir_nr);

Is this thread-safe?  What happens if e.g. one index-pack thread resizes
the array while another one sorts it?

Loading the cache explicitly up-front would avoid that, and improves
performance a bit in my (very limited) tests on an SSD.  Demo patch for
next at the bottom.  How does it do against your test cases?

> + if (oid_array_lookup(>loose_objects_cache, ) >= 0)
> + return 1;
> + }
> + return 0;
> +}
> +
>  /*
>   * Map the loose object at "path" if it is not NULL, or the path found by
>   * searching for a loose object named "sha1".
> @@ -1171,6 +1189,8 @@ static int sha1_loose_object_info(struct repository *r,
>   if (!oi->typep && !oi->type_name && !oi->sizep && !oi->contentp) {
>   const char *path;
>   struct stat st;
> + if (!oi->disk_sizep && (flags & OBJECT_INFO_QUICK))
> + return quick_has_loose(r, sha1) ? 0 : -1;
>   if (stat_sha1_file(r, sha1, , ) < 0)
>   return -1;
>   if (oi->disk_sizep)
> 

 builtin/fetch.c  |  2 ++
 builtin/index-pack.c |  2 ++
 fetch-pack.c |  2 ++
 object-store.h   |  1 +
 sha1-file.c  | 30 +++---
 5 files changed, 34 insertions(+), 3 deletions(-)

diff --git a/builtin/fetch.c b/builtin/fetch.c
index e0140327aa..4b031f5da5 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -301,6 +301,8 @@ static void find_non_local_tags(const struct ref *refs,
refname_hash_init(_refs);
refname_hash_init(_refs);
 
+   repo_load_loose_cache(the_repository);
+
for_each_ref(add_one_refname, _refs);
for (ref = refs; ref; ref = ref->next) {
if (!starts_with(ref->name, "refs/tags/"))
diff --git a/builtin/index-pack.c b/builtin/index-pack.c
index ac1f4ea9a7..7fc6321c77 100644
--- a/builtin/index-pack.c
+++ b/builtin/index-pack.c
@@ -1772,6 +1772,8 @@ int cmd_index_pack(int argc, const char **argv, const 
char *prefix)
if (show_stat)
obj_stat = xcalloc(st_add(nr_objects, 1), sizeof(struct 
object_stat));
ofs_deltas = xcalloc(nr_objects, sizeof(struct ofs_delta_entry));
+   if (startup_info->have_repository)
+   repo_load_loose_cache(the_repository);
parse_pack_objects(pack_hash);
if (report_end_of_input)
write_in_full(2, "\0", 1);
diff --git a/fetch-pack.c b/fetch-pack.c
index dd6700bda9..96c4624d9e 100644
--- a/fetch-pack.c
+++ b/fetch-pack.c
@@ -656,6 +656,8 @@ static void mark_complete_and_common_ref(struct 
fetch_negotiator *negotiator,
 
save_commit_buffer = 0;
 
+   repo_load_loose_cache(the_repository);
+
for (ref = *refs; ref; ref = ref->next) {
struct object *o;
 
diff --git a/object-store.h b/object-store.h
index 8dceed0f31..f98dd3c857 100644
--- a/object-store.h
+++ b/object-store.h
@@ -53,6 +53,7 @@ void add_to_alternates_memory(const char *dir);
  * from 0 to 255 inclusive).
  */
 void odb_load_loose_cache(struct object_directory *odb, int subdir_nr);
+void repo_load_loose_cache(struct repository *r);
 
 struct packed_git {
struct packed_git *next;
diff --git a/sha1-file.c b/sha1-file.c
index 05f63dfd4e..ae12f0a198 100644
--- a/sha1-file.c
+++ b/sha1-file.c
@@ -921,10 +921,19 @@ static int open_sha1_file(struct repository *r,
return -1;
 }
 
+static int quick_has_loose_odb(struct object_directory *odb,
+  const struct object_id *oid)
+{
+   int subdir_nr = oid->hash[0];
+
+   if (odb->loose_objects_subdir_seen[subdir_nr])
+   return oid_array_lookup(>loose_objects_cache, oid) >= 0;
+   return check_and_freshen_odb(odb, oid, 0);
+}
+
 static int quick_has_loose(struct repository *r,
   const unsigned char *sha1)
 {
-   int subdir_nr = sha1[0];
struct object_id oid;
struct object_directory *odb;
 
@@ -932,8 +941,7 @@ static int quick_has_loose(struct repository *r,
 
prepare_alt_odb(r);
for (odb = r->objects->odb; odb; odb = odb->next) {
-   odb_load_loose_cache(odb, subdir_nr);
-   if (oid_array_lookup(>loose_objects_cache, ) >= 0)
+   if (quick_has_loose_odb(odb, ))
return 1;
}

Re: [PATCH 8/9] sha1-file: use loose object cache for quick existence check

2018-11-14 Thread René Scharfe
Am 13.11.2018 um 11:02 schrieb Ævar Arnfjörð Bjarmason:
> So here's the same test not against NFS, but the local ext4 fs (CO7;
> Linux 3.10) for sha1collisiondetection.git:
> 
> Test origin/master 
> peff/jk/loose-cacheavar/check-collisions-config
> 
> --
> 0008.2: index-pack with 256*1 loose objects  0.02(0.02+0.00)   
> 0.02(0.02+0.01) +0.0%  0.02(0.02+0.00) +0.0%
> 0008.3: index-pack with 256*10 loose objects 0.02(0.02+0.00)   
> 0.03(0.03+0.00) +50.0% 0.02(0.02+0.00) +0.0%
> 0008.4: index-pack with 256*100 loose objects0.02(0.02+0.00)   
> 0.17(0.16+0.01) +750.0%0.02(0.02+0.00) +0.0%
> 0008.5: index-pack with 256*250 loose objects0.02(0.02+0.00)   
> 0.43(0.40+0.03) +2050.0%   0.02(0.02+0.00) +0.0%
> 0008.6: index-pack with 256*500 loose objects0.02(0.02+0.00)   
> 0.88(0.80+0.09) +4300.0%   0.02(0.02+0.00) +0.0%
> 0008.7: index-pack with 256*750 loose objects0.02(0.02+0.00)   
> 1.35(1.27+0.09) +6650.0%   0.02(0.02+0.00) +0.0%
> 0008.8: index-pack with 256*1000 loose objects   0.02(0.02+0.00)   
> 1.83(1.70+0.14) +9050.0%   0.02(0.02+0.00) +0.0%

Ouch.

> And for mu.git, a ~20k object repo:
> 
> Test origin/master 
> peff/jk/loose-cache   avar/check-collisions-config
> 
> -
> 0008.2: index-pack with 256*1 loose objects  0.59(0.91+0.06)   
> 0.58(0.93+0.03) -1.7% 0.57(0.89+0.04) -3.4%
> 0008.3: index-pack with 256*10 loose objects 0.59(0.91+0.07)   
> 0.59(0.92+0.03) +0.0% 0.57(0.89+0.03) -3.4%
> 0008.4: index-pack with 256*100 loose objects0.59(0.91+0.05)   
> 0.81(1.13+0.04) +37.3%0.58(0.91+0.04) -1.7%
> 0008.5: index-pack with 256*250 loose objects0.59(0.91+0.05)   
> 1.23(1.51+0.08) +108.5%   0.58(0.91+0.04) -1.7%
> 0008.6: index-pack with 256*500 loose objects0.59(0.90+0.06)   
> 1.96(2.20+0.12) +232.2%   0.58(0.91+0.04) -1.7%
> 0008.7: index-pack with 256*750 loose objects0.59(0.92+0.05)   
> 2.72(2.92+0.17) +361.0%   0.58(0.90+0.04) -1.7%
> 0008.8: index-pack with 256*1000 loose objects   0.59(0.90+0.06)   
> 3.50(3.67+0.21) +493.2%   0.57(0.90+0.04) -3.4%
> 
> All of which is to say that I think it definitely makes sense to re-roll
> this with a perf test, and a switch to toggle it + docs explaining the
> caveats & pointing to the perf test. It's a clear win in some scenarios,
> but a big loss in others.

Right, but can we perhaps find a way to toggle it automatically, like
the special fetch-pack cache tried to do?

So the code needs to decide between using lstat() on individual loose
objects files or opendir()+readdir() to load the names in a whole
fan-out directory.  Intuitively I'd try to solve it using red tape, by
measuring the duration of both kinds of calls, and then try to find a
heuristic based on those numbers.  Is the overhead worth it?

But first, may I interest you in some further complication?  We can
also use access(2) to check for the existence of files.  It doesn't
need to fill in struct stat, so may have a slight advantage if we
don't need any of that information.  The following patch is a
replacement for patch 8 and improves performance by ca. 3% with
git.git on an SSD for me; I'm curious to see how it does on NFS:

---
 sha1-file.c | 15 ++-
 1 file changed, 10 insertions(+), 5 deletions(-)

diff --git a/sha1-file.c b/sha1-file.c
index b77dacedc7..5315c37cbc 100644
--- a/sha1-file.c
+++ b/sha1-file.c
@@ -888,8 +888,13 @@ static int stat_sha1_file(struct repository *r, const 
unsigned char *sha1,
prepare_alt_odb(r);
for (odb = r->objects->odb; odb; odb = odb->next) {
*path = odb_loose_path(odb, , sha1);
-   if (!lstat(*path, st))
-   return 0;
+   if (st) {
+   if (!lstat(*path, st))
+   return 0;
+   } else {
+   if (!access(*path, F_OK))
+   return 0;
+   }
}
 
return -1;
@@ -1171,7 +1176,8 @@ static int sha1_loose_object_info(struct repository *r,
if (!oi->typep && !oi->type_name && !oi->sizep && !oi->contentp) {
const char *path;
struct stat st;
-   if (stat_sha1_file(r, sha1, , ) < 0)
+   struct stat *stp = oi->disk_sizep ?  : NULL;
+   if (stat_sha1_file(r, sha1, stp, ) < 0)
return -1;
if (oi->disk_sizep)
*oi->disk_sizep = st.st_size;
@@ -1382,7 +1388,6 @@ void *read_object_file_extended(const struct object_id 
*oid,
void 

Re: [PATCH 8/9] sha1-file: use loose object cache for quick existence check

2018-11-13 Thread Ævar Arnfjörð Bjarmason


On Mon, Nov 12 2018, Ævar Arnfjörð Bjarmason wrote:

> On Mon, Nov 12 2018, Ævar Arnfjörð Bjarmason wrote:
>
>> On Mon, Nov 12 2018, Jeff King wrote:
>>
>>> On Mon, Nov 12, 2018 at 05:01:02PM +0100, Ævar Arnfjörð Bjarmason wrote:
>>>
 > There's some obvious hand-waving in the paragraphs above. I would love
 > it if somebody with an NFS system could do some before/after timings
 > with various numbers of loose objects, to get a sense of where the
 > breakeven point is.
 >
 > My gut is that we do not need the complexity of a cache-size limit, nor
 > of a config option to disable this. But it would be nice to have a real
 > number where "reasonable" ends and "pathological" begins. :)

 I'm happy to test this on some of the NFS we have locally, and started
 out with a plan to write some for-loop using the low-level API (so it
 would look up all 256), fake populate .git/objects/?? with N number of
 objects etc, but ran out of time.

 Do you have something ready that you think would be representative and I
 could just run? If not I'll try to pick this up again...
>>>
>>> No, but they don't even really need to be actual objects. So I suspect
>>> something like:
>>>
>>>   git init
>>>   for i in $(seq 256); do
>>> i=$(printf %02x $i)
>>> mkdir -p .git/objects/$i
>>> for j in $(seq --format=%038g 1000); do
>>>   echo foo >.git/objects/$i/$j
>>> done
>>>   done
>>>   git index-pack -v --stdin >>
>>> might work (for various values of 1000). The shell loop would probably
>>> be faster as perl, too. :)
>>>
>>> Make sure you clear the object directory between runs, though (otherwise
>>> the subsequent index-pack's really do find collisions and spend time
>>> accessing the objects).
>>>
>>> If you want real objects, you could probably just dump a bunch of
>>> sequential blobs to fast-import, and then pipe the result to
>>> unpack-objects.
>>>
>>> -Peff
>>
>> I did a very ad-hoc test against a NetApp filer using the test script
>> quoted at the end of this E-Mail. The test compared origin/master, this
>> branch of yours, and my core.checkCollisions=false branch.
>>
>> When run with DBD-mysql.git (just some random ~1k commit repo I had):
>>
>> $ GIT_PERF_REPEAT_COUNT=3 GIT_PERF_MAKE_OPTS='-j56 CFLAGS="-O3"' ./run 
>> origin/master peff/jk/loose-cache avar/check-collisions-config 
>> p0008-index-pack.sh
>>
>> I get:
>>
>> Test origin/master 
>> peff/jk/loose-cache  avar/check-collisions-config
>> 
>> 
>> 0008.2: index-pack with 256*1 loose objects  4.31(0.55+0.18)   
>> 0.41(0.40+0.02) -90.5%   0.23(0.36+0.01) -94.7%
>> 0008.3: index-pack with 256*10 loose objects 4.37(0.45+0.21)   
>> 0.45(0.40+0.02) -89.7%   0.25(0.38+0.01) -94.3%
>> 0008.4: index-pack with 256*100 loose objects4.47(0.53+0.23)   
>> 0.67(0.63+0.02) -85.0%   0.24(0.38+0.01) -94.6%
>> 0008.5: index-pack with 256*250 loose objects5.01(0.67+0.30)   
>> 1.04(0.98+0.06) -79.2%   0.24(0.37+0.01) -95.2%
>> 0008.6: index-pack with 256*500 loose objects5.11(0.57+0.21)   
>> 1.81(1.70+0.09) -64.6%   0.25(0.38+0.01) -95.1%
>> 0008.7: index-pack with 256*750 loose objects5.12(0.60+0.22)   
>> 2.54(2.38+0.14) -50.4%   0.24(0.38+0.01) -95.3%
>> 0008.8: index-pack with 256*1000 loose objects   4.52(0.52+0.21)   
>> 3.36(3.17+0.17) -25.7%   0.23(0.36+0.01) -94.9%
>>
>> I then hacked it to test against git.git, but skipped origin/master for
>> that one because it takes *ages*. So just mine v.s. yours:
>>
>> $ GIT_PERF_REPEAT_COUNT=3 GIT_PERF_MAKE_OPTS='-j56 CFLAGS="-O3"' ./run 
>> peff/jk/loose-cache avar/check-collisions-config p0008-index-pack.sh
>> [...]
>> Test peff/jk/loose-cache   
>> avar/check-collisions-config
>> 
>> ---
>> 0008.2: index-pack with 256*1 loose objects  12.57(28.72+0.61) 
>> 12.68(29.36+0.62) +0.9%
>> 0008.3: index-pack with 256*10 loose objects 12.77(28.75+0.61) 
>> 12.50(28.88+0.56) -2.1%
>> 0008.4: index-pack with 256*100 loose objects13.20(29.49+0.66) 
>> 12.38(28.58+0.60) -6.2%
>> 0008.5: index-pack with 256*250 loose objects14.10(30.59+0.64) 
>> 12.54(28.22+0.57) -11.1%
>> 0008.6: index-pack with 256*500 loose objects14.48(31.06+0.74) 
>> 12.43(28.59+0.60) -14.2%
>> 0008.7: index-pack with 256*750 loose objects15.31(31.91+0.74) 
>> 12.67(29.23+0.64) -17.2%
>> 0008.8: index-pack with 256*1000 loose objects   16.34(32.84+0.76) 
>> 13.11(30.19+0.68) -19.8%
>>
>> So not much of a practical difference perhaps. But then again this isn't
>> a very realistic test case of anything. Rarely are you going 

Re: [PATCH 8/9] sha1-file: use loose object cache for quick existence check

2018-11-12 Thread Geert Jansen
On Mon, Nov 12, 2018 at 11:21:51AM -0500, Jeff King wrote:

> No, but they don't even really need to be actual objects. So I suspect
> something like:
> 
>   git init
>   for i in $(seq 256); do
> i=$(printf %02x $i)
> mkdir -p .git/objects/$i
> for j in $(seq --format=%038g 1000); do
>   echo foo >.git/objects/$i/$j
> done
>   done
>   git index-pack -v --stdin  
> might work (for various values of 1000). The shell loop would probably
> be faster as perl, too. :)
> 
> Make sure you clear the object directory between runs, though (otherwise
> the subsequent index-pack's really do find collisions and spend time
> accessing the objects).

Below are my results. They are not as comprehensive as Ævar's tests. Similary I
kept the loose objects between tests and removed the packs instead. And I also
used the "echo 3 | sudo tee /proc/sys/vm/drop_caches" trick :)

This is with git.git:

   origin/masterjk/loose-object-cache

256*100 objects520s 13.5s (-97%)
256*1000 objects   826s 59s (-93%)

I've started a 256*10K setup but that's still creating the 2.5M loose objects.
I'll post the results when it's done. I would expect that jk/loose-object-cache
is still marginally faster than origin/master based on a simple linear
extrapolation.


Re: [PATCH 8/9] sha1-file: use loose object cache for quick existence check

2018-11-12 Thread Ævar Arnfjörð Bjarmason


On Mon, Nov 12 2018, Ævar Arnfjörð Bjarmason wrote:

> On Mon, Nov 12 2018, Jeff King wrote:
>
>> On Mon, Nov 12, 2018 at 05:01:02PM +0100, Ævar Arnfjörð Bjarmason wrote:
>>
>>> > There's some obvious hand-waving in the paragraphs above. I would love
>>> > it if somebody with an NFS system could do some before/after timings
>>> > with various numbers of loose objects, to get a sense of where the
>>> > breakeven point is.
>>> >
>>> > My gut is that we do not need the complexity of a cache-size limit, nor
>>> > of a config option to disable this. But it would be nice to have a real
>>> > number where "reasonable" ends and "pathological" begins. :)
>>>
>>> I'm happy to test this on some of the NFS we have locally, and started
>>> out with a plan to write some for-loop using the low-level API (so it
>>> would look up all 256), fake populate .git/objects/?? with N number of
>>> objects etc, but ran out of time.
>>>
>>> Do you have something ready that you think would be representative and I
>>> could just run? If not I'll try to pick this up again...
>>
>> No, but they don't even really need to be actual objects. So I suspect
>> something like:
>>
>>   git init
>>   for i in $(seq 256); do
>> i=$(printf %02x $i)
>> mkdir -p .git/objects/$i
>> for j in $(seq --format=%038g 1000); do
>>   echo foo >.git/objects/$i/$j
>> done
>>   done
>>   git index-pack -v --stdin >
>> might work (for various values of 1000). The shell loop would probably
>> be faster as perl, too. :)
>>
>> Make sure you clear the object directory between runs, though (otherwise
>> the subsequent index-pack's really do find collisions and spend time
>> accessing the objects).
>>
>> If you want real objects, you could probably just dump a bunch of
>> sequential blobs to fast-import, and then pipe the result to
>> unpack-objects.
>>
>> -Peff
>
> I did a very ad-hoc test against a NetApp filer using the test script
> quoted at the end of this E-Mail. The test compared origin/master, this
> branch of yours, and my core.checkCollisions=false branch.
>
> When run with DBD-mysql.git (just some random ~1k commit repo I had):
>
> $ GIT_PERF_REPEAT_COUNT=3 GIT_PERF_MAKE_OPTS='-j56 CFLAGS="-O3"' ./run 
> origin/master peff/jk/loose-cache avar/check-collisions-config 
> p0008-index-pack.sh
>
> I get:
>
> Test origin/master 
> peff/jk/loose-cache  avar/check-collisions-config
> 
> 
> 0008.2: index-pack with 256*1 loose objects  4.31(0.55+0.18)   
> 0.41(0.40+0.02) -90.5%   0.23(0.36+0.01) -94.7%
> 0008.3: index-pack with 256*10 loose objects 4.37(0.45+0.21)   
> 0.45(0.40+0.02) -89.7%   0.25(0.38+0.01) -94.3%
> 0008.4: index-pack with 256*100 loose objects4.47(0.53+0.23)   
> 0.67(0.63+0.02) -85.0%   0.24(0.38+0.01) -94.6%
> 0008.5: index-pack with 256*250 loose objects5.01(0.67+0.30)   
> 1.04(0.98+0.06) -79.2%   0.24(0.37+0.01) -95.2%
> 0008.6: index-pack with 256*500 loose objects5.11(0.57+0.21)   
> 1.81(1.70+0.09) -64.6%   0.25(0.38+0.01) -95.1%
> 0008.7: index-pack with 256*750 loose objects5.12(0.60+0.22)   
> 2.54(2.38+0.14) -50.4%   0.24(0.38+0.01) -95.3%
> 0008.8: index-pack with 256*1000 loose objects   4.52(0.52+0.21)   
> 3.36(3.17+0.17) -25.7%   0.23(0.36+0.01) -94.9%
>
> I then hacked it to test against git.git, but skipped origin/master for
> that one because it takes *ages*. So just mine v.s. yours:
>
> $ GIT_PERF_REPEAT_COUNT=3 GIT_PERF_MAKE_OPTS='-j56 CFLAGS="-O3"' ./run 
> peff/jk/loose-cache avar/check-collisions-config p0008-index-pack.sh
> [...]
> Test peff/jk/loose-cache   
> avar/check-collisions-config
> 
> ---
> 0008.2: index-pack with 256*1 loose objects  12.57(28.72+0.61) 
> 12.68(29.36+0.62) +0.9%
> 0008.3: index-pack with 256*10 loose objects 12.77(28.75+0.61) 
> 12.50(28.88+0.56) -2.1%
> 0008.4: index-pack with 256*100 loose objects13.20(29.49+0.66) 
> 12.38(28.58+0.60) -6.2%
> 0008.5: index-pack with 256*250 loose objects14.10(30.59+0.64) 
> 12.54(28.22+0.57) -11.1%
> 0008.6: index-pack with 256*500 loose objects14.48(31.06+0.74) 
> 12.43(28.59+0.60) -14.2%
> 0008.7: index-pack with 256*750 loose objects15.31(31.91+0.74) 
> 12.67(29.23+0.64) -17.2%
> 0008.8: index-pack with 256*1000 loose objects   16.34(32.84+0.76) 
> 13.11(30.19+0.68) -19.8%
>
> So not much of a practical difference perhaps. But then again this isn't
> a very realistic test case of anything. Rarely are you going to push a
> history of something the size of git.git into a repo with this many
> loose objects.
>
> Using sha1collisiondetection.git is I think the most 

Re: [PATCH 8/9] sha1-file: use loose object cache for quick existence check

2018-11-12 Thread Ævar Arnfjörð Bjarmason


On Mon, Nov 12 2018, Jeff King wrote:

> On Mon, Nov 12, 2018 at 05:01:02PM +0100, Ævar Arnfjörð Bjarmason wrote:
>
>> > There's some obvious hand-waving in the paragraphs above. I would love
>> > it if somebody with an NFS system could do some before/after timings
>> > with various numbers of loose objects, to get a sense of where the
>> > breakeven point is.
>> >
>> > My gut is that we do not need the complexity of a cache-size limit, nor
>> > of a config option to disable this. But it would be nice to have a real
>> > number where "reasonable" ends and "pathological" begins. :)
>>
>> I'm happy to test this on some of the NFS we have locally, and started
>> out with a plan to write some for-loop using the low-level API (so it
>> would look up all 256), fake populate .git/objects/?? with N number of
>> objects etc, but ran out of time.
>>
>> Do you have something ready that you think would be representative and I
>> could just run? If not I'll try to pick this up again...
>
> No, but they don't even really need to be actual objects. So I suspect
> something like:
>
>   git init
>   for i in $(seq 256); do
> i=$(printf %02x $i)
> mkdir -p .git/objects/$i
> for j in $(seq --format=%038g 1000); do
>   echo foo >.git/objects/$i/$j
> done
>   done
>   git index-pack -v --stdin 
> might work (for various values of 1000). The shell loop would probably
> be faster as perl, too. :)
>
> Make sure you clear the object directory between runs, though (otherwise
> the subsequent index-pack's really do find collisions and spend time
> accessing the objects).
>
> If you want real objects, you could probably just dump a bunch of
> sequential blobs to fast-import, and then pipe the result to
> unpack-objects.
>
> -Peff

I did a very ad-hoc test against a NetApp filer using the test script
quoted at the end of this E-Mail. The test compared origin/master, this
branch of yours, and my core.checkCollisions=false branch.

When run with DBD-mysql.git (just some random ~1k commit repo I had):

$ GIT_PERF_REPEAT_COUNT=3 GIT_PERF_MAKE_OPTS='-j56 CFLAGS="-O3"' ./run 
origin/master peff/jk/loose-cache avar/check-collisions-config 
p0008-index-pack.sh

I get:

Test origin/master 
peff/jk/loose-cache  avar/check-collisions-config


0008.2: index-pack with 256*1 loose objects  4.31(0.55+0.18)   
0.41(0.40+0.02) -90.5%   0.23(0.36+0.01) -94.7%
0008.3: index-pack with 256*10 loose objects 4.37(0.45+0.21)   
0.45(0.40+0.02) -89.7%   0.25(0.38+0.01) -94.3%
0008.4: index-pack with 256*100 loose objects4.47(0.53+0.23)   
0.67(0.63+0.02) -85.0%   0.24(0.38+0.01) -94.6%
0008.5: index-pack with 256*250 loose objects5.01(0.67+0.30)   
1.04(0.98+0.06) -79.2%   0.24(0.37+0.01) -95.2%
0008.6: index-pack with 256*500 loose objects5.11(0.57+0.21)   
1.81(1.70+0.09) -64.6%   0.25(0.38+0.01) -95.1%
0008.7: index-pack with 256*750 loose objects5.12(0.60+0.22)   
2.54(2.38+0.14) -50.4%   0.24(0.38+0.01) -95.3%
0008.8: index-pack with 256*1000 loose objects   4.52(0.52+0.21)   
3.36(3.17+0.17) -25.7%   0.23(0.36+0.01) -94.9%

I then hacked it to test against git.git, but skipped origin/master for
that one because it takes *ages*. So just mine v.s. yours:

$ GIT_PERF_REPEAT_COUNT=3 GIT_PERF_MAKE_OPTS='-j56 CFLAGS="-O3"' ./run 
peff/jk/loose-cache avar/check-collisions-config p0008-index-pack.sh
[...]
Test peff/jk/loose-cache   
avar/check-collisions-config

---
0008.2: index-pack with 256*1 loose objects  12.57(28.72+0.61) 
12.68(29.36+0.62) +0.9%
0008.3: index-pack with 256*10 loose objects 12.77(28.75+0.61) 
12.50(28.88+0.56) -2.1%
0008.4: index-pack with 256*100 loose objects13.20(29.49+0.66) 
12.38(28.58+0.60) -6.2%
0008.5: index-pack with 256*250 loose objects14.10(30.59+0.64) 
12.54(28.22+0.57) -11.1%
0008.6: index-pack with 256*500 loose objects14.48(31.06+0.74) 
12.43(28.59+0.60) -14.2%
0008.7: index-pack with 256*750 loose objects15.31(31.91+0.74) 
12.67(29.23+0.64) -17.2%
0008.8: index-pack with 256*1000 loose objects   16.34(32.84+0.76) 
13.11(30.19+0.68) -19.8%

So not much of a practical difference perhaps. But then again this isn't
a very realistic test case of anything. Rarely are you going to push a
history of something the size of git.git into a repo with this many
loose objects.

Using sha1collisiondetection.git is I think the most realistic scenario,
i.e. you'll often end up fetching/pushing something roughly the size of
its entire history on a big repo, and with it:

Test peff/jk/loose-cache   

Re: [PATCH 8/9] sha1-file: use loose object cache for quick existence check

2018-11-12 Thread Jeff King
On Mon, Nov 12, 2018 at 05:01:02PM +0100, Ævar Arnfjörð Bjarmason wrote:

> > There's some obvious hand-waving in the paragraphs above. I would love
> > it if somebody with an NFS system could do some before/after timings
> > with various numbers of loose objects, to get a sense of where the
> > breakeven point is.
> >
> > My gut is that we do not need the complexity of a cache-size limit, nor
> > of a config option to disable this. But it would be nice to have a real
> > number where "reasonable" ends and "pathological" begins. :)
> 
> I'm happy to test this on some of the NFS we have locally, and started
> out with a plan to write some for-loop using the low-level API (so it
> would look up all 256), fake populate .git/objects/?? with N number of
> objects etc, but ran out of time.
> 
> Do you have something ready that you think would be representative and I
> could just run? If not I'll try to pick this up again...

No, but they don't even really need to be actual objects. So I suspect
something like:

  git init
  for i in $(seq 256); do
i=$(printf %02x $i)
mkdir -p .git/objects/$i
for j in $(seq --format=%038g 1000); do
  echo foo >.git/objects/$i/$j
done
  done
  git index-pack -v --stdin 

Re: [PATCH 8/9] sha1-file: use loose object cache for quick existence check

2018-11-12 Thread Ævar Arnfjörð Bjarmason


On Mon, Nov 12 2018, Jeff King wrote:

> In cases where we expect to ask has_sha1_file() about a lot of objects
> that we are not likely to have (e.g., during fetch negotiation), we
> already use OBJECT_INFO_QUICK to sacrifice accuracy (due to racing with
> a simultaneous write or repack) for speed (we avoid re-scanning the pack
> directory).
>
> However, even checking for loose objects can be expensive, as we will
> stat() each one. On many systems this cost isn't too noticeable, but
> stat() can be particularly slow on some operating systems, or due to
> network filesystems.
>
> Since the QUICK flag already tells us that we're OK with a slightly
> stale answer, we can use that as a cue to look in our in-memory cache of
> each object directory. That basically trades an in-memory binary search
> for a stat() call.
>
> Note that it is possible for this to actually be _slower_. We'll do a
> full readdir() to fill the cache, so if you have a very large number of
> loose objects and a very small number of lookups, that readdir() may end
> up more expensive.
>
> This shouldn't be a big deal in practice. If you have a large number of
> reachable loose objects, you'll already run into performance problems
> (which you should remedy by repacking). You may have unreachable objects
> which wouldn't otherwise impact performance. Usually these would go away
> with the prune step of "git gc", but they may be held for up to 2 weeks
> in the default configuration.
>
> So it comes down to how many such objects you might reasonably expect to
> have, how much slower is readdir() on N entries versus M stat() calls
> (and here we really care about the syscall backing readdir(), like
> getdents() on Linux, but I'll just call this readdir() below).
>
> If N is much smaller than M (a typical packed repo), we know this is a
> big win (few readdirs() followed by many uses of the resulting cache).
> When N and M are similar in size, it's also a win. We care about the
> latency of making a syscall, and readdir() should be giving us many
> values in a single call. How many?
>
> On Linux, running "strace -e getdents ls" shows a 32k buffer getting 512
> entries per call (which is 64 bytes per entry; the name itself is 38
> bytes, plus there are some other fields). So we can imagine that this is
> always a win as long as the number of loose objects in the repository is
> a factor of 500 less than the number of lookups you make. It's hard to
> auto-tune this because we don't generally know up front how many lookups
> we're going to do. But it's unlikely for this to perform significantly
> worse.
>
> Signed-off-by: Jeff King 
> ---
> There's some obvious hand-waving in the paragraphs above. I would love
> it if somebody with an NFS system could do some before/after timings
> with various numbers of loose objects, to get a sense of where the
> breakeven point is.
>
> My gut is that we do not need the complexity of a cache-size limit, nor
> of a config option to disable this. But it would be nice to have a real
> number where "reasonable" ends and "pathological" begins. :)

I'm happy to test this on some of the NFS we have locally, and started
out with a plan to write some for-loop using the low-level API (so it
would look up all 256), fake populate .git/objects/?? with N number of
objects etc, but ran out of time.

Do you have something ready that you think would be representative and I
could just run? If not I'll try to pick this up again...


Re: [PATCH 8/9] sha1-file: use loose object cache for quick existence check

2018-11-12 Thread Derrick Stolee

On 11/12/2018 9:54 AM, Jeff King wrote:

In cases where we expect to ask has_sha1_file() about a lot of objects
that we are not likely to have (e.g., during fetch negotiation), we
already use OBJECT_INFO_QUICK to sacrifice accuracy (due to racing with
a simultaneous write or repack) for speed (we avoid re-scanning the pack
directory).

However, even checking for loose objects can be expensive, as we will
stat() each one. On many systems this cost isn't too noticeable, but
stat() can be particularly slow on some operating systems, or due to
network filesystems.

Since the QUICK flag already tells us that we're OK with a slightly
stale answer, we can use that as a cue to look in our in-memory cache of
each object directory. That basically trades an in-memory binary search
for a stat() call.

Note that it is possible for this to actually be _slower_. We'll do a
full readdir() to fill the cache, so if you have a very large number of
loose objects and a very small number of lookups, that readdir() may end
up more expensive.

This shouldn't be a big deal in practice. If you have a large number of
reachable loose objects, you'll already run into performance problems
(which you should remedy by repacking). You may have unreachable objects
which wouldn't otherwise impact performance. Usually these would go away
with the prune step of "git gc", but they may be held for up to 2 weeks
in the default configuration.

So it comes down to how many such objects you might reasonably expect to
have, how much slower is readdir() on N entries versus M stat() calls
(and here we really care about the syscall backing readdir(), like
getdents() on Linux, but I'll just call this readdir() below).

If N is much smaller than M (a typical packed repo), we know this is a
big win (few readdirs() followed by many uses of the resulting cache).
When N and M are similar in size, it's also a win. We care about the
latency of making a syscall, and readdir() should be giving us many
values in a single call. How many?

On Linux, running "strace -e getdents ls" shows a 32k buffer getting 512
entries per call (which is 64 bytes per entry; the name itself is 38
bytes, plus there are some other fields). So we can imagine that this is
always a win as long as the number of loose objects in the repository is
a factor of 500 less than the number of lookups you make. It's hard to
auto-tune this because we don't generally know up front how many lookups
we're going to do. But it's unlikely for this to perform significantly
worse.

Signed-off-by: Jeff King 
---
There's some obvious hand-waving in the paragraphs above. I would love
it if somebody with an NFS system could do some before/after timings
with various numbers of loose objects, to get a sense of where the
breakeven point is.

My gut is that we do not need the complexity of a cache-size limit, nor
of a config option to disable this. But it would be nice to have a real
number where "reasonable" ends and "pathological" begins. :)


I'm interested in such numbers, but do not have the appropriate setup to 
test.


I think the tradeoffs you mention above are reasonable. There's also 
some chance that this isn't "extra" work but is just "earlier" work, as 
the abbreviation code would load these loose object directories.




  object-store.h |  1 +
  sha1-file.c| 20 
  2 files changed, 21 insertions(+)

diff --git a/object-store.h b/object-store.h
index bf1e0cb761..60758efad8 100644
--- a/object-store.h
+++ b/object-store.h
@@ -13,6 +13,7 @@ struct object_directory {
/*
 * Used to store the results of readdir(3) calls when we are OK
 * sacrificing accuracy due to races for speed. That includes
+* object existence with OBJECT_INFO_QUICK, as well as
 * our search for unique abbreviated hashes. Don't use it for tasks
 * requiring greater accuracy!
 *
diff --git a/sha1-file.c b/sha1-file.c
index 4aae716a37..e53da0b701 100644
--- a/sha1-file.c
+++ b/sha1-file.c
@@ -921,6 +921,24 @@ static int open_sha1_file(struct repository *r,
return -1;
  }
  
+static int quick_has_loose(struct repository *r,

+  const unsigned char *sha1)
+{
+   int subdir_nr = sha1[0];
+   struct object_id oid;
+   struct object_directory *odb;
+
+   hashcpy(oid.hash, sha1);
+
+   prepare_alt_odb(r);
+   for (odb = r->objects->odb; odb; odb = odb->next) {
+   odb_load_loose_cache(odb, subdir_nr);
+   if (oid_array_lookup(>loose_objects_cache, ) >= 0)
+   return 1;
+   }
+   return 0;
+}
+
  /*
   * Map the loose object at "path" if it is not NULL, or the path found by
   * searching for a loose object named "sha1".
@@ -1171,6 +1189,8 @@ static int sha1_loose_object_info(struct repository *r,
if (!oi->typep && !oi->type_name && !oi->sizep && !oi->contentp) {
const char *path;
struct stat st;
+   if 

[PATCH 8/9] sha1-file: use loose object cache for quick existence check

2018-11-12 Thread Jeff King
In cases where we expect to ask has_sha1_file() about a lot of objects
that we are not likely to have (e.g., during fetch negotiation), we
already use OBJECT_INFO_QUICK to sacrifice accuracy (due to racing with
a simultaneous write or repack) for speed (we avoid re-scanning the pack
directory).

However, even checking for loose objects can be expensive, as we will
stat() each one. On many systems this cost isn't too noticeable, but
stat() can be particularly slow on some operating systems, or due to
network filesystems.

Since the QUICK flag already tells us that we're OK with a slightly
stale answer, we can use that as a cue to look in our in-memory cache of
each object directory. That basically trades an in-memory binary search
for a stat() call.

Note that it is possible for this to actually be _slower_. We'll do a
full readdir() to fill the cache, so if you have a very large number of
loose objects and a very small number of lookups, that readdir() may end
up more expensive.

This shouldn't be a big deal in practice. If you have a large number of
reachable loose objects, you'll already run into performance problems
(which you should remedy by repacking). You may have unreachable objects
which wouldn't otherwise impact performance. Usually these would go away
with the prune step of "git gc", but they may be held for up to 2 weeks
in the default configuration.

So it comes down to how many such objects you might reasonably expect to
have, how much slower is readdir() on N entries versus M stat() calls
(and here we really care about the syscall backing readdir(), like
getdents() on Linux, but I'll just call this readdir() below).

If N is much smaller than M (a typical packed repo), we know this is a
big win (few readdirs() followed by many uses of the resulting cache).
When N and M are similar in size, it's also a win. We care about the
latency of making a syscall, and readdir() should be giving us many
values in a single call. How many?

On Linux, running "strace -e getdents ls" shows a 32k buffer getting 512
entries per call (which is 64 bytes per entry; the name itself is 38
bytes, plus there are some other fields). So we can imagine that this is
always a win as long as the number of loose objects in the repository is
a factor of 500 less than the number of lookups you make. It's hard to
auto-tune this because we don't generally know up front how many lookups
we're going to do. But it's unlikely for this to perform significantly
worse.

Signed-off-by: Jeff King 
---
There's some obvious hand-waving in the paragraphs above. I would love
it if somebody with an NFS system could do some before/after timings
with various numbers of loose objects, to get a sense of where the
breakeven point is.

My gut is that we do not need the complexity of a cache-size limit, nor
of a config option to disable this. But it would be nice to have a real
number where "reasonable" ends and "pathological" begins. :)

 object-store.h |  1 +
 sha1-file.c| 20 
 2 files changed, 21 insertions(+)

diff --git a/object-store.h b/object-store.h
index bf1e0cb761..60758efad8 100644
--- a/object-store.h
+++ b/object-store.h
@@ -13,6 +13,7 @@ struct object_directory {
/*
 * Used to store the results of readdir(3) calls when we are OK
 * sacrificing accuracy due to races for speed. That includes
+* object existence with OBJECT_INFO_QUICK, as well as
 * our search for unique abbreviated hashes. Don't use it for tasks
 * requiring greater accuracy!
 *
diff --git a/sha1-file.c b/sha1-file.c
index 4aae716a37..e53da0b701 100644
--- a/sha1-file.c
+++ b/sha1-file.c
@@ -921,6 +921,24 @@ static int open_sha1_file(struct repository *r,
return -1;
 }
 
+static int quick_has_loose(struct repository *r,
+  const unsigned char *sha1)
+{
+   int subdir_nr = sha1[0];
+   struct object_id oid;
+   struct object_directory *odb;
+
+   hashcpy(oid.hash, sha1);
+
+   prepare_alt_odb(r);
+   for (odb = r->objects->odb; odb; odb = odb->next) {
+   odb_load_loose_cache(odb, subdir_nr);
+   if (oid_array_lookup(>loose_objects_cache, ) >= 0)
+   return 1;
+   }
+   return 0;
+}
+
 /*
  * Map the loose object at "path" if it is not NULL, or the path found by
  * searching for a loose object named "sha1".
@@ -1171,6 +1189,8 @@ static int sha1_loose_object_info(struct repository *r,
if (!oi->typep && !oi->type_name && !oi->sizep && !oi->contentp) {
const char *path;
struct stat st;
+   if (!oi->disk_sizep && (flags & OBJECT_INFO_QUICK))
+   return quick_has_loose(r, sha1) ? 0 : -1;
if (stat_sha1_file(r, sha1, , ) < 0)
return -1;
if (oi->disk_sizep)
-- 
2.19.1.1577.g2c5b293d4f