On Tue, Nov 27, 2018 at 09:48:57PM +0100, René Scharfe wrote:

> > +static int quick_has_loose(struct repository *r,
> > +                      const unsigned char *sha1)
> > +{
> > +   int subdir_nr = sha1[0];
> > +   struct object_id oid;
> > +   struct object_directory *odb;
> > +
> > +   hashcpy(oid.hash, sha1);
> > +
> > +   prepare_alt_odb(r);
> > +   for (odb = r->objects->odb; odb; odb = odb->next) {
> > +           odb_load_loose_cache(odb, subdir_nr);
> 
> Is this thread-safe?  What happens if e.g. one index-pack thread resizes
> the array while another one sorts it?

No, but neither is any of the object_info / has_object_file path, which
may use static function-local buffers, or (before my series) alt scratch
bufs, or even call reprepare_packed_git().

In the long run, I think the solution is probably going to be pushing
some mutexes into the right places, and putting one around the cache
fill is an obvious place.

> Loading the cache explicitly up-front would avoid that, and improves
> performance a bit in my (very limited) tests on an SSD.  Demo patch for
> next at the bottom.  How does it do against your test cases?

It's going to do badly on corner cases where we don't need to load every
object subdirectory, and one or more of them are big. I.e., if I look up
"1234abcd", the current code only needs to fault in $GIT_DIR/objects/12.
Pre-loading means we'd hit them all. Even without a lot of objects, on
NFS that's 256 latencies instead of 1.

-Peff

Reply via email to