On Sat, Oct 19, 2019 at 04:25:19PM +0100, Philip Oakley wrote:

> > +int bitmap_walk_contains(struct bitmap_index *bitmap_git,
> > +                    struct bitmap *bitmap, const struct object_id *oid)
> > +{
> > +   int idx;
> Excuse my ignorance here...
> 
> For the case on Windows (int/long 32 bit), is this return value guaranteed
> to be less than 2GiB, i.e. not a memory offset?
> 
> I'm just thinking ahead to the resolution of the 4GiB file limit issue on
> Git-for-Windows (https://github.com/git-for-windows/git/pull/2179)

Yes, it's not a memory offset.

This "idx" here (and the return value of bitmap_position) represents a
position within an array of objects. This isn't strictly limited to the
objects in a single pack (because a traversal might extend to objects
outside the bitmapped pack), but we can use that as a general ballpark.
And it's limited to a 4-byte object count already.

So the "best" type here would be a uint32_t (which is used elsewhere
in the pack code), but we use signedness to indicate that the object
wasn't found.

That's probably OK. The biggest repos I've seen have on the order of
10-100M objects. That still gives us a factor of 20 before we hit 2^31.
If we imagine those repos took 10 years or so to accrue that many
objects, then we probably still have 200 years of growth left. Of course
growth accelerates over time, but I suspect repos with 2B objects will
run into other scaling problems first. So I don't think it's worth
worrying about too much for now.

-Peff

Reply via email to