On Wed, Mar 21, 2018 at 04:59:19PM +0100, Duy Nguyen wrote:

> > I hate to be a wet blanket, but am I the only one who is wondering
> > whether the tradeoffs is worth it? 8% memory reduction doesn't seem
> > mind-bogglingly good,
> 
> AEvar measured RSS. If we count objects[] array alone, the saving is
> 40% (136 bytes per entry down to 80). Some is probably eaten up by
> mmap in rss.

Measuring actual heap usage with massif, I get before/after peak heaps
of 1728 and 1346MB respectively when repacking linux.git. So that's ~22%
savings overall.

Of the used heap after your patches:

 - ~40% of that is from packlist_alloc()
 - ~17% goes to "struct object"
 - ~10% for the object.c hash table to store all the "struct object"
 - ~7% goes to the delta cache
 - ~7% goes to the pack revindex (actually, there's a duplicate 7%
       there, too; I think our peak is when we're sorting the revindex
       and have to keep two copies in memory at once)
 - ~5% goes to the packlist_find() hash table
 - ~3.5% for the get_object_details() sorting list (this is only held
         for a minute, but again, our peak comes during this sort, which
         in turn loads the revindex)

So 27% of the total heap goes away if you switch to a separate rev-list.
Though it's mostly just going to a different process, it does help peak
because that process would have exited by the time we get to the
revindex bits.

I suspect you could get the same effect by just teaching pack-objects to
clear obj_hash and all of the allocated object structs. I think that
should be safe to do as long as we clear _all_ of the objects, so there
are no dangling pointers.

> About the 16k limit (and some other limits as well), I'm making these
> patches with the assumption that large scale deployment probably will
> go with custom builds anyway. Adjusting the limits back should be
> quite easy while we can still provide reasonable defaults for most
> people.

I think this 16k limit is the thing I _most_ dislike about the series.
If we could tweak that case such that we always made forward progress, I
think I'd be a lot less nervous. I responded elsewhere in the thread
(before seeing that both Junio and you seemed aware of the issues ;) ),
but I think it would be acceptable to have git-repack enforce the limit.

That would still mean you could get into a broken state for serving
fetches, but you could at least get out of it by running "git repack".

-Peff

Reply via email to