On Fri, Mar 16, 2018 at 10:29 PM, Jeff King <p...@peff.net> wrote:
> On Fri, Mar 16, 2018 at 09:01:42PM +0100, Ævar Arnfjörð Bjarmason wrote:
>> One thing that can make repositories very pathological is if the ratio
>> of trees to commits is too low.
>>
>> I was dealing with a repo the other day that had several thousand files
>> all in the same root directory, and no subdirectories.
>
> We've definitely run into this problem before (CocoaPods/Specs, for
> example). The metric that would hopefully show this off is "what is the
> tree object with the most entries". Or possibly "what is the average
> number of entries in a tree object".

I find that the best metric for determining this sort of problem is
"Overall repository size -> Trees -> Total tree entries". If you have
a big directory that is being changed frequently, the *real* problem
is that every commit has to rewrite the whole tree, with all of its
many entries. So "Total tree entries" (or equivalently, the total tree
size) skyrockets. And this means that a history traversal has to
*expand* all of those trees again. So a repository that is problematic
for this reason will have a very large number of tree entries.

If you want to detect a bad repository layout like this *before* it
becomes a problem, then probably something like "average tree entries
per commit" might be a good leading indicator of a problem.

Michael

Reply via email to