On Thu, Mar 20, 2014 at 5:11 AM, Junio C Hamano <gits...@pobox.com> wrote:
> David Kastrup <d...@gnu.org> writes:
>
>> Junio C Hamano <gits...@pobox.com> writes:
>>
>>> David Kastrup <d...@gnu.org> writes:
>>>
>>>> The default of 16MiB causes serious thrashing for large delta chains
>>>> combined with large files.
>>>>
>>>> Signed-off-by: David Kastrup <d...@gnu.org>
>>>> ---
>>>
>>> Is that a good argument?  Wouldn't the default of 128MiB burden
>>> smaller machines with bloated processes?
>>
>> The default file size before Git forgets about delta compression is
>> 512MiB.  Unpacking 500MiB files with 16MiB of delta storage is going to
>> be uglier.
>>
>> ...
>>
>> Documentation/config.txt states:
>>
>>     core.deltaBaseCacheLimit::
>>             Maximum number of bytes to reserve for caching base objects
>>             that may be referenced by multiple deltified objects.  By 
>> storing the
>>             entire decompressed base objects in a cache Git is able
>>             to avoid unpacking and decompressing frequently used base
>>             objects multiple times.
>>     +
>>     Default is 16 MiB on all platforms.  This should be reasonable
>>     for all users/operating systems, except on the largest projects.
>>     You probably do not need to adjust this value.
>>
>> I've seen this seriously screwing performance in several projects of
>> mine that don't really count as "largest projects".
>>
>> So the description in combination with the current setting is clearly wrong.
>
> That is a good material for proposed log message, and I think you
> are onto something here.
>
> I know that the 512MiB default for the bitFileThreashold (aka
> "forget about delta compression") came out of thin air.  It was just
> "1GB is always too huge for anybody, so let's cut it in half and
> declare that value the initial version of a sane threashold",
> nothing more.
>
> So it might be that the problem is 512MiB is still too big, relative
> to the 16MiB of delta base cache, and the former may be what needs
> to be tweaked.  If a blob close to but below 512MiB is a problem for
> 16MiB delta base cache, it would still be too big to cause the same
> problem for 128MiB delta base cache---it would evict all the other
> objects and then end up not being able to fit in the limit itself,
> busting the limit immediately, no?
>
> I would understand if the change were to update the definition of
> deltaBaseCacheLimit and link it to the value of bigFileThreashold,
> for example.  With the presented discussion, I am still not sure if
> we can say that bumping deltaBaseCacheLimit is the right solution to
> the "description with the current setting is clearly wrong" (which
> is a real issue).

I vote make big_file_threshold smaller. 512MB is already unfriendly
for many smaller machines. I'm thinking somewhere around 32MB-64MB
(and maybe increase delta cache base limit to match). The only
downside I see is large blobs will be packed  undeltified, which could
increase pack size if you have lots of them. But maybe we could
improve pack-objects/repack/gc to deltify large blobs anyway if
they're old enough.
-- 
Duy
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to