David Kastrup <[email protected]> writes:
> When a parent blob already has chunks queued up for blaming, dropping
> the blob at the end of one blame step will cause it to get reloaded
> right away, doubling the amount of I/O and unpacking when processing a
> linear history.
>
> Keeping such parent blobs in memory seems like a reasonable optimization
> that should incur additional memory pressure mostly when processing the
> merges from old branches.
Thanks for finding an age-old one that dates back to 7c3c7962
("blame: drop blob data after passing blame to the parent",
2007-12-11).
Interestingly, the said commit claims:
When passing blame from a parent to its parent (i.e. the
grandparent), the blob data for the parent may need to be read
again, but it should be relatively cheap, thanks to delta-base
cache.
but perhaps you found a case where the delta-base cache is not all
that effective in the benchmark?
Will queue. Thanks.
>
> Signed-off-by: David Kastrup <[email protected]>
> ---
> blame.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/blame.c b/blame.c
> index 5c07dec190..c11c516921 100644
> --- a/blame.c
> +++ b/blame.c
> @@ -1562,7 +1562,8 @@ static void pass_blame(struct blame_scoreboard *sb,
> struct blame_origin *origin,
> }
> for (i = 0; i < num_sg; i++) {
> if (sg_origin[i]) {
> - drop_origin_blob(sg_origin[i]);
> + if (!sg_origin[i]->suspects)
> + drop_origin_blob(sg_origin[i]);
> blame_origin_decref(sg_origin[i]);
> }
> }