On Sat, Aug 26, 2017 at 12:16 PM, Martin Ågren <martin.ag...@gmail.com> wrote: > On 25 August 2017 at 23:00, Junio C Hamano <gits...@pobox.com> wrote: >> Martin Ågren <martin.ag...@gmail.com> writes: >> >>> files_transaction_prepare() and the functions it calls add strings to a >>> string list without duplicating them, i.e., we keep the original raw >>> pointers we were given. That is "ok", since we keep them only for a >>> short-enough time, but we end up leaking some of them. >> >> [...]
Good find, Martin. First of all, you are right that we don't want any memory leaks here. Nobody promises that the program will end soon if a reference update fails. (In fact, there are invocations of `git` that trigger multiple reference updates.) This is a small leak, but we should fix it. The problem (if I may take a stab at explaining it in my own words) is that `files_transaction_prepare()` uses a `string_list` called `affected_refnames` to ensure that the same reference name is not modified more than once in a single reference transaction. Currently, `affected_refnames` is configured *not* to duplicate the refnames that are fed to it, which also means that it doesn't free the refnames when it is cleared. This is correct for most refnames, which are owned by entries in the `ref_transaction`, and therefore have a longer lifetime than `affected_refnames`. But there is one code path that can add a refname to `affected_refnames` without making a provision for the refname ever to be freed: * files_transaction_prepare() * lock_ref_for_update() * split_symref_update() * item = string_list_insert(affected_refnames, referent) The `referent` in the last statement comes from a `strbuf` that is created in `lock_ref_for_update()` then passed to `lock_raw_ref()`, which fills it. It would be a serious bug if `lock_ref_for_update()` would dispose of `referent`, because the pointer in `affected_refnames` would point at freed memory. But in fact we have the opposite problem; `lock_ref_for_update()` never frees the memory (it doesn't even `strbuf_detach()` it). So that memory is leaked. Your proposed solution is to change `affected_refnames` to duplicate the strings that are stored in it, in which case `lock_ref_for_update()` can properly dispose of `referent`, fixing the leak. This works, but at the price of having to allocate memory for *all* references in the update, even though most of them are already fine. But note that `split_symref_update()` *also* passes a copy of `referent` to `ref_transaction_add_update()`, which *already* makes a copy of the reference name and adds an entry containing the name to the `ref_transaction`. If we would store *that* copy to `affected_refnames`, then it would get freed when the `ref_transaction` is cleaned up, and we could fix the memory leak without allocating any new memory. This requires a little reorg of `split_symref_update()` but it's not too bad: * Do the initial check using `string_list_has_string()` rather than calling `string_list_insert()` already. * After `new_update` has been created, call `string_list_insert()`, passing it `new_update->refname` as the string. If this is done in place of your first commit, then your second commit could be left unchanged. Michael