Re: [PATCH v3 1/3] refs/files-backend: add longer-scoped copy of string to list
On Wed, Sep 06, 2017 at 08:12:24PM +0200, Martin Ågren wrote: > On 5 September 2017 at 23:26, Junio C Hamanowrote: > > Jeff King writes: > > > >> I noticed the HEAD funniness, too, when looking at this earlier. I agree > >> with Junio that it's not quite consistent with the general rule of > >> "string list items point to their refnames", but I don't think it > >> matters in practice. > > > > Yup, we are on the same page; the "fix" I was alluding to would look > > exactly like what you wrote below, but I agree the distinction does > > not matter in practice. IOW, I do not think the code after Martin's > > fix is wrong per-se. > > Well, "not wrong per-se" tells me you'd feel slightly more comfortable > about the state of things if we did this. ;) > > I'll take Peff's hint, tweak/add comments for correctness and symmetry > with the previous patch and add an if-BUG for symmetry. Peff: Do I have > your sign-off? (Do I need it?) Yes, you have my sign-off. Probably it is not necessary for such a trivial patch, but it never hurts to be sure. > If we re-roll, would you prefer Peff's much smaller take on patch 2 > (strbuf_release where it matters, instead of sprinkling "goto out" all > over)? I think me and him agreed that we'd be fine either way. I'd reuse > my commit message, if I get his sign-off and "From:". You are welcome to forge my sign-off there (but I really am OK with either approach). -Peff
Re: [PATCH v3 1/3] refs/files-backend: add longer-scoped copy of string to list
Martin Ågrenwrites: > Junio: The topic is in pu as ma/split-symref-update-fix. Re-roll or new > topic or as a separate "patch 4/3" for you to place on top, any > preference? Until a topic hits 'next', you solely own it and it is mostly up to you how to go about improving it. My preference would be much less relevant than what the end result would require, i.e > If we re-roll, would you prefer Peff's much smaller take on patch 2 > (strbuf_release where it matters, instead of sprinkling "goto out" all > over)? I think me and him agreed that we'd be fine either way. I'd reuse > my commit message, if I get his sign-off and "From:". ... if we take Peff's approach, then rerolling the whole thing would be the only approach that makes sense---incremental update would make the resulting history less readable. Between two approaches I do not have a strong preference either way---it's a choice that can be made between you and Peff. > Obviously, if Michael or anyone else has any input, I'm open to that as > well.. Sure.
Re: [PATCH v3 1/3] refs/files-backend: add longer-scoped copy of string to list
On 5 September 2017 at 23:26, Junio C Hamanowrote: > Jeff King writes: > >> I noticed the HEAD funniness, too, when looking at this earlier. I agree >> with Junio that it's not quite consistent with the general rule of >> "string list items point to their refnames", but I don't think it >> matters in practice. > > Yup, we are on the same page; the "fix" I was alluding to would look > exactly like what you wrote below, but I agree the distinction does > not matter in practice. IOW, I do not think the code after Martin's > fix is wrong per-se. Well, "not wrong per-se" tells me you'd feel slightly more comfortable about the state of things if we did this. ;) I'll take Peff's hint, tweak/add comments for correctness and symmetry with the previous patch and add an if-BUG for symmetry. Peff: Do I have your sign-off? (Do I need it?) Junio: The topic is in pu as ma/split-symref-update-fix. Re-roll or new topic or as a separate "patch 4/3" for you to place on top, any preference? If we re-roll, would you prefer Peff's much smaller take on patch 2 (strbuf_release where it matters, instead of sprinkling "goto out" all over)? I think me and him agreed that we'd be fine either way. I'd reuse my commit message, if I get his sign-off and "From:". Obviously, if Michael or anyone else has any input, I'm open to that as well.. Martin
Re: [PATCH v3 1/3] refs/files-backend: add longer-scoped copy of string to list
Jeff Kingwrites: > I noticed the HEAD funniness, too, when looking at this earlier. I agree > with Junio that it's not quite consistent with the general rule of > "string list items point to their refnames", but I don't think it > matters in practice. Yup, we are on the same page; the "fix" I was alluding to would look exactly like what you wrote below, but I agree the distinction does not matter in practice. IOW, I do not think the code after Martin's fix is wrong per-se. Thanks. > I think the fix, if we wanted to do one, would be similar to what you > did in split_symref_update(). Like: > > diff --git a/refs/files-backend.c b/refs/files-backend.c > index f3455609d6..3f9deff902 100644 > --- a/refs/files-backend.c > +++ b/refs/files-backend.c > @@ -2095,8 +2095,7 @@ static int split_head_update(struct ref_update *update, >* transaction. This insertion is O(N) in the transaction >* size, but it happens at most once per transaction. >*/ > - item = string_list_insert(affected_refnames, "HEAD"); > - if (item->util) { > + if (string_list_has_string(affected_refnames, "HEAD")) { > /* An entry already existed */ > strbuf_addf(err, > "multiple updates for 'HEAD' (including one " > @@ -2111,6 +2110,7 @@ static int split_head_update(struct ref_update *update, > update->new_oid.hash, update->old_oid.hash, > update->msg); > > + item = string_list_insert(affected_refnames, new_update->refname); > item->util = new_update; > > return 0; > > -Peff
Re: [PATCH v3 1/3] refs/files-backend: add longer-scoped copy of string to list
On Tue, Sep 05, 2017 at 07:24:18PM +0200, Martin Ågren wrote: > > And from that point of view, doesn't split_head_update() wants a > > similar fix? It attempts to insert "HEAD", makes sure it hasn't > > been inserted and then hangs a new update transaction as its util. > > It is not wrong per-se from purely leak-prevention point of view, > > as that "HEAD" is a literal string we woudn't even want to free, > > but from logical/"what each data means" point of view, it still > > feels wrong. > > There is a "Special hack" comment related to this, and I don't feel > particularly confident that I could make any meaningful contribution in > this area. To be honest, I don't immediately see in which direction your > suggestion/idea/thought is going, which tells me I should not be making > a mess out of it. :-) I noticed the HEAD funniness, too, when looking at this earlier. I agree with Junio that it's not quite consistent with the general rule of "string list items point to their refnames", but I don't think it matters in practice. I think the fix, if we wanted to do one, would be similar to what you did in split_symref_update(). Like: diff --git a/refs/files-backend.c b/refs/files-backend.c index f3455609d6..3f9deff902 100644 --- a/refs/files-backend.c +++ b/refs/files-backend.c @@ -2095,8 +2095,7 @@ static int split_head_update(struct ref_update *update, * transaction. This insertion is O(N) in the transaction * size, but it happens at most once per transaction. */ - item = string_list_insert(affected_refnames, "HEAD"); - if (item->util) { + if (string_list_has_string(affected_refnames, "HEAD")) { /* An entry already existed */ strbuf_addf(err, "multiple updates for 'HEAD' (including one " @@ -2111,6 +2110,7 @@ static int split_head_update(struct ref_update *update, update->new_oid.hash, update->old_oid.hash, update->msg); + item = string_list_insert(affected_refnames, new_update->refname); item->util = new_update; return 0; -Peff
Re: [PATCH v3 1/3] refs/files-backend: add longer-scoped copy of string to list
On 5 September 2017 at 12:02, Junio C Hamanowrote: > Martin Ågren writes: > >> On 30 August 2017 at 04:52, Michael Haggerty wrote: >>> v3 looks good to me. Thanks! >>> >>> Reviewed-by: Michael Haggerty >> >> Thank _you_ for very helpful feedback on the earlier versions. >> >> Martin > > Yes, the earlier attempt was sort-of barking up a wrong tree. > > Once we step back and observe other users of affected_refnames and > realize that the list is meant to to point at a refname field of an > existing instance of another structure by borrowing, the blame > shifts from files_transaction_prepare() to the real culprit. > Michael's review gave us a very good "let's step back a bit" that > made the huge improvement between v1 and v2/v3. > > I wonder if we should be tightening the use of affected_refnames > even further, though. > > It is may be sufficient to make sure that we do not throw anything > that we would need to free as part of destroying affected_refnames > into the string list, purely from the "leak prevention" perspective. > > But stepping back a bit, the reason why the string list exists in > the first place is to make sure we do not touch the same ref twice > in a single transaction, the set of all possible updates in the > single transaction exists somewhere, and each of these updates know > what ref it wants to update. > > And that is recorded in transaction->updates[]->refname no? > > So it seems to me that logically any and all string that is > registered in affected_refnames list must be the refname field of > a ref_update structure in the transaction. I'm with you up to here. > And from that point of view, doesn't split_head_update() wants a > similar fix? It attempts to insert "HEAD", makes sure it hasn't > been inserted and then hangs a new update transaction as its util. > It is not wrong per-se from purely leak-prevention point of view, > as that "HEAD" is a literal string we woudn't even want to free, > but from logical/"what each data means" point of view, it still > feels wrong. There is a "Special hack" comment related to this, and I don't feel particularly confident that I could make any meaningful contribution in this area. To be honest, I don't immediately see in which direction your suggestion/idea/thought is going, which tells me I should not be making a mess out of it. :-) Martin
Re: [PATCH v3 1/3] refs/files-backend: add longer-scoped copy of string to list
Martin Ågrenwrites: > On 30 August 2017 at 04:52, Michael Haggerty wrote: >> v3 looks good to me. Thanks! >> >> Reviewed-by: Michael Haggerty > > Thank _you_ for very helpful feedback on the earlier versions. > > Martin Yes, the earlier attempt was sort-of barking up a wrong tree. Once we step back and observe other users of affected_refnames and realize that the list is meant to to point at a refname field of an existing instance of another structure by borrowing, the blame shifts from files_transaction_prepare() to the real culprit. Michael's review gave us a very good "let's step back a bit" that made the huge improvement between v1 and v2/v3. I wonder if we should be tightening the use of affected_refnames even further, though. It is may be sufficient to make sure that we do not throw anything that we would need to free as part of destroying affected_refnames into the string list, purely from the "leak prevention" perspective. But stepping back a bit, the reason why the string list exists in the first place is to make sure we do not touch the same ref twice in a single transaction, the set of all possible updates in the single transaction exists somewhere, and each of these updates know what ref it wants to update. And that is recorded in transaction->updates[]->refname no? So it seems to me that logically any and all string that is registered in affected_refnames list must be the refname field of a ref_update structure in the transaction. And from that point of view, doesn't split_head_update() wants a similar fix? It attempts to insert "HEAD", makes sure it hasn't been inserted and then hangs a new update transaction as its util. It is not wrong per-se from purely leak-prevention point of view, as that "HEAD" is a literal string we woudn't even want to free, but from logical/"what each data means" point of view, it still feels wrong.
Re: [PATCH v3 1/3] refs/files-backend: add longer-scoped copy of string to list
On Tue, Sep 05, 2017 at 11:03:36AM +0200, Michael Haggerty wrote: > > It feels pretty dirty, though. It would certainly be a bug if we ever > > decided to switch affected_refnames to duplicate its strings. > > > > So given that your solution is only a constant-time factor worse in > > efficiency, we should probably prefer it as the more maintainable > > option. > > This is clever, but I don't like that it requires outside code to > change internal `string_list` structures in a way that is not > documented to be OK. > > If we cared about getting rid of the extra `O(lg N)` search (and I > agree with you that it doesn't matter in this case), I think the clean > way to do it would be for `string_list` to expose a method like > > struct string_list_item *string_list_insert_at_index(struct > string_list *list, size_t index, const char *string); > > and to use it, together with `string_list_find_insert_index()`, to > avoid having to search twice. Yes, agreed on all counts. -Peff
Re: [PATCH v3 1/3] refs/files-backend: add longer-scoped copy of string to list
On Tue, Sep 5, 2017 at 10:45 AM, Jeff Kingwrote: > On Tue, Aug 29, 2017 at 07:18:22PM +0200, Martin Ågren wrote: > >> [...] >> Rearrange the handling of `referent`, so that we don't add it directly >> to `affected_refnames`. Instead, first just check whether `referent` >> exists in the string list, and later add `new_update->refname`. > > Coincidentally[1] I came across this same leak, and my solution ended up > slightly different. I'll share it here in case it's of interest. > > In your solution we end up searching the string list twice: once to see > if we have the item, and then again to insert it. Whereas in the > original we did both with a single search. > > But we can observe that either: > > 1. The item already existed, in which case our insert was a noop, and > we're good. > > or > > 2. We inserted it, in which case we proceed with creating new_update. > > We can then in O(1) replace the pointer in the string list item > with the storage in new_update. We know we're not violating any > string_list invariants because the strings contain the same bytes. > > I.e.: > > diff --git a/refs/files-backend.c b/refs/files-backend.c > index 9266f5ab9d..1d16c1b33e 100644 > --- a/refs/files-backend.c > +++ b/refs/files-backend.c > @@ -2178,6 +2178,11 @@ static int split_symref_update(struct files_ref_store > *refs, > update->flags |= REF_LOG_ONLY | REF_NODEREF; > update->flags &= ~REF_HAVE_OLD; > > + /* > +* Re-point at the storage provided by our ref_update, which we know > +* will last as long as the affected_refnames list. > +*/ > + item->string = new_update->refname; > item->util = new_update; > > return 0; > > It feels pretty dirty, though. It would certainly be a bug if we ever > decided to switch affected_refnames to duplicate its strings. > > So given that your solution is only a constant-time factor worse in > efficiency, we should probably prefer it as the more maintainable > option. This is clever, but I don't like that it requires outside code to change internal `string_list` structures in a way that is not documented to be OK. If we cared about getting rid of the extra `O(lg N)` search (and I agree with you that it doesn't matter in this case), I think the clean way to do it would be for `string_list` to expose a method like struct string_list_item *string_list_insert_at_index(struct string_list *list, size_t index, const char *string); and to use it, together with `string_list_find_insert_index()`, to avoid having to search twice. Michael
Re: [PATCH v3 1/3] refs/files-backend: add longer-scoped copy of string to list
On Tue, Aug 29, 2017 at 07:18:22PM +0200, Martin Ågren wrote: > Observe that split_symref_update() creates a `new_update`-object through > ref_transaction_add_update(), after which `new_update->refname` is a > copy of `referent`. The difference is, this copy will be freed, and it > will be freed *after* `affected_refnames` has been cleared. > > Rearrange the handling of `referent`, so that we don't add it directly > to `affected_refnames`. Instead, first just check whether `referent` > exists in the string list, and later add `new_update->refname`. Coincidentally[1] I came across this same leak, and my solution ended up slightly different. I'll share it here in case it's of interest. In your solution we end up searching the string list twice: once to see if we have the item, and then again to insert it. Whereas in the original we did both with a single search. But we can observe that either: 1. The item already existed, in which case our insert was a noop, and we're good. or 2. We inserted it, in which case we proceed with creating new_update. We can then in O(1) replace the pointer in the string list item with the storage in new_update. We know we're not violating any string_list invariants because the strings contain the same bytes. I.e.: diff --git a/refs/files-backend.c b/refs/files-backend.c index 9266f5ab9d..1d16c1b33e 100644 --- a/refs/files-backend.c +++ b/refs/files-backend.c @@ -2178,6 +2178,11 @@ static int split_symref_update(struct files_ref_store *refs, update->flags |= REF_LOG_ONLY | REF_NODEREF; update->flags &= ~REF_HAVE_OLD; + /* +* Re-point at the storage provided by our ref_update, which we know +* will last as long as the affected_refnames list. +*/ + item->string = new_update->refname; item->util = new_update; return 0; It feels pretty dirty, though. It would certainly be a bug if we ever decided to switch affected_refnames to duplicate its strings. So given that your solution is only a constant-time factor worse in efficiency, we should probably prefer it as the more maintainable option. -Peff [1] It's not really a coincidence, of course. All the recent leak discussion has got both of us prodding at Git with various tools. :)
Re: [PATCH v3 1/3] refs/files-backend: add longer-scoped copy of string to list
On 30 August 2017 at 04:52, Michael Haggertywrote: > v3 looks good to me. Thanks! > > Reviewed-by: Michael Haggerty Thank _you_ for very helpful feedback on the earlier versions. Martin
Re: [PATCH v3 1/3] refs/files-backend: add longer-scoped copy of string to list
v3 looks good to me. Thanks! Reviewed-by: Michael HaggertyMichael