Re: Implicit keep-alive after reintegrate merge

Julian Foad Mon, 16 Jan 2012 10:36:25 -0800

Hi Paul.

Here's an updated patch, which is both simpler (now that I learnt more about 
how the existing code works) and contains tests that demonstrate the 
functionality (merge_reintegrate_tests.py 18 and 19).


I think merge_reintegrate_tests.py 19 demonstrates it better than 1000 words so 
I'll leave it at that for today.

I'll respond to your last response (below) maybe tomorrow.


- Julian



----- Original Message -----
> From: Paul Burba <[email protected]>
> To: Julian Foad <[email protected]>
> Cc: "[email protected]" <[email protected]>
> Sent: Wednesday, 11 January 2012, 19:57
> Subject: Re: Implicit keep-alive after reintegrate merge
> 
> On Wed, Jan 11, 2012 at 7:43 AM, Julian Foad <[email protected]> 
> wrote:
>>  Hi Paul.  Thanks for your thoughts.
>> 
>>  Paul Burba wrote:
>> 
>>>  Julian Foad wrote:
>>>>   We can say for sure that when we reintegrated B to A (in A:40), 
> that
>>>>  will have added new mergeinfo on A describing merges from B.  
> However,
>>>>  if change A:40 had instead been a different merge into A, let's 
> say
>>>>  from C, it is still possible that merge might have brought along 
> some
>>>>  new mergeinfo describing merges from B, because of the way 
> mergeinfo
>>>>  is propagated from branch to branch.  Therefore, if we find that
>>>>  change A:40 adds new mergeinfo about merges from B, we cannot 
> simply
>>>>  say that A:40 describes a reintegrate merge from B.
>>> 
>>>  Absolutely!  There's also the similar case where 'B' is 
> reintegrated
>>>  to 'A', additional edits are made to 'A', and then the 
> whole thing is
>>>  committed as r40. Now r40 represents both the reintegrate change and
>>>  unrelated edits -- but of course the smallest unit mergetracking works
>>>  with is a revision, so how does it classify r40?
>> 
>>  Actually, I don't think we need to be concerned about the "mixture 
> of merges and new changes" scenario, and here's why.  As we know, 
> merging is not a deterministic operation, and requires user input in general 
> [1].  When the user does a merge and commits the resulting working copy, the 
> result is tracked as a merge.  There's no conceptual way Subversion could 
> distinguish the two kinds of change without the user telling it.  The way to 
> tell Subversion about two separate changes is by putting them in two separate 
> revisions [2].
> 
> Hi Julian,
> 
> I agree that users making changes in addition to a merge in a single
> commit are asking for problems.  But the underlying problem of
> detection doesn't seem any different from what you described as:
> 
> "However, if change A:40 had instead been a different merge into A,
> let's say from C, it is still possible that merge might have brought
> along some new mergeinfo describing merges from B, because of the way
> mergeinfo is propagated from branch to branch.  Therefore, if we find
> that change A:40 adds new mergeinfo about merges from B, we cannot
> simply say that A:40 describes a reintegrate merge from B.  We need to
> look more closely.  That's what I'm currently working on."
> 
> Maybe I need to wait to see what your current work yields before
> commenting much more :-)  Nah, more comments below...
> 
>>  So, "Don't commit new changes with a merge" is a rule we have 
> to embrace if we want to make progress in defining more useful merge 
> behaviour.  
> Or rather the rule is a bit softer: "You shouldn't commit new changes 
> with a merge, because Subversion won't track the new changes separately but 
> will assume they are part of the merge conflict resolution."
>> 
>>  In fact the only cases in which Subversion's behaviour will actually be 
> affected by this rule are multi-path or cyclic merges, that is the cases 
> where 
> we want to avoid merging a change to a branch that already has "it".
>> 
>>  The only form of cyclic merge we've supported up till now has been the 
> reintegrate merge, and that is a special case because it is limited to cases 
> where it doesn't need to track individual changes, it simply compares at the 
> entire states of the two branches.  The effect is that it doesn't matter 
> whether the user included new changes along with any merges into the source 
> (child) branch.
>> 
>>  In a multi-path merge, such as the third step of A->B, A->C, B->C, 
> we don't yet support automatically skipping any changes coming via B that 
> have already been merged directly from A to C.  Subversion will still try to 
> merge that whole change from B to C, and the already-merged part of it will 
> conflict (logically if not physically).  The user's options are to avoid 
> merging that specific revision from B to C (perhaps by record-only merging 
> it), 
> or to  deal with that conflict.  If a new change had been committed along 
> with 
> that A->B merge, neither option successfully merges that new change to C.
>> 
>>  So the reintegrate merge doesn't care if you have mixed a merge with a 
> new change, and for the multi-path scenario the user already needs to keep 
> merges separate from new changes.
>> 
>>  Now, what about the case I'm addressing?  That is, the sync merge into 
> a branch that has already been reintegrated.  If the user wants to carry on 
> using the branch in this way, up till now it has been the user's 
> responsibility to tell Subversion to ignore the reintegrate merge commit, by 
> the 
> record-only merge known as the "keep-alive dance".  If there had been 
> a new change mixed up with that reintegration, the user would be telling 
> Subversion to ignore it.  So here again we already had the rule that you have 
> to 
> keep new changes separate.
>> 
>>  [1] Of course we have automatic text merge tools which give the user a good 
> guess at a likely
>>  answer, which often turns out to need no further editing in simple
>>  cases.
>> 
>>  [2] I don't believe it would be productive to invent a mechanism for
>>  recording two separate changes within one revision, and if we did the
>>  user would still have to tell Subversion about them.
>> 
>> 
>>>>   We need to look more closely.  That's what I'm currently 
> working on.
>>> 
>>>  This runs up against issue #2897 'Reflective merges are 
> faulty'.  You
>>>  seem to be acknowledging this with your comments in the patch itself:
>>> 
>>>    "Note that with all these improvements, the result might 
> approach a more
>>>    flexible merge system in which any reflected or cyclic changes are 
> skipped,
>>>    *BUT WOULD NOT COPE* (emphasis mine) with the case where a revision
>>>    in the source branch contains a mixture of changes merged from the 
> target
>>>    branch and other changes."
>>> 
>>>  But if we can't cope with that case how can we do anything 
> that's an
>>>  improvement over what we do today (i.e. just merge r40)?  We can't
>>>  skip r40 based on mergeinfo alone because there might be other
>>>  legitimate changes.
>> 
>>  There are two potential kinds of "other" changes.  New changes 
> mixed in with the merge are against the rules, as explained above.  Other 
> merges 
> mixed in with that merge are, I think, legitimate.
> 
> Again, I agree that the first case is really one where the user is
> asking for trouble, so I can live without solving that.  Thing is, we
> both agree the second type of change should be detected.  What puzzles
> me is that if we can reliably detect the second type it seems we could
> detect the first type too...
> 
>>  If we can calculate that r40 contains nothing but changes merged directly 
> from branch B, then the Right Thing is to skip it.  If it contains other 
> changes 
> (or if we can't determine for sure), then we can tell the user what's 
> happening (that it would be wrong to try to merge this change because it 
> contains "...") and we can either skip it or stop the merge.
>> 
>>  My contention is that that would be better than what we do today.  Note 
> that what we do today is unconditionally *attempt* to merge r40 even though 
> (with my new code) we could know for sure that it will conflict logically 
> (and 
> will probably but not necessarily conflict physically).
> 
> I agree, sorry if I came across as opposed to improvement in this
> space, I'm not.
> 
>>  Another way of looking at it is that when we merge a revision that will 
> logically conflict with the target, today the default action is to attempt to 
> merge it anyway, and my proposed action is to tell the user what's happening 
> and allow them to take corrective action in advance.  The user could choose a 
> different merge source, or perhaps realize that they were trying to merge 
> into 
> the wrong target branch, or something else.
>> 
>>  Does that make sense?
> 
> A random thought about your patch: The skip/stop logic kicks in when
> performing a cherry pick merge.  It might be better if it only applies
> when doing sync merges.  Reasoning: If a user explicitly states the
> revisions she wants to merge, I think we should assume she knows what
> she wants.  Also, if our detection method isn't foolproof we need a
> way for the user to merge a revision we want to skip (or we error out
> on), but we still want mergetracking active so using --ignore-ancestry
> isn't an option.  If the user is relying on Subversion to select the
> appropriate revisions (i.e. a sync merge) then your patch's behavior
> makes more sense.
> 
> Paul
>

merge-avoid-reflective-3.patch
Description: Binary data

Re: Implicit keep-alive after reintegrate merge

Reply via email to