On Tue, Jul 11, 2006 at 06:40:54PM -0700, Zack Weinberg wrote: > I was thinking about using commit date as a further heuristic, i.e. > when we have two LCAs neither of which is an ancestor of the other, > merge the newest one first; furthermore, when we have three or more > heads with the same LCA, merge the newest two first.
Absent other clearly-obvious better choices (such as conflicting vs non), I like the simple predictability of merging in alpha-sorted revision-id order. In particular, this is important to avoid a lot of merge fan-out. On a busy project with many developers syncing and merging at the same time, all with slightly different mostly-overlapping sets of revisions, we want to minimise the number of additional intermediate merge nodes that will be created because different users will merge subsets of nodes in different orders. Those merge nodes are only going to have to be merged again, creating mostly pointless tangle. As a simple dumb example, consider a minor modification of the present algorithm, that doesn't use any of your additional smarts: 1. make a sorted list of heads 2. attempt to merge the first pair 3. if successful, start again with a new list (that's pretty much what we do now). 4. if not successful, move one slot down the list and try again for the next pair, rather than failing I'm not advocating this change. This is by no means going to produce the best chances for a successful or least-manual-assistance merge, as you're trying to do. To illustrate my point, however, it will do a pretty good job of producing convergent sets of merge nodes amongst multiple people attempting to merge, while allowing further progress than we currently do. Both objectives are important, and you need to consider the tradeoffs between them. I don't have a clear picture of what those tradeoffs might be, but i'm nervous that, between developers with partial views of eachothers work, the nodes that diverge recently from LCAs are perhaps the *least* likely nodes for them to have in common. A counterargument against any more eager merging algorithm is that if such merges are stopped at the first failure, there's another opportunity to sync and learn of more nodes before attempting again. If we merge eagerly in such cases, we're going to produce a more complex set of internal merge nodes and multiple re-merges before finally getting to a single head. I'm not trying to discourage you, nor to suggest that having extra merge nodes is really something to be frightened of: just that we need to consider this too. For all I know, we can come up with a selection order algorithm that will actually improve this situation. One strawman example that comes immediately to mind is to merge nodes with common author certs first, on the assumption an author is more likely to know about their own revisions than those of others. > However, it > seems like a huge pain to get from a revision_id to its commit date, > and in fact I'm not sure the date cert is guaranteed to exist. It's not certain to exist. It's not certain to be correct. It's not certain to be unique. It's not certain to represent time-of-commit; I have at least one case where I set the date according to the time the content was current, rather than the time i'm later committing that record. The most pertinent example here, though, is common merge nodes that have been created by different people; they'll have different dates on them and will be sorted differently by different viewers until the date certs meet up. I don't think it's a good idea. -- Dan.
pgpCEkoWDGEYD.pgp
Description: PGP signature
_______________________________________________ Monotone-devel mailing list Monotone-devel@nongnu.org http://lists.nongnu.org/mailman/listinfo/monotone-devel