Re: [darcs-users] darcs conflicts/dependencies -- is patch theory the place to start?

Stephen J. Turnbull Thu, 13 Sep 2012 01:41:50 -0700

AntC writes:
 > Stephen J. Turnbull <stephen <at> xemacs.org> writes:


 > Thanks you Stephen, you've given me a lot to think about, so here's some 
 > initial reactions ...

You're welcome.  I'm mostly going to indicate where I agree, not that
that makes it *right*, but you might like to know. :-)

 > >  >   If a repo were really just a set of patches,
 > >  >      managing duplicates would be easy(?).
[...]
 > What I meant by duplicates here is as discussed earlier in the
 > thread by Owen -- that is, pulling the same patch twice from a source repo.
 > 
 > What I meant by "... would be easy" is:
 > If the repo I'm pulling in to is just a set, and I pull a patch
 > twice, the VCS should be able to say: I've already got that, I'll
 > ignore it.

Yes, that should be the case.

 > But the algorithm Owen (and Kevin) described seemed to be a lot
 > more difficult, and got messed up where there were conflicts.

If you're talking about the Create File F example, first, I can
summarize by saying that the algorithm described is related to what I
called branch management.

Second, this is a case where there's a lot of potential for duplicate
textual effect, different semantics (at least for some definitions of
semantics).  For example, in XEmacs we have a primitive package
system, whose distribution we manage via Makefiles.  Most "upstream"
packages didn't have Makefiles at that time so this caused no problem
at first.  However we nowadays occasionally run into a case where an
upstream author adds a Makefile to perform activities completely
independent of XEmacs packaging.  I would intuitively consider those
to be "two different files which happen to have the same name", not
"different versions of the same Makefile", and thus argue that they
should get a Create/Create conflict.  The same thing would often be
true for NEWS or ChangeLog (viz. "debian-changelog" in the doc
hierarchy for each package).  On the other hand, suppose both sides
Create a ChangeLog with the same semantics (changes to the content of
the package, not to the packaging).  They might add different content
though, which should be a mergeable Change/Change conflict.

 > What I'm aiming for, though, is supporting 'cherry picking' --
 > which I see as the attractive feature of darcs over monotone, git,
 > etc. (Also supporting 'undoing' some patch back in history -- which
 > is just another variety of cherry-picking.)

Well, the DAG-oriented DVCSes already support cherry-picking.  What
they don't support is automatically grabbing the dependencies as well.

However, once you define cherry-picking that way, undoing is *not* the
inverse of cherrypicking.  What the programmer wants with an undo is
that a patch and some subset of its dependencies be undone.  That
subset is not necessarily empty, and not unique.

 > Arguably there could be different data models for different
 > document semantics. Perhaps we should be capturing the Abstract
 > Syntax Tree ? ;-)

I've argued that that is the only way to resolve many of the
problematic conflicts in the past, but it would get messy.

 > >  > This bites darcs with file moves (renames):
 > >  >    you can't detect a move vs. remove-then-add-then-paste-content.
 > > 
 > > (I still don't understand why anyone cares to detect this difference.
 > > The two are observationally equivalent in a single Darcs patch.  But
 > > that's just me.)
 > 
 > I was thinking about the AddAdd Makefile example that Owen pointed
 > to. If in the target repo the file has been moved/renamed, then any
 > hunk operation should apply to the moved file, not to whatever is
 > currently in the directory named "Makefile".

Sure, but the thing is that git can (in theory, I've not tried it and
I tend to doubt it's actually implemented) already do this by
detecting that before some historical change a particular blob of
content was named one thing and after that change it's named something
else, and apply the hunk operation to the relevant content.  If the
particular blob of content is too changed to be identified as a
rename, the apply will probably fail anyway.

Note that git could easily accomodate an extension to your
hunk-tracking idea by defining a file as a sequence of blobs rather
than a single blob.  I've always wanted to try to extend git (I don't
grok Haskell, unfortunately) to do that, with the "local blobs" being
high-level syntactic entities such as classes and functions.

 > Perhaps the minimum to aim for is to detect when something's not
 > right, and help the programmer work through a resolution. I think
 > this means getting the pre- conditions tight, but not too tight(!) 
 > -- topic for a research project?

I think investigating preconditions is one way forward.

 > [Merging is] conceptually trivial maybe. But practically very
 > complex and intractible when it comes to conflicts and performance.

Well, I think one way forward is your suggestion to "export" the
analysis that Darcs currently does internally (to resolve as many
potential conflicts as possible) as a status report for the user to
consider and perhaps to resolve.  But this is going to be hard.

_______________________________________________
darcs-users mailing list
[email protected]
http://lists.osuosl.org/mailman/listinfo/darcs-users

Re: [darcs-users] darcs conflicts/dependencies -- is patch theory the place to start?

Reply via email to