Re: RFC: new git-splice subcommand for non-interactive branch splicing

2016-05-28 Thread Adam Spiers
On Sat, May 28, 2016 at 09:06:59AM +0200, Johannes Schindelin wrote:
> Hi Adam,
> 
> please reply-to-all on this list.

Sorry, I forgot that was the policy here.  Every list and individual
has different preferences on whether to Cc: on list mail, so I find it
almost impossible to keep track of who prefers what :-/

> On Fri, 27 May 2016, Adam Spiers wrote:
> > My feeling is that rebase -i provides something tremendously
> > important, which the vast majority of users use on a regular basis,
> > but that git is currently missing a convenient way to
> > *non-interactively* perform the same magic which rebase -i
> > facilitates.
> 
> Would it not make sense to enhance `rebase -i`, then?

You mean enhance it to support non-interactive usage?  That wouldn't
make much sense to me, given that -i is short for --interactive.  Even
if we added a new non-interactive rebase mode which let you edit the
commits prior to rebasing them, I can't imagine how it would need to
be any different to how non-interactive rebase -i currently works,
i.e. setting GIT_SEQUENCE_EDITOR to a non-interactive command which
modifies the rebase todo file passed via $1.

Or maybe you are suggesting to enhance it to perform operations on the
rebase todo list, such as removing a commit from the todo list, or
moving a commit to a different position?  But that sounds like scope
creep to me; IMHO it would be cleaner for rebase -i to remain an
unopinionated platform for history editing, rather than to make
assumptions about common history editing workflows.  I think those
assumptions belong in higher-level porcelain tools.

Or if you have some other enhancement in mind, please share details!

> I, for one, plan to port my Git garden shears (at least partially) once I
> managed to get my rebase--helper work in. The shears script is kind of a
> "--preseve-merges done right":
> 
> https://github.com/git-for-windows/build-extra/blob/master/shears.sh
> 
> It allows you to (re-)create a branch structure like this:
> 
> bud
> pick cafebabe XYZ
> pick 01234567 Another patch
> mark the-first-branch
> 
> bud
> pick 98765432 This patch was unrelated
> mark the-second-branch
> 
> bud
> merge the-first-branch
> merge the-second-branch
> 
> Of course, this is interactive.

Interesting approach; thanks for sharing.  At a first glance, this
does sound similar to what topgit and gitwork are trying to achieve.
I don't entirely understand it yet, however; it's hard to without
knowing more about the structure of Git for Windows' integration
branch and seeing a concrete example and/or comprehensive
documentation.  For example, it's not clear to me how
generate_script() works, or how it lets you modify just one of the
topic "sub-branches" and then automatically update all other
sub-branches which depend on it?

> But quite frankly, you want to be able to
> perform quite complicated stuff

Why do you say that?  Splice and transplant operations are
conceptually very straightforward, and not even particularly hard to
implement.  git-splice only took me a day or so, and I expect
git-transplant to be quicker.

The only complicated thing I want to implement is git-explode, but if
I have splice and transplant "primitives" available, then it should be
quite easy to implement git-explode as a series of splice/transplant
operations.

> and I think the command-line offers only an inadequate interface for
> this.

Please can you give an example where it would be inadequate?

> > I suspect the most popular use-case in the short term would be the
> > infamous "oops, I only just noticed that I put that commit on the
> > wrong branch, and now there's already a whole bunch of other commits
> > on top of it".
> 
> I have two workflows for that. The simpler one:
> 
> git checkout other-branch
> git commit
> git checkout @{-1}
>
> after those calls.
> 
> The more complicated one comes in handy when a complete rebuild takes a
> long time, and branch switching would trigger a rebuild:
> 
> # Here, I stash what I *want* on the other branch
> git stash -p
> git worktree add throwaway other-branch
> cd throwaway
> git stash apply
> git commit

Neither of these workflows work with the scenario I described.  In my
scenario, the commit is already buried beneath other commits.

> I did use the approach you proposed a couple of times: just commit in the
> middle, and sort things out later. The problem: I frequently forgot, and
> if I did not, reordering the commits resulted in stupid and avoidable
> merge conflicts.

I think you are misunderstanding me - I am not proposing this
workflow; in fact that would be stupid because it's already in common
usage.  But I'm not even advocating it.  I'm saying:

  - I do it regularly by accident (since when I am in the hacking
zone, I am usually focused on the code, not on branch maintenance).

  - When I eventually realise I've done it, I need to go back afterwards
and 

Re: RFC: new git-splice subcommand for non-interactive branch splicing

2016-05-28 Thread Johannes Schindelin
Hi Adam,

please reply-to-all on this list.

On Fri, 27 May 2016, Adam Spiers wrote:

> My feeling is that rebase -i provides something tremendously
> important, which the vast majority of users use on a regular basis,
> but that git is currently missing a convenient way to
> *non-interactively* perform the same magic which rebase -i
> facilitates.

Would it not make sense to enhance `rebase -i`, then?

I, for one, plan to port my Git garden shears (at least partially) once I
managed to get my rebase--helper work in. The shears script is kind of a
"--preseve-merges done right":

https://github.com/git-for-windows/build-extra/blob/master/shears.sh

It allows you to (re-)create a branch structure like this:

bud
pick cafebabe XYZ
pick 01234567 Another patch
mark the-first-branch

bud
pick 98765432 This patch was unrelated
mark the-second-branch

bud
merge the-first-branch
merge the-second-branch

Of course, this is interactive. But quite frankly, you want to be able to
perform quite complicated stuff, and I think the command-line offers only
an inadequate interface for this.

> I suspect the most popular use-case in the short term would be the
> infamous "oops, I only just noticed that I put that commit on the
> wrong branch, and now there's already a whole bunch of other commits
> on top of it".

I have two workflows for that. The simpler one:

git checkout other-branch
git commit
git checkout @{-1}

Sometimes I need to call `git stash -p` before, and `git stash apply`
after those calls.

The more complicated one comes in handy when a complete rebuild takes a
long time, and branch switching would trigger a rebuild:

# Here, I stash what I *want* on the other branch
git stash -p
git worktree add throwaway other-branch
cd throwaway
git stash apply
git commit

I did use the approach you proposed a couple of times: just commit in the
middle, and sort things out later. The problem: I frequently forgot, and
if I did not, reordering the commits resulted in stupid and avoidable
merge conflicts.

> > > In the longer term however, I'd like to write two more subcommands:
> > > 
> > >   - git-transplant(1) which wraps around git-splice(1) and enables
> > > easy non-interactive transplanting of a range of commits from
> > > one branch to another.  This should be pretty straightforward
> > > to implement.
> > 
> > This is just cherry-pick with a range...
> 
> No it's not:
> 
>   - git-transplant would be able to splice commits from one branch
> *into* (i.e. inside, *not* onto) another branch.

Okay, but in case of merge conflicts, you still have to switch to the
other branch, right?

>   - git-transplant would also take care of removing the commits from
> the source branch, but not before they were safely inside the
> destination branch.

That assumes a workflow where you develop on one big messy branch and
later sort it out into the appropriate, separate branches, right? I admit
that I used to do that, too, but ever since worktrees arrived, I do not do
that anymore: it resulted in too much clean-up work. Better to put the
commits into the correct branch right away. Of course, that is just *my*
preference.

>   - git-transplant would orchestrate the whole workflow with a single
> command, complete with --abort and --continue.

cherry-pick also sports --abort and --continue.

> > >   - git-explode(1) which wraps around git-transplant(1) and
> > > git-deps(1), and automatically breaks a linear sequence of commits
> > > into multiple smaller sequences, forming a commit graph where
> > > ancestry mirrors commit dependency, as mentioned above.  I expect
> > > this to be more difficult, and would probably write it in Python.
> > 
> > You mean something like Darcs on top of Git. Essentially, you want to end
> > up with an octopus merge of branches whose commits would conflict if
> > exchanged.
> 
> Something like that, yes, but it's not as simple as a single octopus
> merge.  It would support arbitrarily deep DAGs of topic branches.

Yes, of course. Because

A - B - C - D

might need to resolve into

A - M1 - C - M3
  X/
B - M2 - D

> > I implemented the logic for this in a shell script somewhere, so it is not
> > *all* that hard (Python not required). But I ended up never quite using it
> > because it turns out that in practice, the commit "dependency" (as defined
> > by the commit diffs) does not really reflect the true dependency.
> >
> > For example,
> 
> [snipped examples]
> 
> Sure - I already covered this concern in footnote [0] of my previous
> mail; maybe you missed that?

I think it deserves more prominent a place than a footnote.

> > So I think that this is a nice exercise, but in practice it will
> > require a human to determine which commits really depend on each
> > other.
> 
> Of course - this is 

Re: RFC: new git-splice subcommand for non-interactive branch splicing

2016-05-27 Thread Adam Spiers
Hi Johannes,

Thanks for the quick reply!  Responses inline below:

On Fri, May 27, 2016 at 05:27:14PM +0200, Johannes Schindelin wrote:
> On Fri, 27 May 2016, Adam Spiers wrote:
> 
> > Description
> > ---
> > 
> > git-splice(1) non-interactively splices the current branch by removing
> > a range of commits from within it and/or cherry-picking a range of
> > commits into it.  It's essentially just a glorified wrapper around
> > cherry-pick and rebase -i.
> 
> It sounds as if you could accomplish the same with
> 
>   git checkout -b former-commits 
>   git checkout -b latter-commits 
>   git cherry-pick ..HEAD@{2}

Not really - that is missing several features which git-splice
provides, e.g.

  - The ability to remove a non-consecutive list of commits
from the branch.

  - The ability to insert commits at the same time as removing
(granted, that's just in extra cherry-pick your method, but again
that's another thing to orchestrate).

  - The ability to specify commits to remove / insert using
arguments understood by git-rev-list.

  - The patch-id magic which is built into git-rebase.  This
would kick in if any of the commits to insert are already
in ..HEAD@{2} (using your reference terminology).

  - A single command to orchestrate the whole workflow, including
cleanup, and --abort and --continue when manual conflict
resolution is required.  This modularity should help a lot when
building further tools which wrap around it in order to perform
more complex tasks.

This last point is perhaps the most important.  Of course it's
possible to do this manually already.  But the whole point of
git-splice is to automate it in a convenient and reliable manner.

> > Next steps, and the future
> > --
> > 
> > Obviously, I'd welcome thoughts on whether it would make sense to
> > include this in the git distribution.
> 
> Far be I from discouraging you to work on these scripts, but I think that
> a really good place for such subcommands is a separate repository, as you
> have it already. There are already some rarely used subcommands in
> libexec/git-core/ cluttering up the space and I would be reluctant to add
> even more subcommands to the default Git installation delivered to every
> user.

Sure, I appreciate the difficulty in deciding where to draw the line.
My feeling is that rebase -i provides something tremendously
important, which the vast majority of users use on a regular basis,
but that git is currently missing a convenient way to
*non-interactively* perform the same magic which rebase -i
facilitates.  And removing / reordering commits is surely one of the
most common use cases of rebase -i, so I think a lot of people could
benefit from some porcelain to automate that and allow building
higher-level tools on top of it.

I suspect the most popular use-case in the short term would be the
infamous "oops, I only just noticed that I put that commit on the
wrong branch, and now there's already a whole bunch of other commits
on top of it".  I would expect that reducing this solution to a single
git-transplant(1) command would be pretty attractive for a lot of
people.  And of course GUIs / IDEs could incorporate it into their
more beautiful front-ends.  However, if it's not in git core, that's
unlikely to happen.

> You can *always* just extend the PATH so that git-splice can be found;
> Then `git splice ...` will do exactly what you want. That is e.g. how
> git-flow works.

Sure, I've been using that trick since at least 2009 ;-) [0]

> (Of course I hope that you will maintain your scripts
> much, much better than git-flow, i.e. not abandon all users).

I hope so too ;-)

> > In the longer term however, I'd like to write two more subcommands:
> > 
> >   - git-transplant(1) which wraps around git-splice(1) and enables
> > easy non-interactive transplanting of a range of commits from
> > one branch to another.  This should be pretty straightforward
> > to implement.
> 
> This is just cherry-pick with a range...

No it's not:

  - git-transplant would be able to splice commits from one branch
*into* (i.e. inside, *not* onto) another branch.

  - git-transplant would also take care of removing the commits from
the source branch, but not before they were safely inside the
destination branch.

  - git-transplant would orchestrate the whole workflow with a single
command, complete with --abort and --continue.

> >   - git-explode(1) which wraps around git-transplant(1) and
> > git-deps(1), and automatically breaks a linear sequence of commits
> > into multiple smaller sequences, forming a commit graph where
> > ancestry mirrors commit dependency, as mentioned above.  I expect
> > this to be more difficult, and would probably write it in Python.
> 
> You mean something like Darcs on top of Git. Essentially, you want to end
> up with an octopus merge of branches whose commits would conflict if
> 

Re: RFC: new git-splice subcommand for non-interactive branch splicing

2016-05-27 Thread Johannes Schindelin
Hi Adam,

On Fri, 27 May 2016, Adam Spiers wrote:

> Description
> ---
> 
> git-splice(1) non-interactively splices the current branch by removing
> a range of commits from within it and/or cherry-picking a range of
> commits into it.  It's essentially just a glorified wrapper around
> cherry-pick and rebase -i.

It sounds as if you could accomplish the same with

git checkout -b former-commits 
git checkout -b latter-commits 
git cherry-pick ..HEAD@{2}

> Next steps, and the future
> --
> 
> Obviously, I'd welcome thoughts on whether it would make sense to
> include this in the git distribution.

Far be I from discouraging you to work on these scripts, but I think that
a really good place for such subcommands is a separate repository, as you
have it already. There are already some rarely used subcommands in
libexec/git-core/ cluttering up the space and I would be reluctant to add
even more subcommands to the default Git installation delivered to every
user.

You can *always* just extend the PATH so that git-splice can be found;
Then `git splice ...` will do exactly what you want. That is e.g. how
git-flow works. (Of course I hope that you will maintain your scripts
much, much better than git-flow, i.e. not abandon all users).

> In the longer term however, I'd like to write two more subcommands:
> 
>   - git-transplant(1) which wraps around git-splice(1) and enables
> easy non-interactive transplanting of a range of commits from
> one branch to another.  This should be pretty straightforward
> to implement.

This is just cherry-pick with a range...

>   - git-explode(1) which wraps around git-transplant(1) and
> git-deps(1), and automatically breaks a linear sequence of commits
> into multiple smaller sequences, forming a commit graph where
> ancestry mirrors commit dependency, as mentioned above.  I expect
> this to be more difficult, and would probably write it in Python.

You mean something like Darcs on top of Git. Essentially, you want to end
up with an octopus merge of branches whose commits would conflict if
exchanged.

I implemented the logic for this in a shell script somewhere, so it is not
*all* that hard (Python not required). But I ended up never quite using it
because it turns out that in practice, the commit "dependency" (as defined
by the commit diffs) does not really reflect the true dependency.

For example, in my work to move large parts of rebase -i into a builtin, I
have an entire series of commits that do nothing else but prepare the
sequencer for rebase -i's functionality. Most of these commits touch
completely separate parts of the code, so they would make independent
branches in your git-explode command. Yet, that would destroy the story
that the patch series tells, as the natural flow would get lost.

Another major complication is that sometimes the "dependency as per the
diff" is totally bogus. Take for example Git for Windows' patches on top
of Git: there are a couple "sub branches" that add global variables to
environment.c. By the logic of the overlapping (or touching) hunks, these
sub branches should build on top of each other, right? But they are
logically completely independent.

So I think that this is a nice exercise, but in practice it will require a
human to determine which commits really depend on each other.

> Eventually, the utopia I'm dreaming about would become a reality and
> look something like this:
> 
> git checkout -b new-feature
> 
> while in_long_frenzied_period_of_hacking; do
> # don't worry too much about branch maintenance here, just hack
> git add ...
> git commit ...
> done
> 
> # Break lots of commits from new-feature into new topic branches:
> git explode
> 
> # List topic branches
> git work list

You would render me *really* impressed if you could come up with an
automated way to determine logical dependencies between patches.

Ciao,
Johannes
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RFC: new git-splice subcommand for non-interactive branch splicing

2016-05-27 Thread Adam Spiers
Hi all,

I finally got around to implementing a new git subcommand which I've
wanted for quite a while.  I've called it git-splice.

Description
---

git-splice(1) non-interactively splices the current branch by removing
a range of commits from within it and/or cherry-picking a range of
commits into it.  It's essentially just a glorified wrapper around
cherry-pick and rebase -i.

Usage
-

Examples:

# Remove commit A from the current branch
git splice A^!

# Remove commits A..B from the current branch
git splice A..B

# Remove commits A..B from the current branch, and cherry-pick
# commits C..D at the same point
git splice A..B C..D

# Cherry-pick commits C..D, splicing them in just after commit A
git splice A C..D

# Remove first commit mentioning 'foo', and insert all commits
# in the 'elsewhere' branch which mention 'bar'
git splice --grep=foo -n1 HEAD -- --grep=bar HEAD..elsewhere

# Abort a splice which failed during cherry-pick or rebase
git splice --abort

# Resume a splice after manually fixing conflicts caused by
# cherry-pick or rebase
git splice --continue

N.B. Obviously this command rewrites history!  As with git rebase,
you should be aware of all the implications of history rewriting
before using it.

Code


Currently this is in alpha state:

  https://github.com/git/git/compare/master...aspiers:splice

and I reserve the right to rewrite the history of that branch in the
near future ;-)

I realise that the code does not yet conform to the coding standards
of the git project.  For example, it relies on non-POSIX bash
features, like arrays.  I would be happy to fix this if there is a
chance git-splice might be accepted for inclusion within the git
distribution.  (Presumably contrib/ is another possibility.)
Also, I haven't yet written a proper man page for it.

Motivation
--

I wrote git-splice as the next step in the journey towards being able
to implement a tool which automatically (or at least
semi-automatically) splits a linear sequence of commits into a commit
graph where ancestry exactly mirrors commit "dependency".[0]  In other
words, in this commit graph, a commit B would have commit A as an
ancestor if and *only* if commit B cannot cleanly apply without A
already being present in the branch.  As a corollary, if commit F
depends on D and E, but D and E are mutually independent, F would
need to depend on a merge commit which contains D and E.

Such a tool could be useful for a few reasons.  Firstly, large patch
series are much harder to review than single commits or small patch
series, but typical development workflows often lead to large patch
series.

For example, if I work privately on a new feature for some hours /
days / weeks, I will typically amass a bunch of commits which are not
all directly related to the new feature: there are often refactorings,
fixes for bugs discovered during development of the new feature, etc.

I doubt I'm the only git user not disciplined enough to maintain neat
branch organization for the whole of a long period of hacking!
i.e. religiously maintaining one branch per bugfix, one branch per
refactoring, and one branch for the new feature.[1]  Typically, tidying
up the branches comes a bit later, when I want to start feeding stuff
upstream for review.

Therefore being able to reduce the effort involved with breaking a
large patch series into smaller related chunks seems potentially very
useful.

As well as making reviews smaller easier, this allows both the reviews
and any corresponding CI to proceed in a more parallelized fashion.

Some review systems can implicitly discourage reviews of large patch
series, by treating each commit as a review in its own right and/or not
providing sophisticated support for patch series.  Gerrit is one
example; gitlab and GitHub are counter-examples.

I'm sure there are other use cases which I didn't think of yet.

Next steps, and the future
--

Obviously, I'd welcome thoughts on whether it would make sense to
include this in the git distribution.

In the longer term however, I'd like to write two more subcommands:

  - git-transplant(1) which wraps around git-splice(1) and enables
easy non-interactive transplanting of a range of commits from
one branch to another.  This should be pretty straightforward
to implement.

  - git-explode(1) which wraps around git-transplant(1) and
git-deps(1), and automatically breaks a linear sequence of commits
into multiple smaller sequences, forming a commit graph where
ancestry mirrors commit dependency, as mentioned above.  I expect
this to be more difficult, and would probably write it in Python.

Ideally, this tool would also be able to integrate with other
workflow management tools[1] in order to effectively create /
manage topic branches and track dependencies between them.

Eventually, the utopia I'm dreaming about would become a