Richard, All -

It occurs to me it is nice and simple if we are able to say that for each new repo, the history is:

* the relevant subset of the incubator repo history
* followed by a `git mv <subdir>/* .` with message saying "migration from incubator to new multi-repo structure complete"

From an audit perspective it is easy to guarantee that nothing spurious has been added, and we guarantee the provenance of all code in the new repos.

I've merged Alex's PR#1 and a few bug fixes of my own. The results are
looking better now; I've re-pushed to all my TEMP-* repositories for
people to take a look at the results.

On 7 December 2015 at 01:19, Alex Heneveld
<[email protected]> wrote:
* script to clean out big binary litter files (refer to notes in an email a
few weeks back, from Svet?)
I've prototyped a few options for that - they are comment out in
`split.sh` to speed up the processing just at the moment.
I misread this -- cool.

John McCabe also suggested that when we delete a binary artifact, we
drop an empty file <foobar.jar.BINARY.FILE.DELETED> to leave a clue
for anyone working in the history as to why a build isn't working.
I don't think there would be any benefit in that. It is ancient stuff. Also it prevents us from making the subset statement at top.

* fix pom files on result of `rearrange-incubator.sh` script so it builds
(make this a diff / git cherry-pick we can just apply once all PRs are
merged?)
John also said to me that he would take a look at this problem. He was
temporarily prevented from doing so by a bug in the split script which
broke TEMP-brooklyn-server, but that dodgy repo should now be fixed.
John - any update?

* adjust move-w-history for new structure and whitelist
The problem with doing a move (rename) before splitting is that it
does obscure the history. I tried this at first, and the "Commits"
view on GitHub showed only as far as the rename. That's also the
default behaviour for `git log` - that has a `--follow` option but the
doc suggests that it only works when viewing the log for a single
file.
Correct re `--follow`. However without this the folder structure becomes confused post-rename, especially wrt history, and git rebases won't work. I think it's better to accept the fact we'll need to do `--follow` for (ancient) history.

Anything else?
I have wondered about rewriting the commit messages that say "This
closes #1234" to say "This closes
https://github.com/apache/incubator-brooklyn/pull/1234"; - that way
when the commit history is viewed in GitHub the pull request links
continue to work.

`git filter-branch` has a `--msg-filter` option that can do this.
Interesting idea. A related idea is for every commit to add a message referring to the original commit ID for tracking.

But in both cases I tend to feel it's not worth the effort. We wouldn't use them often and when we do it's easy enough to apply the change. (Also I'm not sure that incubator-brooklyn will still be mirrored at apache.)


Best
Alex

Reply via email to