Hello fellow Gentoo developers and subscribers of the gentoo-dev mailing list,
I've been wanting to write this email for a while but for some reason never got round to doing it due to lack of motivation and time. I will be discussing many topics in this email revolving around git essentially. I first want to go over some basic concepts about git and GitHub and why we should be doing things differently if we want to avoid cluttering up our repository with useless stuff. * Background As you know, a little while ago we've migrated the main tree to git, the revision control tool which needs no introduction. A few months after the migration, our repository was mirrored over to GitHub to give the project a bit more exposure to what some developers refers to as "the GitHub generation". The response from the community was extordinary and as a result, a massive number of Pull Requests came our way. We soon then started to lend ourselves to the duty of PR triaging and merging, and started to make what's called in the git world "merge commits". Understanding merge commits requires understanding how GitHub considers a contribution. When a contributor sends a PR via GitHub, he will essentially be making a different branch, start working on it and eventually file it. For those of you familiar with git or who've already filed PRs on GitHub, this is old news. However, there's a number of different way to deal with PRs on the receiving end (us) in order to keep a sane log history (graph actually). When we first started working with git, and GitHub, the tendency was to rely on merge commits to merge contributions back into the main repo. In my opinion, this was, and still is, a bad idea. What's so special about merge commits? * A short walk through merge commits As you may know, merging one branch into another often results in creating a new commit. This commit is called a "merge commit" in git jargon. Let's pick for instance cf4cce36684de5e449ec60bde3421fa0e27bac74. I'm not trying to put the blame on a particular developer, we've all used merge commits at one point or another and I was one of the first! In the log graph, this commit is displayed as such: $ git log --graph --oneline master [snip] * | | cf4cce3 Merge remote-tracking branch 'github/pr/1845' |\ \ \ | * | | abf61de net-im/ejabberd: require <dev-lang/erlang-19 * | | | 72c688f app-cdr/xcdroast: remove old revisions * | | | ced099c package.mask: update xcdroast p.mask [snip] The problem here is two fold. First off, we've created a commit which is pretty much meaningless. Merge commits often tell a story which says nothing interesting: Merge remote-tracking branch 'github/pr/1845'. OK, that's great but we care? Not really. The second problem stems from the very nature of merge commits. Indeed, the first parent of a merge commit is the tip of master right when the branch is created, in our case this is when the contributor created his branch and started working on his contribution. However, git log also displays on the left hand side what I shall call "rails" (no, I'm not a Ruby developer). A rail is essentially a path leading back to the parent of a merge commit. It is a meant to be a visual aid to help you work out when two branches veered off and enventually got merged back together. As you might have noticed by running git log yourself in the Gentoo git repo and looking back 6 months or a year ago, there are rails all over the place and overlapping each others. Why does it happen? As I just explained above, the parent of a merge commit is the tip of master. But because PRs i.e. branches are each created at a different time, the tip of master is different for each of them. When merging by using a merge commit, git tries really hard to put this information back together by working out the parent of each merge commit. This results in a gigantic and entangled mess shown by git log. I often joke that it looks like as messy version of the London Tube map: colourful yet upside down. In some open source projects, it makes sense to leverage merge commits. The Linux kernel comes to mind for instance. In this case, merge commits are a good way to track changes coming from a different branch. Given the sheer amount on contributors working on the Linux kernel, this is useful information for someone new willing to tackle a new area of the kernel. Figuring out changes made to a file across several releases is extremely helpful and merge commits definitely fill this gap. Also, the Linux kernel doesn't have to deal with PRs since diffs are sent directly to a mailing list. In the case of Gentoo though, it makes no sense. We should strive for keeping a clean and linear history. I have yet to witness developers creating branches in the Gentoo main repository. Even though the GitHub model considers PRs as branches, they are in fact casual contributions and should be treated as such. By avoiding merge commits, we make sure the history stay linear with no parent/child commits all over the place. It leads us to the two remaining solutions for dealing with PRs in a clean fashion: cherry-picking and git am. These two solutions really shine at keeping a sane history. Cherry-picking is not my go-to solution as far as I'm concerned. It requires a bit of setup and is clearly tedious: you must know in advance the full SHA-1 of commit(s) you want to cherry-pick. You must also set up remote repositories, pull from them every now and then, etc. For a Git newbie, it can be daunting. A few developers often opt for this solution (hi kensington!) which I do not vouch for. Eventually, we're left with git am. My favourite choice if you ask me, since it requires very little to do compared to cherry-picking or making merge commits. You may or may not know about it but a PR can be fetched as a git am-compatible patch. If you've ever read emails sent by the GitHub bots, they point to this URL: https://github.com/gentoo/gentoo/pull/1234.patch Once fetched, using your favourite web crawler, the patch can be directly applied via the git am command onto HEAD of the repository you're dealing with. There's this common idiom for fetching AND applying at patch all at once: $ curl https://github.com/gentoo/gentoo/pull/1234.patch | git am * This is where I'm meant to sell you my solution Ultimately, I've decided to write a tool to leverage this way of fetching PRs and merging them. The tool is called Gentoo-App-Pram and is available in the tree: # emerge Gentoo-App-Pram It is written in Perl, works fairly well and has been used by a fair (growing?) number of developers so far. The tool is CLI-based so you will need to feel at home with the command line. Once emerged, cd into your Gentoo git repo and type `pram' followed by the PR number you wish to merge: $ cd /home/patrice/gentoo $ pram 1234 pram will then fetch the PR as a patch and display it to you in your favourite $EDITOR. At this point, you can make any change to the PR i.e. editing commit message(s), changing code in-line, etc. pram also leverages the "Closes:" header. This header is recognised by GitHub, and Larry the Cow, and will automatically close a PR when parsing it in the body of a commit message. So for instance, the following header will automatically close PR 1234: "Closes: https://github.com/gentoo/gentoo/pull/1234". You don't need to manually add it as pram will do this for you. After saving and getting out of $EDITOR, pram will ask you whether the PR needs merging by asking a yes/no question. "y" will launch git am and merge the patch whereas "n" will abort the operation and clean it up. That's pretty much it. Make sure to read the man page since there are other options available (pram --man). pram wouldn't have been possible without Kent Fredric's help. He's assisted me in releasing the package on CPAN and contributed a few patches. Kudos to him! To wrap up: - Please stop making merge commits. This strategy is not useful in the case of Gentoo and does more harm than good. - Cherry-pick or git-am external contributions such as PRs. - Better yet, use Gentoo-App-Pram. :-) If you want to contribute to Gentoo-App-Pram, send me a PR on GitHub at https://github.com/monsieurp/Gentoo-App-Pram or file a bug report at https://bugs.gentoo.org and assign it to me. Comments and suggestions welcome. Cheers, -- Patrice Clement Gentoo Linux developer http://www.gentoo.org
signature.asc
Description: PGP signature