Re: [DISCUSS] Pull Request merge policy for core

Oleg Nenashev Tue, 20 Oct 2015 07:45:08 -0700

Since Stephen mostly addresses me, I need to explain my position. We had a 
discussion with him in https://github.com/jenkinsci/jenkins/pull/1815 and 
https://github.com/jenkinsci/jenkins/pull/1804


I'm not against multi-commit changes when these commits are really 
reasonable on the atomic level (E.g. "Added new API method" + "Added unit 
tests for new API" + "Migrated stuff to the new API"). I'm against dozens 
of commits happening during the PR polishing: typo fixes, minor 
regressions, etc. I doubt this info will be ever required, but it makes the 
analysis very difficult when you want to investigate the issue without 
checking out the entire repo.

Since Jenkins project is being hosted on GitHub, we should rely on its 
features and make the project more comfortable to all types of contributors 
including developers and mostly silent *Jenkins users and instance 
maintainers*. They are may not have to have much experience with Git or to 
have local clones of repositories, so I'm against referring to 
advanced-level Git features allowing to work with unsquashed commit 
histories.

In order to simplify my work, I've developed the script 
<https://gist.github.com/oleg-nenashev/ec7dea01dcf272e9b891>, which allows 
to easily perform squashed merges of pull requests, which require only one 
logically atomic commit. This script mostly follows the approach being used 
by Cristopher.

>From my side, I would propose the following approach:

   - Jenkins contributors are eligible to do whatever they want in PRs => 
   we don't require contributors to squash commits
   - When we merge PRs, into master, we squash commits and reference the 
   original pull requests there 
   - => We keep the master branch clean
      - => We allow to dig into the history by going to a pull request if 
      anybody wants it
   
Would such approach be convenient enough?
If yes, I can convert the script to a more generic and reliable Java 
application, which will also pull GitHub for data and probably make 
references autmatically







вторник, 20 октября 2015 г., 13:09:07 UTC+3 пользователь Stephen Connolly 
написал:
>
> We seem to have some strong opinions on how pull requests should be 
> merged for core. 
>
> On the one hand, there is the view (which I share) that we should not 
> rewrite the work of contributors. 
>
> On the other hand, there is the view that we should squash all pull 
> requests before merging. 
>
> Now when I look at the other point of view, I believe I can see where 
> they are coming from: 
>
> * The GitHub UI for merging pull requests currently does not allow you 
> to edit the first line of the commit message, forcing it to be "Merge 
> pull request #1234 from joebloggs/jenkins-32198" 
>
> * The GitHub UI for browsing the commit history currently does not 
> allow you to view the history of the branch from the --first-parent 
> point of view 
>
> * The GitHub UI for looking at the blame on a file currently does not 
> allow you to view the blame of the file with the --first-parent point 
> of view 
>
> So if we restrict ourselves to using the GitHub UI, you need to trade 
> off between the ability to do code archeology from the release 
> viewpoint and the ability to do code archeology from the what were 
> they thinking viewpoint. 
>
> Since there is a slight majority of use cases looking at things from 
> the release viewpoint, it therefore makes sense *for users of the 
> GitHub UI* to request that pull requests be merged... in other words: 
>
> Assume that we are using the GitHub UI 
> Assume that release viewpoint archeology is more frequent 
> Then we should squash commits before merging 
>
> Now I have a number of issues with the above set of assumptions, but 
> the main one as I see it is this: 
>
> When you throw away data, you can never recover it. I remember a 
> science project where we recorded the temperature at different times 
> of the day over a month... every day we would take a number of 
> readings and record the average of the days readings... at the end of 
> the month we were introduced to statistical analysis (this was evil 
> teacher making us learn through making mistakes) so that we would 
> realise that *YOU NEVER THROW AWAY THE RAW DATA* because you do not 
> know what you will need it for in the future. 
>
> I agree with some things: 
>
> * The GitHub UI for merging commits is braindead and gives crappy 
> commit summary messages... it would totally be much better if the 
> merge commit summary was actually the PR branch description (i.e. 
> something like what you would get locally with `merge.branchdesc=true` 
> in your git config). I have lodged a feature request for that with 
> GitHub. 
>
> * The GitHub UI for browsing commits is a pain as you cannot see the 
> `git log --first-parent` view (but while we merge pull requests using 
> the GitHub UI with it's crappy "Merge pull request #1234 from 
> joebloggs/jenkins-32198" style summaries it's not like `git log 
> --first-parent` gives us a great history view either) 
>
> * The GitHub UI for attributing blame should allow for the 
> `--first-parent` `-w` `-C` and `-M` options to be configured. 
>
> So I see lots of issues with the GitHub UI... but I do not see that 
> the solution is to squash commits. 
>
> Firstly, if you are squashing commits, then you have to be merging 
> from the command line... and if you are merging from the command line, 
> well you can give a proper merge commit message and then `git log 
> --first-parent` is beautiful and we don't need to throw away the raw 
> data. 
>
> Secondly, the code archeology almost always starts from `git blame` 
> and the number of hours I have wasted with `git blame` when I really 
> wanted `git blame --first-parent -w` (sometimes with a mix of `-C` 
> `-M`) is such that I have learned *never* to use the GitHub UI to 
> track these things down. Attempting to do any code archeology from the 
> GitHub UI is a complete and utter waste of time and will cause you to 
> lose the will to live. 
>
> Thirdly, there is an argument about cherry picking which seems to 
> ignore two simple facts: 
>   * You can cherry pick merge commits - provided you know what number 
> to pass to `-m` 
>   * When you use `git log --first-parent` then you know it's always 
> `git cherry-pick -m 1` 
>
> So I see the advocates of squashing commits as looking for 
> * a short-term gain (because GitHub will eventually fix their crappy UI 
> bugs) 
> * of limited utility (because you almost always are better off with 
> the Git CLI anyway) 
> * with questionable value (because the commit history is still 
> distracted by all those "Merge pull request #... from .../..." noise) 
>
> Now be careful, because I am not against asking the pull request 
> author to tidy up their PR before we merge it... there is a big 
> difference between asking for a tidy-up and mandating a squash. 
>
> If you tidy-up a pull request you might probably: 
> * reorganise the history into a logical flow. 
> * clean up the commit messages to better reflect your thinking 
> * remove noise commits from formatting changes and reverts of those 
> formatting changes 
> * combine some commits together as they form a single logical change 
> * split some commits apart as they co-mix two logically separate changes 
>
> I have some concerns about mandating a tidy-up (I note that Jesse is 
> against rewriting the history of a PR branch as that means that GitHub 
> hides the code review comments) but I see nothing wrong with asking 
> for a tidy-up as being a choice available to the merger... 
>
> Aside: There is also a possible small question of the copyright of 
> commit messages and whether those commit messages are covered by the 
> MIT license that the code diff is covered by... after all, at the time 
> the MIT license was drafted there was no such thing as a pull request 
> containing multiple commits, so the concept of a license applying to 
> the patch metadata would not even have occurred. It's probably not too 
> important but it may mean that even under the current CLA for core we 
> do not have the legal right to modify the original author's commit 
> messages for the initial merge to master.... never mind the drive-by 
> pull requests we get from people without a CLA. 
>
> But really what do others think? 
>
> How do people want to merge pull requests against core? 
> Do we want to use the GitHub UI to merge the pull requests? 
> Do we want to mandate squashing commits (which will prevent us from 
> using the GitHub UI to merge pull requests and force manual closing of 
> the PRs)? 
> Do we want to mandate tidying up commits (which will hide code review 
> comments)? 
>

-- 
You received this message because you are subscribed to the Google Groups 
"Jenkins Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to jenkinsci-dev+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/jenkinsci-dev/f9db72d8-e438-42cc-9763-7a19cc72d8b6%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: [DISCUSS] Pull Request merge policy for core

Reply via email to