Anton makes a good point. We have been talking about policy for what we do,
but the real issue is what we want to come out of it: a clear history for
seeing where code came from and granular rollback. I think in both cases
the key thing is that each commit is a single clear change. How they get
there is not the point.

I have worked on multiple projects with a 1 commit = 1 review (and
sometimes even = 1 ticket). These pretty much never have a good history.
The best case is that each commit has a message that is a bullet point of
many separate changes, because it is simply too inefficient to review each
logical change separately. But since the messages become less useful, it
encourages a culture of not even bothering to write meaningful messages at
all.

Note that is trivial for a committer-reviewer to edit the history any way
they like without the button. "Allow edits by maintainers" is on by
default. The "Squash and merge" button just adds a button for something we
can already do.

Charles: super useful! Worth noting that for a PR with a good history it
will skip meaningful commits (but still give the summary line, which is
nice).

Kenn



On Fri, Sep 28, 2018 at 8:54 AM Anton Kedin <ke...@google.com> wrote:

> Is there an actual problem caused by squashing or not squashing the
> commits that we face in the project? I personally have never needed to
> revert something complicated that would be problematic either way (and
> don't have a strong opinion about which way we should do it). From what I
> see so far in the thread it doesn't look like reverting is a frequent major
> pain for anyone. Maybe it is exactly because we're mostly following some
> best practice and it makes it easy. If someone has concrete examples from
> their experience in the project, please share them, this way it would be
> easier to justify the choice.
>
> The PR and commit cleanliness, size and isolation are probably more
> important thing to have guidance and maybe enforcement for. There are well
> known practices and guidelines that I think we should follow, and I think
> they will make squashing or not squashing mostly irrelevant. For example,
> if we accept that commits should have description that actually describes
> what commit does, then "!fixup", "address comments" and similar should not
> be part of the history and should be squashed before submitting the PR no
> matter which way we decide to go in general. Also, I think that making
> commits isolated is also a good practice, and PR author should be able to
> relatively easily split the PR upon reviewer's request. And if we choose to
> keep whole PRs small and incremental with descriptive isolated commits,
> then there won't be too much difference how many commits there are.
>
> Regards,
> Anton
>
> On Fri, Sep 28, 2018 at 8:21 AM Andrew Pilloud <apill...@google.com>
> wrote:
>
>> I brought up this discussion a few months ago from the other side: I
>> don't like my commits being squashed. I try to create logical commits that
>> each passes tests and could be broken up into multiple PRs. Keeping those
>> changes intact is useful from a history perspective and squashing may break
>> other PRs I have in flight. If the intent is clear (one commit with a
>> descriptive message and a bunch of "fixups"), then feel free to squash,
>> otherwise ask first. When you do squash, it would be good to leave a
>> comment as to how the author can avoid having their commits squashed in the
>> future.
>>
>>
>> https://lists.apache.org/thread.html/8d29e474e681ab9123280164d95075bb8b0b91486b66d3fa25ed20c2@%3Cdev.beam.apache.org%3E
>>
>> Andrew
>>
>> On Fri, Sep 28, 2018 at 7:29 AM Chamikara Jayalath <chamik...@google.com>
>> wrote:
>>
>>>
>>>
>>> On Thu, Sep 27, 2018 at 9:51 AM Robert Bradshaw <rober...@google.com>
>>> wrote:
>>>
>>>> I agree that we should create a good pointer for cleaning up PRs, and
>>>> request (though not require) that authors do it. It's unfortunate though
>>>> that squashing during a review makes things difficult to follow, so adds
>>>> one more round trip.
>>>>
>>>> We could consider for those PRs that make sense as a single logical
>>>> commit (most, but not all, of them) simply using the "squash and merge"
>>>> button even though it technically doesn't create a merge commit.
>>>>
>>>
>>> +1 for allowing "squash and merge" as an option. Most of the reviews (at
>>> least for me) consist of a single valid commit and several additional
>>> commits that get piled up during the review process which obviously should
>>> not be included in the commit history. Going through another round here
>>> just to ask the author to fixup everything is unnecessarily time consuming.
>>>
>>> - Cham
>>>
>>>
>>>>
>>>>
>>>> On Fri, Sep 21, 2018 at 9:24 PM Daniel Oliveira <danolive...@google.com>
>>>> wrote:
>>>>
>>>>> As a non-committer I think some automated squashing of commits sounds
>>>>> best since it lightens the load of regular contributors, by not having to
>>>>> always remember to squash, and lightens the load of committers so it
>>>>> doesn't take as long to have your PR approved by one.
>>>>>
>>>>> But for now I think the second best route would be making it PR
>>>>> author's responsibility to squash fixup commits. Having that expectation
>>>>> described clearly in the Contributor's Guide, along with some simple
>>>>> step-by-step instructions for how to do so should be enough. I mainly
>>>>> support this because I've been doing the squashing myself since I saw a
>>>>> thread about it here a few months ago. It's not nearly as huge a burden on
>>>>> me as it probably is for committers who have to merge in many more PRs,
>>>>> it's very easy to learn how to do, and it's one less barrier to having my
>>>>> code merged in.
>>>>>
>>>>> Of course I wouldn't expect that committers wait for PR authors to
>>>>> squash their fixup commits, but I think leaving a message like "For future
>>>>> pull requests you should squash any small fixup commits, as described 
>>>>> here:
>>>>> <link>" should be fine.
>>>>>
>>>>>
>>>>>> I was also thinking about the possibility of wanting to revert
>>>>>> individual commits from a merge commit. The solution you propose
>>>>>> works,
>>>>>> but only if you want to revert everything.
>>>>>
>>>>>
>>>>> Does this happen often? I might not have enough context since I'm not
>>>>> a committer, but it seems to me that often the person performing a revert
>>>>> is not the original author of a change and doesn't have the context or 
>>>>> time
>>>>> to pick out an individual commit to revert.
>>>>>
>>>>> On Wed, Sep 19, 2018 at 1:32 PM Maximilian Michels <m...@apache.org>
>>>>> wrote:
>>>>>
>>>>>> I tend to agree with you Lukasz. Of course we should try to follow
>>>>>> the
>>>>>> guide lines as much as possible but if it requires an extra back and
>>>>>> forth with the PR author for a cosmetic change, it may not be worth
>>>>>> the
>>>>>> time.
>>>>>>
>>>>>> On 19.09.18 22:17, Lukasz Cwik wrote:
>>>>>> > I have to say I'm guilty of not following the merge guidelines,
>>>>>> > sometimes doing merges without rebasing/flatten commits.
>>>>>> >
>>>>>> > I find that it is a few extra mins of my time to fix someones PR
>>>>>> history
>>>>>> > if they have more then one logical commit they want to be separate
>>>>>> and
>>>>>> > it usually takes days for the PR author to do merging  with the
>>>>>> extra
>>>>>> > burden as a committer to keep track of another PR and its state
>>>>>> (waiting
>>>>>> > for clean-up) is taxing. I really liked the idea of the mergebot
>>>>>> (even
>>>>>> > though it didn't work out in practice) because it could do all the
>>>>>> > policy work on my behalf.
>>>>>> >
>>>>>> > Anything that reduces my overhead as a committer is useful as for
>>>>>> the
>>>>>> > 100s of PRs that I have merged, I've only had to rollback a couple
>>>>>> so
>>>>>> > I'm for Charle's suggestion which makes the rollback flow slightly
>>>>>> more
>>>>>> > complicated for a significantly easier PR merge workflow.
>>>>>> >
>>>>>> > On Wed, Sep 19, 2018 at 1:13 PM Charles Chen <c...@google.com
>>>>>> > <mailto:c...@google.com>> wrote:
>>>>>> >
>>>>>> >     What I mean is that if you get the first-parent commit using
>>>>>> "git
>>>>>> >     log --first-parent", it will incorporate any and all fix up
>>>>>> commits
>>>>>> >     so we don't need to worry about missing any.
>>>>>> >
>>>>>> >     On Wed, Sep 19, 2018, 1:07 PM Maximilian Michels <
>>>>>> m...@apache.org
>>>>>> >     <mailto:m...@apache.org>> wrote:
>>>>>> >
>>>>>> >         Generally, +1 for isolated commits which are easy to revert.
>>>>>> >
>>>>>> >          > I don't think it's actually harder to roll back a set of
>>>>>> >         commits that are merged together.
>>>>>> >         I think Thomas was mainly concerned about "fixup" commits to
>>>>>> >         land in
>>>>>> >         master (as part of a merge). These indeed make reverting
>>>>>> commits
>>>>>> >         more
>>>>>> >         difficult because you have to check whether you missed a
>>>>>> "fixup".
>>>>>> >
>>>>>> >          > Ideally every commit should compile and pass tests
>>>>>> though, right?
>>>>>> >
>>>>>> >         That is definitely what we should strive for when doing a
>>>>>> merge
>>>>>> >         against
>>>>>> >         master.
>>>>>> >
>>>>>> >          > Perhaps the bigger issue is that we need better
>>>>>> documentation
>>>>>> >         and a playbook on how to do this these common tasks in git.
>>>>>> >
>>>>>> >         We do actually have basic documentation about this but most
>>>>>> >         people don't
>>>>>> >         read it. For example, the commit message of a Merge commit
>>>>>> >         should be:
>>>>>> >
>>>>>> >         Merge pull request #xxxx: [BEAM-yyyy] Issue title
>>>>>> >
>>>>>> >         But most merge commits don't comply with this rule :) See
>>>>>> >
>>>>>> https://beam.apache.org/contribute/committer-guide/#merging-it
>>>>>> >
>>>>>> >         On 19.09.18 21:34, Reuven Lax wrote:
>>>>>> >          > Ideally every commit should compile and pass tests
>>>>>> though, right?
>>>>>> >          >
>>>>>> >          > On Wed, Sep 19, 2018 at 12:15 PM Ankur Goenka
>>>>>> >         <goe...@google.com <mailto:goe...@google.com>
>>>>>> >          > <mailto:goe...@google.com <mailto:goe...@google.com>>>
>>>>>> wrote:
>>>>>> >          >
>>>>>> >          >     I agree with the cleanliness of the Commit history.
>>>>>> >          >     "Fixup!", "Address comments", "Address even more
>>>>>> >         comments" type of
>>>>>> >          >     comments does not convey meaningful information and
>>>>>> are
>>>>>> >         not very
>>>>>> >          >     useful. Its a good idea to squash them.
>>>>>> >          >     However, I think its ok to keep separate commits for
>>>>>> >         different
>>>>>> >          >     logical pieces of the code which make reviewing and
>>>>>> >         revisiting code
>>>>>> >          >     easier.
>>>>>> >          >     Example PR: Support X in the pipeline
>>>>>> >          >     Commit 1: Restructuring a bunch of code without any
>>>>>> >         logical change.
>>>>>> >          >     Commit 2: Changing validation logic for pipeline.
>>>>>> >          >     Commit 3: Supporting new field "X" for pipeline.
>>>>>> >          >
>>>>>> >          >     On Wed, Sep 19, 2018 at 11:27 AM Charles Chen
>>>>>> >         <c...@google.com <mailto:c...@google.com>
>>>>>> >          >     <mailto:c...@google.com <mailto:c...@google.com>>>
>>>>>> wrote:
>>>>>> >          >
>>>>>> >          >         To be concrete, it is very easy to revert a
>>>>>> commit in
>>>>>> >         any case:
>>>>>> >          >
>>>>>> >          >          1. First, use "git log --first-parent" to find
>>>>>> the
>>>>>> >         first-parent
>>>>>> >          >             commit corresponding to a PR merge (this is a
>>>>>> >         one-to-one
>>>>>> >          >             correspondence).
>>>>>> >          >          2. Use "git revert -m 1 <commitid>" to revert
>>>>>> the
>>>>>> >         commit; this
>>>>>> >          >             selects the first parent as the base for a
>>>>>> merge
>>>>>> >         commit (in
>>>>>> >          >             the case where a single commit needs to be
>>>>>> >         reverted, just
>>>>>> >          >             use "git revert <commitid>" without the "-m
>>>>>> 1" flag).
>>>>>> >          >
>>>>>> >          >         In any case, as a general good engineering
>>>>>> practice,
>>>>>> >         I do agree
>>>>>> >          >         that it is highly desirable to have small
>>>>>> independent PRs
>>>>>> >          >         instead of large jumbo PRs whenever possible.
>>>>>> >          >
>>>>>> >          >         On Wed, Sep 19, 2018 at 11:20 AM Charles Chen
>>>>>> >         <c...@google.com <mailto:c...@google.com>
>>>>>> >          >         <mailto:c...@google.com <mailto:c...@google.com>>>
>>>>>> wrote:
>>>>>> >          >
>>>>>> >          >             I don't think it's actually harder to roll
>>>>>> back a
>>>>>> >         set of
>>>>>> >          >             commits that are merged together.  Git has
>>>>>> the
>>>>>> >         notion of
>>>>>> >          >             first-parent commits (you can see, for
>>>>>> example,
>>>>>> >         "git log
>>>>>> >          >             --first-parent", which filters out the
>>>>>> intermediate
>>>>>> >          >             commits).  In this sense, PRs still get
>>>>>> merged as
>>>>>> >         one unit
>>>>>> >          >             and this is preserved even if intermediate
>>>>>> >         commits are
>>>>>> >          >             kept.  Perhaps the bigger issue is that we
>>>>>> need
>>>>>> >         better
>>>>>> >          >             documentation and a playbook on how to do
>>>>>> this
>>>>>> >         these common
>>>>>> >          >             tasks in git.
>>>>>> >          >
>>>>>> >          >             On Wed, Sep 19, 2018 at 9:27 AM Thomas Weise
>>>>>> >         <t...@apache.org <mailto:t...@apache.org>
>>>>>> >          >             <mailto:t...@apache.org <mailto:
>>>>>> t...@apache.org>>>
>>>>>> >         wrote:
>>>>>> >          >
>>>>>> >          >                 Wanted to bring this up as reminder as
>>>>>> well as
>>>>>> >          >                 opportunity to discuss potential changes
>>>>>> to our
>>>>>> >          >                 committer guide. It has been a while
>>>>>> since
>>>>>> >         last related
>>>>>> >          >                 discussion and we welcomed several new
>>>>>> >         committers since
>>>>>> >          >                 then.
>>>>>> >          >
>>>>>> >          >                 Finishing up pull requests pre-merge:
>>>>>> >          >
>>>>>> >          >
>>>>>> >
>>>>>> https://beam.apache.org/contribute/committer-guide/#finishing-touches
>>>>>> >          >
>>>>>> >          >                 PRs are worked on over time and may
>>>>>> >         accumulate many
>>>>>> >          >                 commits. Sometimes because scope expands,
>>>>>> >         sometimes just
>>>>>> >          >                 to separate independent changes but most
>>>>>> of
>>>>>> >         the time the
>>>>>> >          >                 commits are just fixups that are added as
>>>>>> >         review progresses.
>>>>>> >          >                 It is important that the latter get
>>>>>> squashed
>>>>>> >         prior to PR
>>>>>> >          >                 merge, as otherwise we lost the ability
>>>>>> to
>>>>>> >         roll back
>>>>>> >          >                 changes by reverting a single commit and
>>>>>> also
>>>>>> >         generally
>>>>>> >          >                 cause a lot of noise in the commit
>>>>>> history
>>>>>> >         that does not
>>>>>> >          >                 help other contributors. To be clear, I
>>>>>> refer
>>>>>> >         to the
>>>>>> >          >                 "Fixup!", "Address comments", "Address
>>>>>> even more
>>>>>> >          >                 comments" type of entries :)
>>>>>> >          >
>>>>>> >          >                 I would also propose that every commit
>>>>>> gets
>>>>>> >         tagged with
>>>>>> >          >                 a JIRA (except those fixups that will be
>>>>>> >         squashed).
>>>>>> >          >                 Having the JIRA and possibly other tags
>>>>>> makes
>>>>>> >         it easier
>>>>>> >          >                 for others not involved in the PR to
>>>>>> identify
>>>>>> >         changes
>>>>>> >          >                 after they were merged, for example when
>>>>>> >         looking at the
>>>>>> >          >                 revision history or annotated source.
>>>>>> >          >
>>>>>> >          >                 As for other scenarios of jumbo PRs with
>>>>>> many
>>>>>> >         commits,
>>>>>> >          >                 there are probably situations where work
>>>>>> >         needs to be
>>>>>> >          >                 broken down into smaller units, making
>>>>>> life
>>>>>> >         better for
>>>>>> >          >                 both, contributor and reviewer(s).
>>>>>> Ideally,
>>>>>> >         every PR
>>>>>> >          >                 would have only one commit, but that may
>>>>>> be a
>>>>>> >         bit much
>>>>>> >          >                 to mandate? Is the general expectation
>>>>>> >         something we need
>>>>>> >          >                 to document more clearly?
>>>>>> >          >
>>>>>> >          >                 Thanks,
>>>>>> >          >                 Thomas
>>>>>> >          >
>>>>>> >
>>>>>>
>>>>>

Reply via email to