Is there an actual problem caused by squashing or not squashing the commits
that we face in the project? I personally have never needed to revert
something complicated that would be problematic either way (and don't have
a strong opinion about which way we should do it). From what I see so far
in the thread it doesn't look like reverting is a frequent major pain for
anyone. Maybe it is exactly because we're mostly following some best
practice and it makes it easy. If someone has concrete examples from their
experience in the project, please share them, this way it would be easier
to justify the choice.

The PR and commit cleanliness, size and isolation are probably more
important thing to have guidance and maybe enforcement for. There are well
known practices and guidelines that I think we should follow, and I think
they will make squashing or not squashing mostly irrelevant. For example,
if we accept that commits should have description that actually describes
what commit does, then "!fixup", "address comments" and similar should not
be part of the history and should be squashed before submitting the PR no
matter which way we decide to go in general. Also, I think that making
commits isolated is also a good practice, and PR author should be able to
relatively easily split the PR upon reviewer's request. And if we choose to
keep whole PRs small and incremental with descriptive isolated commits,
then there won't be too much difference how many commits there are.

Regards,
Anton

On Fri, Sep 28, 2018 at 8:21 AM Andrew Pilloud <apill...@google.com> wrote:

> I brought up this discussion a few months ago from the other side: I don't
> like my commits being squashed. I try to create logical commits that each
> passes tests and could be broken up into multiple PRs. Keeping those
> changes intact is useful from a history perspective and squashing may break
> other PRs I have in flight. If the intent is clear (one commit with a
> descriptive message and a bunch of "fixups"), then feel free to squash,
> otherwise ask first. When you do squash, it would be good to leave a
> comment as to how the author can avoid having their commits squashed in the
> future.
>
>
> https://lists.apache.org/thread.html/8d29e474e681ab9123280164d95075bb8b0b91486b66d3fa25ed20c2@%3Cdev.beam.apache.org%3E
>
> Andrew
>
> On Fri, Sep 28, 2018 at 7:29 AM Chamikara Jayalath <chamik...@google.com>
> wrote:
>
>>
>>
>> On Thu, Sep 27, 2018 at 9:51 AM Robert Bradshaw <rober...@google.com>
>> wrote:
>>
>>> I agree that we should create a good pointer for cleaning up PRs, and
>>> request (though not require) that authors do it. It's unfortunate though
>>> that squashing during a review makes things difficult to follow, so adds
>>> one more round trip.
>>>
>>> We could consider for those PRs that make sense as a single logical
>>> commit (most, but not all, of them) simply using the "squash and merge"
>>> button even though it technically doesn't create a merge commit.
>>>
>>
>> +1 for allowing "squash and merge" as an option. Most of the reviews (at
>> least for me) consist of a single valid commit and several additional
>> commits that get piled up during the review process which obviously should
>> not be included in the commit history. Going through another round here
>> just to ask the author to fixup everything is unnecessarily time consuming.
>>
>> - Cham
>>
>>
>>>
>>>
>>> On Fri, Sep 21, 2018 at 9:24 PM Daniel Oliveira <danolive...@google.com>
>>> wrote:
>>>
>>>> As a non-committer I think some automated squashing of commits sounds
>>>> best since it lightens the load of regular contributors, by not having to
>>>> always remember to squash, and lightens the load of committers so it
>>>> doesn't take as long to have your PR approved by one.
>>>>
>>>> But for now I think the second best route would be making it PR
>>>> author's responsibility to squash fixup commits. Having that expectation
>>>> described clearly in the Contributor's Guide, along with some simple
>>>> step-by-step instructions for how to do so should be enough. I mainly
>>>> support this because I've been doing the squashing myself since I saw a
>>>> thread about it here a few months ago. It's not nearly as huge a burden on
>>>> me as it probably is for committers who have to merge in many more PRs,
>>>> it's very easy to learn how to do, and it's one less barrier to having my
>>>> code merged in.
>>>>
>>>> Of course I wouldn't expect that committers wait for PR authors to
>>>> squash their fixup commits, but I think leaving a message like "For future
>>>> pull requests you should squash any small fixup commits, as described here:
>>>> <link>" should be fine.
>>>>
>>>>
>>>>> I was also thinking about the possibility of wanting to revert
>>>>> individual commits from a merge commit. The solution you propose
>>>>> works,
>>>>> but only if you want to revert everything.
>>>>
>>>>
>>>> Does this happen often? I might not have enough context since I'm not a
>>>> committer, but it seems to me that often the person performing a revert is
>>>> not the original author of a change and doesn't have the context or time to
>>>> pick out an individual commit to revert.
>>>>
>>>> On Wed, Sep 19, 2018 at 1:32 PM Maximilian Michels <m...@apache.org>
>>>> wrote:
>>>>
>>>>> I tend to agree with you Lukasz. Of course we should try to follow the
>>>>> guide lines as much as possible but if it requires an extra back and
>>>>> forth with the PR author for a cosmetic change, it may not be worth
>>>>> the
>>>>> time.
>>>>>
>>>>> On 19.09.18 22:17, Lukasz Cwik wrote:
>>>>> > I have to say I'm guilty of not following the merge guidelines,
>>>>> > sometimes doing merges without rebasing/flatten commits.
>>>>> >
>>>>> > I find that it is a few extra mins of my time to fix someones PR
>>>>> history
>>>>> > if they have more then one logical commit they want to be separate
>>>>> and
>>>>> > it usually takes days for the PR author to do merging  with the
>>>>> extra
>>>>> > burden as a committer to keep track of another PR and its state
>>>>> (waiting
>>>>> > for clean-up) is taxing. I really liked the idea of the mergebot
>>>>> (even
>>>>> > though it didn't work out in practice) because it could do all the
>>>>> > policy work on my behalf.
>>>>> >
>>>>> > Anything that reduces my overhead as a committer is useful as for
>>>>> the
>>>>> > 100s of PRs that I have merged, I've only had to rollback a couple
>>>>> so
>>>>> > I'm for Charle's suggestion which makes the rollback flow slightly
>>>>> more
>>>>> > complicated for a significantly easier PR merge workflow.
>>>>> >
>>>>> > On Wed, Sep 19, 2018 at 1:13 PM Charles Chen <c...@google.com
>>>>> > <mailto:c...@google.com>> wrote:
>>>>> >
>>>>> >     What I mean is that if you get the first-parent commit using "git
>>>>> >     log --first-parent", it will incorporate any and all fix up
>>>>> commits
>>>>> >     so we don't need to worry about missing any.
>>>>> >
>>>>> >     On Wed, Sep 19, 2018, 1:07 PM Maximilian Michels <m...@apache.org
>>>>> >     <mailto:m...@apache.org>> wrote:
>>>>> >
>>>>> >         Generally, +1 for isolated commits which are easy to revert.
>>>>> >
>>>>> >          > I don't think it's actually harder to roll back a set of
>>>>> >         commits that are merged together.
>>>>> >         I think Thomas was mainly concerned about "fixup" commits to
>>>>> >         land in
>>>>> >         master (as part of a merge). These indeed make reverting
>>>>> commits
>>>>> >         more
>>>>> >         difficult because you have to check whether you missed a
>>>>> "fixup".
>>>>> >
>>>>> >          > Ideally every commit should compile and pass tests
>>>>> though, right?
>>>>> >
>>>>> >         That is definitely what we should strive for when doing a
>>>>> merge
>>>>> >         against
>>>>> >         master.
>>>>> >
>>>>> >          > Perhaps the bigger issue is that we need better
>>>>> documentation
>>>>> >         and a playbook on how to do this these common tasks in git.
>>>>> >
>>>>> >         We do actually have basic documentation about this but most
>>>>> >         people don't
>>>>> >         read it. For example, the commit message of a Merge commit
>>>>> >         should be:
>>>>> >
>>>>> >         Merge pull request #xxxx: [BEAM-yyyy] Issue title
>>>>> >
>>>>> >         But most merge commits don't comply with this rule :) See
>>>>> >
>>>>> https://beam.apache.org/contribute/committer-guide/#merging-it
>>>>> >
>>>>> >         On 19.09.18 21:34, Reuven Lax wrote:
>>>>> >          > Ideally every commit should compile and pass tests
>>>>> though, right?
>>>>> >          >
>>>>> >          > On Wed, Sep 19, 2018 at 12:15 PM Ankur Goenka
>>>>> >         <goe...@google.com <mailto:goe...@google.com>
>>>>> >          > <mailto:goe...@google.com <mailto:goe...@google.com>>>
>>>>> wrote:
>>>>> >          >
>>>>> >          >     I agree with the cleanliness of the Commit history.
>>>>> >          >     "Fixup!", "Address comments", "Address even more
>>>>> >         comments" type of
>>>>> >          >     comments does not convey meaningful information and
>>>>> are
>>>>> >         not very
>>>>> >          >     useful. Its a good idea to squash them.
>>>>> >          >     However, I think its ok to keep separate commits for
>>>>> >         different
>>>>> >          >     logical pieces of the code which make reviewing and
>>>>> >         revisiting code
>>>>> >          >     easier.
>>>>> >          >     Example PR: Support X in the pipeline
>>>>> >          >     Commit 1: Restructuring a bunch of code without any
>>>>> >         logical change.
>>>>> >          >     Commit 2: Changing validation logic for pipeline.
>>>>> >          >     Commit 3: Supporting new field "X" for pipeline.
>>>>> >          >
>>>>> >          >     On Wed, Sep 19, 2018 at 11:27 AM Charles Chen
>>>>> >         <c...@google.com <mailto:c...@google.com>
>>>>> >          >     <mailto:c...@google.com <mailto:c...@google.com>>>
>>>>> wrote:
>>>>> >          >
>>>>> >          >         To be concrete, it is very easy to revert a
>>>>> commit in
>>>>> >         any case:
>>>>> >          >
>>>>> >          >          1. First, use "git log --first-parent" to find
>>>>> the
>>>>> >         first-parent
>>>>> >          >             commit corresponding to a PR merge (this is a
>>>>> >         one-to-one
>>>>> >          >             correspondence).
>>>>> >          >          2. Use "git revert -m 1 <commitid>" to revert the
>>>>> >         commit; this
>>>>> >          >             selects the first parent as the base for a
>>>>> merge
>>>>> >         commit (in
>>>>> >          >             the case where a single commit needs to be
>>>>> >         reverted, just
>>>>> >          >             use "git revert <commitid>" without the "-m
>>>>> 1" flag).
>>>>> >          >
>>>>> >          >         In any case, as a general good engineering
>>>>> practice,
>>>>> >         I do agree
>>>>> >          >         that it is highly desirable to have small
>>>>> independent PRs
>>>>> >          >         instead of large jumbo PRs whenever possible.
>>>>> >          >
>>>>> >          >         On Wed, Sep 19, 2018 at 11:20 AM Charles Chen
>>>>> >         <c...@google.com <mailto:c...@google.com>
>>>>> >          >         <mailto:c...@google.com <mailto:c...@google.com>>>
>>>>> wrote:
>>>>> >          >
>>>>> >          >             I don't think it's actually harder to roll
>>>>> back a
>>>>> >         set of
>>>>> >          >             commits that are merged together.  Git has the
>>>>> >         notion of
>>>>> >          >             first-parent commits (you can see, for
>>>>> example,
>>>>> >         "git log
>>>>> >          >             --first-parent", which filters out the
>>>>> intermediate
>>>>> >          >             commits).  In this sense, PRs still get
>>>>> merged as
>>>>> >         one unit
>>>>> >          >             and this is preserved even if intermediate
>>>>> >         commits are
>>>>> >          >             kept.  Perhaps the bigger issue is that we
>>>>> need
>>>>> >         better
>>>>> >          >             documentation and a playbook on how to do this
>>>>> >         these common
>>>>> >          >             tasks in git.
>>>>> >          >
>>>>> >          >             On Wed, Sep 19, 2018 at 9:27 AM Thomas Weise
>>>>> >         <t...@apache.org <mailto:t...@apache.org>
>>>>> >          >             <mailto:t...@apache.org <mailto:t...@apache.org
>>>>> >>>
>>>>> >         wrote:
>>>>> >          >
>>>>> >          >                 Wanted to bring this up as reminder as
>>>>> well as
>>>>> >          >                 opportunity to discuss potential changes
>>>>> to our
>>>>> >          >                 committer guide. It has been a while since
>>>>> >         last related
>>>>> >          >                 discussion and we welcomed several new
>>>>> >         committers since
>>>>> >          >                 then.
>>>>> >          >
>>>>> >          >                 Finishing up pull requests pre-merge:
>>>>> >          >
>>>>> >          >
>>>>> >
>>>>> https://beam.apache.org/contribute/committer-guide/#finishing-touches
>>>>> >          >
>>>>> >          >                 PRs are worked on over time and may
>>>>> >         accumulate many
>>>>> >          >                 commits. Sometimes because scope expands,
>>>>> >         sometimes just
>>>>> >          >                 to separate independent changes but most
>>>>> of
>>>>> >         the time the
>>>>> >          >                 commits are just fixups that are added as
>>>>> >         review progresses.
>>>>> >          >                 It is important that the latter get
>>>>> squashed
>>>>> >         prior to PR
>>>>> >          >                 merge, as otherwise we lost the ability to
>>>>> >         roll back
>>>>> >          >                 changes by reverting a single commit and
>>>>> also
>>>>> >         generally
>>>>> >          >                 cause a lot of noise in the commit history
>>>>> >         that does not
>>>>> >          >                 help other contributors. To be clear, I
>>>>> refer
>>>>> >         to the
>>>>> >          >                 "Fixup!", "Address comments", "Address
>>>>> even more
>>>>> >          >                 comments" type of entries :)
>>>>> >          >
>>>>> >          >                 I would also propose that every commit
>>>>> gets
>>>>> >         tagged with
>>>>> >          >                 a JIRA (except those fixups that will be
>>>>> >         squashed).
>>>>> >          >                 Having the JIRA and possibly other tags
>>>>> makes
>>>>> >         it easier
>>>>> >          >                 for others not involved in the PR to
>>>>> identify
>>>>> >         changes
>>>>> >          >                 after they were merged, for example when
>>>>> >         looking at the
>>>>> >          >                 revision history or annotated source.
>>>>> >          >
>>>>> >          >                 As for other scenarios of jumbo PRs with
>>>>> many
>>>>> >         commits,
>>>>> >          >                 there are probably situations where work
>>>>> >         needs to be
>>>>> >          >                 broken down into smaller units, making
>>>>> life
>>>>> >         better for
>>>>> >          >                 both, contributor and reviewer(s).
>>>>> Ideally,
>>>>> >         every PR
>>>>> >          >                 would have only one commit, but that may
>>>>> be a
>>>>> >         bit much
>>>>> >          >                 to mandate? Is the general expectation
>>>>> >         something we need
>>>>> >          >                 to document more clearly?
>>>>> >          >
>>>>> >          >                 Thanks,
>>>>> >          >                 Thomas
>>>>> >          >
>>>>> >
>>>>>
>>>>

Reply via email to