Generally, +1 for isolated commits which are easy to revert.

I don't think it's actually harder to roll back a set of commits that are merged together.
I think Thomas was mainly concerned about "fixup" commits to land in master (as part of a merge). These indeed make reverting commits more difficult because you have to check whether you missed a "fixup".

Ideally every commit should compile and pass tests though, right?

That is definitely what we should strive for when doing a merge against master.

Perhaps the bigger issue is that we need better documentation and a playbook on 
how to do this these common tasks in git.

We do actually have basic documentation about this but most people don't read it. For example, the commit message of a Merge commit should be:

Merge pull request #xxxx: [BEAM-yyyy] Issue title

But most merge commits don't comply with this rule :) See https://beam.apache.org/contribute/committer-guide/#merging-it

On 19.09.18 21:34, Reuven Lax wrote:
Ideally every commit should compile and pass tests though, right?

On Wed, Sep 19, 2018 at 12:15 PM Ankur Goenka <goe...@google.com <mailto:goe...@google.com>> wrote:

    I agree with the cleanliness of the Commit history.
    "Fixup!", "Address comments", "Address even more comments" type of
    comments does not convey meaningful information and are not very
    useful. Its a good idea to squash them.
    However, I think its ok to keep separate commits for different
    logical pieces of the code which make reviewing and revisiting code
    easier.
    Example PR: Support X in the pipeline
    Commit 1: Restructuring a bunch of code without any logical change.
    Commit 2: Changing validation logic for pipeline.
    Commit 3: Supporting new field "X" for pipeline.

    On Wed, Sep 19, 2018 at 11:27 AM Charles Chen <c...@google.com
    <mailto:c...@google.com>> wrote:

        To be concrete, it is very easy to revert a commit in any case:

         1. First, use "git log --first-parent" to find the first-parent
            commit corresponding to a PR merge (this is a one-to-one
            correspondence).
         2. Use "git revert -m 1 <commitid>" to revert the commit; this
            selects the first parent as the base for a merge commit (in
            the case where a single commit needs to be reverted, just
            use "git revert <commitid>" without the "-m 1" flag).

        In any case, as a general good engineering practice, I do agree
        that it is highly desirable to have small independent PRs
        instead of large jumbo PRs whenever possible.

        On Wed, Sep 19, 2018 at 11:20 AM Charles Chen <c...@google.com
        <mailto:c...@google.com>> wrote:

            I don't think it's actually harder to roll back a set of
            commits that are merged together.  Git has the notion of
            first-parent commits (you can see, for example, "git log
            --first-parent", which filters out the intermediate
            commits).  In this sense, PRs still get merged as one unit
            and this is preserved even if intermediate commits are
            kept.  Perhaps the bigger issue is that we need better
            documentation and a playbook on how to do this these common
            tasks in git.

            On Wed, Sep 19, 2018 at 9:27 AM Thomas Weise <t...@apache.org
            <mailto:t...@apache.org>> wrote:

                Wanted to bring this up as reminder as well as
                opportunity to discuss potential changes to our
                committer guide. It has been a while since last related
                discussion and we welcomed several new committers since
                then.

                Finishing up pull requests pre-merge:

                
https://beam.apache.org/contribute/committer-guide/#finishing-touches

                PRs are worked on over time and may accumulate many
                commits. Sometimes because scope expands, sometimes just
                to separate independent changes but most of the time the
                commits are just fixups that are added as review progresses.
                It is important that the latter get squashed prior to PR
                merge, as otherwise we lost the ability to roll back
                changes by reverting a single commit and also generally
                cause a lot of noise in the commit history that does not
                help other contributors. To be clear, I refer to the
                "Fixup!", "Address comments", "Address even more
                comments" type of entries :)

                I would also propose that every commit gets tagged with
                a JIRA (except those fixups that will be squashed).
                Having the JIRA and possibly other tags makes it easier
                for others not involved in the PR to identify changes
                after they were merged, for example when looking at the
                revision history or annotated source.

                As for other scenarios of jumbo PRs with many commits,
                there are probably situations where work needs to be
                broken down into smaller units, making life better for
                both, contributor and reviewer(s). Ideally, every PR
                would have only one commit, but that may be a bit much
                to mandate? Is the general expectation something we need
                to document more clearly?

                Thanks,
                Thomas

Reply via email to