Generally, +1 for isolated commits which are easy to revert.
I don't think it's actually harder to roll back a set of commits that are merged together.
I think Thomas was mainly concerned about "fixup" commits to land in
master (as part of a merge). These indeed make reverting commits more
difficult because you have to check whether you missed a "fixup".
Ideally every commit should compile and pass tests though, right?
That is definitely what we should strive for when doing a merge against
master.
Perhaps the bigger issue is that we need better documentation and a playbook on
how to do this these common tasks in git.
We do actually have basic documentation about this but most people don't
read it. For example, the commit message of a Merge commit should be:
Merge pull request #xxxx: [BEAM-yyyy] Issue title
But most merge commits don't comply with this rule :) See
https://beam.apache.org/contribute/committer-guide/#merging-it
On 19.09.18 21:34, Reuven Lax wrote:
Ideally every commit should compile and pass tests though, right?
On Wed, Sep 19, 2018 at 12:15 PM Ankur Goenka <goe...@google.com
<mailto:goe...@google.com>> wrote:
I agree with the cleanliness of the Commit history.
"Fixup!", "Address comments", "Address even more comments" type of
comments does not convey meaningful information and are not very
useful. Its a good idea to squash them.
However, I think its ok to keep separate commits for different
logical pieces of the code which make reviewing and revisiting code
easier.
Example PR: Support X in the pipeline
Commit 1: Restructuring a bunch of code without any logical change.
Commit 2: Changing validation logic for pipeline.
Commit 3: Supporting new field "X" for pipeline.
On Wed, Sep 19, 2018 at 11:27 AM Charles Chen <c...@google.com
<mailto:c...@google.com>> wrote:
To be concrete, it is very easy to revert a commit in any case:
1. First, use "git log --first-parent" to find the first-parent
commit corresponding to a PR merge (this is a one-to-one
correspondence).
2. Use "git revert -m 1 <commitid>" to revert the commit; this
selects the first parent as the base for a merge commit (in
the case where a single commit needs to be reverted, just
use "git revert <commitid>" without the "-m 1" flag).
In any case, as a general good engineering practice, I do agree
that it is highly desirable to have small independent PRs
instead of large jumbo PRs whenever possible.
On Wed, Sep 19, 2018 at 11:20 AM Charles Chen <c...@google.com
<mailto:c...@google.com>> wrote:
I don't think it's actually harder to roll back a set of
commits that are merged together. Git has the notion of
first-parent commits (you can see, for example, "git log
--first-parent", which filters out the intermediate
commits). In this sense, PRs still get merged as one unit
and this is preserved even if intermediate commits are
kept. Perhaps the bigger issue is that we need better
documentation and a playbook on how to do this these common
tasks in git.
On Wed, Sep 19, 2018 at 9:27 AM Thomas Weise <t...@apache.org
<mailto:t...@apache.org>> wrote:
Wanted to bring this up as reminder as well as
opportunity to discuss potential changes to our
committer guide. It has been a while since last related
discussion and we welcomed several new committers since
then.
Finishing up pull requests pre-merge:
https://beam.apache.org/contribute/committer-guide/#finishing-touches
PRs are worked on over time and may accumulate many
commits. Sometimes because scope expands, sometimes just
to separate independent changes but most of the time the
commits are just fixups that are added as review progresses.
It is important that the latter get squashed prior to PR
merge, as otherwise we lost the ability to roll back
changes by reverting a single commit and also generally
cause a lot of noise in the commit history that does not
help other contributors. To be clear, I refer to the
"Fixup!", "Address comments", "Address even more
comments" type of entries :)
I would also propose that every commit gets tagged with
a JIRA (except those fixups that will be squashed).
Having the JIRA and possibly other tags makes it easier
for others not involved in the PR to identify changes
after they were merged, for example when looking at the
revision history or annotated source.
As for other scenarios of jumbo PRs with many commits,
there are probably situations where work needs to be
broken down into smaller units, making life better for
both, contributor and reviewer(s). Ideally, every PR
would have only one commit, but that may be a bit much
to mandate? Is the general expectation something we need
to document more clearly?
Thanks,
Thomas