PR for this: https://github.com/apache/beam/pull/7129


On Tue, Oct 16, 2018 at 11:40 AM Robert Bradshaw <[email protected]>
wrote:

> Thanks for bringing this to a conclusion.
>
> On Mon, Oct 15, 2018 at 6:18 PM Thomas Weise <[email protected]> wrote:
> >
> > Here is my attempt to summarize the discussion, please see the TBDs.
> >
> > I would work on a PR with respective contributor and committer guideline
> updates next.
> >
> > Thanks,
> > Thomas
> >
> >
> > Goals:
> >
> > - Clear history with purpose and origin of changes
> > - Ability to perform a granular rollback, if necessary
> > - Efficiency and quality of review (avoid tiny or out-of-context changes
> or huge mega-changes)
> > - Efficiency of authoring (don't want to wait on a review for a tiny bit
> because GitHub makes it very hard to stack up reviews in sequence / don't
> want to have major changes blocked because of difficulty of review)
> > - Ease of new contribution ("OK for committers to do more, while
> new/one-time contributors shouldn't need to know or obey any policy"). TBD:
> I think that quote needs clarification. IMO contributors are expected to
> read and adhere to contribution guidelines.
> >
> > Clean history:
> >
> > - Commit messages should tag JIRAs and be otherwise descriptive. It
> should not be necessary to find a merge or first PR commit to find out what
> caused a change
> > - We want to squash "Fixup!", "Address comments" type of commits to
> achieve a clean history
> > - We prefer that PR authors squash such commits after review is
> complete. This expectation should be described clearly in the Contributor's
> Guide.
> > - The process should not burden committer due to back and forth with
> author and deal with cleaning up PR history and other cosmetics. We want to
> reduce committer overhead. But note that we don't want to shift the burden
> to first time contributors either.
> > - Committer can use the "squash and merge" option (or modify the PR
> commits in other ways). This should address the overhead concern. TBD: I
> would suggest that the author, when this is explicitly not desired, needs
> to indicate it upfront.
>
> I wouldn't require that the author note this on every PR. I would say
> if there are (obvious) fixup commits, the committer is free to squash
> unless asked not to. If the history is clean, favor trusting the
> author. Subject to the bullet point below.
>
> > - Committer is ultimately responsible (committer guidelines) and we
> "trust the committer's judgment"
> >
> > Granularity of changes:
> >
> > - We prefer small independent, incremental PRs with descriptive,
> isolated commits. Each commit is a single clear change.
> > - It is OK to keep separate commits for different logical pieces of the
> code, which make reviewing and revisiting code easier
> > - Making commits isolated is a good practice, PR author should be able
> to relatively easily split the PR upon reviewer's request
> > - For multiple commits in a PR, every commit that gets merged should
> compile and pass tests
>
> Generally +1, though there is an important exception. E,g. for large
> refactoring, often it's clearer (for both history and reviewing) to
> have a single commit that's "ran this tool/command line/script/..." on
> the codebase and a follow-up that's manual, human edits (e.g. to get
> it to compile and pass tests).
>
> > Git playbook
> >
> > - How to rollback multiple commits from single PR
> > - Simple step-by-step instructions for authors to cleanup history
>
> Regarding "rebase and merge," it's often the worst of both, as it
> gives multiple commits, but none of them matching what was in the
> author's branch. Generally those who care about history enough to
> create multiple distinct commits also prefer that the commits remain
> intact.
>
> > On Mon, Oct 1, 2018 at 11:50 AM Rui Wang <[email protected]> wrote:
> >>
> >> +1 to add JIRA issue as the part of commit message. it requires less
> effort but will help a lot on our commit history. I used to not do that but
> I will start to add JIRA info to my future commits.
> >>
> >> -Rui
> >>
> >> On Mon, Oct 1, 2018 at 2:59 AM Alexey Romanenko <
> [email protected]> wrote:
> >>>
> >>> +1 to what Anton said, I’m fully agree with this.
> >>>
> >>> Looking on the current commit history we can notice that there are
> many commits like “fix”, “add/edit/remove”, “clean up”, etc, that, seems,
> were introduced after the review process iterations. I think this makes
> commit history too much verbose and more difficult to follow.
> >>>
> >>> Also, most of such commits don’t have Jira issue name as part of
> commit message. So it requires to find a merge or fist PR commit in case we
> want to find out what caused this change.
> >>>
> >>> I believe we can improve our commit guidelines in this way and it
> should help to have commit history more clean and easy to read.
> >>>
> >>> On 1 Oct 2018, at 06:34, Kenneth Knowles <[email protected]> wrote:
> >>>
> >>> SGTM. I generally favor "trust the committer's judgment" aka (2)(d)
> when it is obvious. We had some small communication problem about this
> before so I just wanted to be careful to ask the author when it is not
> obvious.
> >>>
> >>> Kenn
> >>>
> >>> On Sat, Sep 29, 2018 at 9:06 AM Robert Bradshaw <[email protected]>
> wrote:
> >>>>
> >>>> There's two separate topics of discussion going on here.
> >>>>
> >>>> (1) Is it acceptable for PRs to have more than one commit. We've
> discussed this before, and the consensus seems to be that this is both
> allowed and intentional.
> >>>>
> >>>> (2) What to do about PRs that are clearly "[BEAM-NNNN] Implement
> feature" + (fixup/address comments/lint/...)*. I think this is easier (or
> at least quicker) to make a call on. Our options are
> >>>>
> >>>> (a) Merge as is.
> >>>> (b) Ask the author to do a final squash. (Doing a squash before an
> approval, however, makes reviewing more difficult, so this enforces another
> round.)
> >>>> (c) Having the committer manually pull, squash, (force push to PR?),
> and merge.
> >>>> (d) Use the "squash and merge" button in this case.
> >>>>
> >>>> Clearly, we shouldn't be doing any squashing unless the intent is
> obvious, but when it is, I think (d) is the sanest and safest route.
> >>>>
> >>>> - Robert
> >>>>
> >>>>
> >>>>
> >>>> On Fri, Sep 28, 2018 at 9:53 PM Kenneth Knowles <[email protected]>
> wrote:
> >>>>>
> >>>>> On Fri, Sep 28, 2018 at 10:29 AM Thomas Weise <[email protected]>
> wrote:
> >>>>>>
> >>>>>> +1 for stating the goal of clear provenance and granular rollback.
> >>>>>
> >>>>>
> >>>>>>
> >>>>>>
> >>>>>
> >>>>> Also of course efficiency and quality of review (we don't to review
> tiny or out-of-context changes or huge mega-changes), efficiency of
> authoring (don't want to wait on a review for a tiny bit because GitHub
> makes it very hard to stack up reviews in sequence / don't want to have
> major changes blocked because of difficulty of review), ease of new
> contribution (OK for committers to do more IMO, while new/one-time
> contributors shouldn't need to know or obey any policy).
> >>>>>
> >>>>>> I think this discussion helps to remind/identify best practices how
> to get there. Where appropriate we should augment guidelines for both,
> contributor and committer.
> >>>>>
> >>>>>
> >>>>>> Kenn: would you elaborate on the "1 commit = 1 review (and
> sometimes even = 1 ticket)" a bit more. Is that a problem of insufficient
> task / ticket granularity or something else?
> >>>>>
> >>>>>
> >>>>> The problem isn't a failure to define tasks at the right
> granularity, but that they naturally and fundamentally exist with a
> different granularity (unit of change -> unit of review -> unit of
> tracking/delivery). I'd guess there's often a 5x jump in size from commit
> -> review -> ticket. There are many many easy-to-isolate changes that are
> wasteful to independent review or track.
> >>>>>
> >>>>> To get some concrete facts, I bet the one things we could probably
> find research on is the ideal review size. And we could also scrape logs
> for messages with bullet points (often each bullet is basically what would
> have been a commit). Generally getting a sense of what these ratios are in
> practice and idealized would be kind of interesting but maybe overkill.
> >>>>>
> >>>>> Kenn
> >>>>>
> >>>>>> Charles: my plan is to translate the outcome of this discussion to
> guideline updates and your log filter trick will be part of it. Though my
> hope is that it won't be needed if we get closer to the goals above.
> >>>>>>
> >>>>>> Thanks,
> >>>>>> Thomas
> >>>>>>
> >>>>>>
> >>>>>> On Fri, Sep 28, 2018 at 9:44 AM Kenneth Knowles <[email protected]>
> wrote:
> >>>>>>>
> >>>>>>> Anton makes a good point. We have been talking about policy for
> what we do, but the real issue is what we want to come out of it: a clear
> history for seeing where code came from and granular rollback. I think in
> both cases the key thing is that each commit is a single clear change. How
> they get there is not the point.
> >>>>>>>
> >>>>>>> I have worked on multiple projects with a 1 commit = 1 review (and
> sometimes even = 1 ticket). These pretty much never have a good history.
> The best case is that each commit has a message that is a bullet point of
> many separate changes, because it is simply too inefficient to review each
> logical change separately. But since the messages become less useful, it
> encourages a culture of not even bothering to write meaningful messages at
> all.
> >>>>>>>
> >>>>>>> Note that is trivial for a committer-reviewer to edit the history
> any way they like without the button. "Allow edits by maintainers" is on by
> default. The "Squash and merge" button just adds a button for something we
> can already do.
> >>>>>>>
> >>>>>>> Charles: super useful! Worth noting that for a PR with a good
> history it will skip meaningful commits (but still give the summary line,
> which is nice).
> >>>>>>>
> >>>>>>> Kenn
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> On Fri, Sep 28, 2018 at 8:54 AM Anton Kedin <[email protected]>
> wrote:
> >>>>>>>>
> >>>>>>>> Is there an actual problem caused by squashing or not squashing
> the commits that we face in the project? I personally have never needed to
> revert something complicated that would be problematic either way (and
> don't have a strong opinion about which way we should do it). From what I
> see so far in the thread it doesn't look like reverting is a frequent major
> pain for anyone. Maybe it is exactly because we're mostly following some
> best practice and it makes it easy. If someone has concrete examples from
> their experience in the project, please share them, this way it would be
> easier to justify the choice.
> >>>>>>>>
> >>>>>>>> The PR and commit cleanliness, size and isolation are probably
> more important thing to have guidance and maybe enforcement for. There are
> well known practices and guidelines that I think we should follow, and I
> think they will make squashing or not squashing mostly irrelevant. For
> example, if we accept that commits should have description that actually
> describes what commit does, then "!fixup", "address comments" and similar
> should not be part of the history and should be squashed before submitting
> the PR no matter which way we decide to go in general. Also, I think that
> making commits isolated is also a good practice, and PR author should be
> able to relatively easily split the PR upon reviewer's request. And if we
> choose to keep whole PRs small and incremental with descriptive isolated
> commits, then there won't be too much difference how many commits there are.
> >>>>>>>>
> >>>>>>>> Regards,
> >>>>>>>> Anton
> >>>>>>>>
> >>>>>>>> On Fri, Sep 28, 2018 at 8:21 AM Andrew Pilloud <
> [email protected]> wrote:
> >>>>>>>>>
> >>>>>>>>> I brought up this discussion a few months ago from the other
> side: I don't like my commits being squashed. I try to create logical
> commits that each passes tests and could be broken up into multiple PRs.
> Keeping those changes intact is useful from a history perspective and
> squashing may break other PRs I have in flight. If the intent is clear (one
> commit with a descriptive message and a bunch of "fixups"), then feel free
> to squash, otherwise ask first. When you do squash, it would be good to
> leave a comment as to how the author can avoid having their commits
> squashed in the future.
> >>>>>>>>>
> >>>>>>>>>
> https://lists.apache.org/thread.html/8d29e474e681ab9123280164d95075bb8b0b91486b66d3fa25ed20c2@%3Cdev.beam.apache.org%3E
> >>>>>>>>>
> >>>>>>>>> Andrew
> >>>>>>>>>
> >>>>>>>>> On Fri, Sep 28, 2018 at 7:29 AM Chamikara Jayalath <
> [email protected]> wrote:
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> On Thu, Sep 27, 2018 at 9:51 AM Robert Bradshaw <
> [email protected]> wrote:
> >>>>>>>>>>>
> >>>>>>>>>>> I agree that we should create a good pointer for cleaning up
> PRs, and request (though not require) that authors do it. It's unfortunate
> though that squashing during a review makes things difficult to follow, so
> adds one more round trip.
> >>>>>>>>>>>
> >>>>>>>>>>> We could consider for those PRs that make sense as a single
> logical commit (most, but not all, of them) simply using the "squash and
> merge" button even though it technically doesn't create a merge commit.
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> +1 for allowing "squash and merge" as an option. Most of the
> reviews (at least for me) consist of a single valid commit and several
> additional commits that get piled up during the review process which
> obviously should not be included in the commit history. Going through
> another round here just to ask the author to fixup everything is
> unnecessarily time consuming.
> >>>>>>>>>>
> >>>>>>>>>> - Cham
> >>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> On Fri, Sep 21, 2018 at 9:24 PM Daniel Oliveira <
> [email protected]> wrote:
> >>>>>>>>>>>>
> >>>>>>>>>>>> As a non-committer I think some automated squashing of
> commits sounds best since it lightens the load of regular contributors, by
> not having to always remember to squash, and lightens the load of
> committers so it doesn't take as long to have your PR approved by one.
> >>>>>>>>>>>>
> >>>>>>>>>>>> But for now I think the second best route would be making it
> PR author's responsibility to squash fixup commits. Having that expectation
> described clearly in the Contributor's Guide, along with some simple
> step-by-step instructions for how to do so should be enough. I mainly
> support this because I've been doing the squashing myself since I saw a
> thread about it here a few months ago. It's not nearly as huge a burden on
> me as it probably is for committers who have to merge in many more PRs,
> it's very easy to learn how to do, and it's one less barrier to having my
> code merged in.
> >>>>>>>>>>>>
> >>>>>>>>>>>> Of course I wouldn't expect that committers wait for PR
> authors to squash their fixup commits, but I think leaving a message like
> "For future pull requests you should squash any small fixup commits, as
> described here: <link>" should be fine.
> >>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> I was also thinking about the possibility of wanting to
> revert
> >>>>>>>>>>>>> individual commits from a merge commit. The solution you
> propose works,
> >>>>>>>>>>>>> but only if you want to revert everything.
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> Does this happen often? I might not have enough context since
> I'm not a committer, but it seems to me that often the person performing a
> revert is not the original author of a change and doesn't have the context
> or time to pick out an individual commit to revert.
> >>>>>>>>>>>>
> >>>>>>>>>>>> On Wed, Sep 19, 2018 at 1:32 PM Maximilian Michels <
> [email protected]> wrote:
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> I tend to agree with you Lukasz. Of course we should try to
> follow the
> >>>>>>>>>>>>> guide lines as much as possible but if it requires an extra
> back and
> >>>>>>>>>>>>> forth with the PR author for a cosmetic change, it may not
> be worth the
> >>>>>>>>>>>>> time.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> On 19.09.18 22:17, Lukasz Cwik wrote:
> >>>>>>>>>>>>> > I have to say I'm guilty of not following the merge
> guidelines,
> >>>>>>>>>>>>> > sometimes doing merges without rebasing/flatten commits.
> >>>>>>>>>>>>> >
> >>>>>>>>>>>>> > I find that it is a few extra mins of my time to fix
> someones PR history
> >>>>>>>>>>>>> > if they have more then one logical commit they want to be
> separate and
> >>>>>>>>>>>>> > it usually takes days for the PR author to do merging
> with the extra
> >>>>>>>>>>>>> > burden as a committer to keep track of another PR and its
> state (waiting
> >>>>>>>>>>>>> > for clean-up) is taxing. I really liked the idea of the
> mergebot (even
> >>>>>>>>>>>>> > though it didn't work out in practice) because it could do
> all the
> >>>>>>>>>>>>> > policy work on my behalf.
> >>>>>>>>>>>>> >
> >>>>>>>>>>>>> > Anything that reduces my overhead as a committer is useful
> as for the
> >>>>>>>>>>>>> > 100s of PRs that I have merged, I've only had to rollback
> a couple so
> >>>>>>>>>>>>> > I'm for Charle's suggestion which makes the rollback flow
> slightly more
> >>>>>>>>>>>>> > complicated for a significantly easier PR merge workflow.
> >>>>>>>>>>>>> >
> >>>>>>>>>>>>> > On Wed, Sep 19, 2018 at 1:13 PM Charles Chen <
> [email protected]
> >>>>>>>>>>>>> > <mailto:[email protected]>> wrote:
> >>>>>>>>>>>>> >
> >>>>>>>>>>>>> >     What I mean is that if you get the first-parent commit
> using "git
> >>>>>>>>>>>>> >     log --first-parent", it will incorporate any and all
> fix up commits
> >>>>>>>>>>>>> >     so we don't need to worry about missing any.
> >>>>>>>>>>>>> >
> >>>>>>>>>>>>> >     On Wed, Sep 19, 2018, 1:07 PM Maximilian Michels <
> [email protected]
> >>>>>>>>>>>>> >     <mailto:[email protected]>> wrote:
> >>>>>>>>>>>>> >
> >>>>>>>>>>>>> >         Generally, +1 for isolated commits which are easy
> to revert.
> >>>>>>>>>>>>> >
> >>>>>>>>>>>>> >          > I don't think it's actually harder to roll back
> a set of
> >>>>>>>>>>>>> >         commits that are merged together.
> >>>>>>>>>>>>> >         I think Thomas was mainly concerned about "fixup"
> commits to
> >>>>>>>>>>>>> >         land in
> >>>>>>>>>>>>> >         master (as part of a merge). These indeed make
> reverting commits
> >>>>>>>>>>>>> >         more
> >>>>>>>>>>>>> >         difficult because you have to check whether you
> missed a "fixup".
> >>>>>>>>>>>>> >
> >>>>>>>>>>>>> >          > Ideally every commit should compile and pass
> tests though, right?
> >>>>>>>>>>>>> >
> >>>>>>>>>>>>> >         That is definitely what we should strive for when
> doing a merge
> >>>>>>>>>>>>> >         against
> >>>>>>>>>>>>> >         master.
> >>>>>>>>>>>>> >
> >>>>>>>>>>>>> >          > Perhaps the bigger issue is that we need better
> documentation
> >>>>>>>>>>>>> >         and a playbook on how to do this these common
> tasks in git.
> >>>>>>>>>>>>> >
> >>>>>>>>>>>>> >         We do actually have basic documentation about this
> but most
> >>>>>>>>>>>>> >         people don't
> >>>>>>>>>>>>> >         read it. For example, the commit message of a
> Merge commit
> >>>>>>>>>>>>> >         should be:
> >>>>>>>>>>>>> >
> >>>>>>>>>>>>> >         Merge pull request #xxxx: [BEAM-yyyy] Issue title
> >>>>>>>>>>>>> >
> >>>>>>>>>>>>> >         But most merge commits don't comply with this rule
> :) See
> >>>>>>>>>>>>> >
> https://beam.apache.org/contribute/committer-guide/#merging-it
> >>>>>>>>>>>>> >
> >>>>>>>>>>>>> >         On 19.09.18 21:34, Reuven Lax wrote:
> >>>>>>>>>>>>> >          > Ideally every commit should compile and pass
> tests though, right?
> >>>>>>>>>>>>> >          >
> >>>>>>>>>>>>> >          > On Wed, Sep 19, 2018 at 12:15 PM Ankur Goenka
> >>>>>>>>>>>>> >         <[email protected] <mailto:[email protected]>
> >>>>>>>>>>>>> >          > <mailto:[email protected] <mailto:
> [email protected]>>> wrote:
> >>>>>>>>>>>>> >          >
> >>>>>>>>>>>>> >          >     I agree with the cleanliness of the Commit
> history.
> >>>>>>>>>>>>> >          >     "Fixup!", "Address comments", "Address even
> more
> >>>>>>>>>>>>> >         comments" type of
> >>>>>>>>>>>>> >          >     comments does not convey meaningful
> information and are
> >>>>>>>>>>>>> >         not very
> >>>>>>>>>>>>> >          >     useful. Its a good idea to squash them.
> >>>>>>>>>>>>> >          >     However, I think its ok to keep separate
> commits for
> >>>>>>>>>>>>> >         different
> >>>>>>>>>>>>> >          >     logical pieces of the code which make
> reviewing and
> >>>>>>>>>>>>> >         revisiting code
> >>>>>>>>>>>>> >          >     easier.
> >>>>>>>>>>>>> >          >     Example PR: Support X in the pipeline
> >>>>>>>>>>>>> >          >     Commit 1: Restructuring a bunch of code
> without any
> >>>>>>>>>>>>> >         logical change.
> >>>>>>>>>>>>> >          >     Commit 2: Changing validation logic for
> pipeline.
> >>>>>>>>>>>>> >          >     Commit 3: Supporting new field "X" for
> pipeline.
> >>>>>>>>>>>>> >          >
> >>>>>>>>>>>>> >          >     On Wed, Sep 19, 2018 at 11:27 AM Charles
> Chen
> >>>>>>>>>>>>> >         <[email protected] <mailto:[email protected]>
> >>>>>>>>>>>>> >          >     <mailto:[email protected] <mailto:
> [email protected]>>> wrote:
> >>>>>>>>>>>>> >          >
> >>>>>>>>>>>>> >          >         To be concrete, it is very easy to
> revert a commit in
> >>>>>>>>>>>>> >         any case:
> >>>>>>>>>>>>> >          >
> >>>>>>>>>>>>> >          >          1. First, use "git log --first-parent"
> to find the
> >>>>>>>>>>>>> >         first-parent
> >>>>>>>>>>>>> >          >             commit corresponding to a PR merge
> (this is a
> >>>>>>>>>>>>> >         one-to-one
> >>>>>>>>>>>>> >          >             correspondence).
> >>>>>>>>>>>>> >          >          2. Use "git revert -m 1 <commitid>" to
> revert the
> >>>>>>>>>>>>> >         commit; this
> >>>>>>>>>>>>> >          >             selects the first parent as the
> base for a merge
> >>>>>>>>>>>>> >         commit (in
> >>>>>>>>>>>>> >          >             the case where a single commit
> needs to be
> >>>>>>>>>>>>> >         reverted, just
> >>>>>>>>>>>>> >          >             use "git revert <commitid>" without
> the "-m 1" flag).
> >>>>>>>>>>>>> >          >
> >>>>>>>>>>>>> >          >         In any case, as a general good
> engineering practice,
> >>>>>>>>>>>>> >         I do agree
> >>>>>>>>>>>>> >          >         that it is highly desirable to have
> small independent PRs
> >>>>>>>>>>>>> >          >         instead of large jumbo PRs whenever
> possible.
> >>>>>>>>>>>>> >          >
> >>>>>>>>>>>>> >          >         On Wed, Sep 19, 2018 at 11:20 AM
> Charles Chen
> >>>>>>>>>>>>> >         <[email protected] <mailto:[email protected]>
> >>>>>>>>>>>>> >          >         <mailto:[email protected] <mailto:
> [email protected]>>> wrote:
> >>>>>>>>>>>>> >          >
> >>>>>>>>>>>>> >          >             I don't think it's actually harder
> to roll back a
> >>>>>>>>>>>>> >         set of
> >>>>>>>>>>>>> >          >             commits that are merged together.
> Git has the
> >>>>>>>>>>>>> >         notion of
> >>>>>>>>>>>>> >          >             first-parent commits (you can see,
> for example,
> >>>>>>>>>>>>> >         "git log
> >>>>>>>>>>>>> >          >             --first-parent", which filters out
> the intermediate
> >>>>>>>>>>>>> >          >             commits).  In this sense, PRs still
> get merged as
> >>>>>>>>>>>>> >         one unit
> >>>>>>>>>>>>> >          >             and this is preserved even if
> intermediate
> >>>>>>>>>>>>> >         commits are
> >>>>>>>>>>>>> >          >             kept.  Perhaps the bigger issue is
> that we need
> >>>>>>>>>>>>> >         better
> >>>>>>>>>>>>> >          >             documentation and a playbook on how
> to do this
> >>>>>>>>>>>>> >         these common
> >>>>>>>>>>>>> >          >             tasks in git.
> >>>>>>>>>>>>> >          >
> >>>>>>>>>>>>> >          >             On Wed, Sep 19, 2018 at 9:27 AM
> Thomas Weise
> >>>>>>>>>>>>> >         <[email protected] <mailto:[email protected]>
> >>>>>>>>>>>>> >          >             <mailto:[email protected] <mailto:
> [email protected]>>>
> >>>>>>>>>>>>> >         wrote:
> >>>>>>>>>>>>> >          >
> >>>>>>>>>>>>> >          >                 Wanted to bring this up as
> reminder as well as
> >>>>>>>>>>>>> >          >                 opportunity to discuss
> potential changes to our
> >>>>>>>>>>>>> >          >                 committer guide. It has been a
> while since
> >>>>>>>>>>>>> >         last related
> >>>>>>>>>>>>> >          >                 discussion and we welcomed
> several new
> >>>>>>>>>>>>> >         committers since
> >>>>>>>>>>>>> >          >                 then.
> >>>>>>>>>>>>> >          >
> >>>>>>>>>>>>> >          >                 Finishing up pull requests
> pre-merge:
> >>>>>>>>>>>>> >          >
> >>>>>>>>>>>>> >          >
> >>>>>>>>>>>>> >
> https://beam.apache.org/contribute/committer-guide/#finishing-touches
> >>>>>>>>>>>>> >          >
> >>>>>>>>>>>>> >          >                 PRs are worked on over time and
> may
> >>>>>>>>>>>>> >         accumulate many
> >>>>>>>>>>>>> >          >                 commits. Sometimes because
> scope expands,
> >>>>>>>>>>>>> >         sometimes just
> >>>>>>>>>>>>> >          >                 to separate independent changes
> but most of
> >>>>>>>>>>>>> >         the time the
> >>>>>>>>>>>>> >          >                 commits are just fixups that
> are added as
> >>>>>>>>>>>>> >         review progresses.
> >>>>>>>>>>>>> >          >                 It is important that the latter
> get squashed
> >>>>>>>>>>>>> >         prior to PR
> >>>>>>>>>>>>> >          >                 merge, as otherwise we lost the
> ability to
> >>>>>>>>>>>>> >         roll back
> >>>>>>>>>>>>> >          >                 changes by reverting a single
> commit and also
> >>>>>>>>>>>>> >         generally
> >>>>>>>>>>>>> >          >                 cause a lot of noise in the
> commit history
> >>>>>>>>>>>>> >         that does not
> >>>>>>>>>>>>> >          >                 help other contributors. To be
> clear, I refer
> >>>>>>>>>>>>> >         to the
> >>>>>>>>>>>>> >          >                 "Fixup!", "Address comments",
> "Address even more
> >>>>>>>>>>>>> >          >                 comments" type of entries :)
> >>>>>>>>>>>>> >          >
> >>>>>>>>>>>>> >          >                 I would also propose that every
> commit gets
> >>>>>>>>>>>>> >         tagged with
> >>>>>>>>>>>>> >          >                 a JIRA (except those fixups
> that will be
> >>>>>>>>>>>>> >         squashed).
> >>>>>>>>>>>>> >          >                 Having the JIRA and possibly
> other tags makes
> >>>>>>>>>>>>> >         it easier
> >>>>>>>>>>>>> >          >                 for others not involved in the
> PR to identify
> >>>>>>>>>>>>> >         changes
> >>>>>>>>>>>>> >          >                 after they were merged, for
> example when
> >>>>>>>>>>>>> >         looking at the
> >>>>>>>>>>>>> >          >                 revision history or annotated
> source.
> >>>>>>>>>>>>> >          >
> >>>>>>>>>>>>> >          >                 As for other scenarios of jumbo
> PRs with many
> >>>>>>>>>>>>> >         commits,
> >>>>>>>>>>>>> >          >                 there are probably situations
> where work
> >>>>>>>>>>>>> >         needs to be
> >>>>>>>>>>>>> >          >                 broken down into smaller units,
> making life
> >>>>>>>>>>>>> >         better for
> >>>>>>>>>>>>> >          >                 both, contributor and
> reviewer(s). Ideally,
> >>>>>>>>>>>>> >         every PR
> >>>>>>>>>>>>> >          >                 would have only one commit, but
> that may be a
> >>>>>>>>>>>>> >         bit much
> >>>>>>>>>>>>> >          >                 to mandate? Is the general
> expectation
> >>>>>>>>>>>>> >         something we need
> >>>>>>>>>>>>> >          >                 to document more clearly?
> >>>>>>>>>>>>> >          >
> >>>>>>>>>>>>> >          >                 Thanks,
> >>>>>>>>>>>>> >          >                 Thomas
> >>>>>>>>>>>>> >          >
> >>>>>>>>>>>>> >
> >>>
> >>>
>

Reply via email to