Re: [DISCUSS] Committer Guidelines / Hygene before merging PRs

2018-11-26 Thread Thomas Weise
PR for this: https://github.com/apache/beam/pull/7129


On Tue, Oct 16, 2018 at 11:40 AM Robert Bradshaw 
wrote:

> Thanks for bringing this to a conclusion.
>
> On Mon, Oct 15, 2018 at 6:18 PM Thomas Weise  wrote:
> >
> > Here is my attempt to summarize the discussion, please see the TBDs.
> >
> > I would work on a PR with respective contributor and committer guideline
> updates next.
> >
> > Thanks,
> > Thomas
> >
> >
> > Goals:
> >
> > - Clear history with purpose and origin of changes
> > - Ability to perform a granular rollback, if necessary
> > - Efficiency and quality of review (avoid tiny or out-of-context changes
> or huge mega-changes)
> > - Efficiency of authoring (don't want to wait on a review for a tiny bit
> because GitHub makes it very hard to stack up reviews in sequence / don't
> want to have major changes blocked because of difficulty of review)
> > - Ease of new contribution ("OK for committers to do more, while
> new/one-time contributors shouldn't need to know or obey any policy"). TBD:
> I think that quote needs clarification. IMO contributors are expected to
> read and adhere to contribution guidelines.
> >
> > Clean history:
> >
> > - Commit messages should tag JIRAs and be otherwise descriptive. It
> should not be necessary to find a merge or first PR commit to find out what
> caused a change
> > - We want to squash "Fixup!", "Address comments" type of commits to
> achieve a clean history
> > - We prefer that PR authors squash such commits after review is
> complete. This expectation should be described clearly in the Contributor's
> Guide.
> > - The process should not burden committer due to back and forth with
> author and deal with cleaning up PR history and other cosmetics. We want to
> reduce committer overhead. But note that we don't want to shift the burden
> to first time contributors either.
> > - Committer can use the "squash and merge" option (or modify the PR
> commits in other ways). This should address the overhead concern. TBD: I
> would suggest that the author, when this is explicitly not desired, needs
> to indicate it upfront.
>
> I wouldn't require that the author note this on every PR. I would say
> if there are (obvious) fixup commits, the committer is free to squash
> unless asked not to. If the history is clean, favor trusting the
> author. Subject to the bullet point below.
>
> > - Committer is ultimately responsible (committer guidelines) and we
> "trust the committer's judgment"
> >
> > Granularity of changes:
> >
> > - We prefer small independent, incremental PRs with descriptive,
> isolated commits. Each commit is a single clear change.
> > - It is OK to keep separate commits for different logical pieces of the
> code, which make reviewing and revisiting code easier
> > - Making commits isolated is a good practice, PR author should be able
> to relatively easily split the PR upon reviewer's request
> > - For multiple commits in a PR, every commit that gets merged should
> compile and pass tests
>
> Generally +1, though there is an important exception. E,g. for large
> refactoring, often it's clearer (for both history and reviewing) to
> have a single commit that's "ran this tool/command line/script/..." on
> the codebase and a follow-up that's manual, human edits (e.g. to get
> it to compile and pass tests).
>
> > Git playbook
> >
> > - How to rollback multiple commits from single PR
> > - Simple step-by-step instructions for authors to cleanup history
>
> Regarding "rebase and merge," it's often the worst of both, as it
> gives multiple commits, but none of them matching what was in the
> author's branch. Generally those who care about history enough to
> create multiple distinct commits also prefer that the commits remain
> intact.
>
> > On Mon, Oct 1, 2018 at 11:50 AM Rui Wang  wrote:
> >>
> >> +1 to add JIRA issue as the part of commit message. it requires less
> effort but will help a lot on our commit history. I used to not do that but
> I will start to add JIRA info to my future commits.
> >>
> >> -Rui
> >>
> >> On Mon, Oct 1, 2018 at 2:59 AM Alexey Romanenko <
> aromanenko@gmail.com> wrote:
> >>>
> >>> +1 to what Anton said, I’m fully agree with this.
> >>>
> >>> Looking on the current commit history we can notice that there are
> many commits like “fix”, “add/edit/remove”, “clean up”, etc, that, seems,
> were introduced after the review process iterations. I think this makes
> commit history too much verbose and more difficult to follow.
> >>>
> >>> Also, most of such commits don’t have Jira issue name as part of
> commit message. So it requires to find a merge or fist PR commit in case we
> want to find out what caused this change.
> >>>
> >>> I believe we can improve our commit guidelines in this way and it
> should help to have commit history more clean and easy to read.
> >>>
> >>> On 1 Oct 2018, at 06:34, Kenneth Knowles  wrote:
> >>>
> >>> SGTM. I generally favor "trust the committer's judgment" aka (2)(d)
> when it is obvious. We had

Re: [DISCUSS] Committer Guidelines / Hygene before merging PRs

2018-10-16 Thread Robert Bradshaw
Thanks for bringing this to a conclusion.

On Mon, Oct 15, 2018 at 6:18 PM Thomas Weise  wrote:
>
> Here is my attempt to summarize the discussion, please see the TBDs.
>
> I would work on a PR with respective contributor and committer guideline 
> updates next.
>
> Thanks,
> Thomas
>
>
> Goals:
>
> - Clear history with purpose and origin of changes
> - Ability to perform a granular rollback, if necessary
> - Efficiency and quality of review (avoid tiny or out-of-context changes or 
> huge mega-changes)
> - Efficiency of authoring (don't want to wait on a review for a tiny bit 
> because GitHub makes it very hard to stack up reviews in sequence / don't 
> want to have major changes blocked because of difficulty of review)
> - Ease of new contribution ("OK for committers to do more, while new/one-time 
> contributors shouldn't need to know or obey any policy"). TBD: I think that 
> quote needs clarification. IMO contributors are expected to read and adhere 
> to contribution guidelines.
>
> Clean history:
>
> - Commit messages should tag JIRAs and be otherwise descriptive. It should 
> not be necessary to find a merge or first PR commit to find out what caused a 
> change
> - We want to squash "Fixup!", "Address comments" type of commits to achieve a 
> clean history
> - We prefer that PR authors squash such commits after review is complete. 
> This expectation should be described clearly in the Contributor's Guide.
> - The process should not burden committer due to back and forth with author 
> and deal with cleaning up PR history and other cosmetics. We want to reduce 
> committer overhead. But note that we don't want to shift the burden to first 
> time contributors either.
> - Committer can use the "squash and merge" option (or modify the PR commits 
> in other ways). This should address the overhead concern. TBD: I would 
> suggest that the author, when this is explicitly not desired, needs to 
> indicate it upfront.

I wouldn't require that the author note this on every PR. I would say
if there are (obvious) fixup commits, the committer is free to squash
unless asked not to. If the history is clean, favor trusting the
author. Subject to the bullet point below.

> - Committer is ultimately responsible (committer guidelines) and we "trust 
> the committer's judgment"
>
> Granularity of changes:
>
> - We prefer small independent, incremental PRs with descriptive, isolated 
> commits. Each commit is a single clear change.
> - It is OK to keep separate commits for different logical pieces of the code, 
> which make reviewing and revisiting code easier
> - Making commits isolated is a good practice, PR author should be able to 
> relatively easily split the PR upon reviewer's request
> - For multiple commits in a PR, every commit that gets merged should compile 
> and pass tests

Generally +1, though there is an important exception. E,g. for large
refactoring, often it's clearer (for both history and reviewing) to
have a single commit that's "ran this tool/command line/script/..." on
the codebase and a follow-up that's manual, human edits (e.g. to get
it to compile and pass tests).

> Git playbook
>
> - How to rollback multiple commits from single PR
> - Simple step-by-step instructions for authors to cleanup history

Regarding "rebase and merge," it's often the worst of both, as it
gives multiple commits, but none of them matching what was in the
author's branch. Generally those who care about history enough to
create multiple distinct commits also prefer that the commits remain
intact.

> On Mon, Oct 1, 2018 at 11:50 AM Rui Wang  wrote:
>>
>> +1 to add JIRA issue as the part of commit message. it requires less effort 
>> but will help a lot on our commit history. I used to not do that but I will 
>> start to add JIRA info to my future commits.
>>
>> -Rui
>>
>> On Mon, Oct 1, 2018 at 2:59 AM Alexey Romanenko  
>> wrote:
>>>
>>> +1 to what Anton said, I’m fully agree with this.
>>>
>>> Looking on the current commit history we can notice that there are many 
>>> commits like “fix”, “add/edit/remove”, “clean up”, etc, that, seems, were 
>>> introduced after the review process iterations. I think this makes commit 
>>> history too much verbose and more difficult to follow.
>>>
>>> Also, most of such commits don’t have Jira issue name as part of commit 
>>> message. So it requires to find a merge or fist PR commit in case we want 
>>> to find out what caused this change.
>>>
>>> I believe we can improve our commit guidelines in this way and it should 
>>> help to have commit history more clean and easy to read.
>>>
>>> On 1 Oct 2018, at 06:34, Kenneth Knowles  wrote:
>>>
>>> SGTM. I generally favor "trust the committer's judgment" aka (2)(d) when it 
>>> is obvious. We had some small communication problem about this before so I 
>>> just wanted to be careful to ask the author when it is not obvious.
>>>
>>> Kenn
>>>
>>> On Sat, Sep 29, 2018 at 9:06 AM Robert Bradshaw  wrote:

 There's two separate topic

Re: [DISCUSS] Committer Guidelines / Hygene before merging PRs

2018-10-16 Thread Mikhail Gryzykhin
+1 in general.

However, I'd suggest to use "rebase and merge" more often.
Otherwise reading history is really inconvenient. Like in attached
screenshot. You can never easily understand what is included in any of
previous commits of remotes/upstream/master, unless you do explicit git log
on those and do more digging. Having a linear history on apache/beam:master
will remove such ambiguity.

--Mikhail

Have feedback ?


On Mon, Oct 15, 2018 at 9:18 AM Thomas Weise  wrote:

> Here is my attempt to summarize the discussion, please see the TBDs.
>
> I would work on a PR with respective contributor and committer guideline
> updates next.
>
> Thanks,
> Thomas
>
>
> Goals:
>
> - Clear history with purpose and origin of changes
> - Ability to perform a granular rollback, if necessary
> - Efficiency and quality of review (avoid tiny or out-of-context changes
> or huge mega-changes)
> - Efficiency of authoring (don't want to wait on a review for a tiny bit
> because GitHub makes it very hard to stack up reviews in sequence / don't
> want to have major changes blocked because of difficulty of review)
> - Ease of new contribution ("OK for committers to do more, while
> new/one-time contributors shouldn't need to know or obey any policy"). TBD:
> I think that quote needs clarification. IMO contributors are expected to
> read and adhere to contribution guidelines.
>
> Clean history:
>
> - Commit messages should tag JIRAs and be otherwise descriptive. It should
> not be necessary to find a merge or first PR commit to find out what caused
> a change
> - We want to squash "Fixup!", "Address comments" type of commits to
> achieve a clean history
> - We prefer that PR authors squash such commits after review is complete.
> This expectation should be described clearly in the Contributor's Guide.
> - The process should not burden committer due to back and forth with
> author and deal with cleaning up PR history and other cosmetics. We want to
> reduce committer overhead. But note that we don't want to shift the burden
> to first time contributors either.
> - Committer can use the "squash and merge" option (or modify the PR
> commits in other ways). This should address the overhead concern. TBD: I
> would suggest that the author, when this is explicitly not desired, needs
> to indicate it upfront.
> - Committer is ultimately responsible (committer guidelines) and we "trust
> the committer's judgment"
>
> Granularity of changes:
>
> - We prefer small independent, incremental PRs with descriptive, isolated
> commits. Each commit is a single clear change.
> - It is OK to keep separate commits for different logical pieces of the
> code, which make reviewing and revisiting code easier
> - Making commits isolated is a good practice, PR author should be able to
> relatively easily split the PR upon reviewer's request
> - For multiple commits in a PR, every commit that gets merged should
> compile and pass tests
>
> Git playbook
>
> - How to rollback multiple commits from single PR
> - Simple step-by-step instructions for authors to cleanup history
>
>
>
> On Mon, Oct 1, 2018 at 11:50 AM Rui Wang  wrote:
>
>> +1 to add JIRA issue as the part of commit message. it requires less
>> effort but will help a lot on our commit history. I used to not do that but
>> I will start to add JIRA info to my future commits.
>>
>> -Rui
>>
>> On Mon, Oct 1, 2018 at 2:59 AM Alexey Romanenko 
>> wrote:
>>
>>> +1 to what Anton said, I’m fully agree with this.
>>>
>>> Looking on the current commit history we can notice that there are many
>>> commits like “fix”, “add/edit/remove”, “clean up”, etc, that, seems, were
>>> introduced after the review process iterations. I think this makes commit
>>> history too much verbose and more difficult to follow.
>>>
>>> Also, most of such commits don’t have Jira issue name as part of commit
>>> message. So it requires to find a merge or fist PR commit in case we want
>>> to find out what caused this change.
>>>
>>> I believe we can improve our commit guidelines in this way and it should
>>> help to have commit history more clean and easy to read.
>>>
>>> On 1 Oct 2018, at 06:34, Kenneth Knowles  wrote:
>>>
>>> SGTM. I generally favor "trust the committer's judgment" aka (2)(d) when
>>> it is obvious. We had some small communication problem about this before so
>>> I just wanted to be careful to ask the author when it is not obvious.
>>>
>>> Kenn
>>>
>>> On Sat, Sep 29, 2018 at 9:06 AM Robert Bradshaw 
>>> wrote:
>>>
 There's two separate topics of discussion going on here.

 (1) Is it acceptable for PRs to have more than one commit. We've
 discussed this before, and the consensus seems to be that this is both
 allowed and intentional.

 (2) What to do about PRs that are clearly "[BEAM-] Implement
 feature" + (fixup/address comments/lint/...)*. I think this is easier (or
 at least quicker) to make a call on. Our options are

 (a) Merge as is.

Re: [DISCUSS] Committer Guidelines / Hygene before merging PRs

2018-10-15 Thread Thomas Weise
Here is my attempt to summarize the discussion, please see the TBDs.

I would work on a PR with respective contributor and committer guideline
updates next.

Thanks,
Thomas


Goals:

- Clear history with purpose and origin of changes
- Ability to perform a granular rollback, if necessary
- Efficiency and quality of review (avoid tiny or out-of-context changes or
huge mega-changes)
- Efficiency of authoring (don't want to wait on a review for a tiny bit
because GitHub makes it very hard to stack up reviews in sequence / don't
want to have major changes blocked because of difficulty of review)
- Ease of new contribution ("OK for committers to do more, while
new/one-time contributors shouldn't need to know or obey any policy"). TBD:
I think that quote needs clarification. IMO contributors are expected to
read and adhere to contribution guidelines.

Clean history:

- Commit messages should tag JIRAs and be otherwise descriptive. It should
not be necessary to find a merge or first PR commit to find out what caused
a change
- We want to squash "Fixup!", "Address comments" type of commits to achieve
a clean history
- We prefer that PR authors squash such commits after review is complete.
This expectation should be described clearly in the Contributor's Guide.
- The process should not burden committer due to back and forth with author
and deal with cleaning up PR history and other cosmetics. We want to reduce
committer overhead. But note that we don't want to shift the burden to
first time contributors either.
- Committer can use the "squash and merge" option (or modify the PR commits
in other ways). This should address the overhead concern. TBD: I would
suggest that the author, when this is explicitly not desired, needs to
indicate it upfront.
- Committer is ultimately responsible (committer guidelines) and we "trust
the committer's judgment"

Granularity of changes:

- We prefer small independent, incremental PRs with descriptive, isolated
commits. Each commit is a single clear change.
- It is OK to keep separate commits for different logical pieces of the
code, which make reviewing and revisiting code easier
- Making commits isolated is a good practice, PR author should be able to
relatively easily split the PR upon reviewer's request
- For multiple commits in a PR, every commit that gets merged should
compile and pass tests

Git playbook

- How to rollback multiple commits from single PR
- Simple step-by-step instructions for authors to cleanup history



On Mon, Oct 1, 2018 at 11:50 AM Rui Wang  wrote:

> +1 to add JIRA issue as the part of commit message. it requires less
> effort but will help a lot on our commit history. I used to not do that but
> I will start to add JIRA info to my future commits.
>
> -Rui
>
> On Mon, Oct 1, 2018 at 2:59 AM Alexey Romanenko 
> wrote:
>
>> +1 to what Anton said, I’m fully agree with this.
>>
>> Looking on the current commit history we can notice that there are many
>> commits like “fix”, “add/edit/remove”, “clean up”, etc, that, seems, were
>> introduced after the review process iterations. I think this makes commit
>> history too much verbose and more difficult to follow.
>>
>> Also, most of such commits don’t have Jira issue name as part of commit
>> message. So it requires to find a merge or fist PR commit in case we want
>> to find out what caused this change.
>>
>> I believe we can improve our commit guidelines in this way and it should
>> help to have commit history more clean and easy to read.
>>
>> On 1 Oct 2018, at 06:34, Kenneth Knowles  wrote:
>>
>> SGTM. I generally favor "trust the committer's judgment" aka (2)(d) when
>> it is obvious. We had some small communication problem about this before so
>> I just wanted to be careful to ask the author when it is not obvious.
>>
>> Kenn
>>
>> On Sat, Sep 29, 2018 at 9:06 AM Robert Bradshaw 
>> wrote:
>>
>>> There's two separate topics of discussion going on here.
>>>
>>> (1) Is it acceptable for PRs to have more than one commit. We've
>>> discussed this before, and the consensus seems to be that this is both
>>> allowed and intentional.
>>>
>>> (2) What to do about PRs that are clearly "[BEAM-] Implement
>>> feature" + (fixup/address comments/lint/...)*. I think this is easier (or
>>> at least quicker) to make a call on. Our options are
>>>
>>> (a) Merge as is.
>>> (b) Ask the author to do a final squash. (Doing a squash before an
>>> approval, however, makes reviewing more difficult, so this enforces another
>>> round.)
>>> (c) Having the committer manually pull, squash, (force push to PR?), and
>>> merge.
>>> (d) Use the "squash and merge" button in this case.
>>>
>>> Clearly, we shouldn't be doing any squashing unless the intent is
>>> obvious, but when it is, I think (d) is the sanest and safest route.
>>>
>>> - Robert
>>>
>>>
>>>
>>> On Fri, Sep 28, 2018 at 9:53 PM Kenneth Knowles  wrote:
>>>
 On Fri, Sep 28, 2018 at 10:29 AM Thomas Weise  wrote:

> +1 for stating the goal of clear provenance and

Re: [DISCUSS] Committer Guidelines / Hygene before merging PRs

2018-10-01 Thread Rui Wang
+1 to add JIRA issue as the part of commit message. it requires less effort
but will help a lot on our commit history. I used to not do that but I will
start to add JIRA info to my future commits.

-Rui

On Mon, Oct 1, 2018 at 2:59 AM Alexey Romanenko 
wrote:

> +1 to what Anton said, I’m fully agree with this.
>
> Looking on the current commit history we can notice that there are many
> commits like “fix”, “add/edit/remove”, “clean up”, etc, that, seems, were
> introduced after the review process iterations. I think this makes commit
> history too much verbose and more difficult to follow.
>
> Also, most of such commits don’t have Jira issue name as part of commit
> message. So it requires to find a merge or fist PR commit in case we want
> to find out what caused this change.
>
> I believe we can improve our commit guidelines in this way and it should
> help to have commit history more clean and easy to read.
>
> On 1 Oct 2018, at 06:34, Kenneth Knowles  wrote:
>
> SGTM. I generally favor "trust the committer's judgment" aka (2)(d) when
> it is obvious. We had some small communication problem about this before so
> I just wanted to be careful to ask the author when it is not obvious.
>
> Kenn
>
> On Sat, Sep 29, 2018 at 9:06 AM Robert Bradshaw 
> wrote:
>
>> There's two separate topics of discussion going on here.
>>
>> (1) Is it acceptable for PRs to have more than one commit. We've
>> discussed this before, and the consensus seems to be that this is both
>> allowed and intentional.
>>
>> (2) What to do about PRs that are clearly "[BEAM-] Implement feature"
>> + (fixup/address comments/lint/...)*. I think this is easier (or at least
>> quicker) to make a call on. Our options are
>>
>> (a) Merge as is.
>> (b) Ask the author to do a final squash. (Doing a squash before an
>> approval, however, makes reviewing more difficult, so this enforces another
>> round.)
>> (c) Having the committer manually pull, squash, (force push to PR?), and
>> merge.
>> (d) Use the "squash and merge" button in this case.
>>
>> Clearly, we shouldn't be doing any squashing unless the intent is
>> obvious, but when it is, I think (d) is the sanest and safest route.
>>
>> - Robert
>>
>>
>>
>> On Fri, Sep 28, 2018 at 9:53 PM Kenneth Knowles  wrote:
>>
>>> On Fri, Sep 28, 2018 at 10:29 AM Thomas Weise  wrote:
>>>
 +1 for stating the goal of clear provenance and granular rollback.

>>>
>>>


>>> Also of course efficiency and quality of review (we don't to review tiny
>>> or out-of-context changes or huge mega-changes), efficiency of authoring
>>> (don't want to wait on a review for a tiny bit because GitHub makes it very
>>> hard to stack up reviews in sequence / don't want to have major changes
>>> blocked because of difficulty of review), ease of new contribution (OK for
>>> committers to do more IMO, while new/one-time contributors shouldn't need
>>> to know or obey any policy).
>>>
>>> I think this discussion helps to remind/identify best practices how to
 get there. Where appropriate we should augment guidelines for both,
 contributor and committer.

>>>
>>> Kenn: would you elaborate on the "1 commit = 1 review (and sometimes
 even = 1 ticket)" a bit more. Is that a problem of insufficient task /
 ticket granularity or something else?

>>>
>>> The problem isn't a failure to define tasks at the right granularity,
>>> but that they naturally and fundamentally exist with a different
>>> granularity (unit of change -> unit of review -> unit of
>>> tracking/delivery). I'd guess there's often a 5x jump in size from commit
>>> -> review -> ticket. There are many many easy-to-isolate changes that are
>>> wasteful to independent review or track.
>>>
>>> To get some concrete facts, I bet the one things we could probably find
>>> research on is the ideal review size. And we could also scrape logs for
>>> messages with bullet points (often each bullet is basically what would have
>>> been a commit). Generally getting a sense of what these ratios are in
>>> practice and idealized would be kind of interesting but maybe overkill.
>>>
>>> Kenn
>>>
>>> Charles: my plan is to translate the outcome of this discussion to
 guideline updates and your log filter trick will be part of it. Though my
 hope is that it won't be needed if we get closer to the goals above.

 Thanks,
 Thomas


 On Fri, Sep 28, 2018 at 9:44 AM Kenneth Knowles 
 wrote:

> Anton makes a good point. We have been talking about policy for what
> we do, but the real issue is what we want to come out of it: a clear
> history for seeing where code came from and granular rollback. I think in
> both cases the key thing is that each commit is a single clear change. How
> they get there is not the point.
>
> I have worked on multiple projects with a 1 commit = 1 review (and
> sometimes even = 1 ticket). These pretty much never have a good history.
> The best case is that 

Re: [DISCUSS] Committer Guidelines / Hygene before merging PRs

2018-10-01 Thread Alexey Romanenko
+1 to what Anton said, I’m fully agree with this. 

Looking on the current commit history we can notice that there are many commits 
like “fix”, “add/edit/remove”, “clean up”, etc, that, seems, were introduced 
after the review process iterations. I think this makes commit history too much 
verbose and more difficult to follow. 

Also, most of such commits don’t have Jira issue name as part of commit 
message. So it requires to find a merge or fist PR commit in case we want to 
find out what caused this change. 

I believe we can improve our commit guidelines in this way and it should help 
to have commit history more clean and easy to read. 

> On 1 Oct 2018, at 06:34, Kenneth Knowles  wrote:
> 
> SGTM. I generally favor "trust the committer's judgment" aka (2)(d) when it 
> is obvious. We had some small communication problem about this before so I 
> just wanted to be careful to ask the author when it is not obvious.
> 
> Kenn
> 
> On Sat, Sep 29, 2018 at 9:06 AM Robert Bradshaw  > wrote:
> There's two separate topics of discussion going on here. 
> 
> (1) Is it acceptable for PRs to have more than one commit. We've discussed 
> this before, and the consensus seems to be that this is both allowed and 
> intentional. 
> 
> (2) What to do about PRs that are clearly "[BEAM-] Implement feature" + 
> (fixup/address comments/lint/...)*. I think this is easier (or at least 
> quicker) to make a call on. Our options are 
> 
> (a) Merge as is.
> (b) Ask the author to do a final squash. (Doing a squash before an approval, 
> however, makes reviewing more difficult, so this enforces another round.)
> (c) Having the committer manually pull, squash, (force push to PR?), and 
> merge. 
> (d) Use the "squash and merge" button in this case. 
> 
> Clearly, we shouldn't be doing any squashing unless the intent is obvious, 
> but when it is, I think (d) is the sanest and safest route. 
> 
> - Robert
> 
> 
> 
> On Fri, Sep 28, 2018 at 9:53 PM Kenneth Knowles  > wrote:
> On Fri, Sep 28, 2018 at 10:29 AM Thomas Weise  > wrote:
> +1 for stating the goal of clear provenance and granular rollback.
>
>  
> Also of course efficiency and quality of review (we don't to review tiny or 
> out-of-context changes or huge mega-changes), efficiency of authoring (don't 
> want to wait on a review for a tiny bit because GitHub makes it very hard to 
> stack up reviews in sequence / don't want to have major changes blocked 
> because of difficulty of review), ease of new contribution (OK for committers 
> to do more IMO, while new/one-time contributors shouldn't need to know or 
> obey any policy).
> 
> I think this discussion helps to remind/identify best practices how to get 
> there. Where appropriate we should augment guidelines for both, contributor 
> and committer. 
> 
> Kenn: would you elaborate on the "1 commit = 1 review (and sometimes even = 1 
> ticket)" a bit more. Is that a problem of insufficient task / ticket 
> granularity or something else?
> 
> The problem isn't a failure to define tasks at the right granularity, but 
> that they naturally and fundamentally exist with a different granularity 
> (unit of change -> unit of review -> unit of tracking/delivery). I'd guess 
> there's often a 5x jump in size from commit -> review -> ticket. There are 
> many many easy-to-isolate changes that are wasteful to independent review or 
> track.
> 
> To get some concrete facts, I bet the one things we could probably find 
> research on is the ideal review size. And we could also scrape logs for 
> messages with bullet points (often each bullet is basically what would have 
> been a commit). Generally getting a sense of what these ratios are in 
> practice and idealized would be kind of interesting but maybe overkill.
> 
> Kenn
> 
> Charles: my plan is to translate the outcome of this discussion to guideline 
> updates and your log filter trick will be part of it. Though my hope is that 
> it won't be needed if we get closer to the goals above.
> 
> Thanks,
> Thomas
>  
> 
> On Fri, Sep 28, 2018 at 9:44 AM Kenneth Knowles  > wrote:
> Anton makes a good point. We have been talking about policy for what we do, 
> but the real issue is what we want to come out of it: a clear history for 
> seeing where code came from and granular rollback. I think in both cases the 
> key thing is that each commit is a single clear change. How they get there is 
> not the point.
> 
> I have worked on multiple projects with a 1 commit = 1 review (and sometimes 
> even = 1 ticket). These pretty much never have a good history. The best case 
> is that each commit has a message that is a bullet point of many separate 
> changes, because it is simply too inefficient to review each logical change 
> separately. But since the messages become less useful, it encourages a 
> culture of not even bothering to write meaningful messages at all.
> 
> Note that is 

Re: [DISCUSS] Committer Guidelines / Hygene before merging PRs

2018-09-30 Thread Kenneth Knowles
SGTM. I generally favor "trust the committer's judgment" aka (2)(d) when it
is obvious. We had some small communication problem about this before so I
just wanted to be careful to ask the author when it is not obvious.

Kenn

On Sat, Sep 29, 2018 at 9:06 AM Robert Bradshaw  wrote:

> There's two separate topics of discussion going on here.
>
> (1) Is it acceptable for PRs to have more than one commit. We've discussed
> this before, and the consensus seems to be that this is both allowed and
> intentional.
>
> (2) What to do about PRs that are clearly "[BEAM-] Implement feature"
> + (fixup/address comments/lint/...)*. I think this is easier (or at least
> quicker) to make a call on. Our options are
>
> (a) Merge as is.
> (b) Ask the author to do a final squash. (Doing a squash before an
> approval, however, makes reviewing more difficult, so this enforces another
> round.)
> (c) Having the committer manually pull, squash, (force push to PR?), and
> merge.
> (d) Use the "squash and merge" button in this case.
>
> Clearly, we shouldn't be doing any squashing unless the intent is obvious,
> but when it is, I think (d) is the sanest and safest route.
>
> - Robert
>
>
>
> On Fri, Sep 28, 2018 at 9:53 PM Kenneth Knowles  wrote:
>
>> On Fri, Sep 28, 2018 at 10:29 AM Thomas Weise  wrote:
>>
>>> +1 for stating the goal of clear provenance and granular rollback.
>>>
>>
>>
>>>
>>>
>> Also of course efficiency and quality of review (we don't to review tiny
>> or out-of-context changes or huge mega-changes), efficiency of authoring
>> (don't want to wait on a review for a tiny bit because GitHub makes it very
>> hard to stack up reviews in sequence / don't want to have major changes
>> blocked because of difficulty of review), ease of new contribution (OK for
>> committers to do more IMO, while new/one-time contributors shouldn't need
>> to know or obey any policy).
>>
>> I think this discussion helps to remind/identify best practices how to
>>> get there. Where appropriate we should augment guidelines for both,
>>> contributor and committer.
>>>
>>
>> Kenn: would you elaborate on the "1 commit = 1 review (and sometimes even
>>> = 1 ticket)" a bit more. Is that a problem of insufficient task / ticket
>>> granularity or something else?
>>>
>>
>> The problem isn't a failure to define tasks at the right granularity, but
>> that they naturally and fundamentally exist with a different granularity
>> (unit of change -> unit of review -> unit of tracking/delivery). I'd guess
>> there's often a 5x jump in size from commit -> review -> ticket. There are
>> many many easy-to-isolate changes that are wasteful to independent review
>> or track.
>>
>> To get some concrete facts, I bet the one things we could probably find
>> research on is the ideal review size. And we could also scrape logs for
>> messages with bullet points (often each bullet is basically what would have
>> been a commit). Generally getting a sense of what these ratios are in
>> practice and idealized would be kind of interesting but maybe overkill.
>>
>> Kenn
>>
>> Charles: my plan is to translate the outcome of this discussion to
>>> guideline updates and your log filter trick will be part of it. Though my
>>> hope is that it won't be needed if we get closer to the goals above.
>>>
>>> Thanks,
>>> Thomas
>>>
>>>
>>> On Fri, Sep 28, 2018 at 9:44 AM Kenneth Knowles  wrote:
>>>
 Anton makes a good point. We have been talking about policy for what we
 do, but the real issue is what we want to come out of it: a clear history
 for seeing where code came from and granular rollback. I think in both
 cases the key thing is that each commit is a single clear change. How they
 get there is not the point.

 I have worked on multiple projects with a 1 commit = 1 review (and
 sometimes even = 1 ticket). These pretty much never have a good history.
 The best case is that each commit has a message that is a bullet point of
 many separate changes, because it is simply too inefficient to review each
 logical change separately. But since the messages become less useful, it
 encourages a culture of not even bothering to write meaningful messages at
 all.

 Note that is trivial for a committer-reviewer to edit the history any
 way they like without the button. "Allow edits by maintainers" is on by
 default. The "Squash and merge" button just adds a button for something we
 can already do.

 Charles: super useful! Worth noting that for a PR with a good history
 it will skip meaningful commits (but still give the summary line, which is
 nice).

 Kenn



 On Fri, Sep 28, 2018 at 8:54 AM Anton Kedin  wrote:

> Is there an actual problem caused by squashing or not squashing the
> commits that we face in the project? I personally have never needed to
> revert something complicated that would be problematic either way (and
> don't have a strong opinion abo

Re: [DISCUSS] Committer Guidelines / Hygene before merging PRs

2018-09-29 Thread Robert Bradshaw
There's two separate topics of discussion going on here.

(1) Is it acceptable for PRs to have more than one commit. We've discussed
this before, and the consensus seems to be that this is both allowed and
intentional.

(2) What to do about PRs that are clearly "[BEAM-] Implement feature" +
(fixup/address comments/lint/...)*. I think this is easier (or at least
quicker) to make a call on. Our options are

(a) Merge as is.
(b) Ask the author to do a final squash. (Doing a squash before an
approval, however, makes reviewing more difficult, so this enforces another
round.)
(c) Having the committer manually pull, squash, (force push to PR?), and
merge.
(d) Use the "squash and merge" button in this case.

Clearly, we shouldn't be doing any squashing unless the intent is obvious,
but when it is, I think (d) is the sanest and safest route.

- Robert



On Fri, Sep 28, 2018 at 9:53 PM Kenneth Knowles  wrote:

> On Fri, Sep 28, 2018 at 10:29 AM Thomas Weise  wrote:
>
>> +1 for stating the goal of clear provenance and granular rollback.
>>
>
>
>>
>>
> Also of course efficiency and quality of review (we don't to review tiny
> or out-of-context changes or huge mega-changes), efficiency of authoring
> (don't want to wait on a review for a tiny bit because GitHub makes it very
> hard to stack up reviews in sequence / don't want to have major changes
> blocked because of difficulty of review), ease of new contribution (OK for
> committers to do more IMO, while new/one-time contributors shouldn't need
> to know or obey any policy).
>
> I think this discussion helps to remind/identify best practices how to get
>> there. Where appropriate we should augment guidelines for both, contributor
>> and committer.
>>
>
> Kenn: would you elaborate on the "1 commit = 1 review (and sometimes even
>> = 1 ticket)" a bit more. Is that a problem of insufficient task / ticket
>> granularity or something else?
>>
>
> The problem isn't a failure to define tasks at the right granularity, but
> that they naturally and fundamentally exist with a different granularity
> (unit of change -> unit of review -> unit of tracking/delivery). I'd guess
> there's often a 5x jump in size from commit -> review -> ticket. There are
> many many easy-to-isolate changes that are wasteful to independent review
> or track.
>
> To get some concrete facts, I bet the one things we could probably find
> research on is the ideal review size. And we could also scrape logs for
> messages with bullet points (often each bullet is basically what would have
> been a commit). Generally getting a sense of what these ratios are in
> practice and idealized would be kind of interesting but maybe overkill.
>
> Kenn
>
> Charles: my plan is to translate the outcome of this discussion to
>> guideline updates and your log filter trick will be part of it. Though my
>> hope is that it won't be needed if we get closer to the goals above.
>>
>> Thanks,
>> Thomas
>>
>>
>> On Fri, Sep 28, 2018 at 9:44 AM Kenneth Knowles  wrote:
>>
>>> Anton makes a good point. We have been talking about policy for what we
>>> do, but the real issue is what we want to come out of it: a clear history
>>> for seeing where code came from and granular rollback. I think in both
>>> cases the key thing is that each commit is a single clear change. How they
>>> get there is not the point.
>>>
>>> I have worked on multiple projects with a 1 commit = 1 review (and
>>> sometimes even = 1 ticket). These pretty much never have a good history.
>>> The best case is that each commit has a message that is a bullet point of
>>> many separate changes, because it is simply too inefficient to review each
>>> logical change separately. But since the messages become less useful, it
>>> encourages a culture of not even bothering to write meaningful messages at
>>> all.
>>>
>>> Note that is trivial for a committer-reviewer to edit the history any
>>> way they like without the button. "Allow edits by maintainers" is on by
>>> default. The "Squash and merge" button just adds a button for something we
>>> can already do.
>>>
>>> Charles: super useful! Worth noting that for a PR with a good history it
>>> will skip meaningful commits (but still give the summary line, which is
>>> nice).
>>>
>>> Kenn
>>>
>>>
>>>
>>> On Fri, Sep 28, 2018 at 8:54 AM Anton Kedin  wrote:
>>>
 Is there an actual problem caused by squashing or not squashing the
 commits that we face in the project? I personally have never needed to
 revert something complicated that would be problematic either way (and
 don't have a strong opinion about which way we should do it). From what I
 see so far in the thread it doesn't look like reverting is a frequent major
 pain for anyone. Maybe it is exactly because we're mostly following some
 best practice and it makes it easy. If someone has concrete examples from
 their experience in the project, please share them, this way it would be
 easier to justify the choice.

 The PR a

Re: [DISCUSS] Committer Guidelines / Hygene before merging PRs

2018-09-28 Thread Kenneth Knowles
On Fri, Sep 28, 2018 at 10:29 AM Thomas Weise  wrote:

> +1 for stating the goal of clear provenance and granular rollback.
>


>
>
Also of course efficiency and quality of review (we don't to review tiny or
out-of-context changes or huge mega-changes), efficiency of authoring
(don't want to wait on a review for a tiny bit because GitHub makes it very
hard to stack up reviews in sequence / don't want to have major changes
blocked because of difficulty of review), ease of new contribution (OK for
committers to do more IMO, while new/one-time contributors shouldn't need
to know or obey any policy).

I think this discussion helps to remind/identify best practices how to get
> there. Where appropriate we should augment guidelines for both, contributor
> and committer.
>

Kenn: would you elaborate on the "1 commit = 1 review (and sometimes even =
> 1 ticket)" a bit more. Is that a problem of insufficient task / ticket
> granularity or something else?
>

The problem isn't a failure to define tasks at the right granularity, but
that they naturally and fundamentally exist with a different granularity
(unit of change -> unit of review -> unit of tracking/delivery). I'd guess
there's often a 5x jump in size from commit -> review -> ticket. There are
many many easy-to-isolate changes that are wasteful to independent review
or track.

To get some concrete facts, I bet the one things we could probably find
research on is the ideal review size. And we could also scrape logs for
messages with bullet points (often each bullet is basically what would have
been a commit). Generally getting a sense of what these ratios are in
practice and idealized would be kind of interesting but maybe overkill.

Kenn

Charles: my plan is to translate the outcome of this discussion to
> guideline updates and your log filter trick will be part of it. Though my
> hope is that it won't be needed if we get closer to the goals above.
>
> Thanks,
> Thomas
>
>
> On Fri, Sep 28, 2018 at 9:44 AM Kenneth Knowles  wrote:
>
>> Anton makes a good point. We have been talking about policy for what we
>> do, but the real issue is what we want to come out of it: a clear history
>> for seeing where code came from and granular rollback. I think in both
>> cases the key thing is that each commit is a single clear change. How they
>> get there is not the point.
>>
>> I have worked on multiple projects with a 1 commit = 1 review (and
>> sometimes even = 1 ticket). These pretty much never have a good history.
>> The best case is that each commit has a message that is a bullet point of
>> many separate changes, because it is simply too inefficient to review each
>> logical change separately. But since the messages become less useful, it
>> encourages a culture of not even bothering to write meaningful messages at
>> all.
>>
>> Note that is trivial for a committer-reviewer to edit the history any way
>> they like without the button. "Allow edits by maintainers" is on by
>> default. The "Squash and merge" button just adds a button for something we
>> can already do.
>>
>> Charles: super useful! Worth noting that for a PR with a good history it
>> will skip meaningful commits (but still give the summary line, which is
>> nice).
>>
>> Kenn
>>
>>
>>
>> On Fri, Sep 28, 2018 at 8:54 AM Anton Kedin  wrote:
>>
>>> Is there an actual problem caused by squashing or not squashing the
>>> commits that we face in the project? I personally have never needed to
>>> revert something complicated that would be problematic either way (and
>>> don't have a strong opinion about which way we should do it). From what I
>>> see so far in the thread it doesn't look like reverting is a frequent major
>>> pain for anyone. Maybe it is exactly because we're mostly following some
>>> best practice and it makes it easy. If someone has concrete examples from
>>> their experience in the project, please share them, this way it would be
>>> easier to justify the choice.
>>>
>>> The PR and commit cleanliness, size and isolation are probably more
>>> important thing to have guidance and maybe enforcement for. There are well
>>> known practices and guidelines that I think we should follow, and I think
>>> they will make squashing or not squashing mostly irrelevant. For example,
>>> if we accept that commits should have description that actually describes
>>> what commit does, then "!fixup", "address comments" and similar should not
>>> be part of the history and should be squashed before submitting the PR no
>>> matter which way we decide to go in general. Also, I think that making
>>> commits isolated is also a good practice, and PR author should be able to
>>> relatively easily split the PR upon reviewer's request. And if we choose to
>>> keep whole PRs small and incremental with descriptive isolated commits,
>>> then there won't be too much difference how many commits there are.
>>>
>>> Regards,
>>> Anton
>>>
>>> On Fri, Sep 28, 2018 at 8:21 AM Andrew Pilloud 
>>> wrote:
>>>
 I brought up this d

Re: [DISCUSS] Committer Guidelines / Hygene before merging PRs

2018-09-28 Thread Thomas Weise
+1 for stating the goal of clear provenance and granular rollback.

I think this discussion helps to remind/identify best practices how to get
there. Where appropriate we should augment guidelines for both, contributor
and committer.

Kenn: would you elaborate on the "1 commit = 1 review (and sometimes even =
1 ticket)" a bit more. Is that a problem of insufficient task / ticket
granularity or something else?

Charles: my plan is to translate the outcome of this discussion to
guideline updates and your log filter trick will be part of it. Though my
hope is that it won't be needed if we get closer to the goals above.

Thanks,
Thomas


On Fri, Sep 28, 2018 at 9:44 AM Kenneth Knowles  wrote:

> Anton makes a good point. We have been talking about policy for what we
> do, but the real issue is what we want to come out of it: a clear history
> for seeing where code came from and granular rollback. I think in both
> cases the key thing is that each commit is a single clear change. How they
> get there is not the point.
>
> I have worked on multiple projects with a 1 commit = 1 review (and
> sometimes even = 1 ticket). These pretty much never have a good history.
> The best case is that each commit has a message that is a bullet point of
> many separate changes, because it is simply too inefficient to review each
> logical change separately. But since the messages become less useful, it
> encourages a culture of not even bothering to write meaningful messages at
> all.
>
> Note that is trivial for a committer-reviewer to edit the history any way
> they like without the button. "Allow edits by maintainers" is on by
> default. The "Squash and merge" button just adds a button for something we
> can already do.
>
> Charles: super useful! Worth noting that for a PR with a good history it
> will skip meaningful commits (but still give the summary line, which is
> nice).
>
> Kenn
>
>
>
> On Fri, Sep 28, 2018 at 8:54 AM Anton Kedin  wrote:
>
>> Is there an actual problem caused by squashing or not squashing the
>> commits that we face in the project? I personally have never needed to
>> revert something complicated that would be problematic either way (and
>> don't have a strong opinion about which way we should do it). From what I
>> see so far in the thread it doesn't look like reverting is a frequent major
>> pain for anyone. Maybe it is exactly because we're mostly following some
>> best practice and it makes it easy. If someone has concrete examples from
>> their experience in the project, please share them, this way it would be
>> easier to justify the choice.
>>
>> The PR and commit cleanliness, size and isolation are probably more
>> important thing to have guidance and maybe enforcement for. There are well
>> known practices and guidelines that I think we should follow, and I think
>> they will make squashing or not squashing mostly irrelevant. For example,
>> if we accept that commits should have description that actually describes
>> what commit does, then "!fixup", "address comments" and similar should not
>> be part of the history and should be squashed before submitting the PR no
>> matter which way we decide to go in general. Also, I think that making
>> commits isolated is also a good practice, and PR author should be able to
>> relatively easily split the PR upon reviewer's request. And if we choose to
>> keep whole PRs small and incremental with descriptive isolated commits,
>> then there won't be too much difference how many commits there are.
>>
>> Regards,
>> Anton
>>
>> On Fri, Sep 28, 2018 at 8:21 AM Andrew Pilloud 
>> wrote:
>>
>>> I brought up this discussion a few months ago from the other side: I
>>> don't like my commits being squashed. I try to create logical commits that
>>> each passes tests and could be broken up into multiple PRs. Keeping those
>>> changes intact is useful from a history perspective and squashing may break
>>> other PRs I have in flight. If the intent is clear (one commit with a
>>> descriptive message and a bunch of "fixups"), then feel free to squash,
>>> otherwise ask first. When you do squash, it would be good to leave a
>>> comment as to how the author can avoid having their commits squashed in the
>>> future.
>>>
>>>
>>> https://lists.apache.org/thread.html/8d29e474e681ab9123280164d95075bb8b0b91486b66d3fa25ed20c2@%3Cdev.beam.apache.org%3E
>>>
>>> Andrew
>>>
>>> On Fri, Sep 28, 2018 at 7:29 AM Chamikara Jayalath 
>>> wrote:
>>>


 On Thu, Sep 27, 2018 at 9:51 AM Robert Bradshaw 
 wrote:

> I agree that we should create a good pointer for cleaning up PRs, and
> request (though not require) that authors do it. It's unfortunate though
> that squashing during a review makes things difficult to follow, so adds
> one more round trip.
>
> We could consider for those PRs that make sense as a single logical
> commit (most, but not all, of them) simply using the "squash and merge"
> button even though it technically doesn't c

Re: [DISCUSS] Committer Guidelines / Hygene before merging PRs

2018-09-28 Thread Kenneth Knowles
Anton makes a good point. We have been talking about policy for what we do,
but the real issue is what we want to come out of it: a clear history for
seeing where code came from and granular rollback. I think in both cases
the key thing is that each commit is a single clear change. How they get
there is not the point.

I have worked on multiple projects with a 1 commit = 1 review (and
sometimes even = 1 ticket). These pretty much never have a good history.
The best case is that each commit has a message that is a bullet point of
many separate changes, because it is simply too inefficient to review each
logical change separately. But since the messages become less useful, it
encourages a culture of not even bothering to write meaningful messages at
all.

Note that is trivial for a committer-reviewer to edit the history any way
they like without the button. "Allow edits by maintainers" is on by
default. The "Squash and merge" button just adds a button for something we
can already do.

Charles: super useful! Worth noting that for a PR with a good history it
will skip meaningful commits (but still give the summary line, which is
nice).

Kenn



On Fri, Sep 28, 2018 at 8:54 AM Anton Kedin  wrote:

> Is there an actual problem caused by squashing or not squashing the
> commits that we face in the project? I personally have never needed to
> revert something complicated that would be problematic either way (and
> don't have a strong opinion about which way we should do it). From what I
> see so far in the thread it doesn't look like reverting is a frequent major
> pain for anyone. Maybe it is exactly because we're mostly following some
> best practice and it makes it easy. If someone has concrete examples from
> their experience in the project, please share them, this way it would be
> easier to justify the choice.
>
> The PR and commit cleanliness, size and isolation are probably more
> important thing to have guidance and maybe enforcement for. There are well
> known practices and guidelines that I think we should follow, and I think
> they will make squashing or not squashing mostly irrelevant. For example,
> if we accept that commits should have description that actually describes
> what commit does, then "!fixup", "address comments" and similar should not
> be part of the history and should be squashed before submitting the PR no
> matter which way we decide to go in general. Also, I think that making
> commits isolated is also a good practice, and PR author should be able to
> relatively easily split the PR upon reviewer's request. And if we choose to
> keep whole PRs small and incremental with descriptive isolated commits,
> then there won't be too much difference how many commits there are.
>
> Regards,
> Anton
>
> On Fri, Sep 28, 2018 at 8:21 AM Andrew Pilloud 
> wrote:
>
>> I brought up this discussion a few months ago from the other side: I
>> don't like my commits being squashed. I try to create logical commits that
>> each passes tests and could be broken up into multiple PRs. Keeping those
>> changes intact is useful from a history perspective and squashing may break
>> other PRs I have in flight. If the intent is clear (one commit with a
>> descriptive message and a bunch of "fixups"), then feel free to squash,
>> otherwise ask first. When you do squash, it would be good to leave a
>> comment as to how the author can avoid having their commits squashed in the
>> future.
>>
>>
>> https://lists.apache.org/thread.html/8d29e474e681ab9123280164d95075bb8b0b91486b66d3fa25ed20c2@%3Cdev.beam.apache.org%3E
>>
>> Andrew
>>
>> On Fri, Sep 28, 2018 at 7:29 AM Chamikara Jayalath 
>> wrote:
>>
>>>
>>>
>>> On Thu, Sep 27, 2018 at 9:51 AM Robert Bradshaw 
>>> wrote:
>>>
 I agree that we should create a good pointer for cleaning up PRs, and
 request (though not require) that authors do it. It's unfortunate though
 that squashing during a review makes things difficult to follow, so adds
 one more round trip.

 We could consider for those PRs that make sense as a single logical
 commit (most, but not all, of them) simply using the "squash and merge"
 button even though it technically doesn't create a merge commit.

>>>
>>> +1 for allowing "squash and merge" as an option. Most of the reviews (at
>>> least for me) consist of a single valid commit and several additional
>>> commits that get piled up during the review process which obviously should
>>> not be included in the commit history. Going through another round here
>>> just to ask the author to fixup everything is unnecessarily time consuming.
>>>
>>> - Cham
>>>
>>>


 On Fri, Sep 21, 2018 at 9:24 PM Daniel Oliveira 
 wrote:

> As a non-committer I think some automated squashing of commits sounds
> best since it lightens the load of regular contributors, by not having to
> always remember to squash, and lightens the load of committers so it
> doesn't take as long to have your PR approved by on

Re: [DISCUSS] Committer Guidelines / Hygene before merging PRs

2018-09-28 Thread Charles Chen
+1 to Anton's points.  It looks like the main concern with unsquashed
commits is aesthetic, in that having "!fixup" commits produces noise and
clutters the code tree.  I would like to point out again for those unaware,
that "git log --first-parent" filters the commit history so that each PR
corresponds to one and only one commit (which would not show these fixups),
which seems to be the view that people are looking for.

On Fri, Sep 28, 2018 at 11:54 AM Anton Kedin  wrote:

> Is there an actual problem caused by squashing or not squashing the
> commits that we face in the project? I personally have never needed to
> revert something complicated that would be problematic either way (and
> don't have a strong opinion about which way we should do it). From what I
> see so far in the thread it doesn't look like reverting is a frequent major
> pain for anyone. Maybe it is exactly because we're mostly following some
> best practice and it makes it easy. If someone has concrete examples from
> their experience in the project, please share them, this way it would be
> easier to justify the choice.
>
> The PR and commit cleanliness, size and isolation are probably more
> important thing to have guidance and maybe enforcement for. There are well
> known practices and guidelines that I think we should follow, and I think
> they will make squashing or not squashing mostly irrelevant. For example,
> if we accept that commits should have description that actually describes
> what commit does, then "!fixup", "address comments" and similar should not
> be part of the history and should be squashed before submitting the PR no
> matter which way we decide to go in general. Also, I think that making
> commits isolated is also a good practice, and PR author should be able to
> relatively easily split the PR upon reviewer's request. And if we choose to
> keep whole PRs small and incremental with descriptive isolated commits,
> then there won't be too much difference how many commits there are.
>
> Regards,
> Anton
>
> On Fri, Sep 28, 2018 at 8:21 AM Andrew Pilloud 
> wrote:
>
>> I brought up this discussion a few months ago from the other side: I
>> don't like my commits being squashed. I try to create logical commits that
>> each passes tests and could be broken up into multiple PRs. Keeping those
>> changes intact is useful from a history perspective and squashing may break
>> other PRs I have in flight. If the intent is clear (one commit with a
>> descriptive message and a bunch of "fixups"), then feel free to squash,
>> otherwise ask first. When you do squash, it would be good to leave a
>> comment as to how the author can avoid having their commits squashed in the
>> future.
>>
>>
>> https://lists.apache.org/thread.html/8d29e474e681ab9123280164d95075bb8b0b91486b66d3fa25ed20c2@%3Cdev.beam.apache.org%3E
>>
>> Andrew
>>
>> On Fri, Sep 28, 2018 at 7:29 AM Chamikara Jayalath 
>> wrote:
>>
>>>
>>>
>>> On Thu, Sep 27, 2018 at 9:51 AM Robert Bradshaw 
>>> wrote:
>>>
 I agree that we should create a good pointer for cleaning up PRs, and
 request (though not require) that authors do it. It's unfortunate though
 that squashing during a review makes things difficult to follow, so adds
 one more round trip.

 We could consider for those PRs that make sense as a single logical
 commit (most, but not all, of them) simply using the "squash and merge"
 button even though it technically doesn't create a merge commit.

>>>
>>> +1 for allowing "squash and merge" as an option. Most of the reviews (at
>>> least for me) consist of a single valid commit and several additional
>>> commits that get piled up during the review process which obviously should
>>> not be included in the commit history. Going through another round here
>>> just to ask the author to fixup everything is unnecessarily time consuming.
>>>
>>> - Cham
>>>
>>>


 On Fri, Sep 21, 2018 at 9:24 PM Daniel Oliveira 
 wrote:

> As a non-committer I think some automated squashing of commits sounds
> best since it lightens the load of regular contributors, by not having to
> always remember to squash, and lightens the load of committers so it
> doesn't take as long to have your PR approved by one.
>
> But for now I think the second best route would be making it PR
> author's responsibility to squash fixup commits. Having that expectation
> described clearly in the Contributor's Guide, along with some simple
> step-by-step instructions for how to do so should be enough. I mainly
> support this because I've been doing the squashing myself since I saw a
> thread about it here a few months ago. It's not nearly as huge a burden on
> me as it probably is for committers who have to merge in many more PRs,
> it's very easy to learn how to do, and it's one less barrier to having my
> code merged in.
>
> Of course I wouldn't expect that committers wait for PR authors to
> 

Re: [DISCUSS] Committer Guidelines / Hygene before merging PRs

2018-09-28 Thread Anton Kedin
Is there an actual problem caused by squashing or not squashing the commits
that we face in the project? I personally have never needed to revert
something complicated that would be problematic either way (and don't have
a strong opinion about which way we should do it). From what I see so far
in the thread it doesn't look like reverting is a frequent major pain for
anyone. Maybe it is exactly because we're mostly following some best
practice and it makes it easy. If someone has concrete examples from their
experience in the project, please share them, this way it would be easier
to justify the choice.

The PR and commit cleanliness, size and isolation are probably more
important thing to have guidance and maybe enforcement for. There are well
known practices and guidelines that I think we should follow, and I think
they will make squashing or not squashing mostly irrelevant. For example,
if we accept that commits should have description that actually describes
what commit does, then "!fixup", "address comments" and similar should not
be part of the history and should be squashed before submitting the PR no
matter which way we decide to go in general. Also, I think that making
commits isolated is also a good practice, and PR author should be able to
relatively easily split the PR upon reviewer's request. And if we choose to
keep whole PRs small and incremental with descriptive isolated commits,
then there won't be too much difference how many commits there are.

Regards,
Anton

On Fri, Sep 28, 2018 at 8:21 AM Andrew Pilloud  wrote:

> I brought up this discussion a few months ago from the other side: I don't
> like my commits being squashed. I try to create logical commits that each
> passes tests and could be broken up into multiple PRs. Keeping those
> changes intact is useful from a history perspective and squashing may break
> other PRs I have in flight. If the intent is clear (one commit with a
> descriptive message and a bunch of "fixups"), then feel free to squash,
> otherwise ask first. When you do squash, it would be good to leave a
> comment as to how the author can avoid having their commits squashed in the
> future.
>
>
> https://lists.apache.org/thread.html/8d29e474e681ab9123280164d95075bb8b0b91486b66d3fa25ed20c2@%3Cdev.beam.apache.org%3E
>
> Andrew
>
> On Fri, Sep 28, 2018 at 7:29 AM Chamikara Jayalath 
> wrote:
>
>>
>>
>> On Thu, Sep 27, 2018 at 9:51 AM Robert Bradshaw 
>> wrote:
>>
>>> I agree that we should create a good pointer for cleaning up PRs, and
>>> request (though not require) that authors do it. It's unfortunate though
>>> that squashing during a review makes things difficult to follow, so adds
>>> one more round trip.
>>>
>>> We could consider for those PRs that make sense as a single logical
>>> commit (most, but not all, of them) simply using the "squash and merge"
>>> button even though it technically doesn't create a merge commit.
>>>
>>
>> +1 for allowing "squash and merge" as an option. Most of the reviews (at
>> least for me) consist of a single valid commit and several additional
>> commits that get piled up during the review process which obviously should
>> not be included in the commit history. Going through another round here
>> just to ask the author to fixup everything is unnecessarily time consuming.
>>
>> - Cham
>>
>>
>>>
>>>
>>> On Fri, Sep 21, 2018 at 9:24 PM Daniel Oliveira 
>>> wrote:
>>>
 As a non-committer I think some automated squashing of commits sounds
 best since it lightens the load of regular contributors, by not having to
 always remember to squash, and lightens the load of committers so it
 doesn't take as long to have your PR approved by one.

 But for now I think the second best route would be making it PR
 author's responsibility to squash fixup commits. Having that expectation
 described clearly in the Contributor's Guide, along with some simple
 step-by-step instructions for how to do so should be enough. I mainly
 support this because I've been doing the squashing myself since I saw a
 thread about it here a few months ago. It's not nearly as huge a burden on
 me as it probably is for committers who have to merge in many more PRs,
 it's very easy to learn how to do, and it's one less barrier to having my
 code merged in.

 Of course I wouldn't expect that committers wait for PR authors to
 squash their fixup commits, but I think leaving a message like "For future
 pull requests you should squash any small fixup commits, as described here:
 " should be fine.


> I was also thinking about the possibility of wanting to revert
> individual commits from a merge commit. The solution you propose
> works,
> but only if you want to revert everything.


 Does this happen often? I might not have enough context since I'm not a
 committer, but it seems to me that often the person performing a revert is
 not the original author of a chang

Re: [DISCUSS] Committer Guidelines / Hygene before merging PRs

2018-09-28 Thread Thomas Weise
Thanks for linking the previous discussion.

I have also seen a few cases where the intention was to make individual
changes that can be applied independently. But why not create separate PRs
for those, so they can also be reviewed and merged independently?

Also, if the intention is to make independent, non-squashable commits,
these commits should be tagged appropriately.  That along with "edit
allowed by maintainers" could give sufficient indication to the reviewer.

Thomas



On Fri, Sep 28, 2018 at 8:21 AM Andrew Pilloud  wrote:

> I brought up this discussion a few months ago from the other side: I don't
> like my commits being squashed. I try to create logical commits that each
> passes tests and could be broken up into multiple PRs. Keeping those
> changes intact is useful from a history perspective and squashing may break
> other PRs I have in flight. If the intent is clear (one commit with a
> descriptive message and a bunch of "fixups"), then feel free to squash,
> otherwise ask first. When you do squash, it would be good to leave a
> comment as to how the author can avoid having their commits squashed in the
> future.
>
>
> https://lists.apache.org/thread.html/8d29e474e681ab9123280164d95075bb8b0b91486b66d3fa25ed20c2@%3Cdev.beam.apache.org%3E
>
> Andrew
>
> On Fri, Sep 28, 2018 at 7:29 AM Chamikara Jayalath 
> wrote:
>
>>
>>
>> On Thu, Sep 27, 2018 at 9:51 AM Robert Bradshaw 
>> wrote:
>>
>>> I agree that we should create a good pointer for cleaning up PRs, and
>>> request (though not require) that authors do it. It's unfortunate though
>>> that squashing during a review makes things difficult to follow, so adds
>>> one more round trip.
>>>
>>> We could consider for those PRs that make sense as a single logical
>>> commit (most, but not all, of them) simply using the "squash and merge"
>>> button even though it technically doesn't create a merge commit.
>>>
>>
>> +1 for allowing "squash and merge" as an option. Most of the reviews (at
>> least for me) consist of a single valid commit and several additional
>> commits that get piled up during the review process which obviously should
>> not be included in the commit history. Going through another round here
>> just to ask the author to fixup everything is unnecessarily time consuming.
>>
>> - Cham
>>
>>
>>>
>>>
>>> On Fri, Sep 21, 2018 at 9:24 PM Daniel Oliveira 
>>> wrote:
>>>
 As a non-committer I think some automated squashing of commits sounds
 best since it lightens the load of regular contributors, by not having to
 always remember to squash, and lightens the load of committers so it
 doesn't take as long to have your PR approved by one.

 But for now I think the second best route would be making it PR
 author's responsibility to squash fixup commits. Having that expectation
 described clearly in the Contributor's Guide, along with some simple
 step-by-step instructions for how to do so should be enough. I mainly
 support this because I've been doing the squashing myself since I saw a
 thread about it here a few months ago. It's not nearly as huge a burden on
 me as it probably is for committers who have to merge in many more PRs,
 it's very easy to learn how to do, and it's one less barrier to having my
 code merged in.

 Of course I wouldn't expect that committers wait for PR authors to
 squash their fixup commits, but I think leaving a message like "For future
 pull requests you should squash any small fixup commits, as described here:
 " should be fine.


> I was also thinking about the possibility of wanting to revert
> individual commits from a merge commit. The solution you propose
> works,
> but only if you want to revert everything.


 Does this happen often? I might not have enough context since I'm not a
 committer, but it seems to me that often the person performing a revert is
 not the original author of a change and doesn't have the context or time to
 pick out an individual commit to revert.

 On Wed, Sep 19, 2018 at 1:32 PM Maximilian Michels 
 wrote:

> I tend to agree with you Lukasz. Of course we should try to follow the
> guide lines as much as possible but if it requires an extra back and
> forth with the PR author for a cosmetic change, it may not be worth
> the
> time.
>
> On 19.09.18 22:17, Lukasz Cwik wrote:
> > I have to say I'm guilty of not following the merge guidelines,
> > sometimes doing merges without rebasing/flatten commits.
> >
> > I find that it is a few extra mins of my time to fix someones PR
> history
> > if they have more then one logical commit they want to be separate
> and
> > it usually takes days for the PR author to do merging  with the
> extra
> > burden as a committer to keep track of another PR and its state
> (waiting
> > for clean-up) is taxing. I really liked the idea of the me

Re: [DISCUSS] Committer Guidelines / Hygene before merging PRs

2018-09-28 Thread Robert Bradshaw
Fully agree, if there is a logical commit history, we should keep it. I
think this is speaking to the large number of PRs that have a single "real"
commit, a bunch of fixups, and specifically authors who haven't gone
through and cleaned up their history.

(Now if the commits could logically be broken up into separate PRs, well,
maybe they should be, but that's a separate discussion. There are
definitely times where multiple commits make sense as a single PR too.)

On Fri, Sep 28, 2018 at 5:21 PM Andrew Pilloud  wrote:

> I brought up this discussion a few months ago from the other side: I don't
> like my commits being squashed. I try to create logical commits that each
> passes tests and could be broken up into multiple PRs. Keeping those
> changes intact is useful from a history perspective and squashing may break
> other PRs I have in flight. If the intent is clear (one commit with a
> descriptive message and a bunch of "fixups"), then feel free to squash,
> otherwise ask first. When you do squash, it would be good to leave a
> comment as to how the author can avoid having their commits squashed in the
> future.
>
>
> https://lists.apache.org/thread.html/8d29e474e681ab9123280164d95075bb8b0b91486b66d3fa25ed20c2@%3Cdev.beam.apache.org%3E
>
> Andrew
>
> On Fri, Sep 28, 2018 at 7:29 AM Chamikara Jayalath 
> wrote:
>
>>
>>
>> On Thu, Sep 27, 2018 at 9:51 AM Robert Bradshaw 
>> wrote:
>>
>>> I agree that we should create a good pointer for cleaning up PRs, and
>>> request (though not require) that authors do it. It's unfortunate though
>>> that squashing during a review makes things difficult to follow, so adds
>>> one more round trip.
>>>
>>> We could consider for those PRs that make sense as a single logical
>>> commit (most, but not all, of them) simply using the "squash and merge"
>>> button even though it technically doesn't create a merge commit.
>>>
>>
>> +1 for allowing "squash and merge" as an option. Most of the reviews (at
>> least for me) consist of a single valid commit and several additional
>> commits that get piled up during the review process which obviously should
>> not be included in the commit history. Going through another round here
>> just to ask the author to fixup everything is unnecessarily time consuming.
>>
>> - Cham
>>
>>
>>>
>>>
>>> On Fri, Sep 21, 2018 at 9:24 PM Daniel Oliveira 
>>> wrote:
>>>
 As a non-committer I think some automated squashing of commits sounds
 best since it lightens the load of regular contributors, by not having to
 always remember to squash, and lightens the load of committers so it
 doesn't take as long to have your PR approved by one.

 But for now I think the second best route would be making it PR
 author's responsibility to squash fixup commits. Having that expectation
 described clearly in the Contributor's Guide, along with some simple
 step-by-step instructions for how to do so should be enough. I mainly
 support this because I've been doing the squashing myself since I saw a
 thread about it here a few months ago. It's not nearly as huge a burden on
 me as it probably is for committers who have to merge in many more PRs,
 it's very easy to learn how to do, and it's one less barrier to having my
 code merged in.

 Of course I wouldn't expect that committers wait for PR authors to
 squash their fixup commits, but I think leaving a message like "For future
 pull requests you should squash any small fixup commits, as described here:
 " should be fine.


> I was also thinking about the possibility of wanting to revert
> individual commits from a merge commit. The solution you propose
> works,
> but only if you want to revert everything.


 Does this happen often? I might not have enough context since I'm not a
 committer, but it seems to me that often the person performing a revert is
 not the original author of a change and doesn't have the context or time to
 pick out an individual commit to revert.

 On Wed, Sep 19, 2018 at 1:32 PM Maximilian Michels 
 wrote:

> I tend to agree with you Lukasz. Of course we should try to follow the
> guide lines as much as possible but if it requires an extra back and
> forth with the PR author for a cosmetic change, it may not be worth
> the
> time.
>
> On 19.09.18 22:17, Lukasz Cwik wrote:
> > I have to say I'm guilty of not following the merge guidelines,
> > sometimes doing merges without rebasing/flatten commits.
> >
> > I find that it is a few extra mins of my time to fix someones PR
> history
> > if they have more then one logical commit they want to be separate
> and
> > it usually takes days for the PR author to do merging  with the
> extra
> > burden as a committer to keep track of another PR and its state
> (waiting
> > for clean-up) is taxing. I really liked the idea of the mergebot
> (e

Re: [DISCUSS] Committer Guidelines / Hygene before merging PRs

2018-09-28 Thread Andrew Pilloud
I brought up this discussion a few months ago from the other side: I don't
like my commits being squashed. I try to create logical commits that each
passes tests and could be broken up into multiple PRs. Keeping those
changes intact is useful from a history perspective and squashing may break
other PRs I have in flight. If the intent is clear (one commit with a
descriptive message and a bunch of "fixups"), then feel free to squash,
otherwise ask first. When you do squash, it would be good to leave a
comment as to how the author can avoid having their commits squashed in the
future.

https://lists.apache.org/thread.html/8d29e474e681ab9123280164d95075bb8b0b91486b66d3fa25ed20c2@%3Cdev.beam.apache.org%3E

Andrew

On Fri, Sep 28, 2018 at 7:29 AM Chamikara Jayalath 
wrote:

>
>
> On Thu, Sep 27, 2018 at 9:51 AM Robert Bradshaw 
> wrote:
>
>> I agree that we should create a good pointer for cleaning up PRs, and
>> request (though not require) that authors do it. It's unfortunate though
>> that squashing during a review makes things difficult to follow, so adds
>> one more round trip.
>>
>> We could consider for those PRs that make sense as a single logical
>> commit (most, but not all, of them) simply using the "squash and merge"
>> button even though it technically doesn't create a merge commit.
>>
>
> +1 for allowing "squash and merge" as an option. Most of the reviews (at
> least for me) consist of a single valid commit and several additional
> commits that get piled up during the review process which obviously should
> not be included in the commit history. Going through another round here
> just to ask the author to fixup everything is unnecessarily time consuming.
>
> - Cham
>
>
>>
>>
>> On Fri, Sep 21, 2018 at 9:24 PM Daniel Oliveira 
>> wrote:
>>
>>> As a non-committer I think some automated squashing of commits sounds
>>> best since it lightens the load of regular contributors, by not having to
>>> always remember to squash, and lightens the load of committers so it
>>> doesn't take as long to have your PR approved by one.
>>>
>>> But for now I think the second best route would be making it PR author's
>>> responsibility to squash fixup commits. Having that expectation described
>>> clearly in the Contributor's Guide, along with some simple step-by-step
>>> instructions for how to do so should be enough. I mainly support this
>>> because I've been doing the squashing myself since I saw a thread about it
>>> here a few months ago. It's not nearly as huge a burden on me as it
>>> probably is for committers who have to merge in many more PRs, it's very
>>> easy to learn how to do, and it's one less barrier to having my code merged
>>> in.
>>>
>>> Of course I wouldn't expect that committers wait for PR authors to
>>> squash their fixup commits, but I think leaving a message like "For future
>>> pull requests you should squash any small fixup commits, as described here:
>>> " should be fine.
>>>
>>>
 I was also thinking about the possibility of wanting to revert
 individual commits from a merge commit. The solution you propose works,
 but only if you want to revert everything.
>>>
>>>
>>> Does this happen often? I might not have enough context since I'm not a
>>> committer, but it seems to me that often the person performing a revert is
>>> not the original author of a change and doesn't have the context or time to
>>> pick out an individual commit to revert.
>>>
>>> On Wed, Sep 19, 2018 at 1:32 PM Maximilian Michels 
>>> wrote:
>>>
 I tend to agree with you Lukasz. Of course we should try to follow the
 guide lines as much as possible but if it requires an extra back and
 forth with the PR author for a cosmetic change, it may not be worth the
 time.

 On 19.09.18 22:17, Lukasz Cwik wrote:
 > I have to say I'm guilty of not following the merge guidelines,
 > sometimes doing merges without rebasing/flatten commits.
 >
 > I find that it is a few extra mins of my time to fix someones PR
 history
 > if they have more then one logical commit they want to be separate
 and
 > it usually takes days for the PR author to do merging  with the extra
 > burden as a committer to keep track of another PR and its state
 (waiting
 > for clean-up) is taxing. I really liked the idea of the mergebot
 (even
 > though it didn't work out in practice) because it could do all the
 > policy work on my behalf.
 >
 > Anything that reduces my overhead as a committer is useful as for the
 > 100s of PRs that I have merged, I've only had to rollback a couple so
 > I'm for Charle's suggestion which makes the rollback flow slightly
 more
 > complicated for a significantly easier PR merge workflow.
 >
 > On Wed, Sep 19, 2018 at 1:13 PM Charles Chen >>> > > wrote:
 >
 > What I mean is that if you get the first-parent commit using "git
 > log --first-parent", it will incorporate a

Re: [DISCUSS] Committer Guidelines / Hygene before merging PRs

2018-09-28 Thread Chamikara Jayalath
On Thu, Sep 27, 2018 at 9:51 AM Robert Bradshaw  wrote:

> I agree that we should create a good pointer for cleaning up PRs, and
> request (though not require) that authors do it. It's unfortunate though
> that squashing during a review makes things difficult to follow, so adds
> one more round trip.
>
> We could consider for those PRs that make sense as a single logical commit
> (most, but not all, of them) simply using the "squash and merge" button
> even though it technically doesn't create a merge commit.
>

+1 for allowing "squash and merge" as an option. Most of the reviews (at
least for me) consist of a single valid commit and several additional
commits that get piled up during the review process which obviously should
not be included in the commit history. Going through another round here
just to ask the author to fixup everything is unnecessarily time consuming.

- Cham


>
>
> On Fri, Sep 21, 2018 at 9:24 PM Daniel Oliveira 
> wrote:
>
>> As a non-committer I think some automated squashing of commits sounds
>> best since it lightens the load of regular contributors, by not having to
>> always remember to squash, and lightens the load of committers so it
>> doesn't take as long to have your PR approved by one.
>>
>> But for now I think the second best route would be making it PR author's
>> responsibility to squash fixup commits. Having that expectation described
>> clearly in the Contributor's Guide, along with some simple step-by-step
>> instructions for how to do so should be enough. I mainly support this
>> because I've been doing the squashing myself since I saw a thread about it
>> here a few months ago. It's not nearly as huge a burden on me as it
>> probably is for committers who have to merge in many more PRs, it's very
>> easy to learn how to do, and it's one less barrier to having my code merged
>> in.
>>
>> Of course I wouldn't expect that committers wait for PR authors to squash
>> their fixup commits, but I think leaving a message like "For future pull
>> requests you should squash any small fixup commits, as described here:
>> " should be fine.
>>
>>
>>> I was also thinking about the possibility of wanting to revert
>>> individual commits from a merge commit. The solution you propose works,
>>> but only if you want to revert everything.
>>
>>
>> Does this happen often? I might not have enough context since I'm not a
>> committer, but it seems to me that often the person performing a revert is
>> not the original author of a change and doesn't have the context or time to
>> pick out an individual commit to revert.
>>
>> On Wed, Sep 19, 2018 at 1:32 PM Maximilian Michels 
>> wrote:
>>
>>> I tend to agree with you Lukasz. Of course we should try to follow the
>>> guide lines as much as possible but if it requires an extra back and
>>> forth with the PR author for a cosmetic change, it may not be worth the
>>> time.
>>>
>>> On 19.09.18 22:17, Lukasz Cwik wrote:
>>> > I have to say I'm guilty of not following the merge guidelines,
>>> > sometimes doing merges without rebasing/flatten commits.
>>> >
>>> > I find that it is a few extra mins of my time to fix someones PR
>>> history
>>> > if they have more then one logical commit they want to be separate and
>>> > it usually takes days for the PR author to do merging  with the extra
>>> > burden as a committer to keep track of another PR and its state
>>> (waiting
>>> > for clean-up) is taxing. I really liked the idea of the mergebot (even
>>> > though it didn't work out in practice) because it could do all the
>>> > policy work on my behalf.
>>> >
>>> > Anything that reduces my overhead as a committer is useful as for the
>>> > 100s of PRs that I have merged, I've only had to rollback a couple so
>>> > I'm for Charle's suggestion which makes the rollback flow slightly
>>> more
>>> > complicated for a significantly easier PR merge workflow.
>>> >
>>> > On Wed, Sep 19, 2018 at 1:13 PM Charles Chen >> > > wrote:
>>> >
>>> > What I mean is that if you get the first-parent commit using "git
>>> > log --first-parent", it will incorporate any and all fix up commits
>>> > so we don't need to worry about missing any.
>>> >
>>> > On Wed, Sep 19, 2018, 1:07 PM Maximilian Michels >> > > wrote:
>>> >
>>> > Generally, +1 for isolated commits which are easy to revert.
>>> >
>>> >  > I don't think it's actually harder to roll back a set of
>>> > commits that are merged together.
>>> > I think Thomas was mainly concerned about "fixup" commits to
>>> > land in
>>> > master (as part of a merge). These indeed make reverting
>>> commits
>>> > more
>>> > difficult because you have to check whether you missed a
>>> "fixup".
>>> >
>>> >  > Ideally every commit should compile and pass tests though,
>>> right?
>>> >
>>> > That is definitely what we should strive for when doing a merge
>>> > against
>>> >

Re: [DISCUSS] Committer Guidelines / Hygene before merging PRs

2018-09-27 Thread Robert Bradshaw
I agree that we should create a good pointer for cleaning up PRs, and
request (though not require) that authors do it. It's unfortunate though
that squashing during a review makes things difficult to follow, so adds
one more round trip.

We could consider for those PRs that make sense as a single logical commit
(most, but not all, of them) simply using the "squash and merge" button
even though it technically doesn't create a merge commit.


On Fri, Sep 21, 2018 at 9:24 PM Daniel Oliveira 
wrote:

> As a non-committer I think some automated squashing of commits sounds best
> since it lightens the load of regular contributors, by not having to always
> remember to squash, and lightens the load of committers so it doesn't take
> as long to have your PR approved by one.
>
> But for now I think the second best route would be making it PR author's
> responsibility to squash fixup commits. Having that expectation described
> clearly in the Contributor's Guide, along with some simple step-by-step
> instructions for how to do so should be enough. I mainly support this
> because I've been doing the squashing myself since I saw a thread about it
> here a few months ago. It's not nearly as huge a burden on me as it
> probably is for committers who have to merge in many more PRs, it's very
> easy to learn how to do, and it's one less barrier to having my code merged
> in.
>
> Of course I wouldn't expect that committers wait for PR authors to squash
> their fixup commits, but I think leaving a message like "For future pull
> requests you should squash any small fixup commits, as described here:
> " should be fine.
>
>
>> I was also thinking about the possibility of wanting to revert
>> individual commits from a merge commit. The solution you propose works,
>> but only if you want to revert everything.
>
>
> Does this happen often? I might not have enough context since I'm not a
> committer, but it seems to me that often the person performing a revert is
> not the original author of a change and doesn't have the context or time to
> pick out an individual commit to revert.
>
> On Wed, Sep 19, 2018 at 1:32 PM Maximilian Michels  wrote:
>
>> I tend to agree with you Lukasz. Of course we should try to follow the
>> guide lines as much as possible but if it requires an extra back and
>> forth with the PR author for a cosmetic change, it may not be worth the
>> time.
>>
>> On 19.09.18 22:17, Lukasz Cwik wrote:
>> > I have to say I'm guilty of not following the merge guidelines,
>> > sometimes doing merges without rebasing/flatten commits.
>> >
>> > I find that it is a few extra mins of my time to fix someones PR
>> history
>> > if they have more then one logical commit they want to be separate and
>> > it usually takes days for the PR author to do merging  with the extra
>> > burden as a committer to keep track of another PR and its state
>> (waiting
>> > for clean-up) is taxing. I really liked the idea of the mergebot (even
>> > though it didn't work out in practice) because it could do all the
>> > policy work on my behalf.
>> >
>> > Anything that reduces my overhead as a committer is useful as for the
>> > 100s of PRs that I have merged, I've only had to rollback a couple so
>> > I'm for Charle's suggestion which makes the rollback flow slightly more
>> > complicated for a significantly easier PR merge workflow.
>> >
>> > On Wed, Sep 19, 2018 at 1:13 PM Charles Chen > > > wrote:
>> >
>> > What I mean is that if you get the first-parent commit using "git
>> > log --first-parent", it will incorporate any and all fix up commits
>> > so we don't need to worry about missing any.
>> >
>> > On Wed, Sep 19, 2018, 1:07 PM Maximilian Michels > > > wrote:
>> >
>> > Generally, +1 for isolated commits which are easy to revert.
>> >
>> >  > I don't think it's actually harder to roll back a set of
>> > commits that are merged together.
>> > I think Thomas was mainly concerned about "fixup" commits to
>> > land in
>> > master (as part of a merge). These indeed make reverting commits
>> > more
>> > difficult because you have to check whether you missed a
>> "fixup".
>> >
>> >  > Ideally every commit should compile and pass tests though,
>> right?
>> >
>> > That is definitely what we should strive for when doing a merge
>> > against
>> > master.
>> >
>> >  > Perhaps the bigger issue is that we need better documentation
>> > and a playbook on how to do this these common tasks in git.
>> >
>> > We do actually have basic documentation about this but most
>> > people don't
>> > read it. For example, the commit message of a Merge commit
>> > should be:
>> >
>> > Merge pull request #: [BEAM-] Issue title
>> >
>> > But most merge commits don't comply with this rule :) See
>> > https://beam.apache.org/contrib

Re: [DISCUSS] Committer Guidelines / Hygene before merging PRs

2018-09-21 Thread Daniel Oliveira
As a non-committer I think some automated squashing of commits sounds best
since it lightens the load of regular contributors, by not having to always
remember to squash, and lightens the load of committers so it doesn't take
as long to have your PR approved by one.

But for now I think the second best route would be making it PR author's
responsibility to squash fixup commits. Having that expectation described
clearly in the Contributor's Guide, along with some simple step-by-step
instructions for how to do so should be enough. I mainly support this
because I've been doing the squashing myself since I saw a thread about it
here a few months ago. It's not nearly as huge a burden on me as it
probably is for committers who have to merge in many more PRs, it's very
easy to learn how to do, and it's one less barrier to having my code merged
in.

Of course I wouldn't expect that committers wait for PR authors to squash
their fixup commits, but I think leaving a message like "For future pull
requests you should squash any small fixup commits, as described here:
" should be fine.


> I was also thinking about the possibility of wanting to revert
> individual commits from a merge commit. The solution you propose works,
> but only if you want to revert everything.


Does this happen often? I might not have enough context since I'm not a
committer, but it seems to me that often the person performing a revert is
not the original author of a change and doesn't have the context or time to
pick out an individual commit to revert.

On Wed, Sep 19, 2018 at 1:32 PM Maximilian Michels  wrote:

> I tend to agree with you Lukasz. Of course we should try to follow the
> guide lines as much as possible but if it requires an extra back and
> forth with the PR author for a cosmetic change, it may not be worth the
> time.
>
> On 19.09.18 22:17, Lukasz Cwik wrote:
> > I have to say I'm guilty of not following the merge guidelines,
> > sometimes doing merges without rebasing/flatten commits.
> >
> > I find that it is a few extra mins of my time to fix someones PR history
> > if they have more then one logical commit they want to be separate and
> > it usually takes days for the PR author to do merging  with the extra
> > burden as a committer to keep track of another PR and its state (waiting
> > for clean-up) is taxing. I really liked the idea of the mergebot (even
> > though it didn't work out in practice) because it could do all the
> > policy work on my behalf.
> >
> > Anything that reduces my overhead as a committer is useful as for the
> > 100s of PRs that I have merged, I've only had to rollback a couple so
> > I'm for Charle's suggestion which makes the rollback flow slightly more
> > complicated for a significantly easier PR merge workflow.
> >
> > On Wed, Sep 19, 2018 at 1:13 PM Charles Chen  > > wrote:
> >
> > What I mean is that if you get the first-parent commit using "git
> > log --first-parent", it will incorporate any and all fix up commits
> > so we don't need to worry about missing any.
> >
> > On Wed, Sep 19, 2018, 1:07 PM Maximilian Michels  > > wrote:
> >
> > Generally, +1 for isolated commits which are easy to revert.
> >
> >  > I don't think it's actually harder to roll back a set of
> > commits that are merged together.
> > I think Thomas was mainly concerned about "fixup" commits to
> > land in
> > master (as part of a merge). These indeed make reverting commits
> > more
> > difficult because you have to check whether you missed a "fixup".
> >
> >  > Ideally every commit should compile and pass tests though,
> right?
> >
> > That is definitely what we should strive for when doing a merge
> > against
> > master.
> >
> >  > Perhaps the bigger issue is that we need better documentation
> > and a playbook on how to do this these common tasks in git.
> >
> > We do actually have basic documentation about this but most
> > people don't
> > read it. For example, the commit message of a Merge commit
> > should be:
> >
> > Merge pull request #: [BEAM-] Issue title
> >
> > But most merge commits don't comply with this rule :) See
> > https://beam.apache.org/contribute/committer-guide/#merging-it
> >
> > On 19.09.18 21:34, Reuven Lax wrote:
> >  > Ideally every commit should compile and pass tests though,
> right?
> >  >
> >  > On Wed, Sep 19, 2018 at 12:15 PM Ankur Goenka
> > mailto:goe...@google.com>
> >  > >> wrote:
> >  >
> >  > I agree with the cleanliness of the Commit history.
> >  > "Fixup!", "Address comments", "Address even more
> > comments" type of
> >  > comments does not convey meaningful information and are
> >  

Re: [DISCUSS] Committer Guidelines / Hygene before merging PRs

2018-09-19 Thread Maximilian Michels
I tend to agree with you Lukasz. Of course we should try to follow the 
guide lines as much as possible but if it requires an extra back and 
forth with the PR author for a cosmetic change, it may not be worth the 
time.


On 19.09.18 22:17, Lukasz Cwik wrote:
I have to say I'm guilty of not following the merge guidelines, 
sometimes doing merges without rebasing/flatten commits.


I find that it is a few extra mins of my time to fix someones PR history 
if they have more then one logical commit they want to be separate and 
it usually takes days for the PR author to do merging  with the extra 
burden as a committer to keep track of another PR and its state (waiting 
for clean-up) is taxing. I really liked the idea of the mergebot (even 
though it didn't work out in practice) because it could do all the 
policy work on my behalf.


Anything that reduces my overhead as a committer is useful as for the 
100s of PRs that I have merged, I've only had to rollback a couple so 
I'm for Charle's suggestion which makes the rollback flow slightly more 
complicated for a significantly easier PR merge workflow.


On Wed, Sep 19, 2018 at 1:13 PM Charles Chen > wrote:


What I mean is that if you get the first-parent commit using "git
log --first-parent", it will incorporate any and all fix up commits
so we don't need to worry about missing any.

On Wed, Sep 19, 2018, 1:07 PM Maximilian Michels mailto:m...@apache.org>> wrote:

Generally, +1 for isolated commits which are easy to revert.

 > I don't think it's actually harder to roll back a set of
commits that are merged together.
I think Thomas was mainly concerned about "fixup" commits to
land in
master (as part of a merge). These indeed make reverting commits
more
difficult because you have to check whether you missed a "fixup".

 > Ideally every commit should compile and pass tests though, right?

That is definitely what we should strive for when doing a merge
against
master.

 > Perhaps the bigger issue is that we need better documentation
and a playbook on how to do this these common tasks in git.

We do actually have basic documentation about this but most
people don't
read it. For example, the commit message of a Merge commit
should be:

Merge pull request #: [BEAM-] Issue title

But most merge commits don't comply with this rule :) See
https://beam.apache.org/contribute/committer-guide/#merging-it

On 19.09.18 21:34, Reuven Lax wrote:
 > Ideally every commit should compile and pass tests though, right?
 >
 > On Wed, Sep 19, 2018 at 12:15 PM Ankur Goenka
mailto:goe...@google.com>
 > >> wrote:
 >
 >     I agree with the cleanliness of the Commit history.
 >     "Fixup!", "Address comments", "Address even more
comments" type of
 >     comments does not convey meaningful information and are
not very
 >     useful. Its a good idea to squash them.
 >     However, I think its ok to keep separate commits for
different
 >     logical pieces of the code which make reviewing and
revisiting code
 >     easier.
 >     Example PR: Support X in the pipeline
 >     Commit 1: Restructuring a bunch of code without any
logical change.
 >     Commit 2: Changing validation logic for pipeline.
 >     Commit 3: Supporting new field "X" for pipeline.
 >
 >     On Wed, Sep 19, 2018 at 11:27 AM Charles Chen
mailto:c...@google.com>
 >     >> wrote:
 >
 >         To be concrete, it is very easy to revert a commit in
any case:
 >
 >          1. First, use "git log --first-parent" to find the
first-parent
 >             commit corresponding to a PR merge (this is a
one-to-one
 >             correspondence).
 >          2. Use "git revert -m 1 " to revert the
commit; this
 >             selects the first parent as the base for a merge
commit (in
 >             the case where a single commit needs to be
reverted, just
 >             use "git revert " without the "-m 1" flag).
 >
 >         In any case, as a general good engineering practice,
I do agree
 >         that it is highly desirable to have small independent PRs
 >         instead of large jumbo PRs whenever possible.
 >
 >         On Wed, Sep 19, 2018 at 11:20 AM Charles Chen
mailto:c...@google.com>
 >         >> wrote:
 >
 >             I don't think it

Re: [DISCUSS] Committer Guidelines / Hygene before merging PRs

2018-09-19 Thread Maximilian Michels
I was also thinking about the possibility of wanting to revert 
individual commits from a merge commit. The solution you propose works, 
but only if you want to revert everything.


On 19.09.18 22:12, Charles Chen wrote:
What I mean is that if you get the first-parent commit using "git log 
--first-parent", it will incorporate any and all fix up commits so we 
don't need to worry about missing any.


On Wed, Sep 19, 2018, 1:07 PM Maximilian Michels > wrote:


Generally, +1 for isolated commits which are easy to revert.

 > I don't think it's actually harder to roll back a set of commits
that are merged together.
I think Thomas was mainly concerned about "fixup" commits to land in
master (as part of a merge). These indeed make reverting commits more
difficult because you have to check whether you missed a "fixup".

 > Ideally every commit should compile and pass tests though, right?

That is definitely what we should strive for when doing a merge against
master.

 > Perhaps the bigger issue is that we need better documentation and
a playbook on how to do this these common tasks in git.

We do actually have basic documentation about this but most people
don't
read it. For example, the commit message of a Merge commit should be:

Merge pull request #: [BEAM-] Issue title

But most merge commits don't comply with this rule :) See
https://beam.apache.org/contribute/committer-guide/#merging-it

On 19.09.18 21:34, Reuven Lax wrote:
 > Ideally every commit should compile and pass tests though, right?
 >
 > On Wed, Sep 19, 2018 at 12:15 PM Ankur Goenka mailto:goe...@google.com>
 > >> wrote:
 >
 >     I agree with the cleanliness of the Commit history.
 >     "Fixup!", "Address comments", "Address even more comments"
type of
 >     comments does not convey meaningful information and are not very
 >     useful. Its a good idea to squash them.
 >     However, I think its ok to keep separate commits for different
 >     logical pieces of the code which make reviewing and
revisiting code
 >     easier.
 >     Example PR: Support X in the pipeline
 >     Commit 1: Restructuring a bunch of code without any logical
change.
 >     Commit 2: Changing validation logic for pipeline.
 >     Commit 3: Supporting new field "X" for pipeline.
 >
 >     On Wed, Sep 19, 2018 at 11:27 AM Charles Chen mailto:c...@google.com>
 >     >> wrote:
 >
 >         To be concrete, it is very easy to revert a commit in any
case:
 >
 >          1. First, use "git log --first-parent" to find the
first-parent
 >             commit corresponding to a PR merge (this is a one-to-one
 >             correspondence).
 >          2. Use "git revert -m 1 " to revert the
commit; this
 >             selects the first parent as the base for a merge
commit (in
 >             the case where a single commit needs to be reverted, just
 >             use "git revert " without the "-m 1" flag).
 >
 >         In any case, as a general good engineering practice, I do
agree
 >         that it is highly desirable to have small independent PRs
 >         instead of large jumbo PRs whenever possible.
 >
 >         On Wed, Sep 19, 2018 at 11:20 AM Charles Chen
mailto:c...@google.com>
 >         >> wrote:
 >
 >             I don't think it's actually harder to roll back a set of
 >             commits that are merged together.  Git has the notion of
 >             first-parent commits (you can see, for example, "git log
 >             --first-parent", which filters out the intermediate
 >             commits).  In this sense, PRs still get merged as one
unit
 >             and this is preserved even if intermediate commits are
 >             kept.  Perhaps the bigger issue is that we need better
 >             documentation and a playbook on how to do this these
common
 >             tasks in git.
 >
 >             On Wed, Sep 19, 2018 at 9:27 AM Thomas Weise
mailto:t...@apache.org>
 >             >> wrote:
 >
 >                 Wanted to bring this up as reminder as well as
 >                 opportunity to discuss potential changes to our
 >                 committer guide. It has been a while since last
related
 >                 discussion and we welcomed several new committers
since
 >                 then.
 >
 >                 Finishing up pull requests pre-merge:
 >
 > https://beam.apache.org/contribute/committer-guide/#finishing-touches
 >
 >                 PRs are worked on 

Re: [DISCUSS] Committer Guidelines / Hygene before merging PRs

2018-09-19 Thread Lukasz Cwik
I have to say I'm guilty of not following the merge guidelines, sometimes
doing merges without rebasing/flatten commits.

I find that it is a few extra mins of my time to fix someones PR history if
they have more then one logical commit they want to be separate and it
usually takes days for the PR author to do merging  with the extra burden
as a committer to keep track of another PR and its state (waiting for
clean-up) is taxing. I really liked the idea of the mergebot (even though
it didn't work out in practice) because it could do all the policy work on
my behalf.

Anything that reduces my overhead as a committer is useful as for the 100s
of PRs that I have merged, I've only had to rollback a couple so I'm for
Charle's suggestion which makes the rollback flow slightly more complicated
for a significantly easier PR merge workflow.

On Wed, Sep 19, 2018 at 1:13 PM Charles Chen  wrote:

> What I mean is that if you get the first-parent commit using "git log
> --first-parent", it will incorporate any and all fix up commits so we don't
> need to worry about missing any.
>
> On Wed, Sep 19, 2018, 1:07 PM Maximilian Michels  wrote:
>
>> Generally, +1 for isolated commits which are easy to revert.
>>
>> > I don't think it's actually harder to roll back a set of commits that
>> are merged together.
>> I think Thomas was mainly concerned about "fixup" commits to land in
>> master (as part of a merge). These indeed make reverting commits more
>> difficult because you have to check whether you missed a "fixup".
>>
>> > Ideally every commit should compile and pass tests though, right?
>>
>> That is definitely what we should strive for when doing a merge against
>> master.
>>
>> > Perhaps the bigger issue is that we need better documentation and a
>> playbook on how to do this these common tasks in git.
>>
>> We do actually have basic documentation about this but most people don't
>> read it. For example, the commit message of a Merge commit should be:
>>
>> Merge pull request #: [BEAM-] Issue title
>>
>> But most merge commits don't comply with this rule :) See
>> https://beam.apache.org/contribute/committer-guide/#merging-it
>>
>> On 19.09.18 21:34, Reuven Lax wrote:
>> > Ideally every commit should compile and pass tests though, right?
>> >
>> > On Wed, Sep 19, 2018 at 12:15 PM Ankur Goenka > > > wrote:
>> >
>> > I agree with the cleanliness of the Commit history.
>> > "Fixup!", "Address comments", "Address even more comments" type of
>> > comments does not convey meaningful information and are not very
>> > useful. Its a good idea to squash them.
>> > However, I think its ok to keep separate commits for different
>> > logical pieces of the code which make reviewing and revisiting code
>> > easier.
>> > Example PR: Support X in the pipeline
>> > Commit 1: Restructuring a bunch of code without any logical change.
>> > Commit 2: Changing validation logic for pipeline.
>> > Commit 3: Supporting new field "X" for pipeline.
>> >
>> > On Wed, Sep 19, 2018 at 11:27 AM Charles Chen > > > wrote:
>> >
>> > To be concrete, it is very easy to revert a commit in any case:
>> >
>> >  1. First, use "git log --first-parent" to find the first-parent
>> > commit corresponding to a PR merge (this is a one-to-one
>> > correspondence).
>> >  2. Use "git revert -m 1 " to revert the commit; this
>> > selects the first parent as the base for a merge commit (in
>> > the case where a single commit needs to be reverted, just
>> > use "git revert " without the "-m 1" flag).
>> >
>> > In any case, as a general good engineering practice, I do agree
>> > that it is highly desirable to have small independent PRs
>> > instead of large jumbo PRs whenever possible.
>> >
>> > On Wed, Sep 19, 2018 at 11:20 AM Charles Chen > > > wrote:
>> >
>> > I don't think it's actually harder to roll back a set of
>> > commits that are merged together.  Git has the notion of
>> > first-parent commits (you can see, for example, "git log
>> > --first-parent", which filters out the intermediate
>> > commits).  In this sense, PRs still get merged as one unit
>> > and this is preserved even if intermediate commits are
>> > kept.  Perhaps the bigger issue is that we need better
>> > documentation and a playbook on how to do this these common
>> > tasks in git.
>> >
>> > On Wed, Sep 19, 2018 at 9:27 AM Thomas Weise <
>> t...@apache.org
>> > > wrote:
>> >
>> > Wanted to bring this up as reminder as well as
>> > opportunity to discuss potential changes to our
>> > committer guide. It has been a while since last related
>

Re: [DISCUSS] Committer Guidelines / Hygene before merging PRs

2018-09-19 Thread Charles Chen
What I mean is that if you get the first-parent commit using "git log
--first-parent", it will incorporate any and all fix up commits so we don't
need to worry about missing any.

On Wed, Sep 19, 2018, 1:07 PM Maximilian Michels  wrote:

> Generally, +1 for isolated commits which are easy to revert.
>
> > I don't think it's actually harder to roll back a set of commits that
> are merged together.
> I think Thomas was mainly concerned about "fixup" commits to land in
> master (as part of a merge). These indeed make reverting commits more
> difficult because you have to check whether you missed a "fixup".
>
> > Ideally every commit should compile and pass tests though, right?
>
> That is definitely what we should strive for when doing a merge against
> master.
>
> > Perhaps the bigger issue is that we need better documentation and a
> playbook on how to do this these common tasks in git.
>
> We do actually have basic documentation about this but most people don't
> read it. For example, the commit message of a Merge commit should be:
>
> Merge pull request #: [BEAM-] Issue title
>
> But most merge commits don't comply with this rule :) See
> https://beam.apache.org/contribute/committer-guide/#merging-it
>
> On 19.09.18 21:34, Reuven Lax wrote:
> > Ideally every commit should compile and pass tests though, right?
> >
> > On Wed, Sep 19, 2018 at 12:15 PM Ankur Goenka  > > wrote:
> >
> > I agree with the cleanliness of the Commit history.
> > "Fixup!", "Address comments", "Address even more comments" type of
> > comments does not convey meaningful information and are not very
> > useful. Its a good idea to squash them.
> > However, I think its ok to keep separate commits for different
> > logical pieces of the code which make reviewing and revisiting code
> > easier.
> > Example PR: Support X in the pipeline
> > Commit 1: Restructuring a bunch of code without any logical change.
> > Commit 2: Changing validation logic for pipeline.
> > Commit 3: Supporting new field "X" for pipeline.
> >
> > On Wed, Sep 19, 2018 at 11:27 AM Charles Chen  > > wrote:
> >
> > To be concrete, it is very easy to revert a commit in any case:
> >
> >  1. First, use "git log --first-parent" to find the first-parent
> > commit corresponding to a PR merge (this is a one-to-one
> > correspondence).
> >  2. Use "git revert -m 1 " to revert the commit; this
> > selects the first parent as the base for a merge commit (in
> > the case where a single commit needs to be reverted, just
> > use "git revert " without the "-m 1" flag).
> >
> > In any case, as a general good engineering practice, I do agree
> > that it is highly desirable to have small independent PRs
> > instead of large jumbo PRs whenever possible.
> >
> > On Wed, Sep 19, 2018 at 11:20 AM Charles Chen  > > wrote:
> >
> > I don't think it's actually harder to roll back a set of
> > commits that are merged together.  Git has the notion of
> > first-parent commits (you can see, for example, "git log
> > --first-parent", which filters out the intermediate
> > commits).  In this sense, PRs still get merged as one unit
> > and this is preserved even if intermediate commits are
> > kept.  Perhaps the bigger issue is that we need better
> > documentation and a playbook on how to do this these common
> > tasks in git.
> >
> > On Wed, Sep 19, 2018 at 9:27 AM Thomas Weise  > > wrote:
> >
> > Wanted to bring this up as reminder as well as
> > opportunity to discuss potential changes to our
> > committer guide. It has been a while since last related
> > discussion and we welcomed several new committers since
> > then.
> >
> > Finishing up pull requests pre-merge:
> >
> >
> https://beam.apache.org/contribute/committer-guide/#finishing-touches
> >
> > PRs are worked on over time and may accumulate many
> > commits. Sometimes because scope expands, sometimes just
> > to separate independent changes but most of the time the
> > commits are just fixups that are added as review
> progresses.
> > It is important that the latter get squashed prior to PR
> > merge, as otherwise we lost the ability to roll back
> > changes by reverting a single commit and also generally
> > cause a lot of noise in the commit history that does not
> > help other contributors. To be clear, I refer to the
> > "Fixup!", "Address comments", "Address even more
> > 

Re: [DISCUSS] Committer Guidelines / Hygene before merging PRs

2018-09-19 Thread Maximilian Michels

Generally, +1 for isolated commits which are easy to revert.

I don't think it's actually harder to roll back a set of commits that are merged together. 
I think Thomas was mainly concerned about "fixup" commits to land in 
master (as part of a merge). These indeed make reverting commits more 
difficult because you have to check whether you missed a "fixup".



Ideally every commit should compile and pass tests though, right?


That is definitely what we should strive for when doing a merge against 
master.



Perhaps the bigger issue is that we need better documentation and a playbook on 
how to do this these common tasks in git.


We do actually have basic documentation about this but most people don't 
read it. For example, the commit message of a Merge commit should be:


Merge pull request #: [BEAM-] Issue title

But most merge commits don't comply with this rule :) See 
https://beam.apache.org/contribute/committer-guide/#merging-it


On 19.09.18 21:34, Reuven Lax wrote:

Ideally every commit should compile and pass tests though, right?

On Wed, Sep 19, 2018 at 12:15 PM Ankur Goenka > wrote:


I agree with the cleanliness of the Commit history.
"Fixup!", "Address comments", "Address even more comments" type of
comments does not convey meaningful information and are not very
useful. Its a good idea to squash them.
However, I think its ok to keep separate commits for different
logical pieces of the code which make reviewing and revisiting code
easier.
Example PR: Support X in the pipeline
Commit 1: Restructuring a bunch of code without any logical change.
Commit 2: Changing validation logic for pipeline.
Commit 3: Supporting new field "X" for pipeline.

On Wed, Sep 19, 2018 at 11:27 AM Charles Chen mailto:c...@google.com>> wrote:

To be concrete, it is very easy to revert a commit in any case:

 1. First, use "git log --first-parent" to find the first-parent
commit corresponding to a PR merge (this is a one-to-one
correspondence).
 2. Use "git revert -m 1 " to revert the commit; this
selects the first parent as the base for a merge commit (in
the case where a single commit needs to be reverted, just
use "git revert " without the "-m 1" flag).

In any case, as a general good engineering practice, I do agree
that it is highly desirable to have small independent PRs
instead of large jumbo PRs whenever possible.

On Wed, Sep 19, 2018 at 11:20 AM Charles Chen mailto:c...@google.com>> wrote:

I don't think it's actually harder to roll back a set of
commits that are merged together.  Git has the notion of
first-parent commits (you can see, for example, "git log
--first-parent", which filters out the intermediate
commits).  In this sense, PRs still get merged as one unit
and this is preserved even if intermediate commits are
kept.  Perhaps the bigger issue is that we need better
documentation and a playbook on how to do this these common
tasks in git.

On Wed, Sep 19, 2018 at 9:27 AM Thomas Weise mailto:t...@apache.org>> wrote:

Wanted to bring this up as reminder as well as
opportunity to discuss potential changes to our
committer guide. It has been a while since last related
discussion and we welcomed several new committers since
then.

Finishing up pull requests pre-merge:


https://beam.apache.org/contribute/committer-guide/#finishing-touches

PRs are worked on over time and may accumulate many
commits. Sometimes because scope expands, sometimes just
to separate independent changes but most of the time the
commits are just fixups that are added as review progresses.
It is important that the latter get squashed prior to PR
merge, as otherwise we lost the ability to roll back
changes by reverting a single commit and also generally
cause a lot of noise in the commit history that does not
help other contributors. To be clear, I refer to the
"Fixup!", "Address comments", "Address even more
comments" type of entries :)

I would also propose that every commit gets tagged with
a JIRA (except those fixups that will be squashed).
Having the JIRA and possibly other tags makes it easier
for others not involved in the PR to identify changes
after they were merged, for example when looking at the
revision history or annotated source.

As for other scenarios of jumbo PRs with many commits,
there are p

Re: [DISCUSS] Committer Guidelines / Hygene before merging PRs

2018-09-19 Thread Reuven Lax
Ideally every commit should compile and pass tests though, right?

On Wed, Sep 19, 2018 at 12:15 PM Ankur Goenka  wrote:

> I agree with the cleanliness of the Commit history.
> "Fixup!", "Address comments", "Address even more comments" type of
> comments does not convey meaningful information and are not very useful.
> Its a good idea to squash them.
> However, I think its ok to keep separate commits for different logical
> pieces of the code which make reviewing and revisiting code easier.
> Example PR: Support X in the pipeline
> Commit 1: Restructuring a bunch of code without any logical change.
> Commit 2: Changing validation logic for pipeline.
> Commit 3: Supporting new field "X" for pipeline.
>
> On Wed, Sep 19, 2018 at 11:27 AM Charles Chen  wrote:
>
>> To be concrete, it is very easy to revert a commit in any case:
>>
>>1. First, use "git log --first-parent" to find the first-parent
>>commit corresponding to a PR merge (this is a one-to-one correspondence).
>>2. Use "git revert -m 1 " to revert the commit; this
>>selects the first parent as the base for a merge commit (in the case where
>>a single commit needs to be reverted, just use "git revert "
>>without the "-m 1" flag).
>>
>> In any case, as a general good engineering practice, I do agree that it
>> is highly desirable to have small independent PRs instead of large jumbo
>> PRs whenever possible.
>>
>> On Wed, Sep 19, 2018 at 11:20 AM Charles Chen  wrote:
>>
>>> I don't think it's actually harder to roll back a set of commits that
>>> are merged together.  Git has the notion of first-parent commits (you can
>>> see, for example, "git log --first-parent", which filters out the
>>> intermediate commits).  In this sense, PRs still get merged as one unit and
>>> this is preserved even if intermediate commits are kept.  Perhaps the
>>> bigger issue is that we need better documentation and a playbook on how to
>>> do this these common tasks in git.
>>>
>>> On Wed, Sep 19, 2018 at 9:27 AM Thomas Weise  wrote:
>>>
 Wanted to bring this up as reminder as well as opportunity to discuss
 potential changes to our committer guide. It has been a while since last
 related discussion and we welcomed several new committers since then.

 Finishing up pull requests pre-merge:

 https://beam.apache.org/contribute/committer-guide/#finishing-touches

 PRs are worked on over time and may accumulate many commits. Sometimes
 because scope expands, sometimes just to separate independent changes but
 most of the time the commits are just fixups that are added as review
 progresses.

 It is important that the latter get squashed prior to PR merge, as
 otherwise we lost the ability to roll back changes by reverting a single
 commit and also generally cause a lot of noise in the commit history that
 does not help other contributors. To be clear, I refer to the "Fixup!",
 "Address comments", "Address even more comments" type of entries :)

 I would also propose that every commit gets tagged with a JIRA (except
 those fixups that will be squashed). Having the JIRA and possibly other
 tags makes it easier for others not involved in the PR to identify changes
 after they were merged, for example when looking at the revision history or
 annotated source.

 As for other scenarios of jumbo PRs with many commits, there are
 probably situations where work needs to be broken down into smaller units,
 making life better for both, contributor and reviewer(s). Ideally, every PR
 would have only one commit, but that may be a bit much to mandate? Is the
 general expectation something we need to document more clearly?

 Thanks,
 Thomas




Re: [DISCUSS] Committer Guidelines / Hygene before merging PRs

2018-09-19 Thread Ankur Goenka
I agree with the cleanliness of the Commit history.
"Fixup!", "Address comments", "Address even more comments" type of comments
does not convey meaningful information and are not very useful. Its a good
idea to squash them.
However, I think its ok to keep separate commits for different logical
pieces of the code which make reviewing and revisiting code easier.
Example PR: Support X in the pipeline
Commit 1: Restructuring a bunch of code without any logical change.
Commit 2: Changing validation logic for pipeline.
Commit 3: Supporting new field "X" for pipeline.

On Wed, Sep 19, 2018 at 11:27 AM Charles Chen  wrote:

> To be concrete, it is very easy to revert a commit in any case:
>
>1. First, use "git log --first-parent" to find the first-parent commit
>corresponding to a PR merge (this is a one-to-one correspondence).
>2. Use "git revert -m 1 " to revert the commit; this selects
>the first parent as the base for a merge commit (in the case where a single
>commit needs to be reverted, just use "git revert " without the
>"-m 1" flag).
>
> In any case, as a general good engineering practice, I do agree that it is
> highly desirable to have small independent PRs instead of large jumbo PRs
> whenever possible.
>
> On Wed, Sep 19, 2018 at 11:20 AM Charles Chen  wrote:
>
>> I don't think it's actually harder to roll back a set of commits that are
>> merged together.  Git has the notion of first-parent commits (you can see,
>> for example, "git log --first-parent", which filters out the intermediate
>> commits).  In this sense, PRs still get merged as one unit and this is
>> preserved even if intermediate commits are kept.  Perhaps the bigger issue
>> is that we need better documentation and a playbook on how to do this these
>> common tasks in git.
>>
>> On Wed, Sep 19, 2018 at 9:27 AM Thomas Weise  wrote:
>>
>>> Wanted to bring this up as reminder as well as opportunity to discuss
>>> potential changes to our committer guide. It has been a while since last
>>> related discussion and we welcomed several new committers since then.
>>>
>>> Finishing up pull requests pre-merge:
>>>
>>> https://beam.apache.org/contribute/committer-guide/#finishing-touches
>>>
>>> PRs are worked on over time and may accumulate many commits. Sometimes
>>> because scope expands, sometimes just to separate independent changes but
>>> most of the time the commits are just fixups that are added as review
>>> progresses.
>>>
>>> It is important that the latter get squashed prior to PR merge, as
>>> otherwise we lost the ability to roll back changes by reverting a single
>>> commit and also generally cause a lot of noise in the commit history that
>>> does not help other contributors. To be clear, I refer to the "Fixup!",
>>> "Address comments", "Address even more comments" type of entries :)
>>>
>>> I would also propose that every commit gets tagged with a JIRA (except
>>> those fixups that will be squashed). Having the JIRA and possibly other
>>> tags makes it easier for others not involved in the PR to identify changes
>>> after they were merged, for example when looking at the revision history or
>>> annotated source.
>>>
>>> As for other scenarios of jumbo PRs with many commits, there are
>>> probably situations where work needs to be broken down into smaller units,
>>> making life better for both, contributor and reviewer(s). Ideally, every PR
>>> would have only one commit, but that may be a bit much to mandate? Is the
>>> general expectation something we need to document more clearly?
>>>
>>> Thanks,
>>> Thomas
>>>
>>>


Re: [DISCUSS] Committer Guidelines / Hygene before merging PRs

2018-09-19 Thread Charles Chen
To be concrete, it is very easy to revert a commit in any case:

   1. First, use "git log --first-parent" to find the first-parent commit
   corresponding to a PR merge (this is a one-to-one correspondence).
   2. Use "git revert -m 1 " to revert the commit; this selects
   the first parent as the base for a merge commit (in the case where a single
   commit needs to be reverted, just use "git revert " without the
   "-m 1" flag).

In any case, as a general good engineering practice, I do agree that it is
highly desirable to have small independent PRs instead of large jumbo PRs
whenever possible.

On Wed, Sep 19, 2018 at 11:20 AM Charles Chen  wrote:

> I don't think it's actually harder to roll back a set of commits that are
> merged together.  Git has the notion of first-parent commits (you can see,
> for example, "git log --first-parent", which filters out the intermediate
> commits).  In this sense, PRs still get merged as one unit and this is
> preserved even if intermediate commits are kept.  Perhaps the bigger issue
> is that we need better documentation and a playbook on how to do this these
> common tasks in git.
>
> On Wed, Sep 19, 2018 at 9:27 AM Thomas Weise  wrote:
>
>> Wanted to bring this up as reminder as well as opportunity to discuss
>> potential changes to our committer guide. It has been a while since last
>> related discussion and we welcomed several new committers since then.
>>
>> Finishing up pull requests pre-merge:
>>
>> https://beam.apache.org/contribute/committer-guide/#finishing-touches
>>
>> PRs are worked on over time and may accumulate many commits. Sometimes
>> because scope expands, sometimes just to separate independent changes but
>> most of the time the commits are just fixups that are added as review
>> progresses.
>>
>> It is important that the latter get squashed prior to PR merge, as
>> otherwise we lost the ability to roll back changes by reverting a single
>> commit and also generally cause a lot of noise in the commit history that
>> does not help other contributors. To be clear, I refer to the "Fixup!",
>> "Address comments", "Address even more comments" type of entries :)
>>
>> I would also propose that every commit gets tagged with a JIRA (except
>> those fixups that will be squashed). Having the JIRA and possibly other
>> tags makes it easier for others not involved in the PR to identify changes
>> after they were merged, for example when looking at the revision history or
>> annotated source.
>>
>> As for other scenarios of jumbo PRs with many commits, there are probably
>> situations where work needs to be broken down into smaller units, making
>> life better for both, contributor and reviewer(s). Ideally, every PR would
>> have only one commit, but that may be a bit much to mandate? Is the general
>> expectation something we need to document more clearly?
>>
>> Thanks,
>> Thomas
>>
>>


Re: [DISCUSS] Committer Guidelines / Hygene before merging PRs

2018-09-19 Thread Charles Chen
I don't think it's actually harder to roll back a set of commits that are
merged together.  Git has the notion of first-parent commits (you can see,
for example, "git log --first-parent", which filters out the intermediate
commits).  In this sense, PRs still get merged as one unit and this is
preserved even if intermediate commits are kept.  Perhaps the bigger issue
is that we need better documentation and a playbook on how to do this these
common tasks in git.

On Wed, Sep 19, 2018 at 9:27 AM Thomas Weise  wrote:

> Wanted to bring this up as reminder as well as opportunity to discuss
> potential changes to our committer guide. It has been a while since last
> related discussion and we welcomed several new committers since then.
>
> Finishing up pull requests pre-merge:
>
> https://beam.apache.org/contribute/committer-guide/#finishing-touches
>
> PRs are worked on over time and may accumulate many commits. Sometimes
> because scope expands, sometimes just to separate independent changes but
> most of the time the commits are just fixups that are added as review
> progresses.
>
> It is important that the latter get squashed prior to PR merge, as
> otherwise we lost the ability to roll back changes by reverting a single
> commit and also generally cause a lot of noise in the commit history that
> does not help other contributors. To be clear, I refer to the "Fixup!",
> "Address comments", "Address even more comments" type of entries :)
>
> I would also propose that every commit gets tagged with a JIRA (except
> those fixups that will be squashed). Having the JIRA and possibly other
> tags makes it easier for others not involved in the PR to identify changes
> after they were merged, for example when looking at the revision history or
> annotated source.
>
> As for other scenarios of jumbo PRs with many commits, there are probably
> situations where work needs to be broken down into smaller units, making
> life better for both, contributor and reviewer(s). Ideally, every PR would
> have only one commit, but that may be a bit much to mandate? Is the general
> expectation something we need to document more clearly?
>
> Thanks,
> Thomas
>
>