> On 22 Apr 2021, at 20:18, Robert Bradshaw <rober...@google.com> wrote:
> 
> On Thu, Apr 22, 2021 at 9:33 AM Kenneth Knowles <k...@apache.org 
> <mailto:k...@apache.org>> wrote:
> Heuristic CI that says "this commit history looks OK" might solve a lot of 
> the problem (I see that Robert already started on this).
> 
> And finally I was to repeat my agreement with Ismaël and Alexey that the root 
> problem is this: we need to actually care about the commit history and 
> communication of PR/commit titles and descriptions. We use tools to help us 
> to implement our intentions and to communicate them to newcomers, but I don't 
> think this will replace taking care of the repo.
> 
> Committers should care about taking care of the repo more than the average 
> contributor, but even there there is high variance. I think the issue is "oh, 
> I didn't think to squash vs. merge" rather than "who cares, I always press 
> merge anyway" in which case a timely reminder will go a long way. 

+100 (sorry) and if we can additionally have something like a warning before 
merge it would be helpful too. 
I don’t think we really want to add more complexity to development process and 
“light” version with just a warning perhaps will be enough.

> 
> Kenn
> 
> [1] 
> https://lists.apache.org/thread.html/4a65fb0b66935c9dc61568a3067538775edc3e685c6ac03dd3fa4725%40%3Cdev.beam.apache.org%3E
>  
> <https://lists.apache.org/thread.html/4a65fb0b66935c9dc61568a3067538775edc3e685c6ac03dd3fa4725%40%3Cdev.beam.apache.org%3E>
>  
> 
> As for me, I’d prefer that every committer paid more attention (if not yet) 
> on these “non code” things before reviewing/merging a PR.
> 
> [1] https://github.com/apache/spark/blob/master/dev/merge_spark_pr.py 
> <https://github.com/apache/spark/blob/master/dev/merge_spark_pr.py>
> 
> 
>> On 22 Apr 2021, at 01:28, Robert Bradshaw <rober...@google.com 
>> <mailto:rober...@google.com>> wrote:
>> 
>> I am also in the camp that it often makes sense to have more than 1 commit 
>> per PR, but rather than enforce a 1 commit per PR policy, I would say that 
>> it's too much bother to look at the commit history whether it should be 
>> squashed or merged (though I think it is almost always very obvious which is 
>> preferable for a given PR), go ahead and squash rather than merge by 
>> default. 
>> 
>> 
>> On Wed, Apr 21, 2021 at 2:23 PM Kenneth Knowles <k...@apache.org 
>> <mailto:k...@apache.org>> wrote:
>> This seems to come up a lot. Maybe we should change something?
>> 
>> Having worked on a number of projects and at companies with this policy, 
>> companies using non-distributed source control, and companies that just "use 
>> git like git", I know all these ways of life pretty well.
>> 
>> TL;DR my experience is:
>>  - when people care about the commit history and take care of it, then just 
>> "use git like git" results in faster development and clearer history, 
>> despite intermediate commits not being tested by Jenkins/Travis/GHA
>>  - when people see git as an inconvenience, view the history as an 
>> implementation detail, or think in linear history of PR merges and view the 
>> commits as an implementation detail, it becomes a mess
>> 
>> Empirically, this is what I expect from a 1 commit = 1 PR policy (and how I 
>> feel about each point):
>>  - fewer commits with bad messages (yay!)
>>  - simpler git graph if we squash + rebase (meh)
>>  - larger commits of related-but-independent changes (could be OK)
>>  - commits with bullet points in their description that bundle unrelated 
>> changes (sad face)
>>  - slowdown of development (neutral - slow can be good)
>>  - fewer "quality of life" improvements, since those would add lines of diff 
>> to a PR and are off topic; when they have to be done in a separate PR they 
>> don't get done and they don't get reviewed with the same priority (extra sad 
>> face)
>> 
>> <rant>I know I am in the minority. I tend to have a lot of PRs where there 
>> are 2-5 fairly independent commits. It is "to aid code review" but not in 
>> the way you might think: The best size for code review is pretty big, 
>> compared to the best size for commit. A commit is the unit of roll-forward, 
>> roll-back, cherry-pick, etc. Brian's point about commits not being 
>> independently tested is important: this is a tooling issue, but not that 
>> easy to change. Here is why I am not that worried about it: I believe 
>> strongly in a "rollback first" policy to restore greenness, but also that 
>> the rollback change itself must be verified to restore greenness. When a 
>> multi-commit PR fails, you can easily open a revert of the whole PR as well 
>> as reverts of individual suspect commits. The CI for these will finish 
>> around the same time, and if you manage a smaller revert, great! Imagine if 
>> to revert a PR you had to revert _every_ change between HEAD and that PR. It 
>> would restore to a known green state. Yet we don't do this, because we have 
>> technology that makes it unnecessary. Ultimately, single large commits with 
>> bullet points are just an unstructured version of multi-commit PRs. So I 
>> favor the structure. But people seem to be more likely to write good bullet 
>> points than to write independent commits. Perhaps because it is 
>> easier.</rant>
>> 
>> So at this point, I think I am OK with a 1 commit per PR policy. I think the 
>> net benefits to our commit history would be good. I have grown tired of 
>> repeating the conversation. Rebase-and-squash edits commit ids in ways that 
>> confuses tools, so I do not favor this. Tooling that merges one commit at a 
>> time (without altering commit id) would also be super cool and not that 
>> hard. It would prevent intermediate results from merging, solving both 
>> problems.
>> 
>> Kenn
>> 
>> 
>> On Wed, Apr 21, 2021 at 1:25 PM Brian Hulette <bhule...@google.com 
>> <mailto:bhule...@google.com>> wrote:
>> I'd argue that the history is almost always "most useful" when one PR == one 
>> commit on master. Intermediate commits from a PR may be useful to aid code 
>> review, but they're not verified by presubmits and thus aren't necessarily 
>> independently revertible, so I see little value in keeping them around on 
>> master. In fact if you're breaking up a PR into multiple commits to aid code 
>> review, it's worth considering if they could/should be separately reviewed 
>> and verified PRs.
>> We could solve the unwanted commit issue if we have a policy to always 
>> "Squash and Merge" PRs with rare exceptions.
>> 
>> I agree jira/PR titles could be better, I'm not sure what we can do about it 
>> aside from reminding committers of this responsibility. Perhaps the triage 
>> process can help catch poorly titled jiras?
>> 
>> On Wed, Apr 21, 2021 at 11:38 AM Robert Bradshaw <rober...@google.com 
>> <mailto:rober...@google.com>> wrote:
>> +1 to better descriptions for JIRA (and PRs). Thanks for bringing this up. 
>> 
>> For merging unwanted commits, can we automate a simple check (e.g. with 
>> github actions)? 
>> 
>> On Wed, Apr 21, 2021 at 8:00 AM Tomo Suzuki <suzt...@google.com 
>> <mailto:suzt...@google.com>> wrote:
>> BEAM-12173 is on me. I'm sorry about that. Re-reading committer guide
>> [1], I see I was not following this
>> 
>> > The reviewer should give the LGTM and then request that the author of the 
>> > pull request rebase, squash, split, etc, the commits, so that the history 
>> > is most useful
>> 
>> 
>> Thank you for the feedback on this matter! (And I don't think we
>> should change the contribution guide)
>> 
>> [1] https://beam.apache.org/contribute/committer-guide/ 
>> <https://beam.apache.org/contribute/committer-guide/>
>> 
>> On Wed, Apr 21, 2021 at 10:35 AM Ismaël Mejía <ieme...@gmail.com 
>> <mailto:ieme...@gmail.com>> wrote:
>> >
>> > Hello,
>> >
>> > I have noticed an ongoing pattern of carelessness around issues/PR titles 
>> > and
>> > descriptions. It is really painful to see more and more examples like:
>> >
>> > BEAM-12160 Add TODO for fixing the warning
>> > BEAM-12165 Fix ParquetIO
>> > BEAM-12173 avoid intermediate conversion (PR) and BEAM-12173 use
>> > toMinutes (commit)
>> >
>> > In all these cases with just a bit of detail in the title it would be 
>> > enough to
>> > make other contributors or reviewers life easierm as well as to have a 
>> > better
>> > project history.  What astonishes me apart of the lack of care is that 
>> > some of
>> > those are from Beam commmitters.
>> >
>> > We already have discussed about not paying attention during commit merges 
>> > where
>> > some PRs end up merging tons of 'unwanted' fixup commits, and nothing has
>> > changed so I am wondering if we should maybe just totally remove that rule 
>> > (for
>> > commits) and also eventually for titles and descriptions.
>> >
>> > Ismaël
>> >
>> > [1] https://beam.apache.org/contribute/ 
>> > <https://beam.apache.org/contribute/>
>> 
>> 
>> 
>> -- 
>> Regards,
>> Tomo

Reply via email to