> On 22 Apr 2021, at 20:18, Robert Bradshaw <[email protected]> wrote: > > On Thu, Apr 22, 2021 at 9:33 AM Kenneth Knowles <[email protected] > <mailto:[email protected]>> wrote: > Heuristic CI that says "this commit history looks OK" might solve a lot of > the problem (I see that Robert already started on this). > > And finally I was to repeat my agreement with Ismaël and Alexey that the root > problem is this: we need to actually care about the commit history and > communication of PR/commit titles and descriptions. We use tools to help us > to implement our intentions and to communicate them to newcomers, but I don't > think this will replace taking care of the repo. > > Committers should care about taking care of the repo more than the average > contributor, but even there there is high variance. I think the issue is "oh, > I didn't think to squash vs. merge" rather than "who cares, I always press > merge anyway" in which case a timely reminder will go a long way.
+100 (sorry) and if we can additionally have something like a warning before merge it would be helpful too. I don’t think we really want to add more complexity to development process and “light” version with just a warning perhaps will be enough. > > Kenn > > [1] > https://lists.apache.org/thread.html/4a65fb0b66935c9dc61568a3067538775edc3e685c6ac03dd3fa4725%40%3Cdev.beam.apache.org%3E > > <https://lists.apache.org/thread.html/4a65fb0b66935c9dc61568a3067538775edc3e685c6ac03dd3fa4725%40%3Cdev.beam.apache.org%3E> > > > As for me, I’d prefer that every committer paid more attention (if not yet) > on these “non code” things before reviewing/merging a PR. > > [1] https://github.com/apache/spark/blob/master/dev/merge_spark_pr.py > <https://github.com/apache/spark/blob/master/dev/merge_spark_pr.py> > > >> On 22 Apr 2021, at 01:28, Robert Bradshaw <[email protected] >> <mailto:[email protected]>> wrote: >> >> I am also in the camp that it often makes sense to have more than 1 commit >> per PR, but rather than enforce a 1 commit per PR policy, I would say that >> it's too much bother to look at the commit history whether it should be >> squashed or merged (though I think it is almost always very obvious which is >> preferable for a given PR), go ahead and squash rather than merge by >> default. >> >> >> On Wed, Apr 21, 2021 at 2:23 PM Kenneth Knowles <[email protected] >> <mailto:[email protected]>> wrote: >> This seems to come up a lot. Maybe we should change something? >> >> Having worked on a number of projects and at companies with this policy, >> companies using non-distributed source control, and companies that just "use >> git like git", I know all these ways of life pretty well. >> >> TL;DR my experience is: >> - when people care about the commit history and take care of it, then just >> "use git like git" results in faster development and clearer history, >> despite intermediate commits not being tested by Jenkins/Travis/GHA >> - when people see git as an inconvenience, view the history as an >> implementation detail, or think in linear history of PR merges and view the >> commits as an implementation detail, it becomes a mess >> >> Empirically, this is what I expect from a 1 commit = 1 PR policy (and how I >> feel about each point): >> - fewer commits with bad messages (yay!) >> - simpler git graph if we squash + rebase (meh) >> - larger commits of related-but-independent changes (could be OK) >> - commits with bullet points in their description that bundle unrelated >> changes (sad face) >> - slowdown of development (neutral - slow can be good) >> - fewer "quality of life" improvements, since those would add lines of diff >> to a PR and are off topic; when they have to be done in a separate PR they >> don't get done and they don't get reviewed with the same priority (extra sad >> face) >> >> <rant>I know I am in the minority. I tend to have a lot of PRs where there >> are 2-5 fairly independent commits. It is "to aid code review" but not in >> the way you might think: The best size for code review is pretty big, >> compared to the best size for commit. A commit is the unit of roll-forward, >> roll-back, cherry-pick, etc. Brian's point about commits not being >> independently tested is important: this is a tooling issue, but not that >> easy to change. Here is why I am not that worried about it: I believe >> strongly in a "rollback first" policy to restore greenness, but also that >> the rollback change itself must be verified to restore greenness. When a >> multi-commit PR fails, you can easily open a revert of the whole PR as well >> as reverts of individual suspect commits. The CI for these will finish >> around the same time, and if you manage a smaller revert, great! Imagine if >> to revert a PR you had to revert _every_ change between HEAD and that PR. It >> would restore to a known green state. Yet we don't do this, because we have >> technology that makes it unnecessary. Ultimately, single large commits with >> bullet points are just an unstructured version of multi-commit PRs. So I >> favor the structure. But people seem to be more likely to write good bullet >> points than to write independent commits. Perhaps because it is >> easier.</rant> >> >> So at this point, I think I am OK with a 1 commit per PR policy. I think the >> net benefits to our commit history would be good. I have grown tired of >> repeating the conversation. Rebase-and-squash edits commit ids in ways that >> confuses tools, so I do not favor this. Tooling that merges one commit at a >> time (without altering commit id) would also be super cool and not that >> hard. It would prevent intermediate results from merging, solving both >> problems. >> >> Kenn >> >> >> On Wed, Apr 21, 2021 at 1:25 PM Brian Hulette <[email protected] >> <mailto:[email protected]>> wrote: >> I'd argue that the history is almost always "most useful" when one PR == one >> commit on master. Intermediate commits from a PR may be useful to aid code >> review, but they're not verified by presubmits and thus aren't necessarily >> independently revertible, so I see little value in keeping them around on >> master. In fact if you're breaking up a PR into multiple commits to aid code >> review, it's worth considering if they could/should be separately reviewed >> and verified PRs. >> We could solve the unwanted commit issue if we have a policy to always >> "Squash and Merge" PRs with rare exceptions. >> >> I agree jira/PR titles could be better, I'm not sure what we can do about it >> aside from reminding committers of this responsibility. Perhaps the triage >> process can help catch poorly titled jiras? >> >> On Wed, Apr 21, 2021 at 11:38 AM Robert Bradshaw <[email protected] >> <mailto:[email protected]>> wrote: >> +1 to better descriptions for JIRA (and PRs). Thanks for bringing this up. >> >> For merging unwanted commits, can we automate a simple check (e.g. with >> github actions)? >> >> On Wed, Apr 21, 2021 at 8:00 AM Tomo Suzuki <[email protected] >> <mailto:[email protected]>> wrote: >> BEAM-12173 is on me. I'm sorry about that. Re-reading committer guide >> [1], I see I was not following this >> >> > The reviewer should give the LGTM and then request that the author of the >> > pull request rebase, squash, split, etc, the commits, so that the history >> > is most useful >> >> >> Thank you for the feedback on this matter! (And I don't think we >> should change the contribution guide) >> >> [1] https://beam.apache.org/contribute/committer-guide/ >> <https://beam.apache.org/contribute/committer-guide/> >> >> On Wed, Apr 21, 2021 at 10:35 AM Ismaël Mejía <[email protected] >> <mailto:[email protected]>> wrote: >> > >> > Hello, >> > >> > I have noticed an ongoing pattern of carelessness around issues/PR titles >> > and >> > descriptions. It is really painful to see more and more examples like: >> > >> > BEAM-12160 Add TODO for fixing the warning >> > BEAM-12165 Fix ParquetIO >> > BEAM-12173 avoid intermediate conversion (PR) and BEAM-12173 use >> > toMinutes (commit) >> > >> > In all these cases with just a bit of detail in the title it would be >> > enough to >> > make other contributors or reviewers life easierm as well as to have a >> > better >> > project history. What astonishes me apart of the lack of care is that >> > some of >> > those are from Beam commmitters. >> > >> > We already have discussed about not paying attention during commit merges >> > where >> > some PRs end up merging tons of 'unwanted' fixup commits, and nothing has >> > changed so I am wondering if we should maybe just totally remove that rule >> > (for >> > commits) and also eventually for titles and descriptions. >> > >> > Ismaël >> > >> > [1] https://beam.apache.org/contribute/ >> > <https://beam.apache.org/contribute/> >> >> >> >> -- >> Regards, >> Tomo
