This is a great discussion, thanks for raising the question Naveen. >From my experience working in all kinds of software project of varying sizes and flavours:
1. People should be aware of git rebase interactive and have more hygiene in their PRs. If a PR is hygienic and is separated in a set of semantic commits, it's better not squashed as it helps finding bugs / bisecting. A "dirty" PR with WIP commits, is better squashed, or requested to rebase interactively on CR. 2. Mixing different semantic changes on a single PR is an anti-pattern, for example fixing whitespace or reformatting many lines and changing some code which is hidden in a bunch of trivial changes and could cause a bug or major change of behaviour. 3. Agreed what with others have mentioned about small incremental steps better than huge PRs. We also have JIRA or issues to relate a set of PRs together. If not possible, PR with a set of non squashed commits is also good. Hope it helps. Pedro. On Thu, Jul 12, 2018 at 11:47 PM Naveen Swamy <mnnav...@gmail.com> wrote: > @Aaron, I do not think most contributors(SDE or not) were even aware that > their commits are getting squashed into one and thereby others losing > credit on that work. I would assume no bad intentions there. > > @Hao, > Agree to breaking down into small and reasonable sized PRs, but I think > very small PRs will overwhelm the CI as it stands since it runs > everything(this is a separate topic and needs fixing). > > For cases similar to Aaron's he can raise the PR for the doc > work(regardless of fork or not) and add other contributors as co-authors. > > @Marco, co-author might not be suitable always suitable and agree with Hao > that should be used on exceptions. co-author also makes it hard to see the > contributions individually. > > @Marco, why can we not have Rebase merge enabled on the repo and use that > when there are multiple contributors. This discussion is only about Not > Squashing when there are multiple contributors and agree to continue the > practice of Squashing in general. > > Until the tooling is fixed, I suggest to use the co-author feature when > collaborating on a fork. > > Also, I just want to reiterate and request contributors to have meaningful > and fewer commits on PRs. > > Thanks, Naveen > > > On Thu, Jul 12, 2018 at 11:40 AM, Marco de Abreu < > marco.g.ab...@googlemail.com.invalid> wrote: > > > As of now it will require the usage of a merge bot as far as I understand > > this issue: https://github.com/python/miss-islington/issues/16. Instead > of > > using the web interface do merge, we'd then trigger the bot to do the > merge > > on our behalf. There are pre-made solutions on the internet we might be > > able to leverage, but it would take some time to get into it and adapt > our > > process. > > > > Additionally, GitHub has this feature request tracked internally. Let's > see > > when they get to add it. > > > > -Marco > > > > > > > > Aaron Markham <aaron.s.mark...@gmail.com> schrieb am Do., 12. Juli 2018, > > 21:33: > > > > > My point was about collaboration, or the lack thereof, and how the > > tooling > > > and apparent rewards from contribution statistics can reinforce lone > > ranger > > > behavior. People can and should be proud of their work, but why does it > > > have to be alone? Working within the context of a team is something to > be > > > proud of too, but if your stats and standing are graded by how the > > commits > > > and merges land, and counting lines of code, then incentives of the > > system > > > are skewed. > > > How do we go about implementing the co-author process? It sounds like > > > something worth doing! > > > > > > On Thu, Jul 12, 2018 at 11:15 AM, Hao Jin <hjjn.a...@gmail.com> wrote: > > > > > > > Hi Aaron, > > > > To be fair, this discussion has nothing to do with "pride" of SDEs, > but > > > > rather a discussion on what is a better software engineering practice > > for > > > > the project. Breaking a large project into smaller tasks is a good > > > software > > > > engineering practice. This article(https://arxiv.org/pdf/ > > 1702.01715.pdf) > > > > on > > > > Google's software engineering practice says: "Engineers are > encouraged > > to > > > > keep each individual change small, with larger changes preferably > > broken > > > > into a series of smaller changes that a reviewer can easily review in > > one > > > > go.". If you have concerns or your comments on this practice, we can > > take > > > > the discussion offline. On the other hand, an important spirit of > > Apache > > > > community is that "one must interact with others, and share vision > and > > > > knowledge"(https://community.apache.org/contributors/), if you > > observed > > > > that a majority of the contributors have serious problems with their > > > > writing maybe you can share some tips and hints on how to write > better > > > > documentations, in that way not only a lot of us within the community > > can > > > > have some growth but also you can save some time for writing more > good > > > > documentations and blogposts for MXNet, don't you think so? Or, if > you > > > only > > > > have to fix the doc for someone once in a while, I definitely agree > > that > > > > you should be given the credit for that, and that's where the > co-author > > > > field can be useful, which was exactly what I was saying in my > previous > > > > email. Thanks! > > > > Hao > > > > > > > > On Thu, Jul 12, 2018 at 8:16 AM, Aaron Markham < > > > aaron.s.mark...@gmail.com> > > > > wrote: > > > > > > > > > This is a great discussion, and close to my heart, because I want > to > > > > > include more writers and editors in our community, and I'm > struggling > > > to > > > > > see how to best manage this. It seems like being the sole > contributor > > > on > > > > a > > > > > feature is like an engineer's Lone Ranger badge of pride! I feel > like > > > it > > > > > should be a red flag. > > > > > > > > > > I work in collaboration with dozens of engineers to improve their > > > > > documentation, explain their features, change flows to improve user > > > > > experience, and sometimes fix bugs and write code. When the PR's > docs > > > has > > > > > great coverage, is clear, and not loaded with bad grammar and > > spelling > > > > > mistakes, I will put comments in a review. But sometimes there > needs > > to > > > > be > > > > > a complete rework such that hundreds of review comments isn't > > feasible, > > > > and > > > > > they can't be properly addressed. In a lot of these cases, I commit > > my > > > > > changes to someone else's fork. Now I know the people I work with > on > > > > their > > > > > PRs appreciate my help, but when all of this work gets squashed > down > > to > > > > one > > > > > commit, I'm pretty much regularly getting squashed out. I'm not > sure > > > who > > > > > else is in this boat, but does the community recognize our > > > contributions > > > > > when this is the general mode of operation? > > > > > > > > > > Regarding co-author - I'm not sure how people would feel about me > > > being a > > > > > co-author on their work. Documentation and clarity in your work > > product > > > > is > > > > > important, but people's views on their personal *code* contribution > > > seems > > > > > to be more important than the overall code & feature quality itself > > > when > > > > > docs are part of the package. I feel that if I do follow-on PRs > > instead > > > > of > > > > > fixing things before they are merged, that we would be releasing > > > > incorrect, > > > > > incomplete, and inferior product. But, in absence of a better > > solution, > > > > > maybe that's the pill we have to swallow. > > > > > > > > > > -1 on lots of small PRs (until we expand our range of committers > and > > > > reduce > > > > > the latency in reviews and merges). > > > > > +1 on improving the quality of commit messages, so we don't have to > > > > squash > > > > > & merge all of the time. > > > > > +1 on improving the practice of better commit & merge management so > > > that > > > > > the commit history is concise and meaningful. (I'm guilty of this, > > and > > > > > certainly need to improve here.) > > > > > +1 on co-author - assuming that's what most everyone thinks is a > good > > > > > solution for now. I'm unclear on how this gets managed though. > > > > > > > > > > Cheers, > > > > > Aaron > > > > > > > > > > > > > > > On Thu, Jul 12, 2018 at 5:19 AM, Anton Chernov < > mecher...@gmail.com> > > > > > wrote: > > > > > > > > > > > Unfortunately there has been significant push back for small > > > iterative > > > > > PR's > > > > > > and as a result they have grown substantially involving multiple > > > > > > contributors, multiple commits, sometimes multiple branches to be > > > > worked > > > > > > on. > > > > > > > > > > > > I fully agree and support all points that Jin raised: > > > > > > > > > > > > 1) Most contributions should be broken down into smallest > possible > > > > PR's. > > > > > > 2) If a PR is small enough a single person can complete it. > > > > > > 3) If a majority of PR is done by someone, and there's some minor > > > issue > > > > > > he/she needs help with it could be done in a follow up PR by > > anybody, > > > > > > including the reviewer. > > > > > > > > > > > > Best regards, > > > > > > Anton > > > > > > > > > > > > чт, 12 июл. 2018 г. в 10:11, Hao Jin <hjjn.a...@gmail.com>: > > > > > > > > > > > > > +1 for Marco's proposal on using the co-author field. IMHO the > > > usage > > > > of > > > > > > > co-author field should not be that often, consider: > > > > > > > 1) If a PR is so big that it needs multiple people to > contribute > > a > > > > > > > substantial part of it, why can't that PR be broken down into > > > smaller > > > > > PRs > > > > > > > each submitted by single one of them? > > > > > > > 2) If a PR is small enough (for example, < 300 lines), why > can't > > a > > > > > single > > > > > > > person complete it? > > > > > > > 3) If a majority of PR is done by someone, and there's some > minor > > > > issue > > > > > > > he/she needs help with(a small bug, incomplete doc), why can't > > that > > > > be > > > > > > done > > > > > > > through code reviews? > > > > > > > From the above 3 situations we can see that collaborations on > > > > > individual > > > > > > > PRs should not be quite often, but I do agree that it could > still > > > be > > > > > > > necessary when someone lacks the related expertise/knowledge on > > > some > > > > > > > necessary part of that PR given that the required knowledge > > cannot > > > be > > > > > > > picked up in a short period of time. > > > > > > > > > > > > > > I do agree that keeping the commit histories of PRs clean is > very > > > > > > important > > > > > > > as it could be confusing when reviewing PRs, but that really > > > depends > > > > on > > > > > > > personal preferences (For my case, I usually do "git commit > > > --amend" > > > > on > > > > > > > trivial changes and get a new commit for bigger changes such > as a > > > > > > > checkpoint of my whole PR). With growing the community and > > > attracting > > > > > > more > > > > > > > contributors as a high priority for our project, I would prefer > > > that > > > > we > > > > > > do > > > > > > > not put even more burden on the contributors by asking them to > > > manage > > > > > and > > > > > > > squash the commits themselves just for the not-so-often cases > of > > > > having > > > > > > > multiple contributors on a single PR. > > > > > > > Best regards, > > > > > > > Hao > > > > > > > > > > > > > > On Thu, Jul 12, 2018 at 12:35 AM, Marco de Abreu < > > > > > > > marco.g.ab...@googlemail.com.invalid> wrote: > > > > > > > > > > > > > > > Hi Naveen, > > > > > > > > > > > > > > > > I'm in favour of the squashing, considering the number of > > commits > > > > in > > > > > > some > > > > > > > > PRs and especially because of some people making commit > > messages > > > a > > > > la > > > > > > > "fix" > > > > > > > > "fix" "fix" all the time. Additionally, it gets hard (not > > > > impossible, > > > > > > > just > > > > > > > > more inconvenient) to determine the atomic states of master - > > > aka, > > > > > > which > > > > > > > > commits are separate from each other. You should consider > that > > > > > > > intermediary > > > > > > > > commits are unstable (fail CI) and thus it could be very hard > > to > > > > > bisect > > > > > > > > failures in future - and the commit history gets cluttered. > > > > > > > > > > > > > > > > As alternative, I'd like to suggest the co-author field for > > these > > > > > > cases. > > > > > > > > Further documentation is available at > > > > > > > > https://blog.github.com/2018-01-29-commit-together-with-co- > > > > authors/. > > > > > > > > > > > > > > > > I definitely agree with the second part. We should all lead > by > > > > > example > > > > > > > and > > > > > > > > maintain a high quality by keeping our commit messages clean > > and > > > > > > > > meaningful. When I receive an email notification that a new > > > commit > > > > > has > > > > > > > been > > > > > > > > added and it only contains "fix" as title, it's not that > > helpful > > > > and > > > > > > also > > > > > > > > it's hard to track the development of a PR overtime. E.g., > why > > > has > > > > > > > > something been changed? Was there maybe a bug that we didn't > > > cover > > > > > with > > > > > > > > tests but the author just hacked something to get it to work > > but > > > > the > > > > > > > > problem still lays somewhere? We won't know that way and it > > makes > > > > it > > > > > > > harder > > > > > > > > for us to review. > > > > > > > > > > > > > > > > Best regards, > > > > > > > > Marco > > > > > > > > > > > > > > > > > > > > > > > > Naveen Swamy <mnnav...@gmail.com> schrieb am Do., 12. Juli > > 2018, > > > > > > 10:09: > > > > > > > > > > > > > > > > > Hi All, > > > > > > > > > > > > > > > > > > I am seeing that maintainers merge PRs into the repo, they > > are > > > > > > > squashing > > > > > > > > > the commits in the PR, which I understand and agree is to > > keep > > > a > > > > > sane > > > > > > > > > commit history, however this is causing problem when there > > are > > > > > > multiple > > > > > > > > > contributors involved on a PR(by contributing to a fork of > > the > > > > > repo) > > > > > > > this > > > > > > > > > effectively removes credit for multiple contributors > involved > > > and > > > > > > shows > > > > > > > > all > > > > > > > > > code as authored by the contributor who created the PR. > > > > > > > > > > > > > > > > > > Can I request maintainers to not squash PRs if there are > > > multiple > > > > > > > > > contributors involved on the PR. > > > > > > > > > > > > > > > > > > Also on the same note, I request contributors(regardless of > > > > > multiple > > > > > > > > > contributors or not) to keep a clean commit history by > > > squashing > > > > > the > > > > > > > > > commits and not push all your WIP commits to the PR. this > > will > > > > help > > > > > > us > > > > > > > > keep > > > > > > > > > our commit history clean and meaningful. > > > > > > > > > > > > > > > > > > Let me know your thoughts/better approach or If I have > > > > > misunderstood > > > > > > > how > > > > > > > > > this works. > > > > > > > > > > > > > > > > > > Thanks, Naveen > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >