+1 I think this provides a reasonable set of expectations for the author and committer.
-Kyle On Tue, Jan 31, 2017 at 10:46 AM, Nick Allen <n...@nickallen.org> wrote: > I am in agreement with what I am reading. Something along the lines of the > following. > > - Best effort should be made to create a pull request that can be > attributed to a single author. > - If the work of multiple authors cannot be split into separate pull > requests for justifiable reasons, then it is the author's > responsibility to > create a pull request with the minimal number of commits required to > provide appropriate attribution to multiple authors. > - Each commit should be attributed to a single author and follow the > commit message guidelines identifying the issue associated with the > change. > - When merging the pull request, the committer will not squash these > commits so that attribution to each author can be maintained in the > revision history. > > > On Tue, Jan 31, 2017 at 9:42 AM, Otto Fowler <ottobackwa...@gmail.com> > wrote: > > > Just to be clear - I’m not pushing for this, I was more curious about how > > to do it if we wanted to. My main interest is that the process we agree > to > > fosters and encourages collaboration rather than make it such a pain that > > contributors think twice before helping out on large PR efforts. > > > > How did we do it for the Storm efforts? > > > > > > On January 31, 2017 at 09:21:17, Casey Stella (ceste...@gmail.com) > wrote: > > > > To my reading, that script only notes additional authors, it does not > > create separate commits for those authors. I feel that we should give > > authors the ability to construct pull requests in such a way that > > attribution in the form of github statistics are preserved. > > > > Also, and this is for the mentors, does the zookeeper methodology abide > by > > the requirement of having the commit log be a place of record for > > authorization? > > > > Casey > > > > On Tue, Jan 31, 2017 at 9:16 AM, Otto Fowler <ottobackwa...@gmail.com> > > wrote: > > > > > https://cwiki.apache.org/confluence/display/ZOOKEEPER/ > > > Merging+Github+Pull+Requests > > > > > > > > > On January 31, 2017 at 09:04:38, Casey Stella (ceste...@gmail.com) > > wrote: > > > > > > The problem is that a single commit cannot have multiple authors and > it's > > > difficult to construct a minimal set of commits for multiple authors in > > the > > > same PR by the committers. I would be in favor of not automating this > > and, > > > rather, insisting that the pull request either be split up into > multiple > > > PRs or have the individual PR's commits squashed and named > appropriately > > by > > > the authors. > > > > > > For reference, here's how Apache Apex does this. Lines item 7 and 8 > under > > > "Opening a Pull Request" at https://apex.apache.org/contributing.html > : > > > > > > 1. > > > > > > After all review is complete, combine all new commits into one squashed > > > commit except when there are multiple contributors, and include the > Jira > > > number in the commit message. There are several ways to squash > > > commits, but here > > > is one explanation from git-scm.com > > > <https://git-scm.com/book/en/v2/Git-Tools-Rewriting- > > > History#Squashing-Commits> > > > and > > > a simple example is illustrated below: > > > > > > If tracking upstream/master then run git rebase -i. Else run git rebase > > > -i upstream/master. This command opens the text editor which lists the > > > multiple commits: > > > > > > pick 67cd79b change1 > > > pick 6f98905 change2 > > > > > > # Rebase e13748b..3463fbf onto e13748b (2 command(s)) > > > # > > > # Commands: > > > # p, pick = use commit > > > # r, reword = use commit, but edit the commit message > > > # e, edit = use commit, but stop for amending > > > # s, squash = use commit, but meld into previous commit > > > # f, fixup = like "squash", but discard this commit's log message > > > # x, exec = run command (the rest of the line) using shell > > > # > > > # These lines can be re-ordered; they are executed from top to bottom. > > > > > > Squash 'change2' to 'change1' and save. > > > > > > pick 67cd79b change1 > > > squash 6f98905 change2 > > > > > > 2. If there are multiple contributors in a pull request preserve > > > individual attributions. Try to squash the commits to the minimum > number > > of > > > commits required to preserve attribution and the contribution to still > be > > > functionally correct. > > > > > > Their language is similar to the ones proposed by Dave and I like it. > > This > > > is, I consider, a special case and we should avoid trying to automate > it, > > > but rather delegate it to meatspace. Thoughts? > > > > > > On Tue, Jan 31, 2017 at 5:58 AM, zeo...@gmail.com <zeo...@gmail.com> > > > wrote: > > > > > > > I thought the idea would be to maintain the commit mapping to > > individuals > > > > so things like the GitHub statistics are accurate regarding # of > > > commits, # > > > > of lines affected +/-, # of contributors, etc. > > > > > > > > Jon > > > > > > > > On Tue, Jan 31, 2017, 12:20 AM Otto Fowler <ottobackwa...@gmail.com> > > > > wrote: > > > > > > > > > I mean multi-author of course > > > > > > > > > > > > > > > On January 31, 2017 at 00:19:17, Otto Fowler ( > > ottobackwa...@gmail.com) > > > > > wrote: > > > > > > > > > > So, if we use a script like this…. > > > > > > > > > > import sys > > > > > import requests > > > > > import json > > > > > from sets import Set > > > > > > > > > > if len(sys.argv) == 1: > > > > > print "Must pass in pull request number" > > > > > exit(1) > > > > > > > > > > next_url = 'https://api.github.com/repos/ > > > apache/incubator-metron/pulls/' > > > > > + sys.argv[1] + '/commits' > > > > > authors = Set() > > > > > while next_url: > > > > > response = requests.get(next_url) > > > > > if response.status_code == 200: > > > > > m = json.loads(response.content) > > > > > for c in m: > > > > > authors.add(c["commit"]["author"]["name"] + " <" + > > > > > c["commit"]["author"]["email"] + ">") > > > > > if 'next' in response.links: > > > > > next_url = response.links['next']['url'] > > > > > else: > > > > > next_url = '' > > > > > > > > > > for author in authors: > > > > > print author > > > > > > > > > > > > > > > and call it from our bash scripts ( this is a test script, would > need > > > > error > > > > > check in the real thing ) > > > > > > > > > > 1 #!/bin/local/bin bash > > > > > > > > > > 2 > > > > > > > > > > 3 my_array=() > > > > > > > > > > 4 while IFS= read -r line; do > > > > > > > > > > 5 my_array+=( "$line" ) > > > > > > > > > > 6 done < <( python test.py 316 ) > > > > > > > > > > 7 > > > > > > > > > > 8 for element in "${my_array[@]}" > > > > > > > > > > 9 do > > > > > > > > > > 10 echo "${element}" > > > > > > > > > > 11 done > > > > > > > > > > I get the following output for a multi commit PR like #316 > > > > > > > > > > *➜* *ottofowler**@**Winterfell * *~/src/github/forks * bash test.sh > > > > > > > > > > JJ <jjmey...@gmail.com> > > > > > > > > > > merrimanr <merrim...@gmail.com> > > > > > > > > > > rmerriman <rmerri...@hortonworks.com> > > > > > > > > > > > > > > > Would putting that information in the commit message be good > enough? > > > > The > > > > > idea that we integrate this with Nick’s scripts. > > > > > > > > > > > > > > > On January 30, 2017 at 07:00:35, Otto Fowler ( > > ottobackwa...@gmail.com) > > > > > wrote: > > > > > > > > > > Maybe using something like this? > > > > > https://developer.github.com/v3/pulls/#list-commits-on-a- > > pull-request > > > > > > > > > > > > > > > On January 28, 2017 at 09:52:16, Otto Fowler ( > > ottobackwa...@gmail.com) > > > > > wrote: > > > > > > > > > > Would it be possible to get all the authors out of the pr before > > > > squashing > > > > > and change out commit messages? Maybe using an extended version of > > > > nick's > > > > > scripts? > > > > > > > > > > On January 27, 2017 at 19:02:54, Casey Stella (ceste...@gmail.com) > > > > wrote: > > > > > > > > > > > Having thought about it a minute longer, maybe you're right, > Dave. > > I > > > > like > > > > > > this. > > > > > > > > > > > > I'd add one sentence more: > > > > > > If the contribution does not have a single author, please ask > that > > > the > > > > > > contribution be squashed by the author into the minimum number of > > > > commits > > > > > > required to preserve correct attribution each following the same > > > > standard > > > > > > naming convention of: JIRA Description (e.g. METRON-123: Foo all > > the > > > > > Bars). > > > > > > The committer should not squash commits in this instance so that > > > > > ownership > > > > > > is maintained. > > > > > > > > > > > > On Fri, Jan 27, 2017 at 6:55 PM, Casey Stella < > ceste...@gmail.com> > > > > > wrote: > > > > > > > > > > > > Yeah, this is sort of the reason I was going for multiple PRs > > rather > > > > than > > > > > > having contributors do their own squashing and commits. I > recognize > > > > that > > > > > > multiple PRs are difficult to review. > > > > > > > > > > > > That being said, I don't know that the actual commits are > required > > to > > > > > have > > > > > > all of a person's commit. I imagined the process being: > > > > > > * The contributors squashes their commits owned by one of them > > > > > > * The others add an empty commit for each of the contributors > > > > > > * The committer then just does a straight pull rather than a > > squashed > > > > > pull. > > > > > > > > > > > > My understanding is that we need to keep a record of who > > contributed > > > to > > > > > > what JIRA not that the actual commits reflect the actual code > > > > contributed > > > > > > (which can be quite difficult to disentangle, I imagine). Please > > > > correct > > > > > > me if I'm wrong here, though. > > > > > > > > > > > > I'm being prescriptive because I'm worried that "do the right > > thing" > > > > will > > > > > > result in us not doing the right thing, sadly. > > > > > > > > > > > > Thoughts? > > > > > > > > > > > > On Fri, Jan 27, 2017 at 6:03 PM, David Lyle < > dlyle65...@gmail.com> > > > > > wrote: > > > > > > > > > > > > We may be trying to be too prescriptive, it won't always be > > possible > > > to > > > > > > have a single commit per author (interleaved changes). > > > > > > > > > > > > Maybe > > > > > > > > > > > > If the contribution does not have a single author, please ask > that > > > the > > > > > > contribution be squashed by the author into the minimum number of > > > > commits > > > > > > required to preserve correct attribution each following the same > > > > standard > > > > > > naming convention of: JIRA Description (e.g. METRON-123: Foo all > > the > > > > > > Bars). > > > > > > > > > > > > ? > > > > > > > > > > > > I really don't know- the real message is, do the right thing to > > make > > > > sure > > > > > > everyone's contribution is duly recorded. So the commit log would > > be > > > > > > something like > > > > > > Author: 1 METRON-123 Foo all the bars - foo'd some bars > > > > > > Author: 2 METRON-123 Foo all the bars - helped for a better foo > > > > > > Author: 1 METRON-123 Fool all the bars - But, there was this > little > > > > deal > > > > > > > > > > > > I think it's hard to cover all the cases, so maybe a statement of > > the > > > > > > principle we're trying to maintain? > > > > > > > > > > > > -D... > > > > > > > > > > > > > > > > > > On Fri, Jan 27, 2017 at 5:50 PM, Casey Stella < > ceste...@gmail.com> > > > > > wrote: > > > > > > > > > > > > So, then the wording would be: > > > > > > > > > > > > If the contribution does not have a single author, please ask > that > > > the > > > > > > contribution be squashed by the author into one commit per author > > > > > > each following the same standard naming convention of: JIRA > > > > > > Description (e.g. METRON-123: Foo all the Bars). > > > > > > > > > > > > Fair or did I get it wrong? > > > > > > > > > > > > On Fri, Jan 27, 2017 at 5:47 PM, David Lyle < > dlyle65...@gmail.com> > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > The bolded didn't come through, but I'm guessing it's here: > > > > > > > > > > > > *If the contribution does not have a single author, please ask > that > > > > > > > > > > > > the > > > > > > > > > > > > contribution be separated into multiple separate pull requests > (one > > > > > > > > > > > > per > > > > > > > > > > > > author with their contributions) that may have (non-cyclical) > > > > > > > > > > > > dependencies, > > > > > > > > > > > > each following the same standard naming convention of JIRA: JIRA > > > > > > Description (e.g. METRON-123: Foo all the Bars).* > > > > > > > > > > > > I'd recommend against multiple PRs. Creates all kinds of > > difficulties > > > > > > reviewing and testing. The changeset should be reviewed and > tested > > as > > > > > > > > > > > > a > > > > > > > > > > > > single unit. If there is more than one author, simply don't > squash. > > > > > > > > > > > > -D... > > > > > > > > > > > > > > > > > > On Fri, Jan 27, 2017 at 5:44 PM, Casey Stella < > ceste...@gmail.com> > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > It is important for every contribution to attribute the author, > our > > > > > > > > > > > > git > > > > > > > > > > > > log > > > > > > > > > > > > is a legal papertrail for attribution. As a consequence of that, > I > > > > > > recommend the Development Guidelines section 1.2 be amended to > the > > > > > > following (bolded change): > > > > > > > > > > > > Everyone is encouraged to review open pull requests. We only ask > > > > > > > > > > > > that > > > > > > > > > > > > you > > > > > > > > > > > > try and think carefully, ask questions and are excellent to one > > > > > > > > > > > > another. > > > > > > > > > > > > Code review is our opportunity to share knowledge, design ideas > and > > > > > > > > > > > > make > > > > > > > > > > > > friends. The instructions on how to checkout a patch for review > > > > > > > > > > > > can be > > > > > > > > > > > > found here <https://github.com/nickwallen/metron-commit-stuff>. > > > > > > > > > > > > When reviewing a patch try to keep each of these concepts in > mind: > > > > > > > > > > > > > > > > > > > > > > > > - Is the proposed change being made in the correct place? Is it a > > > > > > > > > > > > fix > > > > > > > > > > > > in > > > > > > > > > > > > a backend when it should be in the primitives? In Kafka vs > > > > > > > > > > > > Storm? > > > > > > > > > > > > - What is the change being proposed? Is it based on Community > > > > > > recognized issues? > > > > > > - Do we want this feature or is the bug they’re fixing really a > > > > > > > > > > > > bug? > > > > > > > > > > > > - Does the change do what the author claims? > > > > > > - Are there sufficient tests? > > > > > > - Has it been documented? > > > > > > - Will this change introduce new bugs? > > > > > > - *Does this contribution have a single author?* > > > > > > > > > > > > > > > > > > *If the contribution does not have a single author, please ask > that > > > > > > > > > > > > the > > > > > > > > > > > > contribution be separated into multiple separate pull requests > (one > > > > > > > > > > > > per > > > > > > > > > > > > author with their contributions) that may have (non-cyclical) > > > > > > > > > > > > dependencies, > > > > > > > > > > > > each following the same standard naming convention of JIRA: JIRA > > > > > > Description (e.g. METRON-123: Foo all the Bars).* > > > > > > > > > > > > Also, please review if the submitter correctly flagged impacts to > > > > > > > > > > > > the > > > > > > > > > > > > following (if exist): > > > > > > > > > > > > - System Configuration Changes > > > > > > - Metron Configuration > > > > > > - Metron Component Configuration (sensors, etc) > > > > > > - Tech Stack Configuration (Storm, Hbase, etc) > > > > > > - Environmental Changes > > > > > > - Helper Shell Scripts > > > > > > - RPM Packaging > > > > > > - Ansible, Ambari, AWS, Docker > > > > > > - Documentation Impacts > > > > > > - Changes to Wiki Documentation > > > > > > - Revisions in Tutorials > > > > > > - Developer Guide > > > > > > - Expansions in readme's > > > > > > - Changes to System Interfaces > > > > > > - Stellar Shell > > > > > > - REST APIs > > > > > > - Etc... > > > > > > > > > > > > > > > > > > Thoughts? Better way to handle this scenario while still > > > > > > > > > > > > maintaining > > > > > > > > > > > > unambiguous authorship? > > > > > > > > > > > > Casey > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > Jon > > > > > > > > Sent from my mobile device > > > > > > > > > > > > > > > > -- > Nick Allen <n...@nickallen.org> >