Is everyone ok with modifying the document with the following? I can make the changes and put it up for a vote.
Thanks, James 31.01.2017, 08:46, "Nick Allen" <n...@nickallen.org>: > I am in agreement with what I am reading. Something along the lines of the > following. > > - Best effort should be made to create a pull request that can be > attributed to a single author. > - If the work of multiple authors cannot be split into separate pull > requests for justifiable reasons, then it is the author's responsibility to > create a pull request with the minimal number of commits required to > provide appropriate attribution to multiple authors. > - Each commit should be attributed to a single author and follow the > commit message guidelines identifying the issue associated with the change. > - When merging the pull request, the committer will not squash these > commits so that attribution to each author can be maintained in the > revision history. > > On Tue, Jan 31, 2017 at 9:42 AM, Otto Fowler <ottobackwa...@gmail.com> > wrote: > >> Just to be clear - I’m not pushing for this, I was more curious about how >> to do it if we wanted to. My main interest is that the process we agree to >> fosters and encourages collaboration rather than make it such a pain that >> contributors think twice before helping out on large PR efforts. >> >> How did we do it for the Storm efforts? >> >> On January 31, 2017 at 09:21:17, Casey Stella (ceste...@gmail.com) wrote: >> >> To my reading, that script only notes additional authors, it does not >> create separate commits for those authors. I feel that we should give >> authors the ability to construct pull requests in such a way that >> attribution in the form of github statistics are preserved. >> >> Also, and this is for the mentors, does the zookeeper methodology abide by >> the requirement of having the commit log be a place of record for >> authorization? >> >> Casey >> >> On Tue, Jan 31, 2017 at 9:16 AM, Otto Fowler <ottobackwa...@gmail.com> >> wrote: >> >> > https://cwiki.apache.org/confluence/display/ZOOKEEPER/ >> > Merging+Github+Pull+Requests >> > >> > >> > On January 31, 2017 at 09:04:38, Casey Stella (ceste...@gmail.com) >> wrote: >> > >> > The problem is that a single commit cannot have multiple authors and it's >> > difficult to construct a minimal set of commits for multiple authors in >> the >> > same PR by the committers. I would be in favor of not automating this >> and, >> > rather, insisting that the pull request either be split up into multiple >> > PRs or have the individual PR's commits squashed and named appropriately >> by >> > the authors. >> > >> > For reference, here's how Apache Apex does this. Lines item 7 and 8 under >> > "Opening a Pull Request" at https://apex.apache.org/contributing.html : >> > >> > 1. >> > >> > After all review is complete, combine all new commits into one squashed >> > commit except when there are multiple contributors, and include the Jira >> > number in the commit message. There are several ways to squash >> > commits, but here >> > is one explanation from git-scm.com >> > <https://git-scm.com/book/en/v2/Git-Tools-Rewriting- >> > History#Squashing-Commits> >> > and >> > a simple example is illustrated below: >> > >> > If tracking upstream/master then run git rebase -i. Else run git rebase >> > -i upstream/master. This command opens the text editor which lists the >> > multiple commits: >> > >> > pick 67cd79b change1 >> > pick 6f98905 change2 >> > >> > # Rebase e13748b..3463fbf onto e13748b (2 command(s)) >> > # >> > # Commands: >> > # p, pick = use commit >> > # r, reword = use commit, but edit the commit message >> > # e, edit = use commit, but stop for amending >> > # s, squash = use commit, but meld into previous commit >> > # f, fixup = like "squash", but discard this commit's log message >> > # x, exec = run command (the rest of the line) using shell >> > # >> > # These lines can be re-ordered; they are executed from top to bottom. >> > >> > Squash 'change2' to 'change1' and save. >> > >> > pick 67cd79b change1 >> > squash 6f98905 change2 >> > >> > 2. If there are multiple contributors in a pull request preserve >> > individual attributions. Try to squash the commits to the minimum number >> of >> > commits required to preserve attribution and the contribution to still be >> > functionally correct. >> > >> > Their language is similar to the ones proposed by Dave and I like it. >> This >> > is, I consider, a special case and we should avoid trying to automate it, >> > but rather delegate it to meatspace. Thoughts? >> > >> > On Tue, Jan 31, 2017 at 5:58 AM, zeo...@gmail.com <zeo...@gmail.com> >> > wrote: >> > >> > > I thought the idea would be to maintain the commit mapping to >> individuals >> > > so things like the GitHub statistics are accurate regarding # of >> > commits, # >> > > of lines affected +/-, # of contributors, etc. >> > > >> > > Jon >> > > >> > > On Tue, Jan 31, 2017, 12:20 AM Otto Fowler <ottobackwa...@gmail.com> >> > > wrote: >> > > >> > > > I mean multi-author of course >> > > > >> > > > >> > > > On January 31, 2017 at 00:19:17, Otto Fowler ( >> ottobackwa...@gmail.com) >> > > > wrote: >> > > > >> > > > So, if we use a script like this…. >> > > > >> > > > import sys >> > > > import requests >> > > > import json >> > > > from sets import Set >> > > > >> > > > if len(sys.argv) == 1: >> > > > print "Must pass in pull request number" >> > > > exit(1) >> > > > >> > > > next_url = 'https://api.github.com/repos/ >> > apache/incubator-metron/pulls/' >> > > > + sys.argv[1] + '/commits' >> > > > authors = Set() >> > > > while next_url: >> > > > response = requests.get(next_url) >> > > > if response.status_code == 200: >> > > > m = json.loads(response.content) >> > > > for c in m: >> > > > authors.add(c["commit"]["author"]["name"] + " <" + >> > > > c["commit"]["author"]["email"] + ">") >> > > > if 'next' in response.links: >> > > > next_url = response.links['next']['url'] >> > > > else: >> > > > next_url = '' >> > > > >> > > > for author in authors: >> > > > print author >> > > > >> > > > >> > > > and call it from our bash scripts ( this is a test script, would need >> > > error >> > > > check in the real thing ) >> > > > >> > > > 1 #!/bin/local/bin bash >> > > > >> > > > 2 >> > > > >> > > > 3 my_array=() >> > > > >> > > > 4 while IFS= read -r line; do >> > > > >> > > > 5 my_array+=( "$line" ) >> > > > >> > > > 6 done < <( python test.py 316 ) >> > > > >> > > > 7 >> > > > >> > > > 8 for element in "${my_array[@]}" >> > > > >> > > > 9 do >> > > > >> > > > 10 echo "${element}" >> > > > >> > > > 11 done >> > > > >> > > > I get the following output for a multi commit PR like #316 >> > > > >> > > > *➜* *ottofowler**@**Winterfell * *~/src/github/forks * bash test.sh >> > > > >> > > > JJ <jjmey...@gmail.com> >> > > > >> > > > merrimanr <merrim...@gmail.com> >> > > > >> > > > rmerriman <rmerri...@hortonworks.com> >> > > > >> > > > >> > > > Would putting that information in the commit message be good enough? >> > > The >> > > > idea that we integrate this with Nick’s scripts. >> > > > >> > > > >> > > > On January 30, 2017 at 07:00:35, Otto Fowler ( >> ottobackwa...@gmail.com) >> > > > wrote: >> > > > >> > > > Maybe using something like this? >> > > > https://developer.github.com/v3/pulls/#list-commits-on-a- >> pull-request >> > > > >> > > > >> > > > On January 28, 2017 at 09:52:16, Otto Fowler ( >> ottobackwa...@gmail.com) >> > > > wrote: >> > > > >> > > > Would it be possible to get all the authors out of the pr before >> > > squashing >> > > > and change out commit messages? Maybe using an extended version of >> > > nick's >> > > > scripts? >> > > > >> > > > On January 27, 2017 at 19:02:54, Casey Stella (ceste...@gmail.com) >> > > wrote: >> > > > >> > > > > Having thought about it a minute longer, maybe you're right, Dave. >> I >> > > like >> > > > > this. >> > > > > >> > > > > I'd add one sentence more: >> > > > > If the contribution does not have a single author, please ask that >> > the >> > > > > contribution be squashed by the author into the minimum number of >> > > commits >> > > > > required to preserve correct attribution each following the same >> > > standard >> > > > > naming convention of: JIRA Description (e.g. METRON-123: Foo all >> the >> > > > Bars). >> > > > > The committer should not squash commits in this instance so that >> > > > ownership >> > > > > is maintained. >> > > > > >> > > > > On Fri, Jan 27, 2017 at 6:55 PM, Casey Stella <ceste...@gmail.com> >> > > > wrote: >> > > > > >> > > > > Yeah, this is sort of the reason I was going for multiple PRs >> rather >> > > than >> > > > > having contributors do their own squashing and commits. I recognize >> > > that >> > > > > multiple PRs are difficult to review. >> > > > > >> > > > > That being said, I don't know that the actual commits are required >> to >> > > > have >> > > > > all of a person's commit. I imagined the process being: >> > > > > * The contributors squashes their commits owned by one of them >> > > > > * The others add an empty commit for each of the contributors >> > > > > * The committer then just does a straight pull rather than a >> squashed >> > > > pull. >> > > > > >> > > > > My understanding is that we need to keep a record of who >> contributed >> > to >> > > > > what JIRA not that the actual commits reflect the actual code >> > > contributed >> > > > > (which can be quite difficult to disentangle, I imagine). Please >> > > correct >> > > > > me if I'm wrong here, though. >> > > > > >> > > > > I'm being prescriptive because I'm worried that "do the right >> thing" >> > > will >> > > > > result in us not doing the right thing, sadly. >> > > > > >> > > > > Thoughts? >> > > > > >> > > > > On Fri, Jan 27, 2017 at 6:03 PM, David Lyle <dlyle65...@gmail.com> >> > > > wrote: >> > > > > >> > > > > We may be trying to be too prescriptive, it won't always be >> possible >> > to >> > > > > have a single commit per author (interleaved changes). >> > > > > >> > > > > Maybe >> > > > > >> > > > > If the contribution does not have a single author, please ask that >> > the >> > > > > contribution be squashed by the author into the minimum number of >> > > commits >> > > > > required to preserve correct attribution each following the same >> > > standard >> > > > > naming convention of: JIRA Description (e.g. METRON-123: Foo all >> the >> > > > > Bars). >> > > > > >> > > > > ? >> > > > > >> > > > > I really don't know- the real message is, do the right thing to >> make >> > > sure >> > > > > everyone's contribution is duly recorded. So the commit log would >> be >> > > > > something like >> > > > > Author: 1 METRON-123 Foo all the bars - foo'd some bars >> > > > > Author: 2 METRON-123 Foo all the bars - helped for a better foo >> > > > > Author: 1 METRON-123 Fool all the bars - But, there was this little >> > > deal >> > > > > >> > > > > I think it's hard to cover all the cases, so maybe a statement of >> the >> > > > > principle we're trying to maintain? >> > > > > >> > > > > -D... >> > > > > >> > > > > >> > > > > On Fri, Jan 27, 2017 at 5:50 PM, Casey Stella <ceste...@gmail.com> >> > > > wrote: >> > > > > >> > > > > So, then the wording would be: >> > > > > >> > > > > If the contribution does not have a single author, please ask that >> > the >> > > > > contribution be squashed by the author into one commit per author >> > > > > each following the same standard naming convention of: JIRA >> > > > > Description (e.g. METRON-123: Foo all the Bars). >> > > > > >> > > > > Fair or did I get it wrong? >> > > > > >> > > > > On Fri, Jan 27, 2017 at 5:47 PM, David Lyle <dlyle65...@gmail.com> >> > > > > >> > > > > wrote: >> > > > > >> > > > > >> > > > > The bolded didn't come through, but I'm guessing it's here: >> > > > > >> > > > > *If the contribution does not have a single author, please ask that >> > > > > >> > > > > the >> > > > > >> > > > > contribution be separated into multiple separate pull requests (one >> > > > > >> > > > > per >> > > > > >> > > > > author with their contributions) that may have (non-cyclical) >> > > > > >> > > > > dependencies, >> > > > > >> > > > > each following the same standard naming convention of JIRA: JIRA >> > > > > Description (e.g. METRON-123: Foo all the Bars).* >> > > > > >> > > > > I'd recommend against multiple PRs. Creates all kinds of >> difficulties >> > > > > reviewing and testing. The changeset should be reviewed and tested >> as >> > > > > >> > > > > a >> > > > > >> > > > > single unit. If there is more than one author, simply don't squash. >> > > > > >> > > > > -D... >> > > > > >> > > > > >> > > > > On Fri, Jan 27, 2017 at 5:44 PM, Casey Stella <ceste...@gmail.com> >> > > > > >> > > > > wrote: >> > > > > >> > > > > >> > > > > It is important for every contribution to attribute the author, our >> > > > > >> > > > > git >> > > > > >> > > > > log >> > > > > >> > > > > is a legal papertrail for attribution. As a consequence of that, I >> > > > > recommend the Development Guidelines section 1.2 be amended to the >> > > > > following (bolded change): >> > > > > >> > > > > Everyone is encouraged to review open pull requests. We only ask >> > > > > >> > > > > that >> > > > > >> > > > > you >> > > > > >> > > > > try and think carefully, ask questions and are excellent to one >> > > > > >> > > > > another. >> > > > > >> > > > > Code review is our opportunity to share knowledge, design ideas and >> > > > > >> > > > > make >> > > > > >> > > > > friends. The instructions on how to checkout a patch for review >> > > > > >> > > > > can be >> > > > > >> > > > > found here <https://github.com/nickwallen/metron-commit-stuff>. >> > > > > >> > > > > When reviewing a patch try to keep each of these concepts in mind: >> > > > > >> > > > > >> > > > > >> > > > > - Is the proposed change being made in the correct place? Is it a >> > > > > >> > > > > fix >> > > > > >> > > > > in >> > > > > >> > > > > a backend when it should be in the primitives? In Kafka vs >> > > > > >> > > > > Storm? >> > > > > >> > > > > - What is the change being proposed? Is it based on Community >> > > > > recognized issues? >> > > > > - Do we want this feature or is the bug they’re fixing really a >> > > > > >> > > > > bug? >> > > > > >> > > > > - Does the change do what the author claims? >> > > > > - Are there sufficient tests? >> > > > > - Has it been documented? >> > > > > - Will this change introduce new bugs? >> > > > > - *Does this contribution have a single author?* >> > > > > >> > > > > >> > > > > *If the contribution does not have a single author, please ask that >> > > > > >> > > > > the >> > > > > >> > > > > contribution be separated into multiple separate pull requests (one >> > > > > >> > > > > per >> > > > > >> > > > > author with their contributions) that may have (non-cyclical) >> > > > > >> > > > > dependencies, >> > > > > >> > > > > each following the same standard naming convention of JIRA: JIRA >> > > > > Description (e.g. METRON-123: Foo all the Bars).* >> > > > > >> > > > > Also, please review if the submitter correctly flagged impacts to >> > > > > >> > > > > the >> > > > > >> > > > > following (if exist): >> > > > > >> > > > > - System Configuration Changes >> > > > > - Metron Configuration >> > > > > - Metron Component Configuration (sensors, etc) >> > > > > - Tech Stack Configuration (Storm, Hbase, etc) >> > > > > - Environmental Changes >> > > > > - Helper Shell Scripts >> > > > > - RPM Packaging >> > > > > - Ansible, Ambari, AWS, Docker >> > > > > - Documentation Impacts >> > > > > - Changes to Wiki Documentation >> > > > > - Revisions in Tutorials >> > > > > - Developer Guide >> > > > > - Expansions in readme's >> > > > > - Changes to System Interfaces >> > > > > - Stellar Shell >> > > > > - REST APIs >> > > > > - Etc... >> > > > > >> > > > > >> > > > > Thoughts? Better way to handle this scenario while still >> > > > > >> > > > > maintaining >> > > > > >> > > > > unambiguous authorship? >> > > > > >> > > > > Casey >> > > > > >> > > > > >> > > > > >> > > > > >> > > > > >> > > > > >> > > > > >> > > > >> > > -- >> > > >> > > Jon >> > > >> > > Sent from my mobile device >> > > >> > >> > > > -- > Nick Allen <n...@nickallen.org> ------------------- Thank you, James Sirota PPMC- Apache Metron (Incubating) jsirota AT apache DOT org