That's a good question. I have been under the working assumption that we can do that, Nick. I've done it a number of times, so I *hope* we can. Can a mentor clarify?
On Fri, Feb 3, 2017 at 3:27 PM, Nick Allen <n...@nickallen.org> wrote: > One point of clarification. Is there an Apache requirement that all work > *must* be attributed to each author? Or can an author choose to concede > this and allow their work to be committed under someone else's name? > > For example, if I help someone out with a small portion of a larger PR, I'd > prefer to just concede that work to them rather than go through the hassle > of splitting commits. > > On Tue, Jan 31, 2017 at 3:28 PM, Otto Fowler <ottobackwa...@gmail.com> > wrote: > > > Along with the committer guide/steps > > > > > > On January 31, 2017 at 14:03:38, James Sirota (jsir...@apache.org) > wrote: > > > > Is everyone ok with modifying the document with the following? I can make > > the changes and put it up for a vote. > > > > Thanks, > > James > > > > 31.01.2017, 08:46, "Nick Allen" <n...@nickallen.org>: > > > I am in agreement with what I am reading. Something along the lines of > > the > > > following. > > > > > > - Best effort should be made to create a pull request that can be > > > attributed to a single author. > > > - If the work of multiple authors cannot be split into separate pull > > > requests for justifiable reasons, then it is the author's > > responsibility to > > > create a pull request with the minimal number of commits required to > > > provide appropriate attribution to multiple authors. > > > - Each commit should be attributed to a single author and follow the > > > commit message guidelines identifying the issue associated with the > > change. > > > - When merging the pull request, the committer will not squash these > > > commits so that attribution to each author can be maintained in the > > > revision history. > > > > > > On Tue, Jan 31, 2017 at 9:42 AM, Otto Fowler <ottobackwa...@gmail.com> > > > wrote: > > > > > >> Just to be clear - I’m not pushing for this, I was more curious about > > how > > >> to do it if we wanted to. My main interest is that the process we > agree > > to > > >> fosters and encourages collaboration rather than make it such a pain > > that > > >> contributors think twice before helping out on large PR efforts. > > >> > > >> How did we do it for the Storm efforts? > > >> > > >> On January 31, 2017 at 09:21:17, Casey Stella (ceste...@gmail.com) > > wrote: > > >> > > >> To my reading, that script only notes additional authors, it does not > > >> create separate commits for those authors. I feel that we should give > > >> authors the ability to construct pull requests in such a way that > > >> attribution in the form of github statistics are preserved. > > >> > > >> Also, and this is for the mentors, does the zookeeper methodology > abide > > by > > >> the requirement of having the commit log be a place of record for > > >> authorization? > > >> > > >> Casey > > >> > > >> On Tue, Jan 31, 2017 at 9:16 AM, Otto Fowler < > ottobackwa...@gmail.com> > > >> wrote: > > >> > > >> > https://cwiki.apache.org/confluence/display/ZOOKEEPER/ > > >> > Merging+Github+Pull+Requests > > >> > > > >> > > > >> > On January 31, 2017 at 09:04:38, Casey Stella (ceste...@gmail.com) > > >> wrote: > > >> > > > >> > The problem is that a single commit cannot have multiple authors > and > > it's > > >> > difficult to construct a minimal set of commits for multiple > authors > > in > > >> the > > >> > same PR by the committers. I would be in favor of not automating > this > > >> and, > > >> > rather, insisting that the pull request either be split up into > > multiple > > >> > PRs or have the individual PR's commits squashed and named > > appropriately > > >> by > > >> > the authors. > > >> > > > >> > For reference, here's how Apache Apex does this. Lines item 7 and 8 > > under > > >> > "Opening a Pull Request" at https://apex.apache.org/ > > contributing.html > > : > > >> > > > >> > 1. > > >> > > > >> > After all review is complete, combine all new commits into one > > squashed > > >> > commit except when there are multiple contributors, and include the > > Jira > > >> > number in the commit message. There are several ways to squash > > >> > commits, but here > > >> > is one explanation from git-scm.com > > >> > <https://git-scm.com/book/en/v2/Git-Tools-Rewriting- > > >> > History#Squashing-Commits> > > >> > and > > >> > a simple example is illustrated below: > > >> > > > >> > If tracking upstream/master then run git rebase -i. Else run git > > rebase > > >> > -i upstream/master. This command opens the text editor which lists > > the > > >> > multiple commits: > > >> > > > >> > pick 67cd79b change1 > > >> > pick 6f98905 change2 > > >> > > > >> > # Rebase e13748b..3463fbf onto e13748b (2 command(s)) > > >> > # > > >> > # Commands: > > >> > # p, pick = use commit > > >> > # r, reword = use commit, but edit the commit message > > >> > # e, edit = use commit, but stop for amending > > >> > # s, squash = use commit, but meld into previous commit > > >> > # f, fixup = like "squash", but discard this commit's log message > > >> > # x, exec = run command (the rest of the line) using shell > > >> > # > > >> > # These lines can be re-ordered; they are executed from top to > > bottom. > > >> > > > >> > Squash 'change2' to 'change1' and save. > > >> > > > >> > pick 67cd79b change1 > > >> > squash 6f98905 change2 > > >> > > > >> > 2. If there are multiple contributors in a pull request preserve > > >> > individual attributions. Try to squash the commits to the minimum > > number > > >> of > > >> > commits required to preserve attribution and the contribution to > > still be > > >> > functionally correct. > > >> > > > >> > Their language is similar to the ones proposed by Dave and I like > it. > > >> This > > >> > is, I consider, a special case and we should avoid trying to > automate > > it, > > >> > but rather delegate it to meatspace. Thoughts? > > >> > > > >> > On Tue, Jan 31, 2017 at 5:58 AM, zeo...@gmail.com < > zeo...@gmail.com> > > >> > wrote: > > >> > > > >> > > I thought the idea would be to maintain the commit mapping to > > >> individuals > > >> > > so things like the GitHub statistics are accurate regarding # of > > >> > commits, # > > >> > > of lines affected +/-, # of contributors, etc. > > >> > > > > >> > > Jon > > >> > > > > >> > > On Tue, Jan 31, 2017, 12:20 AM Otto Fowler < > > ottobackwa...@gmail.com> > > > > >> > > wrote: > > >> > > > > >> > > > I mean multi-author of course > > >> > > > > > >> > > > > > >> > > > On January 31, 2017 at 00:19:17, Otto Fowler ( > > >> ottobackwa...@gmail.com) > > >> > > > wrote: > > >> > > > > > >> > > > So, if we use a script like this…. > > >> > > > > > >> > > > import sys > > >> > > > import requests > > >> > > > import json > > >> > > > from sets import Set > > >> > > > > > >> > > > if len(sys.argv) == 1: > > >> > > > print "Must pass in pull request number" > > >> > > > exit(1) > > >> > > > > > >> > > > next_url = 'https://api.github.com/repos/ > > >> > apache/incubator-metron/pulls/' > > >> > > > + sys.argv[1] + '/commits' > > >> > > > authors = Set() > > >> > > > while next_url: > > >> > > > response = requests.get(next_url) > > >> > > > if response.status_code == 200: > > >> > > > m = json.loads(response.content) > > >> > > > for c in m: > > >> > > > authors.add(c["commit"]["author"]["name"] + " <" + > > >> > > > c["commit"]["author"]["email"] + ">") > > >> > > > if 'next' in response.links: > > >> > > > next_url = response.links['next']['url'] > > >> > > > else: > > >> > > > next_url = '' > > >> > > > > > >> > > > for author in authors: > > >> > > > print author > > >> > > > > > >> > > > > > >> > > > and call it from our bash scripts ( this is a test script, > would > > need > > >> > > error > > >> > > > check in the real thing ) > > >> > > > > > >> > > > 1 #!/bin/local/bin bash > > >> > > > > > >> > > > 2 > > >> > > > > > >> > > > 3 my_array=() > > >> > > > > > >> > > > 4 while IFS= read -r line; do > > >> > > > > > >> > > > 5 my_array+=( "$line" ) > > >> > > > > > >> > > > 6 done < <( python test.py 316 ) > > >> > > > > > >> > > > 7 > > >> > > > > > >> > > > 8 for element in "${my_array[@]}" > > >> > > > > > >> > > > 9 do > > >> > > > > > >> > > > 10 echo "${element}" > > >> > > > > > >> > > > 11 done > > >> > > > > > >> > > > I get the following output for a multi commit PR like #316 > > >> > > > > > >> > > > *➜* *ottofowler**@**Winterfell * *~/src/github/forks * bash > > test.sh > > >> > > > > > >> > > > JJ <jjmey...@gmail.com> > > >> > > > > > >> > > > merrimanr <merrim...@gmail.com> > > >> > > > > > >> > > > rmerriman <rmerri...@hortonworks.com> > > >> > > > > > >> > > > > > >> > > > Would putting that information in the commit message be good > > enough? > > >> > > The > > >> > > > idea that we integrate this with Nick’s scripts. > > >> > > > > > >> > > > > > >> > > > On January 30, 2017 at 07:00:35, Otto Fowler ( > > >> ottobackwa...@gmail.com) > > >> > > > wrote: > > >> > > > > > >> > > > Maybe using something like this? > > >> > > > https://developer.github.com/v3/pulls/#list-commits-on-a- > > >> pull-request > > >> > > > > > >> > > > > > >> > > > On January 28, 2017 at 09:52:16, Otto Fowler ( > > >> ottobackwa...@gmail.com) > > >> > > > wrote: > > >> > > > > > >> > > > Would it be possible to get all the authors out of the pr > before > > >> > > squashing > > >> > > > and change out commit messages? Maybe using an extended version > > of > > >> > > nick's > > >> > > > scripts? > > >> > > > > > >> > > > On January 27, 2017 at 19:02:54, Casey Stella ( > > ceste...@gmail.com) > > > > >> > > wrote: > > >> > > > > > >> > > > > Having thought about it a minute longer, maybe you're right, > > Dave. > > >> I > > >> > > like > > >> > > > > this. > > >> > > > > > > >> > > > > I'd add one sentence more: > > >> > > > > If the contribution does not have a single author, please ask > > that > > >> > the > > >> > > > > contribution be squashed by the author into the minimum > number > > of > > >> > > commits > > >> > > > > required to preserve correct attribution each following the > > same > > >> > > standard > > >> > > > > naming convention of: JIRA Description (e.g. METRON-123: Foo > > all > > >> the > > >> > > > Bars). > > >> > > > > The committer should not squash commits in this instance so > > that > > >> > > > ownership > > >> > > > > is maintained. > > >> > > > > > > >> > > > > On Fri, Jan 27, 2017 at 6:55 PM, Casey Stella < > > ceste...@gmail.com> > > >> > > > wrote: > > >> > > > > > > >> > > > > Yeah, this is sort of the reason I was going for multiple PRs > > >> rather > > >> > > than > > >> > > > > having contributors do their own squashing and commits. I > > recognize > > >> > > that > > >> > > > > multiple PRs are difficult to review. > > >> > > > > > > >> > > > > That being said, I don't know that the actual commits are > > required > > >> to > > >> > > > have > > >> > > > > all of a person's commit. I imagined the process being: > > >> > > > > * The contributors squashes their commits owned by one of > them > > >> > > > > * The others add an empty commit for each of the contributors > > >> > > > > * The committer then just does a straight pull rather than a > > >> squashed > > >> > > > pull. > > >> > > > > > > >> > > > > My understanding is that we need to keep a record of who > > >> contributed > > >> > to > > >> > > > > what JIRA not that the actual commits reflect the actual code > > >> > > contributed > > >> > > > > (which can be quite difficult to disentangle, I imagine). > > Please > > >> > > correct > > >> > > > > me if I'm wrong here, though. > > >> > > > > > > >> > > > > I'm being prescriptive because I'm worried that "do the right > > >> thing" > > >> > > will > > >> > > > > result in us not doing the right thing, sadly. > > >> > > > > > > >> > > > > Thoughts? > > >> > > > > > > >> > > > > On Fri, Jan 27, 2017 at 6:03 PM, David Lyle < > > dlyle65...@gmail.com> > > >> > > > wrote: > > >> > > > > > > >> > > > > We may be trying to be too prescriptive, it won't always be > > >> possible > > >> > to > > >> > > > > have a single commit per author (interleaved changes). > > >> > > > > > > >> > > > > Maybe > > >> > > > > > > >> > > > > If the contribution does not have a single author, please ask > > that > > >> > the > > >> > > > > contribution be squashed by the author into the minimum > number > > of > > >> > > commits > > >> > > > > required to preserve correct attribution each following the > > same > > >> > > standard > > >> > > > > naming convention of: JIRA Description (e.g. METRON-123: Foo > > all > > >> the > > >> > > > > Bars). > > >> > > > > > > >> > > > > ? > > >> > > > > > > >> > > > > I really don't know- the real message is, do the right thing > to > > >> make > > >> > > sure > > >> > > > > everyone's contribution is duly recorded. So the commit log > > would > > >> be > > >> > > > > something like > > >> > > > > Author: 1 METRON-123 Foo all the bars - foo'd some bars > > >> > > > > Author: 2 METRON-123 Foo all the bars - helped for a better > foo > > >> > > > > Author: 1 METRON-123 Fool all the bars - But, there was this > > little > > >> > > deal > > >> > > > > > > >> > > > > I think it's hard to cover all the cases, so maybe a > statement > > of > > >> the > > >> > > > > principle we're trying to maintain? > > >> > > > > > > >> > > > > -D... > > >> > > > > > > >> > > > > > > >> > > > > On Fri, Jan 27, 2017 at 5:50 PM, Casey Stella < > > ceste...@gmail.com> > > >> > > > wrote: > > >> > > > > > > >> > > > > So, then the wording would be: > > >> > > > > > > >> > > > > If the contribution does not have a single author, please ask > > that > > >> > the > > >> > > > > contribution be squashed by the author into one commit per > > author > > >> > > > > each following the same standard naming convention of: JIRA > > >> > > > > Description (e.g. METRON-123: Foo all the Bars). > > >> > > > > > > >> > > > > Fair or did I get it wrong? > > >> > > > > > > >> > > > > On Fri, Jan 27, 2017 at 5:47 PM, David Lyle < > > dlyle65...@gmail.com> > > >> > > > > > > >> > > > > wrote: > > >> > > > > > > >> > > > > > > >> > > > > The bolded didn't come through, but I'm guessing it's here: > > >> > > > > > > >> > > > > *If the contribution does not have a single author, please > ask > > that > > >> > > > > > > >> > > > > the > > >> > > > > > > >> > > > > contribution be separated into multiple separate pull > requests > > (one > > >> > > > > > > >> > > > > per > > >> > > > > > > >> > > > > author with their contributions) that may have (non-cyclical) > > >> > > > > > > >> > > > > dependencies, > > >> > > > > > > >> > > > > each following the same standard naming convention of JIRA: > > JIRA > > >> > > > > Description (e.g. METRON-123: Foo all the Bars).* > > >> > > > > > > >> > > > > I'd recommend against multiple PRs. Creates all kinds of > > >> difficulties > > >> > > > > reviewing and testing. The changeset should be reviewed and > > tested > > >> as > > >> > > > > > > >> > > > > a > > >> > > > > > > >> > > > > single unit. If there is more than one author, simply don't > > squash. > > >> > > > > > > >> > > > > -D... > > >> > > > > > > >> > > > > > > >> > > > > On Fri, Jan 27, 2017 at 5:44 PM, Casey Stella < > > ceste...@gmail.com> > > >> > > > > > > >> > > > > wrote: > > >> > > > > > > >> > > > > > > >> > > > > It is important for every contribution to attribute the > author, > > our > > >> > > > > > > >> > > > > git > > >> > > > > > > >> > > > > log > > >> > > > > > > >> > > > > is a legal papertrail for attribution. As a consequence of > > that, I > > >> > > > > recommend the Development Guidelines section 1.2 be amended > to > > the > > >> > > > > following (bolded change): > > >> > > > > > > >> > > > > Everyone is encouraged to review open pull requests. We only > > ask > > >> > > > > > > >> > > > > that > > >> > > > > > > >> > > > > you > > >> > > > > > > >> > > > > try and think carefully, ask questions and are excellent to > one > > >> > > > > > > >> > > > > another. > > >> > > > > > > >> > > > > Code review is our opportunity to share knowledge, design > ideas > > and > > >> > > > > > > >> > > > > make > > >> > > > > > > >> > > > > friends. The instructions on how to checkout a patch for > review > > >> > > > > > > >> > > > > can be > > >> > > > > > > >> > > > > found here <https://github.com/ > nickwallen/metron-commit-stuff > > >. > > >> > > > > > > >> > > > > When reviewing a patch try to keep each of these concepts in > > mind: > > >> > > > > > > >> > > > > > > >> > > > > > > >> > > > > - Is the proposed change being made in the correct place? Is > it > > a > > >> > > > > > > >> > > > > fix > > >> > > > > > > >> > > > > in > > >> > > > > > > >> > > > > a backend when it should be in the primitives? In Kafka vs > > >> > > > > > > >> > > > > Storm? > > >> > > > > > > >> > > > > - What is the change being proposed? Is it based on Community > > >> > > > > recognized issues? > > >> > > > > - Do we want this feature or is the bug they’re fixing > really a > > >> > > > > > > >> > > > > bug? > > >> > > > > > > >> > > > > - Does the change do what the author claims? > > >> > > > > - Are there sufficient tests? > > >> > > > > - Has it been documented? > > >> > > > > - Will this change introduce new bugs? > > >> > > > > - *Does this contribution have a single author?* > > >> > > > > > > >> > > > > > > >> > > > > *If the contribution does not have a single author, please > ask > > that > > >> > > > > > > >> > > > > the > > >> > > > > > > >> > > > > contribution be separated into multiple separate pull > requests > > (one > > >> > > > > > > >> > > > > per > > >> > > > > > > >> > > > > author with their contributions) that may have (non-cyclical) > > >> > > > > > > >> > > > > dependencies, > > >> > > > > > > >> > > > > each following the same standard naming convention of JIRA: > > JIRA > > >> > > > > Description (e.g. METRON-123: Foo all the Bars).* > > >> > > > > > > >> > > > > Also, please review if the submitter correctly flagged > impacts > > to > > >> > > > > > > >> > > > > the > > >> > > > > > > >> > > > > following (if exist): > > >> > > > > > > >> > > > > - System Configuration Changes > > >> > > > > - Metron Configuration > > >> > > > > - Metron Component Configuration (sensors, etc) > > >> > > > > - Tech Stack Configuration (Storm, Hbase, etc) > > >> > > > > - Environmental Changes > > >> > > > > - Helper Shell Scripts > > >> > > > > - RPM Packaging > > >> > > > > - Ansible, Ambari, AWS, Docker > > >> > > > > - Documentation Impacts > > >> > > > > - Changes to Wiki Documentation > > >> > > > > - Revisions in Tutorials > > >> > > > > - Developer Guide > > >> > > > > - Expansions in readme's > > >> > > > > - Changes to System Interfaces > > >> > > > > - Stellar Shell > > >> > > > > - REST APIs > > >> > > > > - Etc... > > >> > > > > > > >> > > > > > > >> > > > > Thoughts? Better way to handle this scenario while still > > >> > > > > > > >> > > > > maintaining > > >> > > > > > > >> > > > > unambiguous authorship? > > >> > > > > > > >> > > > > Casey > > >> > > > > > > >> > > > > > > >> > > > > > > >> > > > > > > >> > > > > > > >> > > > > > > >> > > > > > > >> > > > > > >> > > -- > > >> > > > > >> > > Jon > > >> > > > > >> > > Sent from my mobile device > > >> > > > > >> > > > >> > > > > > > > -- > > > Nick Allen <n...@nickallen.org> > > > > ------------------- > > Thank you, > > > > James Sirota > > PPMC- Apache Metron (Incubating) > > jsirota AT apache DOT org > > >