I think the concern from ASF, and why they have this IP clearance process, is that the copyright ownership of these files is not clear. It wasn't done in a fork of a Daffodil repo, and there are contributions from multiple developers, with little public oversight from Daffodil. I think from the ASF's perspective, this code did not follow the ASF process, and is assumed to be owned by the contributors or their companies. There is no license/copyright in the code specifying otherwise, so I think ASF must assume the worst, and require a software grant.
Also note that even prototype code has a copyright owner and a license. Copying it into a PR doesn't change that. If you were to throw away this code and start from scratch following the ASF process, then it wouldn't be a problem. But if the plan is to copy prototype code not owned by ASF into a PR, then there are ownership concerns. If all this work was done in a fork of the apache/daffodil-vscode repo from a single contributor, then I think maybe the assumption from ASF is the code was intended to be part of the main repo and implicitly granted to ASF via the PR process. On 9/9/21 4:05 PM, John Wass wrote: > Yeah I was thinking of the example repo as a prototype, just as if I was > working on a feature in my fork of Daffodil. The main project doesn't own > the feature until it crosses the PR threshold, and once it does cross over > the state of my fork is of no concern to it. > > > > On Thu, Sep 9, 2021 at 3:54 PM Steve Lawrence <slawre...@apache.org> wrote: > >> The concern is that this code was developed outside of Apache and so >> didn't follow standard Apache process. From the IP clearance page: >> >> https://incubator.apache.org/ip-clearance/ >> >>> Any code that was developed outside of the ASF SVN repository and >>> our public mailing lists must be processed like this, even if the >>> external developer is already an ASF committer. >> >> I suppose that submitting it as a PR does follow some of that process, >> but there is maybe less assurance of ownership. Because it was not >> developed in an ASF repository, that code is presumed to be owned by >> you, multiple developers, or a company, and so that ownership must be >> granted to ASF via the IP clearance process, with appropriate software >> grant, CLA's, etc. (At least, that's my admittedly limited understanding >> of the process). >> >> - Steve >> >> >> On 9/9/21 3:34 PM, John Wass wrote: >>> Couldn't we (the vscode contributors) submit a series of PRs against the >>> new repo to move the code, and just archive the example repo as-is? >>> >>> I noted some thoughts on that a while back >>> https://github.com/jw3/example-daffodil-vscode/issues/77 >>> >>> >>> >>> On Thu, Sep 9, 2021 at 2:11 PM Beckerle, Mike < >> mbecke...@owlcyberdefense.com> >>> wrote: >>> >>>> I know of one file in the repo which will have to be removed which is >> the >>>> jpeg.dfdl.xsd file, which is there just as an example workspace. >>>> >>>> The copyright and provisions of that are not compatible with Apache >>>> licensing. >>>> >>>> We can find a DFDL schema that we created that has Apache license to use >>>> instead. >>>> >>>> For the other files under src, server, and build, can we generate a list >>>> of files identifying which are: >>>> >>>> (a) original MIT-licensed, unmodified >>>> (b) new - can be ASL >>>> (c) blended - started from MIT-licensed source, modified with >>>> daffodil-vscode-specific changes. >>>> >>>> It is these blended files that are the problematic ones. >>>> >>>> >>>> >>>> ________________________________ >>>> From: Steve Lawrence <slawre...@apache.org> >>>> Sent: Thursday, September 9, 2021 1:38 PM >>>> To: dev@daffodil.apache.org <dev@daffodil.apache.org> >>>> Subject: Re: daffodil-vscode - how to package and identify the >>>> contribution - some git questions >>>> >>>> Correct. For more information about Apache license compatibility: >>>> >>>> https://www.apache.org/legal/resolved.html >>>> >>>> MIT is Category A and is fine. EPL is Category B and is also okay, but >>>> generally only in its binary form. So these top-level dependencies look >>>> okay, assuming their transitive dependencies are also okay. >>>> >>>> We'll also need to verify the licenses of all code in the repo. >>>> Hopefully little of that is original microsoft MIT and can be granted to >>>> ASF and relicensed. >>>> >>>> >>>> On 9/9/21 1:30 PM, Beckerle, Mike wrote: >>>>> The requirement, is that the entire dependency tree (transitively) >>>> cannot depend on any software that has an Apache-incompatible (aka >>>> restrictive) license. >>>>> >>>>> So we need the transitive closure of all dependencies. >>>>> >>>>> >>>>> ________________________________ >>>>> From: Adam Rosien <a...@rosien.net> >>>>> Sent: Thursday, September 9, 2021 12:44 PM >>>>> To: dev@daffodil.apache.org <dev@daffodil.apache.org> >>>>> Subject: Re: daffodil-vscode - how to package and identify the >>>> contribution - some git questions >>>>> >>>>> (I don't understand the requirements of licencing + transitive >>>>> dependencies, so I'm giving some surface level license info) >>>>> >>>>> "ch.qos.logback" % "logback-classic" % "1.2.3" - EPL >>>>> http://logback.qos.ch/license.html >>>>> "com.microsoft.java" % "com.microsoft.java.debug.core" % "0.31.1" - EPL >>>> 1.0 >>>>> "co.fs2" %% "fs2-io" % "3.0.4" - MIT >>>>> "com.monovore" %% "decline-effect" % "2.1.0" - APL 2.0 >>>>> "org.typelevel" %% "log4cats-slf4j" % "2.1.0" - APL 2.0 >>>>> >>>>> On Thu, Sep 9, 2021 at 9:35 AM Adam Rosien <a...@rosien.net> wrote: >>>>> >>>>>> I can relay the list of dependencies and their licenses. >>>>>> >>>>>> On Thu, Sep 9, 2021 at 9:20 AM Steve Lawrence <slawre...@apache.org> >>>>>> wrote: >>>>>> >>>>>>> I personally don't care too much about having the existing git >> history >>>>>>> once its part of ASF, especially if it makes thing any easier (as you >>>>>>> mention, squash/rebase can be difficut through merges). So I'd say we >>>>>>> just do plan B--create a tarball of the current state (without the >> git >>>>>>> history), and the content of that tarball is what goes through the IP >>>>>>> clearance process, and is the content of the inital commit when >> adding >>>>>>> to the apache/daffodil-vscode repo. >>>>>>> >>>>>>> Note that I think the incubator will still want access to the >> existing >>>>>>> repo so they can view the full git history. Understanding where >>>>>>> everything came from and verifying the provenance is important to >>>>>>> ensuring we have all the appropriate CLA's. So while the tarball is >>>>>>> maybe what is officially voted on, they will want access to the repo. >>>>>>> >>>>>>> That said, I don't think we are going to get CLA's for any Microsoft >>>>>>> contribute code. So either all Microsoft contributed code will need >> to >>>>>>> be kept MIT, or removed from the codebase. And if feels a bit odd to >>>>>>> grant something to ASF where the original codebase stays MIT and >> isn't >>>>>>> part of that grant. >>>>>>> >>>>>>> I think understanding how much code still exists that is >> Microsoft/MIT >>>>>>> is going to be important to getting this through the IP clearance >>>> process. >>>>>>> >>>>>>> So I'm curious how much of that original Microsoft code still >> exists? I >>>>>>> assume since it was just example code it has mostly been replaced? If >>>>>>> that's the case, we could potentially say Microsoft has no ownership >> of >>>>>>> this code, and so their CLA and MIT license aren't necessary? >>>>>>> >>>>>>> We should also have a good understanding of the dependencies. If any >> of >>>>>>> them are not compatible with ALv2, then going through this process >>>> isn't >>>>>>> even worth it until they are replaced. Do you have a list of the >>>>>>> dependencies? >>>>>>> >>>>>>> >>>>>>> On 9/9/21 11:16 AM, Beckerle, Mike wrote: >>>>>>>> So the daffodil-vscode code-base wants to be granted to become part >> of >>>>>>> the >>>>>>>> Daffodil project. >>>>>>>> >>>>>>>> One question arises which is "what is the contribution?" exactly. >>>>>>>> >>>>>>>> The normal way this is identified is by creating a tarball of the >>>>>>> source files >>>>>>>> and specifying an sha or md5 hash of that file. >>>>>>>> >>>>>>>> However, this code base is perhaps different from usual. >>>>>>>> >>>>>>>> It started by creating a detached fork of the vscode debugger >> example >>>>>>> code base. >>>>>>>> This is MIT-Licensed which is a compatible license. >>>>>>>> >>>>>>>> The files are then edited. There are around 100 commits on top of >> the >>>>>>> base that >>>>>>>> came from the vscode debugger repository. >>>>>>>> >>>>>>>> So the contribution is that set of 100 commits - the >>>>>>> patches/change-sets they >>>>>>>> represent. >>>>>>>> >>>>>>>> These commits often edit the original files of the vscode debugger >>>>>>> example to >>>>>>>> add the daffodil-specific functionality. That is, the contribution >>>>>>> material is >>>>>>>> in several cases intermingled in the lines of the existing files. >>>>>>> That's ok I >>>>>>>> think so long as the modified file had MIT license. >>>>>>>> >>>>>>>> There's some value in preserving the 100 commits by our >> contributors, >>>>>>> not >>>>>>>> squashing it down to one commit, though if it's really not sensible >> to >>>>>>> proceed >>>>>>>> otherwise, we can choose to squash it down to one commit. >>>>>>>> >>>>>>>> Furthermore, the vscode debugger example repo itself had many >> commits >>>>>>> in it. The >>>>>>>> current daffodil-vscode repo preserves all these commits as well. I >>>>>>> don't see >>>>>>>> value in preserving these commits, and would rather they were >> squashed >>>>>>> into a >>>>>>>> single "starting point" commit, with a dependencies file specifying >>>> the >>>>>>> githash >>>>>>>> where we forked from, just so we can refer back if necessary. >>>>>>>> >>>>>>>> So as a starting suggestion (subject to discussion of other >>>>>>> alternatives) is this: >>>>>>>> >>>>>>>> Plan A: >>>>>>>> >>>>>>>> 1. squash all commits up to and including the last Microsoft >> commit, >>>>>>> together >>>>>>>> into one. >>>>>>>> 2. rebase the remaining commits on top of that. >>>>>>>> 1. I'm a bit worried about this rebase. There are merge >> commits, >>>>>>> etc. in >>>>>>>> the history. I'm not sure this will just all rebase while >>>>>>> preserving all >>>>>>>> the commits, but maybe it will "just work" >>>>>>>> 3. create a "patch set" corresponding to the 100 or so commits that >>>>>>> make up the >>>>>>>> "contribution". >>>>>>>> 1. I don't know if this is even feasible for this many commits. >>>>>>>> 4. create a tar/zip of this aggregate patch set. >>>>>>>> 5. compute an md5 of this patch set. >>>>>>>> >>>>>>>> The patch set tar/zip file and its md5 hash are "the granted >>>> software". >>>>>>>> >>>>>>>> The problem with this idea is that there's no obvious way to review >> a >>>>>>> patch set, >>>>>>>> shy of applying it. >>>>>>>> >>>>>>>> A better way may be to change steps 3 - 5 above to >>>>>>>> >>>>>>>> Plan B: >>>>>>>> >>>>>>>> 3. push the main branch to a new empty git repository >>>>>>>> The point of this is to remove all historic stuff from the >>>>>>> repository, >>>>>>>> i.e., have a minimal git repo that contains only the >> contribution >>>>>>> and the >>>>>>>> single other commit it must be based on. >>>>>>>> >>>>>>>> 4. create a tarball of this git repository, and md5 hash of it >>>>>>>> >>>>>>>> 5. document that the contribution is from githash X (after the >>>>>>> first commit) >>>>>>>> to githash Y (the final commit) of this repository >>>>>>>> >>>>>>>> >>>>>>>> This has the advantage that the contribution is a self-contained >>>>>>> review-able thing. >>>>>>>> >>>>>>>> Other ideas are welcome. (Plans C, D, etc) The only requirements I >>>> know >>>>>>> of are: >>>>>>>> >>>>>>>> 1. a single file containing the contribution, and its md5 hash >>>>>>>> 2. a sensible way one can review the contents of this contribution >>>> file >>>>>>>> 3. preserve history of derivation from the vscode debugger example. >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> Mike Beckerle | Principal Engineer >>>>>>>> >>>>>>>> mbecke...@owlcyberdefense.com <mailto:bhum...@owlcyberdefense.com> >>>>>>>> >>>>>>>> P +1-781-330-0412 >>>>>>>> >>>>>>> >>>>>>> >>>>> >>>> >>>> >>> >> >> >