Re: daffodil-vscode - how to package and identify the contribution - some git questions

Beckerle, Mike Thu, 09 Sep 2021 11:11:43 -0700

I know of one file in the repo which will have to be removed which is the 
jpeg.dfdl.xsd file, which is there just as an example workspace.


The copyright and provisions of that are not compatible with Apache licensing.

We can find a DFDL schema that we created that has Apache license to use 
instead.

For the other files under src, server, and build, can we generate a list of 
files identifying which are:

(a) original MIT-licensed, unmodified
(b) new - can be ASL
(c) blended - started from MIT-licensed source, modified with 
daffodil-vscode-specific changes.

It is these blended files that are the problematic ones.



________________________________
From: Steve Lawrence <slawre...@apache.org>
Sent: Thursday, September 9, 2021 1:38 PM
To: dev@daffodil.apache.org <dev@daffodil.apache.org>
Subject: Re: daffodil-vscode - how to package and identify the contribution - 
some git questions

Correct. For more information about Apache license compatibility:

  https://www.apache.org/legal/resolved.html

MIT is Category A and is fine. EPL is Category B and is also okay, but
generally only in its binary form. So these top-level dependencies look
okay, assuming their transitive dependencies are also okay.

We'll also need to verify the licenses of all code in the repo.
Hopefully little of that is original microsoft MIT and can be granted to
ASF and relicensed.


On 9/9/21 1:30 PM, Beckerle, Mike wrote:
> The requirement, is that the entire dependency tree (transitively) cannot 
> depend on any software that has an Apache-incompatible (aka restrictive) 
> license.
>
> So we need the transitive closure of all dependencies.
>
>
> ________________________________
> From: Adam Rosien <a...@rosien.net>
> Sent: Thursday, September 9, 2021 12:44 PM
> To: dev@daffodil.apache.org <dev@daffodil.apache.org>
> Subject: Re: daffodil-vscode - how to package and identify the contribution - 
> some git questions
>
> (I don't understand the requirements of licencing + transitive
> dependencies, so I'm giving some surface level license info)
>
> "ch.qos.logback" % "logback-classic" % "1.2.3" - EPL
> http://logback.qos.ch/license.html
> "com.microsoft.java" % "com.microsoft.java.debug.core" % "0.31.1" - EPL 1.0
> "co.fs2" %% "fs2-io" % "3.0.4" - MIT
> "com.monovore" %% "decline-effect" % "2.1.0" - APL 2.0
> "org.typelevel" %% "log4cats-slf4j" % "2.1.0" - APL 2.0
>
> On Thu, Sep 9, 2021 at 9:35 AM Adam Rosien <a...@rosien.net> wrote:
>
>> I can relay the list of dependencies and their licenses.
>>
>> On Thu, Sep 9, 2021 at 9:20 AM Steve Lawrence <slawre...@apache.org>
>> wrote:
>>
>>> I personally don't care too much about having the existing git history
>>> once its part of ASF, especially if it makes thing any easier (as you
>>> mention, squash/rebase can be difficut through merges). So I'd say we
>>> just do plan B--create a tarball of the current state (without the git
>>> history), and the content of that tarball is what goes through the IP
>>> clearance process, and is the content of the inital commit when adding
>>> to the apache/daffodil-vscode repo.
>>>
>>> Note that I think the incubator will still want access to the existing
>>> repo so they can view the full git history. Understanding where
>>> everything came from and verifying the provenance is important to
>>> ensuring we have all the appropriate CLA's. So while the tarball is
>>> maybe what is officially voted on, they will want access to the repo.
>>>
>>> That said, I don't think we are going to get CLA's for any Microsoft
>>> contribute code. So either all Microsoft contributed code will need to
>>> be kept MIT, or removed from the codebase. And if feels a bit odd to
>>> grant something to ASF where the original codebase stays MIT and isn't
>>> part of that grant.
>>>
>>> I think understanding how much code still exists that is Microsoft/MIT
>>> is going to be important to getting this through the IP clearance process.
>>>
>>> So I'm curious how much of that original Microsoft code still exists? I
>>> assume since it was just example code it has mostly been replaced? If
>>> that's the case, we could potentially say Microsoft has no ownership of
>>> this code, and so their CLA and MIT license aren't necessary?
>>>
>>> We should also have a good understanding of the dependencies. If any of
>>> them are not compatible with ALv2, then going through this process isn't
>>> even worth it until they are replaced. Do you have a list of the
>>> dependencies?
>>>
>>>
>>> On 9/9/21 11:16 AM, Beckerle, Mike wrote:
>>>> So the daffodil-vscode code-base wants to be granted to become part of
>>> the
>>>> Daffodil project.
>>>>
>>>> One question arises which is "what is the contribution?" exactly.
>>>>
>>>> The normal way this is identified is by creating a tarball of the
>>> source files
>>>> and specifying an sha or md5 hash of that file.
>>>>
>>>> However, this code base is perhaps different from usual.
>>>>
>>>> It started by creating a detached fork of the vscode debugger example
>>> code base.
>>>> This is MIT-Licensed which is a compatible license.
>>>>
>>>> The files are then edited. There are around 100 commits on top of the
>>> base that
>>>> came from the vscode debugger repository.
>>>>
>>>> So the contribution is that set of 100 commits - the
>>> patches/change-sets they
>>>> represent.
>>>>
>>>> These commits often edit the original files of the vscode debugger
>>> example to
>>>> add the daffodil-specific functionality. That is, the contribution
>>> material is
>>>> in several cases intermingled in the lines of the existing files.
>>> That's ok I
>>>> think so long as the modified file had MIT license.
>>>>
>>>> There's some value in preserving the 100 commits by our contributors,
>>> not
>>>> squashing it down to one commit, though if it's really not sensible to
>>> proceed
>>>> otherwise, we can choose to squash it down to one commit.
>>>>
>>>> Furthermore, the vscode debugger example repo itself had many commits
>>> in it. The
>>>> current daffodil-vscode repo preserves all these commits as well. I
>>> don't see
>>>> value in preserving these commits, and would rather they were squashed
>>> into a
>>>> single "starting point" commit, with a dependencies file specifying the
>>> githash
>>>> where we forked from, just so we can refer back if necessary.
>>>>
>>>> So as a starting suggestion (subject to discussion of other
>>> alternatives) is this:
>>>>
>>>> Plan A:
>>>>
>>>>  1. squash all commits up to and including the last Microsoft commit,
>>> together
>>>>     into one.
>>>>  2. rebase the remaining commits on top of that.
>>>>      1. I'm a bit worried about this rebase. There are merge commits,
>>> etc. in
>>>>         the history. I'm not sure this will just all rebase while
>>> preserving all
>>>>         the commits, but maybe it will "just work"
>>>>  3. create a "patch set" corresponding to the 100 or so commits that
>>> make up the
>>>>     "contribution".
>>>>      1. I don't know if this is even feasible for this many commits.
>>>>  4. create a tar/zip of this aggregate patch set.
>>>>  5. compute an md5 of this patch set.
>>>>
>>>> The patch set tar/zip file and its md5 hash are "the granted software".
>>>>
>>>> The problem with this idea is that there's no obvious way to review a
>>> patch set,
>>>> shy of applying it.
>>>>
>>>> A better way may be to change steps 3 - 5 above to
>>>>
>>>> Plan B:
>>>>
>>>>     3. push the main branch to a new empty git repository
>>>>          The point of this is to remove all historic stuff from the
>>> repository,
>>>>     i.e., have a minimal git repo that contains only the contribution
>>> and the
>>>>     single other commit it must be based on.
>>>>
>>>>     4. create a tarball of this git repository, and md5 hash of it
>>>>
>>>>     5. document that the contribution is from githash X (after the
>>> first commit)
>>>>     to githash Y (the final commit) of this repository
>>>>
>>>>
>>>> This has the advantage that the contribution is a self-contained
>>> review-able thing.
>>>>
>>>> Other ideas are welcome. (Plans C, D, etc) The only requirements I know
>>> of are:
>>>>
>>>>  1. a single file containing the contribution, and its md5 hash
>>>>  2. a sensible way one can review the contents of this contribution file
>>>>  3. preserve history of derivation from the vscode debugger example.
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> Mike Beckerle | Principal Engineer
>>>>
>>>> mbecke...@owlcyberdefense.com <mailto:bhum...@owlcyberdefense.com>
>>>>
>>>> P +1-781-330-0412
>>>>
>>>
>>>
>

Re: daffodil-vscode - how to package and identify the contribution - some git questions

Reply via email to