Re: daffodil-vscode - how to package and identify the contribution - some git questions

2021-09-09 Thread John Wass
I had a few more (6) source files as modified..

extension.ts
debugAdapter.ts
daffodilRuntime.ts
daffodilDebug.ts
adapter.test.ts
activateDaffodilDebug.ts

> It would seem an IDE (probably vscode!) decided to restyle/reindent this
code.

We added opinionated code formatting... apparently trying to make this
process as hard as possible :/

That reformat commit was done on 08/25/2021, title of PR was Prettier.
Looking prior to that commit might give a little better idea of what
changed.


> squash/rebase can be difficut through merges

Here is a quick pass on (1) squashing the MS source in a single commit (2)
placing that commit on top of an init commit in a repo (2) and then
rewriting out commits on top of all of that.

It preserves our authorship.  Can be cleaned up a little bit still but I am
not going to put time into it if we don't want this.  I just wanted to note
how it could look.

https://github.com/jw3/rewrite-daffodil-vscode-1

One issue I could see here is the linking of the example repo PR IDs in the
commit messages will conflict once we start adding PRs in the new repo.
Now would be the time to rewrite these commit messages and strip/modify
those #ID tags.

Thoughts on that rewrite repo?





On Thu, Sep 9, 2021 at 5:42 PM Beckerle, Mike 
wrote:

> So via some git trickery I was able to determine the "blended" files.
>
> I'm ignoring the various configuration files which are generally json
> files.
>
> Of the ".ts" files only 3 are blended:
>
> src/debugAdapter.ts - 72 lines - only maybe 6 lines are different
> src/extension.ts - 179 lines
> src/tests/adapter.test.ts - 137 lines (50 of which are commented-out code)
>
> The delta between these files and the original files of the same name are
> larger than expected due to changes in whitespace, and removal of ";" at
> end of line (which I guess are optional in many places in typescript).
>
> It would seem an IDE (probably vscode!) decided to restyle/reindent this
> code.
>
> So it's a bit hard to figure out what the "real" deltas are.
>
> src/debugAdapter.ts appears to be only trivially different. The name
> MockDebugSession was replaced by DaffodilDebugSession, and "./mockDebug"
> was changed to "./daffodilDebug".
>
> The other two files do appear to be where all the real blended code is.
>
>
>
> 
> From: Beckerle, Mike 
> Sent: Thursday, September 9, 2021 4:21 PM
> To: dev@daffodil.apache.org 
> Subject: Re: daffodil-vscode - how to package and identify the
> contribution - some git questions
>
> Whether it's a PR or series of PRs, or a software grant, that still
> doesn't resolve the issue of the blended files which are part MIT-licensed
> original code, and part new code deltas by the daffodil-vscode contributors.
>
> We need to understand whether those blended files can be teased apart
> somehow so that it is clear going forward what is an MIT-licensed library
> and what is Apache Licensed.
>
> I just did a grep -R -i microsoft  in a clone of the
> openwhisk-vscode-extension and got zero hits. So no files still carry
> microsoft copyright and in fact their NOTICES.txt file does not indicate
> any dependency on MIT-licensed code at all.  So I think
> openwhisk-vscode-extension is not going to help us figure out how to surf
> this issue.
>
>
> 
> From: Steve Lawrence 
> Sent: Thursday, September 9, 2021 3:54 PM
> To: dev@daffodil.apache.org 
> Subject: Re: daffodil-vscode - how to package and identify the
> contribution - some git questions
>
> The concern is that this code was developed outside of Apache and so
> didn't follow standard Apache process. From the IP clearance page:
>
> https://incubator.apache.org/ip-clearance/
>
> > Any code that was developed outside of the ASF SVN repository and
> > our public mailing lists must be processed like this, even if the
> > external developer is already an ASF committer.
>
> I suppose that submitting it as a PR does follow some of that process,
> but there is maybe less assurance of ownership. Because it was not
> developed in an ASF repository, that code is presumed to be owned by
> you, multiple developers, or a company, and so that ownership must be
> granted to ASF via the IP clearance process, with appropriate software
> grant, CLA's, etc. (At least, that's my admittedly limited understanding
> of the process).
>
> - Steve
>
>
> On 9/9/21 3:34 PM, John Wass wrote:
> > Couldn't we (the vscode contributors) submit a series of PRs against the
> > new repo to move the code, and just archive the example repo as-is?
> >
> > I noted some thoughts on that a while back
> > https://github.com/jw3/example-daffodil-vscode/issues/77
> >
> >
> >
> > On Thu, Sep 9, 2021 at 2:11 PM Beckerle, Mike <
> mbecke...@owlcyberdefense.com>
> > wrote:
> >
> >> I know of one file in the repo which will have to be removed which is
> the
> >> jpeg.dfdl.xsd file, which is there just as an example workspace.
> >>
> >> The copyright and provisions of that are 

Re: daffodil-vscode - how to package and identify the contribution - some git questions

2021-09-09 Thread Beckerle, Mike
So via some git trickery I was able to determine the "blended" files.

I'm ignoring the various configuration files which are generally json files.

Of the ".ts" files only 3 are blended:

src/debugAdapter.ts - 72 lines - only maybe 6 lines are different
src/extension.ts - 179 lines
src/tests/adapter.test.ts - 137 lines (50 of which are commented-out code)

The delta between these files and the original files of the same name are 
larger than expected due to changes in whitespace, and removal of ";" at end of 
line (which I guess are optional in many places in typescript).

It would seem an IDE (probably vscode!) decided to restyle/reindent this code.

So it's a bit hard to figure out what the "real" deltas are.

src/debugAdapter.ts appears to be only trivially different. The name 
MockDebugSession was replaced by DaffodilDebugSession, and "./mockDebug" was 
changed to "./daffodilDebug".

The other two files do appear to be where all the real blended code is.




From: Beckerle, Mike 
Sent: Thursday, September 9, 2021 4:21 PM
To: dev@daffodil.apache.org 
Subject: Re: daffodil-vscode - how to package and identify the contribution - 
some git questions

Whether it's a PR or series of PRs, or a software grant, that still doesn't 
resolve the issue of the blended files which are part MIT-licensed original 
code, and part new code deltas by the daffodil-vscode contributors.

We need to understand whether those blended files can be teased apart somehow 
so that it is clear going forward what is an MIT-licensed library and what is 
Apache Licensed.

I just did a grep -R -i microsoft  in a clone of the openwhisk-vscode-extension 
and got zero hits. So no files still carry microsoft copyright and in fact 
their NOTICES.txt file does not indicate any dependency on MIT-licensed code at 
all.  So I think openwhisk-vscode-extension is not going to help us figure out 
how to surf this issue.



From: Steve Lawrence 
Sent: Thursday, September 9, 2021 3:54 PM
To: dev@daffodil.apache.org 
Subject: Re: daffodil-vscode - how to package and identify the contribution - 
some git questions

The concern is that this code was developed outside of Apache and so
didn't follow standard Apache process. From the IP clearance page:

https://incubator.apache.org/ip-clearance/

> Any code that was developed outside of the ASF SVN repository and
> our public mailing lists must be processed like this, even if the
> external developer is already an ASF committer.

I suppose that submitting it as a PR does follow some of that process,
but there is maybe less assurance of ownership. Because it was not
developed in an ASF repository, that code is presumed to be owned by
you, multiple developers, or a company, and so that ownership must be
granted to ASF via the IP clearance process, with appropriate software
grant, CLA's, etc. (At least, that's my admittedly limited understanding
of the process).

- Steve


On 9/9/21 3:34 PM, John Wass wrote:
> Couldn't we (the vscode contributors) submit a series of PRs against the
> new repo to move the code, and just archive the example repo as-is?
>
> I noted some thoughts on that a while back
> https://github.com/jw3/example-daffodil-vscode/issues/77
>
>
>
> On Thu, Sep 9, 2021 at 2:11 PM Beckerle, Mike 
> wrote:
>
>> I know of one file in the repo which will have to be removed which is the
>> jpeg.dfdl.xsd file, which is there just as an example workspace.
>>
>> The copyright and provisions of that are not compatible with Apache
>> licensing.
>>
>> We can find a DFDL schema that we created that has Apache license to use
>> instead.
>>
>> For the other files under src, server, and build, can we generate a list
>> of files identifying which are:
>>
>> (a) original MIT-licensed, unmodified
>> (b) new - can be ASL
>> (c) blended - started from MIT-licensed source, modified with
>> daffodil-vscode-specific changes.
>>
>> It is these blended files that are the problematic ones.
>>
>>
>>
>> 
>> From: Steve Lawrence 
>> Sent: Thursday, September 9, 2021 1:38 PM
>> To: dev@daffodil.apache.org 
>> Subject: Re: daffodil-vscode - how to package and identify the
>> contribution - some git questions
>>
>> Correct. For more information about Apache license compatibility:
>>
>>   https://www.apache.org/legal/resolved.html
>>
>> MIT is Category A and is fine. EPL is Category B and is also okay, but
>> generally only in its binary form. So these top-level dependencies look
>> okay, assuming their transitive dependencies are also okay.
>>
>> We'll also need to verify the licenses of all code in the repo.
>> Hopefully little of that is original microsoft MIT and can be granted to
>> ASF and relicensed.
>>
>>
>> On 9/9/21 1:30 PM, Beckerle, Mike wrote:
>>> The requirement, is that the entire dependency tree (transitively)
>> cannot depend on any software that has an Apache-incompatible (aka
>> restrictive) license.
>>>
>>> So we 

Re: daffodil-vscode - how to package and identify the contribution - some git questions

2021-09-09 Thread Beckerle, Mike
Whether it's a PR or series of PRs, or a software grant, that still doesn't 
resolve the issue of the blended files which are part MIT-licensed original 
code, and part new code deltas by the daffodil-vscode contributors.

We need to understand whether those blended files can be teased apart somehow 
so that it is clear going forward what is an MIT-licensed library and what is 
Apache Licensed.

I just did a grep -R -i microsoft  in a clone of the openwhisk-vscode-extension 
and got zero hits. So no files still carry microsoft copyright and in fact 
their NOTICES.txt file does not indicate any dependency on MIT-licensed code at 
all.  So I think openwhisk-vscode-extension is not going to help us figure out 
how to surf this issue.



From: Steve Lawrence 
Sent: Thursday, September 9, 2021 3:54 PM
To: dev@daffodil.apache.org 
Subject: Re: daffodil-vscode - how to package and identify the contribution - 
some git questions

The concern is that this code was developed outside of Apache and so
didn't follow standard Apache process. From the IP clearance page:

https://incubator.apache.org/ip-clearance/

> Any code that was developed outside of the ASF SVN repository and
> our public mailing lists must be processed like this, even if the
> external developer is already an ASF committer.

I suppose that submitting it as a PR does follow some of that process,
but there is maybe less assurance of ownership. Because it was not
developed in an ASF repository, that code is presumed to be owned by
you, multiple developers, or a company, and so that ownership must be
granted to ASF via the IP clearance process, with appropriate software
grant, CLA's, etc. (At least, that's my admittedly limited understanding
of the process).

- Steve


On 9/9/21 3:34 PM, John Wass wrote:
> Couldn't we (the vscode contributors) submit a series of PRs against the
> new repo to move the code, and just archive the example repo as-is?
>
> I noted some thoughts on that a while back
> https://github.com/jw3/example-daffodil-vscode/issues/77
>
>
>
> On Thu, Sep 9, 2021 at 2:11 PM Beckerle, Mike 
> wrote:
>
>> I know of one file in the repo which will have to be removed which is the
>> jpeg.dfdl.xsd file, which is there just as an example workspace.
>>
>> The copyright and provisions of that are not compatible with Apache
>> licensing.
>>
>> We can find a DFDL schema that we created that has Apache license to use
>> instead.
>>
>> For the other files under src, server, and build, can we generate a list
>> of files identifying which are:
>>
>> (a) original MIT-licensed, unmodified
>> (b) new - can be ASL
>> (c) blended - started from MIT-licensed source, modified with
>> daffodil-vscode-specific changes.
>>
>> It is these blended files that are the problematic ones.
>>
>>
>>
>> 
>> From: Steve Lawrence 
>> Sent: Thursday, September 9, 2021 1:38 PM
>> To: dev@daffodil.apache.org 
>> Subject: Re: daffodil-vscode - how to package and identify the
>> contribution - some git questions
>>
>> Correct. For more information about Apache license compatibility:
>>
>>   https://www.apache.org/legal/resolved.html
>>
>> MIT is Category A and is fine. EPL is Category B and is also okay, but
>> generally only in its binary form. So these top-level dependencies look
>> okay, assuming their transitive dependencies are also okay.
>>
>> We'll also need to verify the licenses of all code in the repo.
>> Hopefully little of that is original microsoft MIT and can be granted to
>> ASF and relicensed.
>>
>>
>> On 9/9/21 1:30 PM, Beckerle, Mike wrote:
>>> The requirement, is that the entire dependency tree (transitively)
>> cannot depend on any software that has an Apache-incompatible (aka
>> restrictive) license.
>>>
>>> So we need the transitive closure of all dependencies.
>>>
>>>
>>> 
>>> From: Adam Rosien 
>>> Sent: Thursday, September 9, 2021 12:44 PM
>>> To: dev@daffodil.apache.org 
>>> Subject: Re: daffodil-vscode - how to package and identify the
>> contribution - some git questions
>>>
>>> (I don't understand the requirements of licencing + transitive
>>> dependencies, so I'm giving some surface level license info)
>>>
>>> "ch.qos.logback" % "logback-classic" % "1.2.3" - EPL
>>> http://logback.qos.ch/license.html
>>> "com.microsoft.java" % "com.microsoft.java.debug.core" % "0.31.1" - EPL
>> 1.0
>>> "co.fs2" %% "fs2-io" % "3.0.4" - MIT
>>> "com.monovore" %% "decline-effect" % "2.1.0" - APL 2.0
>>> "org.typelevel" %% "log4cats-slf4j" % "2.1.0" - APL 2.0
>>>
>>> On Thu, Sep 9, 2021 at 9:35 AM Adam Rosien  wrote:
>>>
 I can relay the list of dependencies and their licenses.

 On Thu, Sep 9, 2021 at 9:20 AM Steve Lawrence 
 wrote:

> I personally don't care too much about having the existing git history
> once its part of ASF, especially if it makes thing any easier (as you
> mention, squash/rebase can be difficut through merges). So I'd 

Re: daffodil-vscode - how to package and identify the contribution - some git questions

2021-09-09 Thread John Wass
Yeah I was thinking of the example repo as a prototype, just as if I was
working on a feature in my fork of Daffodil.  The main project doesn't own
the feature until it crosses the PR threshold, and once it does cross over
the state of my fork is of no concern to it.



On Thu, Sep 9, 2021 at 3:54 PM Steve Lawrence  wrote:

> The concern is that this code was developed outside of Apache and so
> didn't follow standard Apache process. From the IP clearance page:
>
> https://incubator.apache.org/ip-clearance/
>
> > Any code that was developed outside of the ASF SVN repository and
> > our public mailing lists must be processed like this, even if the
> > external developer is already an ASF committer.
>
> I suppose that submitting it as a PR does follow some of that process,
> but there is maybe less assurance of ownership. Because it was not
> developed in an ASF repository, that code is presumed to be owned by
> you, multiple developers, or a company, and so that ownership must be
> granted to ASF via the IP clearance process, with appropriate software
> grant, CLA's, etc. (At least, that's my admittedly limited understanding
> of the process).
>
> - Steve
>
>
> On 9/9/21 3:34 PM, John Wass wrote:
> > Couldn't we (the vscode contributors) submit a series of PRs against the
> > new repo to move the code, and just archive the example repo as-is?
> >
> > I noted some thoughts on that a while back
> > https://github.com/jw3/example-daffodil-vscode/issues/77
> >
> >
> >
> > On Thu, Sep 9, 2021 at 2:11 PM Beckerle, Mike <
> mbecke...@owlcyberdefense.com>
> > wrote:
> >
> >> I know of one file in the repo which will have to be removed which is
> the
> >> jpeg.dfdl.xsd file, which is there just as an example workspace.
> >>
> >> The copyright and provisions of that are not compatible with Apache
> >> licensing.
> >>
> >> We can find a DFDL schema that we created that has Apache license to use
> >> instead.
> >>
> >> For the other files under src, server, and build, can we generate a list
> >> of files identifying which are:
> >>
> >> (a) original MIT-licensed, unmodified
> >> (b) new - can be ASL
> >> (c) blended - started from MIT-licensed source, modified with
> >> daffodil-vscode-specific changes.
> >>
> >> It is these blended files that are the problematic ones.
> >>
> >>
> >>
> >> 
> >> From: Steve Lawrence 
> >> Sent: Thursday, September 9, 2021 1:38 PM
> >> To: dev@daffodil.apache.org 
> >> Subject: Re: daffodil-vscode - how to package and identify the
> >> contribution - some git questions
> >>
> >> Correct. For more information about Apache license compatibility:
> >>
> >>   https://www.apache.org/legal/resolved.html
> >>
> >> MIT is Category A and is fine. EPL is Category B and is also okay, but
> >> generally only in its binary form. So these top-level dependencies look
> >> okay, assuming their transitive dependencies are also okay.
> >>
> >> We'll also need to verify the licenses of all code in the repo.
> >> Hopefully little of that is original microsoft MIT and can be granted to
> >> ASF and relicensed.
> >>
> >>
> >> On 9/9/21 1:30 PM, Beckerle, Mike wrote:
> >>> The requirement, is that the entire dependency tree (transitively)
> >> cannot depend on any software that has an Apache-incompatible (aka
> >> restrictive) license.
> >>>
> >>> So we need the transitive closure of all dependencies.
> >>>
> >>>
> >>> 
> >>> From: Adam Rosien 
> >>> Sent: Thursday, September 9, 2021 12:44 PM
> >>> To: dev@daffodil.apache.org 
> >>> Subject: Re: daffodil-vscode - how to package and identify the
> >> contribution - some git questions
> >>>
> >>> (I don't understand the requirements of licencing + transitive
> >>> dependencies, so I'm giving some surface level license info)
> >>>
> >>> "ch.qos.logback" % "logback-classic" % "1.2.3" - EPL
> >>> http://logback.qos.ch/license.html
> >>> "com.microsoft.java" % "com.microsoft.java.debug.core" % "0.31.1" - EPL
> >> 1.0
> >>> "co.fs2" %% "fs2-io" % "3.0.4" - MIT
> >>> "com.monovore" %% "decline-effect" % "2.1.0" - APL 2.0
> >>> "org.typelevel" %% "log4cats-slf4j" % "2.1.0" - APL 2.0
> >>>
> >>> On Thu, Sep 9, 2021 at 9:35 AM Adam Rosien  wrote:
> >>>
>  I can relay the list of dependencies and their licenses.
> 
>  On Thu, Sep 9, 2021 at 9:20 AM Steve Lawrence 
>  wrote:
> 
> > I personally don't care too much about having the existing git
> history
> > once its part of ASF, especially if it makes thing any easier (as you
> > mention, squash/rebase can be difficut through merges). So I'd say we
> > just do plan B--create a tarball of the current state (without the
> git
> > history), and the content of that tarball is what goes through the IP
> > clearance process, and is the content of the inital commit when
> adding
> > to the apache/daffodil-vscode repo.
> >
> > Note that I think the incubator will still want access to the
> existing
> > repo 

Re: daffodil-vscode - how to package and identify the contribution - some git questions

2021-09-09 Thread Steve Lawrence
The concern is that this code was developed outside of Apache and so
didn't follow standard Apache process. From the IP clearance page:

https://incubator.apache.org/ip-clearance/

> Any code that was developed outside of the ASF SVN repository and
> our public mailing lists must be processed like this, even if the
> external developer is already an ASF committer.

I suppose that submitting it as a PR does follow some of that process,
but there is maybe less assurance of ownership. Because it was not
developed in an ASF repository, that code is presumed to be owned by
you, multiple developers, or a company, and so that ownership must be
granted to ASF via the IP clearance process, with appropriate software
grant, CLA's, etc. (At least, that's my admittedly limited understanding
of the process).

- Steve


On 9/9/21 3:34 PM, John Wass wrote:
> Couldn't we (the vscode contributors) submit a series of PRs against the
> new repo to move the code, and just archive the example repo as-is?
> 
> I noted some thoughts on that a while back
> https://github.com/jw3/example-daffodil-vscode/issues/77
> 
> 
> 
> On Thu, Sep 9, 2021 at 2:11 PM Beckerle, Mike 
> wrote:
> 
>> I know of one file in the repo which will have to be removed which is the
>> jpeg.dfdl.xsd file, which is there just as an example workspace.
>>
>> The copyright and provisions of that are not compatible with Apache
>> licensing.
>>
>> We can find a DFDL schema that we created that has Apache license to use
>> instead.
>>
>> For the other files under src, server, and build, can we generate a list
>> of files identifying which are:
>>
>> (a) original MIT-licensed, unmodified
>> (b) new - can be ASL
>> (c) blended - started from MIT-licensed source, modified with
>> daffodil-vscode-specific changes.
>>
>> It is these blended files that are the problematic ones.
>>
>>
>>
>> 
>> From: Steve Lawrence 
>> Sent: Thursday, September 9, 2021 1:38 PM
>> To: dev@daffodil.apache.org 
>> Subject: Re: daffodil-vscode - how to package and identify the
>> contribution - some git questions
>>
>> Correct. For more information about Apache license compatibility:
>>
>>   https://www.apache.org/legal/resolved.html
>>
>> MIT is Category A and is fine. EPL is Category B and is also okay, but
>> generally only in its binary form. So these top-level dependencies look
>> okay, assuming their transitive dependencies are also okay.
>>
>> We'll also need to verify the licenses of all code in the repo.
>> Hopefully little of that is original microsoft MIT and can be granted to
>> ASF and relicensed.
>>
>>
>> On 9/9/21 1:30 PM, Beckerle, Mike wrote:
>>> The requirement, is that the entire dependency tree (transitively)
>> cannot depend on any software that has an Apache-incompatible (aka
>> restrictive) license.
>>>
>>> So we need the transitive closure of all dependencies.
>>>
>>>
>>> 
>>> From: Adam Rosien 
>>> Sent: Thursday, September 9, 2021 12:44 PM
>>> To: dev@daffodil.apache.org 
>>> Subject: Re: daffodil-vscode - how to package and identify the
>> contribution - some git questions
>>>
>>> (I don't understand the requirements of licencing + transitive
>>> dependencies, so I'm giving some surface level license info)
>>>
>>> "ch.qos.logback" % "logback-classic" % "1.2.3" - EPL
>>> http://logback.qos.ch/license.html
>>> "com.microsoft.java" % "com.microsoft.java.debug.core" % "0.31.1" - EPL
>> 1.0
>>> "co.fs2" %% "fs2-io" % "3.0.4" - MIT
>>> "com.monovore" %% "decline-effect" % "2.1.0" - APL 2.0
>>> "org.typelevel" %% "log4cats-slf4j" % "2.1.0" - APL 2.0
>>>
>>> On Thu, Sep 9, 2021 at 9:35 AM Adam Rosien  wrote:
>>>
 I can relay the list of dependencies and their licenses.

 On Thu, Sep 9, 2021 at 9:20 AM Steve Lawrence 
 wrote:

> I personally don't care too much about having the existing git history
> once its part of ASF, especially if it makes thing any easier (as you
> mention, squash/rebase can be difficut through merges). So I'd say we
> just do plan B--create a tarball of the current state (without the git
> history), and the content of that tarball is what goes through the IP
> clearance process, and is the content of the inital commit when adding
> to the apache/daffodil-vscode repo.
>
> Note that I think the incubator will still want access to the existing
> repo so they can view the full git history. Understanding where
> everything came from and verifying the provenance is important to
> ensuring we have all the appropriate CLA's. So while the tarball is
> maybe what is officially voted on, they will want access to the repo.
>
> That said, I don't think we are going to get CLA's for any Microsoft
> contribute code. So either all Microsoft contributed code will need to
> be kept MIT, or removed from the codebase. And if feels a bit odd to
> grant something to ASF where the original codebase stays MIT and isn't

Re: daffodil-vscode - how to package and identify the contribution - some git questions

2021-09-09 Thread John Wass
Couldn't we (the vscode contributors) submit a series of PRs against the
new repo to move the code, and just archive the example repo as-is?

I noted some thoughts on that a while back
https://github.com/jw3/example-daffodil-vscode/issues/77



On Thu, Sep 9, 2021 at 2:11 PM Beckerle, Mike 
wrote:

> I know of one file in the repo which will have to be removed which is the
> jpeg.dfdl.xsd file, which is there just as an example workspace.
>
> The copyright and provisions of that are not compatible with Apache
> licensing.
>
> We can find a DFDL schema that we created that has Apache license to use
> instead.
>
> For the other files under src, server, and build, can we generate a list
> of files identifying which are:
>
> (a) original MIT-licensed, unmodified
> (b) new - can be ASL
> (c) blended - started from MIT-licensed source, modified with
> daffodil-vscode-specific changes.
>
> It is these blended files that are the problematic ones.
>
>
>
> 
> From: Steve Lawrence 
> Sent: Thursday, September 9, 2021 1:38 PM
> To: dev@daffodil.apache.org 
> Subject: Re: daffodil-vscode - how to package and identify the
> contribution - some git questions
>
> Correct. For more information about Apache license compatibility:
>
>   https://www.apache.org/legal/resolved.html
>
> MIT is Category A and is fine. EPL is Category B and is also okay, but
> generally only in its binary form. So these top-level dependencies look
> okay, assuming their transitive dependencies are also okay.
>
> We'll also need to verify the licenses of all code in the repo.
> Hopefully little of that is original microsoft MIT and can be granted to
> ASF and relicensed.
>
>
> On 9/9/21 1:30 PM, Beckerle, Mike wrote:
> > The requirement, is that the entire dependency tree (transitively)
> cannot depend on any software that has an Apache-incompatible (aka
> restrictive) license.
> >
> > So we need the transitive closure of all dependencies.
> >
> >
> > 
> > From: Adam Rosien 
> > Sent: Thursday, September 9, 2021 12:44 PM
> > To: dev@daffodil.apache.org 
> > Subject: Re: daffodil-vscode - how to package and identify the
> contribution - some git questions
> >
> > (I don't understand the requirements of licencing + transitive
> > dependencies, so I'm giving some surface level license info)
> >
> > "ch.qos.logback" % "logback-classic" % "1.2.3" - EPL
> > http://logback.qos.ch/license.html
> > "com.microsoft.java" % "com.microsoft.java.debug.core" % "0.31.1" - EPL
> 1.0
> > "co.fs2" %% "fs2-io" % "3.0.4" - MIT
> > "com.monovore" %% "decline-effect" % "2.1.0" - APL 2.0
> > "org.typelevel" %% "log4cats-slf4j" % "2.1.0" - APL 2.0
> >
> > On Thu, Sep 9, 2021 at 9:35 AM Adam Rosien  wrote:
> >
> >> I can relay the list of dependencies and their licenses.
> >>
> >> On Thu, Sep 9, 2021 at 9:20 AM Steve Lawrence 
> >> wrote:
> >>
> >>> I personally don't care too much about having the existing git history
> >>> once its part of ASF, especially if it makes thing any easier (as you
> >>> mention, squash/rebase can be difficut through merges). So I'd say we
> >>> just do plan B--create a tarball of the current state (without the git
> >>> history), and the content of that tarball is what goes through the IP
> >>> clearance process, and is the content of the inital commit when adding
> >>> to the apache/daffodil-vscode repo.
> >>>
> >>> Note that I think the incubator will still want access to the existing
> >>> repo so they can view the full git history. Understanding where
> >>> everything came from and verifying the provenance is important to
> >>> ensuring we have all the appropriate CLA's. So while the tarball is
> >>> maybe what is officially voted on, they will want access to the repo.
> >>>
> >>> That said, I don't think we are going to get CLA's for any Microsoft
> >>> contribute code. So either all Microsoft contributed code will need to
> >>> be kept MIT, or removed from the codebase. And if feels a bit odd to
> >>> grant something to ASF where the original codebase stays MIT and isn't
> >>> part of that grant.
> >>>
> >>> I think understanding how much code still exists that is Microsoft/MIT
> >>> is going to be important to getting this through the IP clearance
> process.
> >>>
> >>> So I'm curious how much of that original Microsoft code still exists? I
> >>> assume since it was just example code it has mostly been replaced? If
> >>> that's the case, we could potentially say Microsoft has no ownership of
> >>> this code, and so their CLA and MIT license aren't necessary?
> >>>
> >>> We should also have a good understanding of the dependencies. If any of
> >>> them are not compatible with ALv2, then going through this process
> isn't
> >>> even worth it until they are replaced. Do you have a list of the
> >>> dependencies?
> >>>
> >>>
> >>> On 9/9/21 11:16 AM, Beckerle, Mike wrote:
>  So the daffodil-vscode code-base wants to be granted to become part of
> >>> the
>  Daffodil 

Re: daffodil-vscode - how to package and identify the contribution - some git questions

2021-09-09 Thread Beckerle, Mike
I know of one file in the repo which will have to be removed which is the 
jpeg.dfdl.xsd file, which is there just as an example workspace.

The copyright and provisions of that are not compatible with Apache licensing.

We can find a DFDL schema that we created that has Apache license to use 
instead.

For the other files under src, server, and build, can we generate a list of 
files identifying which are:

(a) original MIT-licensed, unmodified
(b) new - can be ASL
(c) blended - started from MIT-licensed source, modified with 
daffodil-vscode-specific changes.

It is these blended files that are the problematic ones.




From: Steve Lawrence 
Sent: Thursday, September 9, 2021 1:38 PM
To: dev@daffodil.apache.org 
Subject: Re: daffodil-vscode - how to package and identify the contribution - 
some git questions

Correct. For more information about Apache license compatibility:

  https://www.apache.org/legal/resolved.html

MIT is Category A and is fine. EPL is Category B and is also okay, but
generally only in its binary form. So these top-level dependencies look
okay, assuming their transitive dependencies are also okay.

We'll also need to verify the licenses of all code in the repo.
Hopefully little of that is original microsoft MIT and can be granted to
ASF and relicensed.


On 9/9/21 1:30 PM, Beckerle, Mike wrote:
> The requirement, is that the entire dependency tree (transitively) cannot 
> depend on any software that has an Apache-incompatible (aka restrictive) 
> license.
>
> So we need the transitive closure of all dependencies.
>
>
> 
> From: Adam Rosien 
> Sent: Thursday, September 9, 2021 12:44 PM
> To: dev@daffodil.apache.org 
> Subject: Re: daffodil-vscode - how to package and identify the contribution - 
> some git questions
>
> (I don't understand the requirements of licencing + transitive
> dependencies, so I'm giving some surface level license info)
>
> "ch.qos.logback" % "logback-classic" % "1.2.3" - EPL
> http://logback.qos.ch/license.html
> "com.microsoft.java" % "com.microsoft.java.debug.core" % "0.31.1" - EPL 1.0
> "co.fs2" %% "fs2-io" % "3.0.4" - MIT
> "com.monovore" %% "decline-effect" % "2.1.0" - APL 2.0
> "org.typelevel" %% "log4cats-slf4j" % "2.1.0" - APL 2.0
>
> On Thu, Sep 9, 2021 at 9:35 AM Adam Rosien  wrote:
>
>> I can relay the list of dependencies and their licenses.
>>
>> On Thu, Sep 9, 2021 at 9:20 AM Steve Lawrence 
>> wrote:
>>
>>> I personally don't care too much about having the existing git history
>>> once its part of ASF, especially if it makes thing any easier (as you
>>> mention, squash/rebase can be difficut through merges). So I'd say we
>>> just do plan B--create a tarball of the current state (without the git
>>> history), and the content of that tarball is what goes through the IP
>>> clearance process, and is the content of the inital commit when adding
>>> to the apache/daffodil-vscode repo.
>>>
>>> Note that I think the incubator will still want access to the existing
>>> repo so they can view the full git history. Understanding where
>>> everything came from and verifying the provenance is important to
>>> ensuring we have all the appropriate CLA's. So while the tarball is
>>> maybe what is officially voted on, they will want access to the repo.
>>>
>>> That said, I don't think we are going to get CLA's for any Microsoft
>>> contribute code. So either all Microsoft contributed code will need to
>>> be kept MIT, or removed from the codebase. And if feels a bit odd to
>>> grant something to ASF where the original codebase stays MIT and isn't
>>> part of that grant.
>>>
>>> I think understanding how much code still exists that is Microsoft/MIT
>>> is going to be important to getting this through the IP clearance process.
>>>
>>> So I'm curious how much of that original Microsoft code still exists? I
>>> assume since it was just example code it has mostly been replaced? If
>>> that's the case, we could potentially say Microsoft has no ownership of
>>> this code, and so their CLA and MIT license aren't necessary?
>>>
>>> We should also have a good understanding of the dependencies. If any of
>>> them are not compatible with ALv2, then going through this process isn't
>>> even worth it until they are replaced. Do you have a list of the
>>> dependencies?
>>>
>>>
>>> On 9/9/21 11:16 AM, Beckerle, Mike wrote:
 So the daffodil-vscode code-base wants to be granted to become part of
>>> the
 Daffodil project.

 One question arises which is "what is the contribution?" exactly.

 The normal way this is identified is by creating a tarball of the
>>> source files
 and specifying an sha or md5 hash of that file.

 However, this code base is perhaps different from usual.

 It started by creating a detached fork of the vscode debugger example
>>> code base.
 This is MIT-Licensed which is a compatible license.

 The files are then edited. There are around 

Re: daffodil-vscode - how to package and identify the contribution - some git questions

2021-09-09 Thread Steve Lawrence
For comparison, it looks like Apache OpenWhisk has a vscode extension:

  https://github.com/apache/openwhisk-vscode-extension

And here's the IP Clearnace page for that:

  https://incubator.apache.org/ip-clearance/openwhisk-vscode-extension.html

It looks like they simply imported the 1.0 release without any git
history. And it looks like it was already licensed as ALv2.

I think it would be worth the effort to remove as much Microsoft MIT
licensed code as possible, and then license whatever is left over as
ALv2 (assuming the authors have the rights to do so). This will make it
more clear what is actually MIT vs ALv2 and will likely ease the IP
clearance process.


On 9/9/21 11:16 AM, Beckerle, Mike wrote:
> So the daffodil-vscode code-base wants to be granted to become part of the 
> Daffodil project.
> 
> One question arises which is "what is the contribution?" exactly.
> 
> The normal way this is identified is by creating a tarball of the source 
> files 
> and specifying an sha or md5 hash of that file.
> 
> However, this code base is perhaps different from usual.
> 
> It started by creating a detached fork of the vscode debugger example code 
> base. 
> This is MIT-Licensed which is a compatible license.
> 
> The files are then edited. There are around 100 commits on top of the base 
> that 
> came from the vscode debugger repository.
> 
> So the contribution is that set of 100 commits - the patches/change-sets they 
> represent.
> 
> These commits often edit the original files of the vscode debugger example to 
> add the daffodil-specific functionality. That is, the contribution material 
> is 
> in several cases intermingled in the lines of the existing files.  That's ok 
> I 
> think so long as the modified file had MIT license.
> 
> There's some value in preserving the 100 commits by our contributors, not 
> squashing it down to one commit, though if it's really not sensible to 
> proceed 
> otherwise, we can choose to squash it down to one commit.
> 
> Furthermore, the vscode debugger example repo itself had many commits in it. 
> The 
> current daffodil-vscode repo preserves all these commits as well. I don't see 
> value in preserving these commits, and would rather they were squashed into a 
> single "starting point" commit, with a dependencies file specifying the 
> githash 
> where we forked from, just so we can refer back if necessary.
> 
> So as a starting suggestion (subject to discussion of other alternatives) is 
> this:
> 
> Plan A:
> 
>  1. squash all commits up to and including the last Microsoft commit, together
> into one.
>  2. rebase the remaining commits on top of that.
>  1. I'm a bit worried about this rebase. There are merge commits, etc. in
> the history. I'm not sure this will just all rebase while preserving 
> all
> the commits, but maybe it will "just work"
>  3. create a "patch set" corresponding to the 100 or so commits that make up 
> the
> "contribution".
>  1. I don't know if this is even feasible for this many commits. 
>  4. create a tar/zip of this aggregate patch set.
>  5. compute an md5 of this patch set.
> 
> The patch set tar/zip file and its md5 hash are "the granted software".
> 
> The problem with this idea is that there's no obvious way to review a patch 
> set, 
> shy of applying it.
> 
> A better way may be to change steps 3 - 5 above to
> 
> Plan B:
> 
> 3. push the main branch to a new empty git repository
>  The point of this is to remove all historic stuff from the 
> repository,
> i.e., have a minimal git repo that contains only the contribution and the
> single other commit it must be based on.
> 
> 4. create a tarball of this git repository, and md5 hash of it
> 
> 5. document that the contribution is from githash X (after the first 
> commit)
> to githash Y (the final commit) of this repository
> 
> 
> This has the advantage that the contribution is a self-contained review-able 
> thing.
> 
> Other ideas are welcome. (Plans C, D, etc) The only requirements I know of 
> are:
> 
>  1. a single file containing the contribution, and its md5 hash
>  2. a sensible way one can review the contents of this contribution file
>  3. preserve history of derivation from the vscode debugger example.
> 
> 
> 
> 
> 
> 
> 
> 
> Mike Beckerle | Principal Engineer
> 
> mbecke...@owlcyberdefense.com 
> 
> P +1-781-330-0412
> 



Re: daffodil-vscode - how to package and identify the contribution - some git questions

2021-09-09 Thread Steve Lawrence
Correct. For more information about Apache license compatibility:

  https://www.apache.org/legal/resolved.html

MIT is Category A and is fine. EPL is Category B and is also okay, but
generally only in its binary form. So these top-level dependencies look
okay, assuming their transitive dependencies are also okay.

We'll also need to verify the licenses of all code in the repo.
Hopefully little of that is original microsoft MIT and can be granted to
ASF and relicensed.


On 9/9/21 1:30 PM, Beckerle, Mike wrote:
> The requirement, is that the entire dependency tree (transitively) cannot 
> depend on any software that has an Apache-incompatible (aka restrictive) 
> license.
> 
> So we need the transitive closure of all dependencies.
> 
> 
> 
> From: Adam Rosien 
> Sent: Thursday, September 9, 2021 12:44 PM
> To: dev@daffodil.apache.org 
> Subject: Re: daffodil-vscode - how to package and identify the contribution - 
> some git questions
> 
> (I don't understand the requirements of licencing + transitive
> dependencies, so I'm giving some surface level license info)
> 
> "ch.qos.logback" % "logback-classic" % "1.2.3" - EPL
> http://logback.qos.ch/license.html
> "com.microsoft.java" % "com.microsoft.java.debug.core" % "0.31.1" - EPL 1.0
> "co.fs2" %% "fs2-io" % "3.0.4" - MIT
> "com.monovore" %% "decline-effect" % "2.1.0" - APL 2.0
> "org.typelevel" %% "log4cats-slf4j" % "2.1.0" - APL 2.0
> 
> On Thu, Sep 9, 2021 at 9:35 AM Adam Rosien  wrote:
> 
>> I can relay the list of dependencies and their licenses.
>>
>> On Thu, Sep 9, 2021 at 9:20 AM Steve Lawrence 
>> wrote:
>>
>>> I personally don't care too much about having the existing git history
>>> once its part of ASF, especially if it makes thing any easier (as you
>>> mention, squash/rebase can be difficut through merges). So I'd say we
>>> just do plan B--create a tarball of the current state (without the git
>>> history), and the content of that tarball is what goes through the IP
>>> clearance process, and is the content of the inital commit when adding
>>> to the apache/daffodil-vscode repo.
>>>
>>> Note that I think the incubator will still want access to the existing
>>> repo so they can view the full git history. Understanding where
>>> everything came from and verifying the provenance is important to
>>> ensuring we have all the appropriate CLA's. So while the tarball is
>>> maybe what is officially voted on, they will want access to the repo.
>>>
>>> That said, I don't think we are going to get CLA's for any Microsoft
>>> contribute code. So either all Microsoft contributed code will need to
>>> be kept MIT, or removed from the codebase. And if feels a bit odd to
>>> grant something to ASF where the original codebase stays MIT and isn't
>>> part of that grant.
>>>
>>> I think understanding how much code still exists that is Microsoft/MIT
>>> is going to be important to getting this through the IP clearance process.
>>>
>>> So I'm curious how much of that original Microsoft code still exists? I
>>> assume since it was just example code it has mostly been replaced? If
>>> that's the case, we could potentially say Microsoft has no ownership of
>>> this code, and so their CLA and MIT license aren't necessary?
>>>
>>> We should also have a good understanding of the dependencies. If any of
>>> them are not compatible with ALv2, then going through this process isn't
>>> even worth it until they are replaced. Do you have a list of the
>>> dependencies?
>>>
>>>
>>> On 9/9/21 11:16 AM, Beckerle, Mike wrote:
 So the daffodil-vscode code-base wants to be granted to become part of
>>> the
 Daffodil project.

 One question arises which is "what is the contribution?" exactly.

 The normal way this is identified is by creating a tarball of the
>>> source files
 and specifying an sha or md5 hash of that file.

 However, this code base is perhaps different from usual.

 It started by creating a detached fork of the vscode debugger example
>>> code base.
 This is MIT-Licensed which is a compatible license.

 The files are then edited. There are around 100 commits on top of the
>>> base that
 came from the vscode debugger repository.

 So the contribution is that set of 100 commits - the
>>> patches/change-sets they
 represent.

 These commits often edit the original files of the vscode debugger
>>> example to
 add the daffodil-specific functionality. That is, the contribution
>>> material is
 in several cases intermingled in the lines of the existing files.
>>> That's ok I
 think so long as the modified file had MIT license.

 There's some value in preserving the 100 commits by our contributors,
>>> not
 squashing it down to one commit, though if it's really not sensible to
>>> proceed
 otherwise, we can choose to squash it down to one commit.

 Furthermore, the vscode debugger example repo itself had many commits
>>> in 

Re: daffodil-vscode - how to package and identify the contribution - some git questions

2021-09-09 Thread Beckerle, Mike
The requirement, is that the entire dependency tree (transitively) cannot 
depend on any software that has an Apache-incompatible (aka restrictive) 
license.

So we need the transitive closure of all dependencies.



From: Adam Rosien 
Sent: Thursday, September 9, 2021 12:44 PM
To: dev@daffodil.apache.org 
Subject: Re: daffodil-vscode - how to package and identify the contribution - 
some git questions

(I don't understand the requirements of licencing + transitive
dependencies, so I'm giving some surface level license info)

"ch.qos.logback" % "logback-classic" % "1.2.3" - EPL
http://logback.qos.ch/license.html
"com.microsoft.java" % "com.microsoft.java.debug.core" % "0.31.1" - EPL 1.0
"co.fs2" %% "fs2-io" % "3.0.4" - MIT
"com.monovore" %% "decline-effect" % "2.1.0" - APL 2.0
"org.typelevel" %% "log4cats-slf4j" % "2.1.0" - APL 2.0

On Thu, Sep 9, 2021 at 9:35 AM Adam Rosien  wrote:

> I can relay the list of dependencies and their licenses.
>
> On Thu, Sep 9, 2021 at 9:20 AM Steve Lawrence 
> wrote:
>
>> I personally don't care too much about having the existing git history
>> once its part of ASF, especially if it makes thing any easier (as you
>> mention, squash/rebase can be difficut through merges). So I'd say we
>> just do plan B--create a tarball of the current state (without the git
>> history), and the content of that tarball is what goes through the IP
>> clearance process, and is the content of the inital commit when adding
>> to the apache/daffodil-vscode repo.
>>
>> Note that I think the incubator will still want access to the existing
>> repo so they can view the full git history. Understanding where
>> everything came from and verifying the provenance is important to
>> ensuring we have all the appropriate CLA's. So while the tarball is
>> maybe what is officially voted on, they will want access to the repo.
>>
>> That said, I don't think we are going to get CLA's for any Microsoft
>> contribute code. So either all Microsoft contributed code will need to
>> be kept MIT, or removed from the codebase. And if feels a bit odd to
>> grant something to ASF where the original codebase stays MIT and isn't
>> part of that grant.
>>
>> I think understanding how much code still exists that is Microsoft/MIT
>> is going to be important to getting this through the IP clearance process.
>>
>> So I'm curious how much of that original Microsoft code still exists? I
>> assume since it was just example code it has mostly been replaced? If
>> that's the case, we could potentially say Microsoft has no ownership of
>> this code, and so their CLA and MIT license aren't necessary?
>>
>> We should also have a good understanding of the dependencies. If any of
>> them are not compatible with ALv2, then going through this process isn't
>> even worth it until they are replaced. Do you have a list of the
>> dependencies?
>>
>>
>> On 9/9/21 11:16 AM, Beckerle, Mike wrote:
>> > So the daffodil-vscode code-base wants to be granted to become part of
>> the
>> > Daffodil project.
>> >
>> > One question arises which is "what is the contribution?" exactly.
>> >
>> > The normal way this is identified is by creating a tarball of the
>> source files
>> > and specifying an sha or md5 hash of that file.
>> >
>> > However, this code base is perhaps different from usual.
>> >
>> > It started by creating a detached fork of the vscode debugger example
>> code base.
>> > This is MIT-Licensed which is a compatible license.
>> >
>> > The files are then edited. There are around 100 commits on top of the
>> base that
>> > came from the vscode debugger repository.
>> >
>> > So the contribution is that set of 100 commits - the
>> patches/change-sets they
>> > represent.
>> >
>> > These commits often edit the original files of the vscode debugger
>> example to
>> > add the daffodil-specific functionality. That is, the contribution
>> material is
>> > in several cases intermingled in the lines of the existing files.
>> That's ok I
>> > think so long as the modified file had MIT license.
>> >
>> > There's some value in preserving the 100 commits by our contributors,
>> not
>> > squashing it down to one commit, though if it's really not sensible to
>> proceed
>> > otherwise, we can choose to squash it down to one commit.
>> >
>> > Furthermore, the vscode debugger example repo itself had many commits
>> in it. The
>> > current daffodil-vscode repo preserves all these commits as well. I
>> don't see
>> > value in preserving these commits, and would rather they were squashed
>> into a
>> > single "starting point" commit, with a dependencies file specifying the
>> githash
>> > where we forked from, just so we can refer back if necessary.
>> >
>> > So as a starting suggestion (subject to discussion of other
>> alternatives) is this:
>> >
>> > Plan A:
>> >
>> >  1. squash all commits up to and including the last Microsoft commit,
>> together
>> > into one.
>> >  2. rebase the remaining commits on top of that.
>> >

Re: daffodil-vscode - how to package and identify the contribution - some git questions

2021-09-09 Thread Adam Rosien
(I don't understand the requirements of licencing + transitive
dependencies, so I'm giving some surface level license info)

"ch.qos.logback" % "logback-classic" % "1.2.3" - EPL
http://logback.qos.ch/license.html
"com.microsoft.java" % "com.microsoft.java.debug.core" % "0.31.1" - EPL 1.0
"co.fs2" %% "fs2-io" % "3.0.4" - MIT
"com.monovore" %% "decline-effect" % "2.1.0" - APL 2.0
"org.typelevel" %% "log4cats-slf4j" % "2.1.0" - APL 2.0

On Thu, Sep 9, 2021 at 9:35 AM Adam Rosien  wrote:

> I can relay the list of dependencies and their licenses.
>
> On Thu, Sep 9, 2021 at 9:20 AM Steve Lawrence 
> wrote:
>
>> I personally don't care too much about having the existing git history
>> once its part of ASF, especially if it makes thing any easier (as you
>> mention, squash/rebase can be difficut through merges). So I'd say we
>> just do plan B--create a tarball of the current state (without the git
>> history), and the content of that tarball is what goes through the IP
>> clearance process, and is the content of the inital commit when adding
>> to the apache/daffodil-vscode repo.
>>
>> Note that I think the incubator will still want access to the existing
>> repo so they can view the full git history. Understanding where
>> everything came from and verifying the provenance is important to
>> ensuring we have all the appropriate CLA's. So while the tarball is
>> maybe what is officially voted on, they will want access to the repo.
>>
>> That said, I don't think we are going to get CLA's for any Microsoft
>> contribute code. So either all Microsoft contributed code will need to
>> be kept MIT, or removed from the codebase. And if feels a bit odd to
>> grant something to ASF where the original codebase stays MIT and isn't
>> part of that grant.
>>
>> I think understanding how much code still exists that is Microsoft/MIT
>> is going to be important to getting this through the IP clearance process.
>>
>> So I'm curious how much of that original Microsoft code still exists? I
>> assume since it was just example code it has mostly been replaced? If
>> that's the case, we could potentially say Microsoft has no ownership of
>> this code, and so their CLA and MIT license aren't necessary?
>>
>> We should also have a good understanding of the dependencies. If any of
>> them are not compatible with ALv2, then going through this process isn't
>> even worth it until they are replaced. Do you have a list of the
>> dependencies?
>>
>>
>> On 9/9/21 11:16 AM, Beckerle, Mike wrote:
>> > So the daffodil-vscode code-base wants to be granted to become part of
>> the
>> > Daffodil project.
>> >
>> > One question arises which is "what is the contribution?" exactly.
>> >
>> > The normal way this is identified is by creating a tarball of the
>> source files
>> > and specifying an sha or md5 hash of that file.
>> >
>> > However, this code base is perhaps different from usual.
>> >
>> > It started by creating a detached fork of the vscode debugger example
>> code base.
>> > This is MIT-Licensed which is a compatible license.
>> >
>> > The files are then edited. There are around 100 commits on top of the
>> base that
>> > came from the vscode debugger repository.
>> >
>> > So the contribution is that set of 100 commits - the
>> patches/change-sets they
>> > represent.
>> >
>> > These commits often edit the original files of the vscode debugger
>> example to
>> > add the daffodil-specific functionality. That is, the contribution
>> material is
>> > in several cases intermingled in the lines of the existing files.
>> That's ok I
>> > think so long as the modified file had MIT license.
>> >
>> > There's some value in preserving the 100 commits by our contributors,
>> not
>> > squashing it down to one commit, though if it's really not sensible to
>> proceed
>> > otherwise, we can choose to squash it down to one commit.
>> >
>> > Furthermore, the vscode debugger example repo itself had many commits
>> in it. The
>> > current daffodil-vscode repo preserves all these commits as well. I
>> don't see
>> > value in preserving these commits, and would rather they were squashed
>> into a
>> > single "starting point" commit, with a dependencies file specifying the
>> githash
>> > where we forked from, just so we can refer back if necessary.
>> >
>> > So as a starting suggestion (subject to discussion of other
>> alternatives) is this:
>> >
>> > Plan A:
>> >
>> >  1. squash all commits up to and including the last Microsoft commit,
>> together
>> > into one.
>> >  2. rebase the remaining commits on top of that.
>> >  1. I'm a bit worried about this rebase. There are merge commits,
>> etc. in
>> > the history. I'm not sure this will just all rebase while
>> preserving all
>> > the commits, but maybe it will "just work"
>> >  3. create a "patch set" corresponding to the 100 or so commits that
>> make up the
>> > "contribution".
>> >  1. I don't know if this is even feasible for this many commits.
>> >  4. create a tar/zip 

Re: daffodil-vscode - how to package and identify the contribution - some git questions

2021-09-09 Thread Adam Rosien
I can relay the list of dependencies and their licenses.

On Thu, Sep 9, 2021 at 9:20 AM Steve Lawrence  wrote:

> I personally don't care too much about having the existing git history
> once its part of ASF, especially if it makes thing any easier (as you
> mention, squash/rebase can be difficut through merges). So I'd say we
> just do plan B--create a tarball of the current state (without the git
> history), and the content of that tarball is what goes through the IP
> clearance process, and is the content of the inital commit when adding
> to the apache/daffodil-vscode repo.
>
> Note that I think the incubator will still want access to the existing
> repo so they can view the full git history. Understanding where
> everything came from and verifying the provenance is important to
> ensuring we have all the appropriate CLA's. So while the tarball is
> maybe what is officially voted on, they will want access to the repo.
>
> That said, I don't think we are going to get CLA's for any Microsoft
> contribute code. So either all Microsoft contributed code will need to
> be kept MIT, or removed from the codebase. And if feels a bit odd to
> grant something to ASF where the original codebase stays MIT and isn't
> part of that grant.
>
> I think understanding how much code still exists that is Microsoft/MIT
> is going to be important to getting this through the IP clearance process.
>
> So I'm curious how much of that original Microsoft code still exists? I
> assume since it was just example code it has mostly been replaced? If
> that's the case, we could potentially say Microsoft has no ownership of
> this code, and so their CLA and MIT license aren't necessary?
>
> We should also have a good understanding of the dependencies. If any of
> them are not compatible with ALv2, then going through this process isn't
> even worth it until they are replaced. Do you have a list of the
> dependencies?
>
>
> On 9/9/21 11:16 AM, Beckerle, Mike wrote:
> > So the daffodil-vscode code-base wants to be granted to become part of
> the
> > Daffodil project.
> >
> > One question arises which is "what is the contribution?" exactly.
> >
> > The normal way this is identified is by creating a tarball of the source
> files
> > and specifying an sha or md5 hash of that file.
> >
> > However, this code base is perhaps different from usual.
> >
> > It started by creating a detached fork of the vscode debugger example
> code base.
> > This is MIT-Licensed which is a compatible license.
> >
> > The files are then edited. There are around 100 commits on top of the
> base that
> > came from the vscode debugger repository.
> >
> > So the contribution is that set of 100 commits - the patches/change-sets
> they
> > represent.
> >
> > These commits often edit the original files of the vscode debugger
> example to
> > add the daffodil-specific functionality. That is, the contribution
> material is
> > in several cases intermingled in the lines of the existing files.
> That's ok I
> > think so long as the modified file had MIT license.
> >
> > There's some value in preserving the 100 commits by our contributors,
> not
> > squashing it down to one commit, though if it's really not sensible to
> proceed
> > otherwise, we can choose to squash it down to one commit.
> >
> > Furthermore, the vscode debugger example repo itself had many commits in
> it. The
> > current daffodil-vscode repo preserves all these commits as well. I
> don't see
> > value in preserving these commits, and would rather they were squashed
> into a
> > single "starting point" commit, with a dependencies file specifying the
> githash
> > where we forked from, just so we can refer back if necessary.
> >
> > So as a starting suggestion (subject to discussion of other
> alternatives) is this:
> >
> > Plan A:
> >
> >  1. squash all commits up to and including the last Microsoft commit,
> together
> > into one.
> >  2. rebase the remaining commits on top of that.
> >  1. I'm a bit worried about this rebase. There are merge commits,
> etc. in
> > the history. I'm not sure this will just all rebase while
> preserving all
> > the commits, but maybe it will "just work"
> >  3. create a "patch set" corresponding to the 100 or so commits that
> make up the
> > "contribution".
> >  1. I don't know if this is even feasible for this many commits.
> >  4. create a tar/zip of this aggregate patch set.
> >  5. compute an md5 of this patch set.
> >
> > The patch set tar/zip file and its md5 hash are "the granted software".
> >
> > The problem with this idea is that there's no obvious way to review a
> patch set,
> > shy of applying it.
> >
> > A better way may be to change steps 3 - 5 above to
> >
> > Plan B:
> >
> > 3. push the main branch to a new empty git repository
> >  The point of this is to remove all historic stuff from the
> repository,
> > i.e., have a minimal git repo that contains only the contribution
> and the
> > single other 

Re: daffodil-vscode - how to package and identify the contribution - some git questions

2021-09-09 Thread Steve Lawrence
I personally don't care too much about having the existing git history
once its part of ASF, especially if it makes thing any easier (as you
mention, squash/rebase can be difficut through merges). So I'd say we
just do plan B--create a tarball of the current state (without the git
history), and the content of that tarball is what goes through the IP
clearance process, and is the content of the inital commit when adding
to the apache/daffodil-vscode repo.

Note that I think the incubator will still want access to the existing
repo so they can view the full git history. Understanding where
everything came from and verifying the provenance is important to
ensuring we have all the appropriate CLA's. So while the tarball is
maybe what is officially voted on, they will want access to the repo.

That said, I don't think we are going to get CLA's for any Microsoft
contribute code. So either all Microsoft contributed code will need to
be kept MIT, or removed from the codebase. And if feels a bit odd to
grant something to ASF where the original codebase stays MIT and isn't
part of that grant.

I think understanding how much code still exists that is Microsoft/MIT
is going to be important to getting this through the IP clearance process.

So I'm curious how much of that original Microsoft code still exists? I
assume since it was just example code it has mostly been replaced? If
that's the case, we could potentially say Microsoft has no ownership of
this code, and so their CLA and MIT license aren't necessary?

We should also have a good understanding of the dependencies. If any of
them are not compatible with ALv2, then going through this process isn't
even worth it until they are replaced. Do you have a list of the
dependencies?


On 9/9/21 11:16 AM, Beckerle, Mike wrote:
> So the daffodil-vscode code-base wants to be granted to become part of the 
> Daffodil project.
> 
> One question arises which is "what is the contribution?" exactly.
> 
> The normal way this is identified is by creating a tarball of the source 
> files 
> and specifying an sha or md5 hash of that file.
> 
> However, this code base is perhaps different from usual.
> 
> It started by creating a detached fork of the vscode debugger example code 
> base. 
> This is MIT-Licensed which is a compatible license.
> 
> The files are then edited. There are around 100 commits on top of the base 
> that 
> came from the vscode debugger repository.
> 
> So the contribution is that set of 100 commits - the patches/change-sets they 
> represent.
> 
> These commits often edit the original files of the vscode debugger example to 
> add the daffodil-specific functionality. That is, the contribution material 
> is 
> in several cases intermingled in the lines of the existing files.  That's ok 
> I 
> think so long as the modified file had MIT license.
> 
> There's some value in preserving the 100 commits by our contributors, not 
> squashing it down to one commit, though if it's really not sensible to 
> proceed 
> otherwise, we can choose to squash it down to one commit.
> 
> Furthermore, the vscode debugger example repo itself had many commits in it. 
> The 
> current daffodil-vscode repo preserves all these commits as well. I don't see 
> value in preserving these commits, and would rather they were squashed into a 
> single "starting point" commit, with a dependencies file specifying the 
> githash 
> where we forked from, just so we can refer back if necessary.
> 
> So as a starting suggestion (subject to discussion of other alternatives) is 
> this:
> 
> Plan A:
> 
>  1. squash all commits up to and including the last Microsoft commit, together
> into one.
>  2. rebase the remaining commits on top of that.
>  1. I'm a bit worried about this rebase. There are merge commits, etc. in
> the history. I'm not sure this will just all rebase while preserving 
> all
> the commits, but maybe it will "just work"
>  3. create a "patch set" corresponding to the 100 or so commits that make up 
> the
> "contribution".
>  1. I don't know if this is even feasible for this many commits. 
>  4. create a tar/zip of this aggregate patch set.
>  5. compute an md5 of this patch set.
> 
> The patch set tar/zip file and its md5 hash are "the granted software".
> 
> The problem with this idea is that there's no obvious way to review a patch 
> set, 
> shy of applying it.
> 
> A better way may be to change steps 3 - 5 above to
> 
> Plan B:
> 
> 3. push the main branch to a new empty git repository
>  The point of this is to remove all historic stuff from the 
> repository,
> i.e., have a minimal git repo that contains only the contribution and the
> single other commit it must be based on.
> 
> 4. create a tarball of this git repository, and md5 hash of it
> 
> 5. document that the contribution is from githash X (after the first 
> commit)
> to githash Y (the final commit) of this repository
> 
> 
> This has the advantage 

daffodil-vscode - how to package and identify the contribution - some git questions

2021-09-09 Thread Beckerle, Mike
So the daffodil-vscode code-base wants to be granted to become part of the 
Daffodil project.

One question arises which is "what is the contribution?" exactly.

The normal way this is identified is by creating a tarball of the source files 
and specifying an sha or md5 hash of that file.

However, this code base is perhaps different from usual.

It started by creating a detached fork of the vscode debugger example code 
base. This is MIT-Licensed which is a compatible license.

The files are then edited. There are around 100 commits on top of the base that 
came from the vscode debugger repository.

So the contribution is that set of 100 commits - the patches/change-sets they 
represent.

These commits often edit the original files of the vscode debugger example to 
add the daffodil-specific functionality. That is, the contribution material is 
in several cases intermingled in the lines of the existing files.  That's ok I 
think so long as the modified file had MIT license.

There's some value in preserving the 100 commits by our contributors, not 
squashing it down to one commit, though if it's really not sensible to proceed 
otherwise, we can choose to squash it down to one commit.

Furthermore, the vscode debugger example repo itself had many commits in it. 
The current daffodil-vscode repo preserves all these commits as well. I don't 
see value in preserving these commits, and would rather they were squashed into 
a single "starting point" commit, with a dependencies file specifying the 
githash where we forked from, just so we can refer back if necessary.

So as a starting suggestion (subject to discussion of other alternatives) is 
this:

Plan A:

  1.  squash all commits up to and including the last Microsoft commit, 
together into one.
  2.  rebase the remaining commits on top of that.
 *   I'm a bit worried about this rebase. There are merge commits, etc. in 
the history. I'm not sure this will just all rebase while preserving all the 
commits, but maybe it will "just work"
  3.  create a "patch set" corresponding to the 100 or so commits that make up 
the "contribution".
 *   I don't know if this is even feasible for this many commits.
  4.  create a tar/zip of this aggregate patch set.
  5.  compute an md5 of this patch set.

The patch set tar/zip file and its md5 hash are "the granted software".

The problem with this idea is that there's no obvious way to review a patch 
set, shy of applying it.

A better way may be to change steps 3 - 5 above to

Plan B:

3. push the main branch to a new empty git repository
The point of this is to remove all historic stuff from the repository, 
i.e., have a minimal git repo that contains only the contribution and the 
single other commit it must be based on.

4. create a tarball of this git repository, and md5 hash of it

5. document that the contribution is from githash X (after the first commit) to 
githash Y (the final commit) of this repository

This has the advantage that the contribution is a self-contained review-able 
thing.

Other ideas are welcome. (Plans C, D, etc) The only requirements I know of are:

  1.  a single file containing the contribution, and its md5 hash
  2.  a sensible way one can review the contents of this contribution file
  3.  preserve history of derivation from the vscode debugger example.







Mike Beckerle | Principal Engineer

[cid:238f633f-3220-4dc5-944c-ca72b28b8338]

mbecke...@owlcyberdefense.com

P +1-781-330-0412