Re: daffodil-vscode - how to package and identify the contribution - some git questions

2021-09-10 Thread Beckerle, Mike
How hard is it to refactor these 6 files so that all new code is in separate 
files from all preserved original code?

Assume one-liner changes to original files (like calling MockDebugger changed 
to call DaffodilDebugger) are allowed.

We either have to separate these 6 blended files, or convince legal and the 
incubator-pmc that blended files are ok because they originally had the MIT 
license.

I definitely don't want to bother with that unless the refactoring exercise 
here is hard.

From: John Wass 
Sent: Friday, September 10, 2021 1:02 PM
To: dev@daffodil.apache.org 
Subject: Re: daffodil-vscode - how to package and identify the contribution - 
some git questions

Mike - Those were renames from the original versions that had "mock" in
their names.

commit 383fd4882a8fe51adf21b5ae31fe252056800447

On Fri, Sep 10, 2021 at 12:54 PM Beckerle, Mike <
mbecke...@owlcyberdefense.com> wrote:

>
> John Wass said:
>
> I had a few more (6) source files as modified..
>
> extension.ts
> debugAdapter.ts
> daffodilRuntime.ts
> daffodilDebug.ts
> adapter.test.ts
> activateDaffodilDebug.ts
>
> The 3 files with daffodil or Daffodil in their names, aren't those new
> files? Or were those based on provided files, but the file was renamed as
> well as the content modified?
>
> ...mikeb
>
>


Re: daffodil-vscode - how to package and identify the contribution - some git questions

2021-09-10 Thread Steve Lawrence
Yeah, I think having accurate license/copyright headers on all the files
will make it much easier to determine the ownership and ASF
acceptability, especially for the ASF group that must approve the IP
clearance process.

On 9/10/21 12:34 PM, Adam Rosien wrote:
> Would it help if we developers added copyright information to the repo? Or
> would it be redundant. (Most projects start out without explicit copyright
> notices anyway, and add them after some period of time).
> 
> .. Adam
> 
> On Fri, Sep 10, 2021 at 4:42 AM Steve Lawrence  wrote:
> 
>> I think the concern from ASF, and why they have this IP clearance
>> process, is that the copyright ownership of these files is not clear. It
>> wasn't done in a fork of a Daffodil repo, and there are contributions
>> from multiple developers, with little public oversight from Daffodil. I
>> think from the ASF's perspective, this code did not follow the ASF
>> process, and is assumed to be owned by the contributors or their
>> companies. There is no license/copyright in the code specifying
>> otherwise, so I think ASF must assume the worst, and require a software
>> grant.
>>
>> Also note that even prototype code has a copyright owner and a license.
>> Copying it into a PR doesn't change that. If you were to throw away this
>> code and start from scratch following the ASF process, then it wouldn't
>> be a problem. But if the plan is to copy prototype code not owned by ASF
>> into a PR, then there are ownership concerns.
>>
>> If all this work was done in a fork of the apache/daffodil-vscode repo
>> from a single contributor, then I think maybe the assumption from ASF is
>> the code was intended to be part of the main repo and implicitly granted
>> to ASF via the PR process.
>>
>>
>> On 9/9/21 4:05 PM, John Wass wrote:
>>> Yeah I was thinking of the example repo as a prototype, just as if I was
>>> working on a feature in my fork of Daffodil.  The main project doesn't
>> own
>>> the feature until it crosses the PR threshold, and once it does cross
>> over
>>> the state of my fork is of no concern to it.
>>>
>>>
>>>
>>> On Thu, Sep 9, 2021 at 3:54 PM Steve Lawrence 
>> wrote:
>>>
 The concern is that this code was developed outside of Apache and so
 didn't follow standard Apache process. From the IP clearance page:

 https://incubator.apache.org/ip-clearance/

> Any code that was developed outside of the ASF SVN repository and
> our public mailing lists must be processed like this, even if the
> external developer is already an ASF committer.

 I suppose that submitting it as a PR does follow some of that process,
 but there is maybe less assurance of ownership. Because it was not
 developed in an ASF repository, that code is presumed to be owned by
 you, multiple developers, or a company, and so that ownership must be
 granted to ASF via the IP clearance process, with appropriate software
 grant, CLA's, etc. (At least, that's my admittedly limited understanding
 of the process).

 - Steve


 On 9/9/21 3:34 PM, John Wass wrote:
> Couldn't we (the vscode contributors) submit a series of PRs against
>> the
> new repo to move the code, and just archive the example repo as-is?
>
> I noted some thoughts on that a while back
> https://github.com/jw3/example-daffodil-vscode/issues/77
>
>
>
> On Thu, Sep 9, 2021 at 2:11 PM Beckerle, Mike <
 mbecke...@owlcyberdefense.com>
> wrote:
>
>> I know of one file in the repo which will have to be removed which is
 the
>> jpeg.dfdl.xsd file, which is there just as an example workspace.
>>
>> The copyright and provisions of that are not compatible with Apache
>> licensing.
>>
>> We can find a DFDL schema that we created that has Apache license to
>> use
>> instead.
>>
>> For the other files under src, server, and build, can we generate a
>> list
>> of files identifying which are:
>>
>> (a) original MIT-licensed, unmodified
>> (b) new - can be ASL
>> (c) blended - started from MIT-licensed source, modified with
>> daffodil-vscode-specific changes.
>>
>> It is these blended files that are the problematic ones.
>>
>>
>>
>> 
>> From: Steve Lawrence 
>> Sent: Thursday, September 9, 2021 1:38 PM
>> To: dev@daffodil.apache.org 
>> Subject: Re: daffodil-vscode - how to package and identify the
>> contribution - some git questions
>>
>> Correct. For more information about Apache license compatibility:
>>
>>   https://www.apache.org/legal/resolved.html
>>
>> MIT is Category A and is fine. EPL is Category B and is also okay, but
>> generally only in its binary form. So these top-level dependencies
>> look
>> okay, assuming their transitive dependencies are also okay.
>>
>> We'll also need to verify the licenses of all code in the 

Re: daffodil-vscode - how to package and identify the contribution - some git questions

2021-09-10 Thread John Wass
Mike - Those were renames from the original versions that had "mock" in
their names.

commit 383fd4882a8fe51adf21b5ae31fe252056800447

On Fri, Sep 10, 2021 at 12:54 PM Beckerle, Mike <
mbecke...@owlcyberdefense.com> wrote:

>
> John Wass said:
>
> I had a few more (6) source files as modified..
>
> extension.ts
> debugAdapter.ts
> daffodilRuntime.ts
> daffodilDebug.ts
> adapter.test.ts
> activateDaffodilDebug.ts
>
> The 3 files with daffodil or Daffodil in their names, aren't those new
> files? Or were those based on provided files, but the file was renamed as
> well as the content modified?
>
> ...mikeb
>
>


Re: daffodil-vscode - how to package and identify the contribution - some git questions

2021-09-10 Thread Beckerle, Mike

John Wass said:

I had a few more (6) source files as modified..

extension.ts
debugAdapter.ts
daffodilRuntime.ts
daffodilDebug.ts
adapter.test.ts
activateDaffodilDebug.ts

The 3 files with daffodil or Daffodil in their names, aren't those new files? 
Or were those based on provided files, but the file was renamed as well as the 
content modified?

...mikeb



Re: daffodil-vscode - how to package and identify the contribution - some git questions

2021-09-10 Thread Adam Rosien
Would it help if we developers added copyright information to the repo? Or
would it be redundant. (Most projects start out without explicit copyright
notices anyway, and add them after some period of time).

.. Adam

On Fri, Sep 10, 2021 at 4:42 AM Steve Lawrence  wrote:

> I think the concern from ASF, and why they have this IP clearance
> process, is that the copyright ownership of these files is not clear. It
> wasn't done in a fork of a Daffodil repo, and there are contributions
> from multiple developers, with little public oversight from Daffodil. I
> think from the ASF's perspective, this code did not follow the ASF
> process, and is assumed to be owned by the contributors or their
> companies. There is no license/copyright in the code specifying
> otherwise, so I think ASF must assume the worst, and require a software
> grant.
>
> Also note that even prototype code has a copyright owner and a license.
> Copying it into a PR doesn't change that. If you were to throw away this
> code and start from scratch following the ASF process, then it wouldn't
> be a problem. But if the plan is to copy prototype code not owned by ASF
> into a PR, then there are ownership concerns.
>
> If all this work was done in a fork of the apache/daffodil-vscode repo
> from a single contributor, then I think maybe the assumption from ASF is
> the code was intended to be part of the main repo and implicitly granted
> to ASF via the PR process.
>
>
> On 9/9/21 4:05 PM, John Wass wrote:
> > Yeah I was thinking of the example repo as a prototype, just as if I was
> > working on a feature in my fork of Daffodil.  The main project doesn't
> own
> > the feature until it crosses the PR threshold, and once it does cross
> over
> > the state of my fork is of no concern to it.
> >
> >
> >
> > On Thu, Sep 9, 2021 at 3:54 PM Steve Lawrence 
> wrote:
> >
> >> The concern is that this code was developed outside of Apache and so
> >> didn't follow standard Apache process. From the IP clearance page:
> >>
> >> https://incubator.apache.org/ip-clearance/
> >>
> >>> Any code that was developed outside of the ASF SVN repository and
> >>> our public mailing lists must be processed like this, even if the
> >>> external developer is already an ASF committer.
> >>
> >> I suppose that submitting it as a PR does follow some of that process,
> >> but there is maybe less assurance of ownership. Because it was not
> >> developed in an ASF repository, that code is presumed to be owned by
> >> you, multiple developers, or a company, and so that ownership must be
> >> granted to ASF via the IP clearance process, with appropriate software
> >> grant, CLA's, etc. (At least, that's my admittedly limited understanding
> >> of the process).
> >>
> >> - Steve
> >>
> >>
> >> On 9/9/21 3:34 PM, John Wass wrote:
> >>> Couldn't we (the vscode contributors) submit a series of PRs against
> the
> >>> new repo to move the code, and just archive the example repo as-is?
> >>>
> >>> I noted some thoughts on that a while back
> >>> https://github.com/jw3/example-daffodil-vscode/issues/77
> >>>
> >>>
> >>>
> >>> On Thu, Sep 9, 2021 at 2:11 PM Beckerle, Mike <
> >> mbecke...@owlcyberdefense.com>
> >>> wrote:
> >>>
>  I know of one file in the repo which will have to be removed which is
> >> the
>  jpeg.dfdl.xsd file, which is there just as an example workspace.
> 
>  The copyright and provisions of that are not compatible with Apache
>  licensing.
> 
>  We can find a DFDL schema that we created that has Apache license to
> use
>  instead.
> 
>  For the other files under src, server, and build, can we generate a
> list
>  of files identifying which are:
> 
>  (a) original MIT-licensed, unmodified
>  (b) new - can be ASL
>  (c) blended - started from MIT-licensed source, modified with
>  daffodil-vscode-specific changes.
> 
>  It is these blended files that are the problematic ones.
> 
> 
> 
>  
>  From: Steve Lawrence 
>  Sent: Thursday, September 9, 2021 1:38 PM
>  To: dev@daffodil.apache.org 
>  Subject: Re: daffodil-vscode - how to package and identify the
>  contribution - some git questions
> 
>  Correct. For more information about Apache license compatibility:
> 
>    https://www.apache.org/legal/resolved.html
> 
>  MIT is Category A and is fine. EPL is Category B and is also okay, but
>  generally only in its binary form. So these top-level dependencies
> look
>  okay, assuming their transitive dependencies are also okay.
> 
>  We'll also need to verify the licenses of all code in the repo.
>  Hopefully little of that is original microsoft MIT and can be granted
> to
>  ASF and relicensed.
> 
> 
>  On 9/9/21 1:30 PM, Beckerle, Mike wrote:
> > The requirement, is that the entire dependency tree (transitively)
>  cannot depend on any software that has an 

Re: daffodil-vscode - how to package and identify the contribution - some git questions

2021-09-10 Thread Steve Lawrence
I don't think I would try squashing anything. The full and accurate git
history will likely be wanted by the IP clearance process. A tarball
will be the result of that process, and we can add it to the get repo in
one commit. Trying to maintain git history just complicates things for I
think not much benefit, especially if things like bug/PR ID's in commit
messages are incorrect.

It's more important that files in the repo have correct license headers
add to make it clear the who the copyright holder is of all the files.


On 9/9/21 9:44 PM, John Wass wrote:
> I had a few more (6) source files as modified..
> 
> extension.ts
> debugAdapter.ts
> daffodilRuntime.ts
> daffodilDebug.ts
> adapter.test.ts
> activateDaffodilDebug.ts
> 
>> It would seem an IDE (probably vscode!) decided to restyle/reindent this
> code.
> 
> We added opinionated code formatting... apparently trying to make this
> process as hard as possible :/
> 
> That reformat commit was done on 08/25/2021, title of PR was Prettier.
> Looking prior to that commit might give a little better idea of what
> changed.
> 
> 
>> squash/rebase can be difficut through merges
> 
> Here is a quick pass on (1) squashing the MS source in a single commit (2)
> placing that commit on top of an init commit in a repo (2) and then
> rewriting out commits on top of all of that.
> 
> It preserves our authorship.  Can be cleaned up a little bit still but I am
> not going to put time into it if we don't want this.  I just wanted to note
> how it could look.
> 
> https://github.com/jw3/rewrite-daffodil-vscode-1
> 
> One issue I could see here is the linking of the example repo PR IDs in the
> commit messages will conflict once we start adding PRs in the new repo.
> Now would be the time to rewrite these commit messages and strip/modify
> those #ID tags.
> 
> Thoughts on that rewrite repo?
> 
> 
> 
> 
> 
> On Thu, Sep 9, 2021 at 5:42 PM Beckerle, Mike 
> wrote:
> 
>> So via some git trickery I was able to determine the "blended" files.
>>
>> I'm ignoring the various configuration files which are generally json
>> files.
>>
>> Of the ".ts" files only 3 are blended:
>>
>> src/debugAdapter.ts - 72 lines - only maybe 6 lines are different
>> src/extension.ts - 179 lines
>> src/tests/adapter.test.ts - 137 lines (50 of which are commented-out code)
>>
>> The delta between these files and the original files of the same name are
>> larger than expected due to changes in whitespace, and removal of ";" at
>> end of line (which I guess are optional in many places in typescript).
>>
>> It would seem an IDE (probably vscode!) decided to restyle/reindent this
>> code.
>>
>> So it's a bit hard to figure out what the "real" deltas are.
>>
>> src/debugAdapter.ts appears to be only trivially different. The name
>> MockDebugSession was replaced by DaffodilDebugSession, and "./mockDebug"
>> was changed to "./daffodilDebug".
>>
>> The other two files do appear to be where all the real blended code is.
>>
>>
>>
>> 
>> From: Beckerle, Mike 
>> Sent: Thursday, September 9, 2021 4:21 PM
>> To: dev@daffodil.apache.org 
>> Subject: Re: daffodil-vscode - how to package and identify the
>> contribution - some git questions
>>
>> Whether it's a PR or series of PRs, or a software grant, that still
>> doesn't resolve the issue of the blended files which are part MIT-licensed
>> original code, and part new code deltas by the daffodil-vscode contributors.
>>
>> We need to understand whether those blended files can be teased apart
>> somehow so that it is clear going forward what is an MIT-licensed library
>> and what is Apache Licensed.
>>
>> I just did a grep -R -i microsoft  in a clone of the
>> openwhisk-vscode-extension and got zero hits. So no files still carry
>> microsoft copyright and in fact their NOTICES.txt file does not indicate
>> any dependency on MIT-licensed code at all.  So I think
>> openwhisk-vscode-extension is not going to help us figure out how to surf
>> this issue.
>>
>>
>> 
>> From: Steve Lawrence 
>> Sent: Thursday, September 9, 2021 3:54 PM
>> To: dev@daffodil.apache.org 
>> Subject: Re: daffodil-vscode - how to package and identify the
>> contribution - some git questions
>>
>> The concern is that this code was developed outside of Apache and so
>> didn't follow standard Apache process. From the IP clearance page:
>>
>> https://incubator.apache.org/ip-clearance/
>>
>>> Any code that was developed outside of the ASF SVN repository and
>>> our public mailing lists must be processed like this, even if the
>>> external developer is already an ASF committer.
>>
>> I suppose that submitting it as a PR does follow some of that process,
>> but there is maybe less assurance of ownership. Because it was not
>> developed in an ASF repository, that code is presumed to be owned by
>> you, multiple developers, or a company, and so that ownership must be
>> granted to ASF via the IP clearance process, with appropriate software

Re: daffodil-vscode - how to package and identify the contribution - some git questions

2021-09-10 Thread Steve Lawrence
I think the concern from ASF, and why they have this IP clearance
process, is that the copyright ownership of these files is not clear. It
wasn't done in a fork of a Daffodil repo, and there are contributions
from multiple developers, with little public oversight from Daffodil. I
think from the ASF's perspective, this code did not follow the ASF
process, and is assumed to be owned by the contributors or their
companies. There is no license/copyright in the code specifying
otherwise, so I think ASF must assume the worst, and require a software
grant.

Also note that even prototype code has a copyright owner and a license.
Copying it into a PR doesn't change that. If you were to throw away this
code and start from scratch following the ASF process, then it wouldn't
be a problem. But if the plan is to copy prototype code not owned by ASF
into a PR, then there are ownership concerns.

If all this work was done in a fork of the apache/daffodil-vscode repo
from a single contributor, then I think maybe the assumption from ASF is
the code was intended to be part of the main repo and implicitly granted
to ASF via the PR process.


On 9/9/21 4:05 PM, John Wass wrote:
> Yeah I was thinking of the example repo as a prototype, just as if I was
> working on a feature in my fork of Daffodil.  The main project doesn't own
> the feature until it crosses the PR threshold, and once it does cross over
> the state of my fork is of no concern to it.
> 
> 
> 
> On Thu, Sep 9, 2021 at 3:54 PM Steve Lawrence  wrote:
> 
>> The concern is that this code was developed outside of Apache and so
>> didn't follow standard Apache process. From the IP clearance page:
>>
>> https://incubator.apache.org/ip-clearance/
>>
>>> Any code that was developed outside of the ASF SVN repository and
>>> our public mailing lists must be processed like this, even if the
>>> external developer is already an ASF committer.
>>
>> I suppose that submitting it as a PR does follow some of that process,
>> but there is maybe less assurance of ownership. Because it was not
>> developed in an ASF repository, that code is presumed to be owned by
>> you, multiple developers, or a company, and so that ownership must be
>> granted to ASF via the IP clearance process, with appropriate software
>> grant, CLA's, etc. (At least, that's my admittedly limited understanding
>> of the process).
>>
>> - Steve
>>
>>
>> On 9/9/21 3:34 PM, John Wass wrote:
>>> Couldn't we (the vscode contributors) submit a series of PRs against the
>>> new repo to move the code, and just archive the example repo as-is?
>>>
>>> I noted some thoughts on that a while back
>>> https://github.com/jw3/example-daffodil-vscode/issues/77
>>>
>>>
>>>
>>> On Thu, Sep 9, 2021 at 2:11 PM Beckerle, Mike <
>> mbecke...@owlcyberdefense.com>
>>> wrote:
>>>
 I know of one file in the repo which will have to be removed which is
>> the
 jpeg.dfdl.xsd file, which is there just as an example workspace.

 The copyright and provisions of that are not compatible with Apache
 licensing.

 We can find a DFDL schema that we created that has Apache license to use
 instead.

 For the other files under src, server, and build, can we generate a list
 of files identifying which are:

 (a) original MIT-licensed, unmodified
 (b) new - can be ASL
 (c) blended - started from MIT-licensed source, modified with
 daffodil-vscode-specific changes.

 It is these blended files that are the problematic ones.



 
 From: Steve Lawrence 
 Sent: Thursday, September 9, 2021 1:38 PM
 To: dev@daffodil.apache.org 
 Subject: Re: daffodil-vscode - how to package and identify the
 contribution - some git questions

 Correct. For more information about Apache license compatibility:

   https://www.apache.org/legal/resolved.html

 MIT is Category A and is fine. EPL is Category B and is also okay, but
 generally only in its binary form. So these top-level dependencies look
 okay, assuming their transitive dependencies are also okay.

 We'll also need to verify the licenses of all code in the repo.
 Hopefully little of that is original microsoft MIT and can be granted to
 ASF and relicensed.


 On 9/9/21 1:30 PM, Beckerle, Mike wrote:
> The requirement, is that the entire dependency tree (transitively)
 cannot depend on any software that has an Apache-incompatible (aka
 restrictive) license.
>
> So we need the transitive closure of all dependencies.
>
>
> 
> From: Adam Rosien 
> Sent: Thursday, September 9, 2021 12:44 PM
> To: dev@daffodil.apache.org 
> Subject: Re: daffodil-vscode - how to package and identify the
 contribution - some git questions
>
> (I don't understand the requirements of licencing + transitive
> dependencies, so I'm giving some surface