daffodil-vscode - how to package and identify the contribution - some git questions

Beckerle, Mike Thu, 09 Sep 2021 08:16:58 -0700

So the daffodil-vscode code-base wants to be granted to become part of the 
Daffodil project.


One question arises which is "what is the contribution?" exactly.

The normal way this is identified is by creating a tarball of the source files 
and specifying an sha or md5 hash of that file.

However, this code base is perhaps different from usual.

It started by creating a detached fork of the vscode debugger example code 
base. This is MIT-Licensed which is a compatible license.

The files are then edited. There are around 100 commits on top of the base that 
came from the vscode debugger repository.

So the contribution is that set of 100 commits - the patches/change-sets they 
represent.

These commits often edit the original files of the vscode debugger example to 
add the daffodil-specific functionality. That is, the contribution material is 
in several cases intermingled in the lines of the existing files.  That's ok I 
think so long as the modified file had MIT license.

There's some value in preserving the 100 commits by our contributors, not 
squashing it down to one commit, though if it's really not sensible to proceed 
otherwise, we can choose to squash it down to one commit.

Furthermore, the vscode debugger example repo itself had many commits in it. 
The current daffodil-vscode repo preserves all these commits as well. I don't 
see value in preserving these commits, and would rather they were squashed into 
a single "starting point" commit, with a dependencies file specifying the 
githash where we forked from, just so we can refer back if necessary.

So as a starting suggestion (subject to discussion of other alternatives) is 
this:

Plan A:

  1.  squash all commits up to and including the last Microsoft commit, 
together into one.
  2.  rebase the remaining commits on top of that.
     *   I'm a bit worried about this rebase. There are merge commits, etc. in 
the history. I'm not sure this will just all rebase while preserving all the 
commits, but maybe it will "just work"
  3.  create a "patch set" corresponding to the 100 or so commits that make up 
the "contribution".
     *   I don't know if this is even feasible for this many commits.
  4.  create a tar/zip of this aggregate patch set.
  5.  compute an md5 of this patch set.

The patch set tar/zip file and its md5 hash are "the granted software".

The problem with this idea is that there's no obvious way to review a patch 
set, shy of applying it.

A better way may be to change steps 3 - 5 above to

Plan B:

3. push the main branch to a new empty git repository
    The point of this is to remove all historic stuff from the repository, 
i.e., have a minimal git repo that contains only the contribution and the 
single other commit it must be based on.

4. create a tarball of this git repository, and md5 hash of it

5. document that the contribution is from githash X (after the first commit) to 
githash Y (the final commit) of this repository

This has the advantage that the contribution is a self-contained review-able 
thing.

Other ideas are welcome. (Plans C, D, etc) The only requirements I know of are:

  1.  a single file containing the contribution, and its md5 hash
  2.  a sensible way one can review the contents of this contribution file
  3.  preserve history of derivation from the vscode debugger example.







Mike Beckerle | Principal Engineer

[cid:238f633f-3220-4dc5-944c-ca72b28b8338]

[email protected]<mailto:[email protected]>

P +1-781-330-0412

daffodil-vscode - how to package and identify the contribution - some git questions

Reply via email to