Re: [spdx-tech] standardLicenseHeader for EPL-1.0

2017-12-28 Thread Thanh Ha
On Thu, Dec 28, 2017 at 4:53 PM, Philippe Ombredanne 
wrote:

> Hi Thanh,
>
> On Thu, Dec 28, 2017 at 2:18 AM, Thanh Ha 
> wrote:
> > I am developing a license header scanner in order to quickly scan local
> > files for license headers at the top of code files.
>
> You may want to check out ScanCode [1]. Since I use it with top Linux
> maintainers to clarify the kernel licensing and set SPDX ids, it must
> not be too shabby as a license detection engine.  It detects headers
> alright and much more, including EPL headers.
>
> PS: ScanCode is written in Python, not Go and I am the maintainer.
>
> [1] https://github.com/nexB/scancode-toolkit
> --
> Cordially
> Philippe Ombredanne
>

Hi Philippe,

Thanks for the pointer. I had a look and unfortunately it isn't the tool we
need for our use case.

The tool we need (and is what I'm prototyping) is one that we can use in CI
systems to pass a list of valid licenses like "EPL-1.0, Apache-2.0" for
example and then it searches all the code files in a project repo to make
sure that the top of every code file contains the correct license header
text (and optionally SPDX identifier). If any files that have missing
license headers or has incorrect license header text will automatically
fail the build in CI and reports a -1 vote (or blocking vote) in a code
review system like Gerrit or GitHub code reviews. The intention here is to
block developers from merging code with missing license headers in the
first place rather than find out after the fact that this has happened.

We've successfully done this for a few of our Java projects using
checkstyle but it's Java specific and runs quite a bit slower than we like.
The new tool we've been working on scans significantly more quickly as it
only reads the first few bytes of every file and all the scanning is done
locally without generating anything (scans 10s of thousands of files in
seconds). I have a work in progress here [0] in case anyone is interested
but it currently requires us to provide an example license header. I'd like
to pull in SPDX data so that this data can be automatically sourced from
somewhere rather than depending on the projects to provide the correct
header examples.

Hope this explains things more clearly.

Thanks,
Thanh

[0] https://github.com/zxiiro/license-header-checker
___
Spdx-tech mailing list
Spdx-tech@lists.spdx.org
https://lists.spdx.org/mailman/listinfo/spdx-tech


Re: [spdx-tech] standardLicenseHeader for EPL-1.0

2017-12-28 Thread W. Trevor King
On Thu, Dec 28, 2017 at 10:53:02PM +0100, Philippe Ombredanne wrote:
> You may want to check out ScanCode [1]. Since I use it with top Linux
> maintainers to clarify the kernel licensing and set SPDX ids, it must
> not be too shabby as a license detection engine.  It detects headers
> alright and much more, including EPL headers.

[1] is the ScanCode rule that encodes a header similar to what Thanh
mentions (although ScanCode has some other EPL-1.0 rules as well).
I'm in favor of documenting those standard headers (when they are
recommended by the license steward) in SDPX.

The Eclipse docs at [2] currently recommend EPL-2.0 and, more
importantly for us, an SPDX-License-Identifier entry.  As long as
you're willing to trust that entry, you can skip over the rest of the
header completely.

Cheers,
Trevor

[1]: 
https://github.com/nexB/scancode-toolkit/blob/v2.2.1/src/licensedcode/data/rules/epl-1.0_2.RULE
[2]: https://www.eclipse.org/projects/handbook/#ip-copyright-headers

-- 
This email may be signed or encrypted with GnuPG (http://www.gnupg.org).
For more information, see http://en.wikipedia.org/wiki/Pretty_Good_Privacy


signature.asc
Description: OpenPGP digital signature
___
Spdx-tech mailing list
Spdx-tech@lists.spdx.org
https://lists.spdx.org/mailman/listinfo/spdx-tech


Re: [spdx-tech] standardLicenseHeader for EPL-1.0

2017-12-28 Thread Philippe Ombredanne
Hi Thanh,

On Thu, Dec 28, 2017 at 2:18 AM, Thanh Ha  wrote:
> I am developing a license header scanner in order to quickly scan local
> files for license headers at the top of code files.

You may want to check out ScanCode [1]. Since I use it with top Linux
maintainers to clarify the kernel licensing and set SPDX ids, it must
not be too shabby as a license detection engine.  It detects headers
alright and much more, including EPL headers.

PS: ScanCode is written in Python, not Go and I am the maintainer.

[1] https://github.com/nexB/scancode-toolkit
-- 
Cordially
Philippe Ombredanne
___
Spdx-tech mailing list
Spdx-tech@lists.spdx.org
https://lists.spdx.org/mailman/listinfo/spdx-tech


Re: [spdx-tech] standardLicenseHeader for EPL-1.0

2017-12-27 Thread W. Trevor King
On Wed, Dec 27, 2017 at 08:18:32PM -0500, Thanh Ha wrote:
> The other options available licenseText and standardLicenseTemplate seems
> to have the full license header rather than what's recommended by the
> Eclipse project to include a short header message [1] in code files.
> …
> [1] https://www.eclipse.org/projects/handbook/#ip-copyright-headers

Besides Gary's comments, another wrinkle is that we currently require
headers to be declared in the license or its appendix [1].  We are
likely to relax that limitation soon to “declared somewhere by the
license steward” [2], but for the moment, the handbook page you link
may be too far from the license for inclusion.

Cheers,
Trevor

[1]: https://spdx.org/spdx-license-list/license-list-overview#fields
[2]: https://github.com/spdx/license-list/issues/5#issuecomment-307292702

-- 
This email may be signed or encrypted with GnuPG (http://www.gnupg.org).
For more information, see http://en.wikipedia.org/wiki/Pretty_Good_Privacy


signature.asc
Description: OpenPGP digital signature
___
Spdx-tech mailing list
Spdx-tech@lists.spdx.org
https://lists.spdx.org/mailman/listinfo/spdx-tech


[spdx-tech] standardLicenseHeader for EPL-1.0

2017-12-27 Thread Thanh Ha
Hi Everyone,

I am developing a license header scanner in order to quickly scan local
files for license headers at the top of code files. I intend to add SPDX
Header scanning support but have noticed that the EPL-1.0 is missing the
recommended header text which based on my reading here [0] should be where
such things are defined.

The other options available licenseText and standardLicenseTemplate seems
to have the full license header rather than what's recommended by the
Eclipse project to include a short header message [1] in code files.

How can we get this included in SPDX and is standardLicenseHeader the right
field or am I misunderstanding it's usage here?

Thanks,
Thanh

[0]
https://github.com/spdx/license-list-data/blob/master/accessingLicenses.md
[1] https://www.eclipse.org/projects/handbook/#ip-copyright-headers
___
Spdx-tech mailing list
Spdx-tech@lists.spdx.org
https://lists.spdx.org/mailman/listinfo/spdx-tech