Re: Package licensing part I - the approach - was Github example
On Fri, Sep 15, 2017 at 06:02:00AM +, Gisi, Mark wrote: > How does one define “accurate and complete” when a package’s “top > level” license does not represent all the files contained within the > package (think license diversity). Although there was no standard > agreement on what “accurate and complete” meant, I got the strong > impression looking at the customer’s spreadsheet that a package’s > top level license was not enough. If you're going to look through the package an conclude licenses for each file (a good idea when you need this level of detail), then you'll have declared/concluded licenses for each file (or parts of files, if you use snippets). Once you've collected that, an “accurate and complete” license for the package would be the AND-ed combination of all the file/snippet licenses. However, in many cases there will be content in those packages that is not ending up in the final device (e.g. documentation, some license files, project management and policy information, …) that someone shipping a device does not care about. Those file/snippet licenses won't matter to them (unless they are interested in pushing doc patches back upstream, or some such). So I'd recommend just providing those customers with file/snippet granularity and find a workflow that does not bother with “package licenses”. If they can't support that level of granularity, ask them to provide you with a list of files/snippets they care about and only AND their licenses to conclude a selected-project-subset license. If they can't provide a list of files/snippets they care about and can only accept conclusions at the package level, then they're going to get things like “GPL-3.0 AND Verbatim” for a package that includes GPL-3.0 code and the text of the GPL 3.0. But the Verbatim license contains no onerous conditions for someone shipping devices, so they probably won't mind. > There are obviously other types of open source users who do not > share the same compliance challenges as Stakeholder #1. Consider > businesses that provides Software as a Service (SaaS) where the lion > share of the open source runs in the data center as opposed to being > distributed on millions of devices. Think of Facebook, Netflix, > Airbnb, and Lyft. For SaaS provider’s software distribution is > typically not a consideration (except perhaps for the apps you > download onto your phone). The license compliance complexity and > risk profile for a SaaS provider is very low compared to device > makers, their need for SPDX file level licensing information tends > to very low, if at all. Maybe the risk is lower, but they have the same issue. For example, the AGPL-3.0 has explicit requriments for this use case [1]. Where detailed licensing is expensive, SaaS providers end up cutting corners. But if everyone was doing things right, SaaS providers would have the same audit-trail robustness that you attribute to device shippers. > Many of the products you use (or drive) are created by Stakeholder > #1. If they lack sufficient file level licensing information… And nobody is arguing for removing file/snippet granularity (just like nobody is arguing for removing LicenseRef [2]). So what problems does SPDX not address for either of these use cases? Cheers, Trevor [1]: https://github.com/spdx/license-list-XML/blob/f1522b5cc61bde64d9b38af05204fdb93c02eef8/src/AGPL-3.0.xml#L480-L494 [2]: https://lists.spdx.org/pipermail/spdx-legal/2017-September/002184.html Subject: Re: GPLv2 - Github example Date: Tue, 12 Sep 2017 09:45:38 -0700 Message-ID: <20170912164538.gg30...@valgrind.tremily.us> -- This email may be signed or encrypted with GnuPG (http://www.gnupg.org). For more information, see http://en.wikipedia.org/wiki/Pretty_Good_Privacy signature.asc Description: OpenPGP digital signature ___ Spdx-legal mailing list Spdx-legal@lists.spdx.org https://lists.spdx.org/mailman/listinfo/spdx-legal
RE: Package licensing part I - the approach - was Github example
Thank you Richard, Kyle and Trevor for providing insights into what is important to Stakeholder #2 (developers). Before we proceed to the next step, I would like to provide insights into what is important to Stakeholder #1. It might be helpful to understand because one of the catalysts for creating SPDX was Stakeholder #1's license compliance challenges. Stakeholder #1 --- A typical Stakeholder #1 would be a device maker that embeds software into the products they manufacture. Think of the typical manufacturers for TVs, printers, network routers, cameras, smart thermostats, automobiles, industrial robots, elevators, train control systems, wind turbines, medical devices, and so forth. Also think of all the “Things” in the Internet of Things. We are talking about billions of devices[1]. Linux combined with other open source solutions typically serve as the nervous system of those devices and represents the lion share of the software that runs on them. Each time a device sold (distributed) it triggers a set of license obligations that are typically more complex than other types of open source use cases (e.g., SaaS). Compliance at the file level is particularly relevant. Most device makers are committed to doing the right thing - i.e., they want to provide all the required source code, attribution notices, copies of license and so forth. Although not every company takes compliance to the same level, many of the device manufactures are quite advanced. They need to comply with the licensing of the libraries and binaries that end up on their device’s runtime. To determine the licensing for each of the libraries and binaries, the device maker needs to understand the licensing of the source files from which they were constructed. Therefore the file level license is very important whereas the top level package license is much less so. File level compliance can be challenging. Well managed open source projects typically include a license notice in every file, but unfortunately many projects do not. Furthermore, the more successful a project is, the more it shares (and borrows) code with other projects that are potentially under different licenses. This leads to license diversity at the file level, a byproduct of successful sharing and a reality that we need to embrace. SPDX facilitates the management of license diversity while making missing license information transparent. SPDX data is a valuable input into a device maker’s compliance program. I have come to understand the concerns of Stakeholder #1 through contract negotiations I participated in with Wind River customers (largely device makers). In the past customers use to include all kinds of language on what defines open source and how they wanted licensing information delivered (often in their own custom spreadsheet format). The words that made me pause every time were: “we need you to provide *accurate and compete* licensing information for each Linux package”. How does one define “accurate and complete” when a package’s “top level” license does not represent all the files contained within the package (think license diversity). Although there was no standard agreement on what “accurate and complete” meant, I got the strong impression looking at the customer’s spreadsheet that a package’s top level license was not enough. SPDX played a valuable role whenever a customer tried to define what open source was and what their licensing reporting expectations were. I replaced their language with the promise to deliver licensing information using SPDX, the Linux community’s license reporting standard. All the concerns around “accurate and complete” went away. Furthermore, the time and cost saving by having to deal with just one format vs hundreds was significant. SPDX is Not for Everyone: --- There are obviously other types of open source users who do not share the same compliance challenges as Stakeholder #1. Consider businesses that provides Software as a Service (SaaS) where the lion share of the open source runs in the data center as opposed to being distributed on millions of devices. Think of Facebook, Netflix, Airbnb, and Lyft. For SaaS provider’s software distribution is typically not a consideration (except perhaps for the apps you download onto your phone). The license compliance complexity and risk profile for a SaaS provider is very low compared to device makers, their need for SPDX file level licensing information tends to very low, if at all. Hard to Ignore Stakeholder #1: - Many of the products you use (or drive) are created by Stakeholder #1. If they lack sufficient file level licensing information it becomes much more difficult and costly to deliver the source code and attribution notices you deserve (and the source code authors expect). All in all, SPDX enables device makers to
RE: Package licensing part I - the approach - was Github example
Thanks to both Richard and Kyle for very helpful posts. Ditto on Kyle's recognition of the amazing work of so many people on this project. I confess that I am still thrilled to see any embrace of the SPDX license expressions by developers. I have said many times over the course of many years that SPDX Nirvana would be use of SPDX identifiers by projects. A glimpse of a galaxy far far away where there is one seamless automated path from projects to fully compliant end user products is so tantalizing. And we have talked about the two different parts of the SPDX story. One I will call SPDX output (a format for a machine and human readable expression of the contents of a specific package of software) and the other upstream use of SPDX identifiers that would make it easier to automate the production of quality SPDX output. We knew how to work on the SPDX output and we knew we could only hope that the developers would find something useful that would bring them into the conversation. We never expected to dictate to developers. I still believe both the SPDX sides come together at the top level goal: at the end of the supply chain products comply with the license obligations. I still believe SPDX output and Open Chain are important to achieving consistent end product compliance. But your notes are very helpful to my understanding of where the goals of the two parts of SPDX may diverge. The SPDX output team knows that the full potential of SPDX output will only be realized if the community norm is high quality. We only move the needle on compliance if the SPDX output is accurate. We only eliminate duplicative work if there is a reasonable expectation of SPDX reliability. We only get vendors in the supply chain to implement processes to know what OS they are using if the SPDX's are actually used and useful. So the SPDX team has focused on imposing exacting standards that will survive the scrutiny of even the most conservative lawyers. What I think I am hearing is that developers think that the lawyers have once again let the perfect be the enemy of the good. The developers want to devote more time on the code then they spend making lawyers happy. The highly exacting standards that add quality when the license identifier is used in the context of SPDX output feels ridiculous when they just want a convenient short hand for a license reference. I know it is more complicated than this, and I know I am not smart enough to know what the bridge is that keeps the conversations together. But I do know that both sides are valuable and that both sides have something to gain from the work of the other - use of OS projects in compliant end user products. And as you all know, I could go on and on about how important open source and collaborative development is to humankind. Richard, thank you once again for the time you take to educate all of us. Your insights are invaluable. I think I am beginning to understand the disconnect. That is always the right first step. From: spdx-legal-boun...@lists.spdx.org [mailto:spdx-legal-boun...@lists.spdx.org] On Behalf Of Kyle Mitchell Sent: Wednesday, September 13, 2017 12:02 PM To: Richard Fontana Cc: SPDX-legal Subject: Re: Package licensing part I - the approach - was Github example My first order of business here is to reaffirm my gratitude to the stalwarts of the SPDX team. A frankly staggering amount of work and thought has gone into this and other lists over the years, and a very nice portion of that has settled its way out into various outputs---spec, license list, software, not GitHub repositories---from which others now benefit. Should be a great source of pride! I think the distinction Mark introduced and Richard elaborated is a sound one. I could quibble with some details---or maybe some language, it may boil down to just that---but at a high level, I think I see the picture as they do. Some see SPDX primarily as a way to describe others' work, and trade those conclusions. Some making software now see SPDX as a tool to apply themselves to their own work, directly, in ways that travel together with the code itself. The point I'd like to offer---and celebrate!---is that the _modularity_ of SPDX' approach facilitated those new and perhaps unanticipated uses. Many programming language package metadata standards picked up the License List for standardized strings to refer to specific form licenses quite early on. We also saw that with SPDX-License-Identifier in header comments. The license expression language allowed those on the self-describing side to take even _more_ work and wisdom from the SPDX spec. In the case of npm, we ran into adoption-implementation challenges right at the boundary between the expression language and the rest of the spec. How do we handle non-standard licenses? What do we do with LicenseRef-*? That's a very common experience when adopting an approach "abstracted out of" a lar
Re: Package licensing part I - the approach - was Github example
On Wed, Sep 13, 2017 at 11:07:52AM -0400, Richard Fontana wrote: > The other SPDX is the use of something that *superficially* looks > like SPDX-conformant license expressions to describe licensing in a > way that is, I guess, outside the intended scope of SPDX. Examples > of this nonconformant use of SPDX license expressions include > developers annotating the licensing of their own software as well as > distributors annotating the licensing of things they distribute. That's officially supported by the spec with Appendix V (Using SPDX short identifiers in Source Files [1], new in 2.1 [2]). > In particular I would assert that these two uses of SPDX are > fundamentally in conflict. Can you go into more details about the conflict you see? License expressions seem like they're designed to express the license of a (possibly combined) work, including the declared license of a package [3] or file [4]. Both of those fields take license-expression values, with additional ‘NONE’ and ‘NOASSERTION’. That sounds like the exact same use case to me as the SPDX-License-Identifier use case (which also takes license-expression values [1]), with the differnce being that an SPDX-License-Identifier entry in the file is the file author declaring the license, and the PackageLicenseDeclared and LicenseInfoInFile entries in an external SPDX file may have been written by someone else. But in both cases, they're trying to express a declared license for the file. The npm package.json license field [5] is just like SPDX-License-Identifier, except it's the author declaring a package-level license. All of these use cases need a compact, machine-readable way to express the license of a work, and license expressions seem like a good fit for all of them to me. There is the difficulty that outside of an SPDX file, there's no way to define custom licenses (LicenseRef-*). For most projects, the licenses and exceptions they need are already in the SPDX license list, so they don't need that functionality. For projects who need a license that's not in the SPDX list in a license-expression-only context, submitting the license for SPDX inclusion is fairly straightforward. The only issue with submission is that sometimes the license is rejected (although I don't have a link I can cite for this) and sometimes it is judged sufficiently similar with an existing license (e.g. “and” vs. “and/or” in ISC-ish licenses [6,7]). But that's not an insurmountable problem. We can always add a: LicenseURI "${URI}" operator to license expressions that supports RFC 3986 URIs [8] to provide folks with a way to reference not-in-the-SPDX-list licenses (with a similar ExceptionURI for exceptions?). You're obviously not going to have all the expressive power of a full SPDX file in the license expression, but if all you're aiming for is declaring a license, I don't see why you would need more structure than license expressions. Cheers, Trevor [1]: https://spdx.org/spdx-specification-21-web-version#h.twlc0ztnng3b [2]: https://spdx.org/spdx-specification-21-web-version#h.1sh8jn1fc5zw [3]: https://spdx.org/spdx-specification-21-web-version#h.1hmsyys [4]: https://spdx.org/spdx-specification-21-web-version#h.111kx3o [5]: https://docs.npmjs.com/files/package.json#license [6]: https://lists.spdx.org/pipermail/spdx-legal/2015-April/001398.html Subject: New License/Exception Request Date: Thu Apr 30 17:56:52 UTC 2015 [7]: https://github.com/spdx/license-list-XML/pull/423/commits/233ae572d9e30cf48bd4c887e9c069626e76a93b [8]: https://tools.ietf.org/html/rfc3986 -- This email may be signed or encrypted with GnuPG (http://www.gnupg.org). For more information, see http://en.wikipedia.org/wiki/Pretty_Good_Privacy signature.asc Description: OpenPGP digital signature ___ Spdx-legal mailing list Spdx-legal@lists.spdx.org https://lists.spdx.org/mailman/listinfo/spdx-legal
Re: Package licensing part I - the approach - was Github example
My first order of business here is to reaffirm my gratitude to the stalwarts of the SPDX team. A frankly staggering amount of work and thought has gone into this and other lists over the years, and a very nice portion of that has settled its way out into various outputs---spec, license list, software, not GitHub repositories---from which others now benefit. Should be a great source of pride! I think the distinction Mark introduced and Richard elaborated is a sound one. I could quibble with some details---or maybe some language, it may boil down to just that---but at a high level, I think I see the picture as they do. Some see SPDX primarily as a way to describe others' work, and trade those conclusions. Some making software now see SPDX as a tool to apply themselves to their own work, directly, in ways that travel together with the code itself. The point I'd like to offer---and celebrate!---is that the _modularity_ of SPDX' approach facilitated those new and perhaps unanticipated uses. Many programming language package metadata standards picked up the License List for standardized strings to refer to specific form licenses quite early on. We also saw that with SPDX-License-Identifier in header comments. The license expression language allowed those on the self-describing side to take even _more_ work and wisdom from the SPDX spec. In the case of npm, we ran into adoption-implementation challenges right at the boundary between the expression language and the rest of the spec. How do we handle non-standard licenses? What do we do with LicenseRef-*? That's a very common experience when adopting an approach "abstracted out of" a larger system. The origin and the newly independent tool never quite totally separate. I'd suggest that we reinforce and celebrate the extent to which the wise organization of SPDX and its outputs facilitates diverse use. Diverse use tends to bring in more people, which makes a better project. Diverse use tends to reveal more about the project and the nature of good solutions, under different demands and aspirations, which also makes a better project. -- Kyle Mitchell, attorney // Oakland // (510) 712 - 0933 ___ Spdx-legal mailing list Spdx-legal@lists.spdx.org https://lists.spdx.org/mailman/listinfo/spdx-legal
Re: Package licensing part I - the approach - was Github example
Just a comment, which seems to resonate with some of what you are saying and expresses something I've been struggling with for a while: As (mostly) an intentionally-not-watching-too-closely bystander wrt SPDX, for some time I've realized that SPDX means at least two different things. There is SPDX as contemplated by those who have been most closely and actively involved in its development. This anticipates the creation of SPDX-conformant files and among other things defines an important distinction between "declared" and "concluded" licenses -- while expecting the use of an identical license expression language for both, which, as an aside, I think is conceptually very confusing. The other SPDX is the use of something that *superficially* looks like SPDX-conformant license expressions to describe licensing in a way that is, I guess, outside the intended scope of SPDX. Examples of this nonconformant use of SPDX license expressions include developers annotating the licensing of their own software as well as distributors annotating the licensing of things they distribute. It's the second use of SPDX that in recent years I see catching on in the real world of developers and vendors and users, and not the first. The first use of SPDX continues after many years to be relegated to an extremely small set of enthusiasts, and I think to the rest of the world seems impractical for various reasons. In particular I would assert that these two uses of SPDX are fundamentally in conflict. >From what I can see, the SPDX group counts the second development as a set of >signs of "adoption of SPDX". For example, when Fedora began to consider >"switching to SPDX" for RPM spec file license tags (which still is nowhere >near actual adoption and implementation, for reasons relating to this post), I >think the SPDX group saw that as a potential major victory. But that is not >really accurate at all. What's happening is that SPDX license expressions have >been hijacked to a non-contemplated use which is of much greater interest to >the community than the original contemplated use for SPDX. These seem to correspond directly to your two stakeholders, except the second stakeholder in my mind also includes non-developers who are looking for a simple, standardized way to adequately describe the licensing of things they distribute. Maybe that's a third stakeholder which sees value in the LEL that is similar to what the second stakeholder sees. Now speaking specifically of things Red Hat is involved with, we have essentially zero interest in the first use of SPDX. Basically no one we encounter (outside of some individuals on this list) wants to create or see conformant SPDX files - not projects, not us as a company, not our customers. But we see a growing interest in the second use of SPDX from developers of Red Hat-maintained projects, in certain internal engineering efforts, and with a small number of customers, and an accompanying growing dissatisfaction with other conventions for describing licensing of software components. And one of the big challenges in this second use of SPDX is developing a set of conventions that hides sufficient complexity in license description, which I think is philosophically completely at odds with the basic direction of official SPDX. The conflict is such that I wonder whether there really shouldn't be a separate official effort around the second use of SPDX. Richard - Original Message - From: "Mark Gisi"To: "SPDX-legal" Sent: Wednesday, September 13, 2017 1:13:51 AM Subject: Package licensing part I - the approach - was Github example How to move forward: It appears we have not collectively agreed on what the problem is. I believe this is because there are at least two different stakeholders expressing two different sets of requirements for the License Expression Language (LEL). Stakeholder 1 (Traditional): Linux License Compliance people who use SPDX to deliver accurate and complete licensing information for Linux packages - many of which they did not author but may have patched. Accurate and complete licensing means from the package level down to each file level (using reasonable efforts). Stakeholder 2 (New): Developers who want to use license expressions i) in their code in place of the more tradition license notices and ii) for package level licensing designations. There are three steps I would like to suggest we achieve before developing updates to the LEL. 1) Describe the background on how the LEL was designed to date and the process used. The hope is that we can continue using the same process. 2) Define the requirements for the two different stakeholders and perhaps identify other stakeholders or correct the ones that are identified above. 3) Use the requirements to come up with a more precise problem description. Before we proceed - any feedback