Comments in line. From: David Kemp <dk1...@gmail.com> Sent: Friday, November 5, 2021 7:40 AM To: William Bartholomew (CELA) <will...@microsoft.com> Cc: nis...@vmware.com; Spdx-tech@lists.spdx.org Subject: Re: [EXTERNAL] Re: [spdx-tech] Collection member Elements
You don't often get email from dk1...@gmail.com<mailto:dk1...@gmail.com>. Learn why this is important<http://aka.ms/LearnAboutSenderIdentification> On Thu, Nov 4, 2021 at 7:16 PM William Bartholomew (CELA) <will...@microsoft.com<mailto:will...@microsoft.com>> wrote: Collection was created to be a superclass of both ContextualCollection and Document because they have shared traits and are both containers. SBOMs do have external maps (because they indirectly inherit from Collection and external maps are attached to Collection). The drawing you emailed in September shows Collection with subclasses ContextualCollection and Document. Package/BOM/SBOM inherit from ContextualCollection, and ExternalMap is a property of Document, unavailable to Package/BOM/SBOM. Could you mail or post the latest logical model; I'm apparently looking at an old version. Glad that's been fixed. [WillBar] Sorry, you're correct, namespaceMap is off collection but externalMap is off document and that is because of the issue discussed below. Sorry for the red herring there. The "verifiedUsing" (today) isn't how you verify the element but the thing that the element is describing. For example, a File element's verified using is the hash of the file, not of the File element. Element has verifiedUsing, that for an Artifact/File *might* be the hash of the data retrieved from the artifactURL. But that's inconsistent with how an Identity Element is verified, since there is no identityURL (or relationshipURL or annotationURL), the only hashable information is the properties of those Elements. I think it would be clearer to make artifactVerifiedUsing a property of Artifact, leaving the Element's verifiedUsing to always hash the Element properties regardless of the Element subclass being hashed. [WillBar] "verifiedUsing" on an Identity element was never intended to be a hash of the element, to be clear, hashing of individual elements has not been discussed until very recently, so nothing in the model today was intended for that purpose. "verifiedUsing" on an Identity element would likely be things like public keys, GPG key fingerprints, decentralized identity documents, etc. We do not currently have a way (outside of a Document) to generate a hash of an element, this is why the model for referencing external element is bound to Document, because without that we have no way of a) knowing where to fetch the element from and b) verifying its integrity. ExternalMap's verifiedUsing property would be the hash or signature value of the data referenced by the "elmentURL" property, which must be elementIRI because the Element "id" property is an IRI. (If there were also an optional elementURL property, it would become a dead link as soon as the data was transferred to a non-Internet environment.) [WillBar] Per above this was never the intent of "verifiedUsing", we have no per element (collection or not) integrity, only document-level integrity. There is a desire to have this capability, but it has not been defined yet and there are concerns about the practicality and the immediacy of the need for it. I think this is the key decision for us to make next week. Defining an standalone integrity value for an individual collection member is a topic for future work, it doesn't affect the model now. Discussing it now is a red herring. For now, assume a Collection that can be verifiedUsing=sha256:mblurf29380u4...: [WillBar] This is an assumption that is not true today, we have not defined a way to determine the integrity of a collection or of an element, only of a serialized document. That is the root of the issue here. A: id=http://abcdef/, type=Collection, created=Gary/June |---B: id=http://abcdef/file1, type=File |---C: id=http://abcdef/file2, type=File |---D: id=http://abcdef/id1, type=Identity |---E: id=http://abcdef/a1, type=Annotation B-E are minted as part of A and have A's creation info. There is no individual hash defined for the collection members, only A is hashed. You don't need to put an Element in a "Document" to compute the Element's hash. A generic Collection, or a Collection subclass like SBOM (Element A) has a hash, is minted as a unit, and SBOM member Elements B,C,D,E are included in A's hash. If a Relationship in a different Collection references Identity D, it cannot obtain D's property values without A (and is thus able to compute A's hash) because D is minted as part of A. We had a discussion a couple of weeks ago about how this might be possible (canonicalization of elements and hashing of the canonicalized form) but that has its own challenges, and we didn't conclude that conversation. If we can define a way to guarantee integrity of individual elements and how to fetch them then yes Document becomes far less interesting. I also believe this is only an exchange problem, once you've fetched a Document and verified its integrity the elements within it can standalone and you can query over the graphs between them completely ignoring Document. Document is unnecessary now (with only Collection hashes) and in the future (if independent Collection member hashes are ever defined). I'm unconvinced the SBOM should be a unit of exchange, the SBOM will often reference other things that you want to transfer along with the SBOM (identities, licenses, etc.). Having the unit of exchange being Document (and there's no reason you can't have a Document that only contains an SBOM if you want) gives you the flexibility of transferring multiple independent elements in a single exchange. That's unusual - I understood SBOM to be a set of Artifacts (Packages, Files, Snippets) *and* the Identities, Relationships and Annotations that apply to those Artifacts. The foundational reason for making all of these inherit from Element is so they can all be included in a Collection of Elements and be addressed by their Element IRIs. The logical diagram doesn't currently show licenses, but they will have to be modeled such that licensing info is included in Elements (presumably by adding license properties to Artifact - I assume Identities/Relationships/Annotations don't have licenses). Bundle (little-d document, non-contextual collection) is the unit of transfer for multiple independent/unrelated Elements in a single exchange, such as moving them from the Internet to a closed/internal environment. The fact of putting a bunch of Elements into a unit of transfer does not create any logical context/relationship among them. Once they have been transferred the unit of transfer disappears, leaving no trace of itself in the transferred set of Elements. If the unit of transfer is itself intended to become an Element with an IRI after the transfer is finished, then simply create and transfer a Collection or its subclass.. [WillBar] Document is intended to have the same purpose of Bundle, there was weeks of debate across both the 3T-SBOM and SPDX communities about whether Document should be renamed to Bundle, the decision was made not to. Regards, Dave -=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#4233): https://lists.spdx.org/g/Spdx-tech/message/4233 Mute This Topic: https://lists.spdx.org/mt/86776587/21656 Group Owner: spdx-tech+ow...@lists.spdx.org Unsubscribe: https://lists.spdx.org/g/Spdx-tech/unsub [arch...@mail-archive.com] -=-=-=-=-=-=-=-=-=-=-=-