Comments in line.

From: David Kemp <dk1...@gmail.com>
Sent: Friday, November 5, 2021 7:40 AM
To: William Bartholomew (CELA) <will...@microsoft.com>
Cc: nis...@vmware.com; Spdx-tech@lists.spdx.org
Subject: Re: [EXTERNAL] Re: [spdx-tech] Collection member Elements

You don't often get email from dk1...@gmail.com<mailto:dk1...@gmail.com>. Learn 
why this is important<http://aka.ms/LearnAboutSenderIdentification>

On Thu, Nov 4, 2021 at 7:16 PM William Bartholomew (CELA) 
<will...@microsoft.com<mailto:will...@microsoft.com>> wrote:
Collection was created to be a superclass of both ContextualCollection and 
Document because they have shared traits and are both containers. SBOMs do have 
external maps (because they indirectly inherit from Collection and external 
maps are attached to Collection).

The drawing you emailed in September shows Collection with subclasses 
ContextualCollection and Document.  Package/BOM/SBOM inherit from 
ContextualCollection, and ExternalMap is a property of Document, unavailable to 
Package/BOM/SBOM.  Could you mail or post the latest logical model; I'm 
apparently looking at an old version.  Glad that's been fixed.
[WillBar] Sorry, you're correct, namespaceMap is off collection but externalMap 
is off document and that is because of the issue discussed below. Sorry for the 
red herring there.

The "verifiedUsing" (today) isn't how you verify the element but the thing that 
the element is describing. For example, a File element's verified using is the 
hash of the file, not of the File element.

Element has verifiedUsing, that for an Artifact/File *might* be the hash of the 
data retrieved from the artifactURL. But that's inconsistent with how an 
Identity Element is verified, since there is no identityURL (or relationshipURL 
or annotationURL), the only hashable information is the properties of those 
Elements.  I think it would be clearer to make artifactVerifiedUsing a property 
of Artifact, leaving the Element's verifiedUsing to always hash the Element 
properties regardless of the Element subclass being hashed.

[WillBar] "verifiedUsing" on an Identity element was never intended to be a 
hash of the element, to be clear, hashing of individual elements has not been 
discussed until very recently, so nothing in the model today was intended for 
that purpose. "verifiedUsing" on an Identity element would likely be things 
like public keys, GPG key fingerprints, decentralized identity documents, etc.

We do not currently have a way (outside of a Document) to generate a hash of an 
element, this is why the model for referencing external element is bound to 
Document, because without that we have no way of a) knowing where to fetch the 
element from and b) verifying its integrity.

 ExternalMap's verifiedUsing property would be the hash or signature value of 
the data referenced by the "elmentURL" property, which must be elementIRI 
because the Element "id" property is an IRI.  (If there were also an optional 
elementURL property, it would become a dead link as soon as the data was 
transferred to a non-Internet environment.)
[WillBar] Per above this was never the intent of "verifiedUsing", we have no 
per element (collection or not) integrity, only document-level integrity. There 
is a desire to have this capability, but it has not been defined yet and there 
are concerns about the practicality and the immediacy of the need for it. I 
think this is the key decision for us to make next week.

Defining an standalone integrity value for an individual collection member is a 
topic for future work, it doesn't affect the model now.  Discussing it now is a 
red herring.

For now, assume a Collection that can be verifiedUsing=sha256:mblurf29380u4...:
[WillBar] This is an assumption that is not true today, we have not defined a 
way to determine the integrity of a collection or of an element, only of a 
serialized document. That is the root of the issue here.

A:  id=http://abcdef/,  type=Collection, created=Gary/June
|---B: id=http://abcdef/file1, type=File
|---C: id=http://abcdef/file2, type=File
|---D: id=http://abcdef/id1, type=Identity
|---E: id=http://abcdef/a1, type=Annotation

B-E are minted as part of A and have A's creation info.  There is no individual 
hash defined for the collection members, only A is hashed.  You don't need to 
put an Element in a "Document" to compute the Element's hash.  A generic 
Collection, or a Collection subclass like SBOM (Element A) has a hash, is 
minted as a unit, and SBOM member Elements B,C,D,E are included in A's hash.  
If a Relationship in a different Collection references Identity D, it cannot 
obtain D's property values without A (and is thus able to compute A's hash) 
because D is minted as part of A.
We had a discussion a couple of weeks ago about how this might be possible 
(canonicalization of elements and hashing of the canonicalized form) but that 
has its own challenges, and we didn't conclude that conversation. If we can 
define a way to guarantee integrity of individual elements and how to fetch 
them then yes Document becomes far less interesting. I also believe this is 
only an exchange problem, once you've fetched a Document and verified its 
integrity the elements within it can standalone and you can query over the 
graphs between them completely ignoring Document.

Document is unnecessary now (with only Collection hashes) and in the future (if 
independent Collection member hashes are ever defined).

I'm unconvinced the SBOM should be a unit of exchange, the SBOM will often 
reference other things that you want to transfer along with the SBOM 
(identities, licenses, etc.). Having the unit of exchange being Document (and 
there's no reason you can't have a Document that only contains an SBOM if you 
want) gives you the flexibility of transferring multiple independent elements 
in a single exchange.

That's unusual - I understood SBOM to be a set of Artifacts (Packages, Files, 
Snippets) *and* the Identities, Relationships and Annotations that apply to 
those Artifacts. The foundational reason for making all of these inherit from 
Element is so they can all be included in a Collection of Elements and be 
addressed by their Element IRIs.  The logical diagram doesn't currently show 
licenses, but they will have to be modeled such that licensing info is included 
in Elements (presumably by adding license properties to Artifact - I assume 
Identities/Relationships/Annotations don't have licenses).

Bundle (little-d document, non-contextual collection) is the unit of transfer 
for multiple independent/unrelated Elements in a single exchange, such as 
moving them from the Internet to a closed/internal environment.  The fact of 
putting a bunch of Elements into a unit of transfer does not create any logical 
context/relationship among them.  Once they have been transferred the unit of 
transfer disappears, leaving no trace of itself in the transferred set of 
Elements.  If the unit of transfer is itself intended to become an Element with 
an IRI after the transfer is finished, then simply create and transfer a 
Collection or its subclass..

[WillBar] Document is intended to have the same purpose of Bundle, there was 
weeks of debate across both the 3T-SBOM and SPDX communities about whether 
Document should be renamed to Bundle, the decision was made not to.


Regards,
Dave


-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#4233): https://lists.spdx.org/g/Spdx-tech/message/4233
Mute This Topic: https://lists.spdx.org/mt/86776587/21656
Group Owner: spdx-tech+ow...@lists.spdx.org
Unsubscribe: https://lists.spdx.org/g/Spdx-tech/unsub [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-


Reply via email to