[COSE] Re: [Cbor] Enveloped => Embedded CBOR signatures

Orie Thu, 17 Apr 2025 08:19:32 -0700

Hi Wolf,

Indeed, Gordian has many of these interesting and useful properties.
They can also be achieved with a combination of COSE Receipts, SD-CWT, COSE
Key Thumbprints, and c509.
Given enough time, I could reproduce them for TOML.


However, the primary point I was trying to make in my long previous post is
this:

If you are building a graph of signed data, it helps to have lots of signed
data.

Making a new signing format that is graph compatible and hoping to get
adoption and deployment that places this new format above the existing data
sources, is a difficult task.

A more direct solution to this problem would be to project data that is
understood into the graph format, with counter signatures as you go.

This is basically what SCITT / COSE Receipts enable... They are CBOR, but
they serve other data formats... and in order to serve those formats, they
MUST preserve their bytes exactly as they are.

There is an interesting open issue, which we're still debating in SCITT
regarding this:

https://github.com/ietf-wg-scitt/draft-ietf-scitt-architecture/pull/371

What happens when you want to "secure or make transparent or notarize" data
that is in an unprotected header?

We've got some options:

1. just drop the unprotected header, and any data that is inside it... it's
easy to secure the empty header.
2. keep the unprotected header as is and add it to a merkle tree, signing
the new root... this means the unprotected header bytes, as they exist at
registration time, are made transparent.
3. canonicalize the unprotected header, create a set of rules that issuers
and verifiers can use to understand what goes into the log and what is left
out.

Unsurprisingly I'm a fan of 2, because we know people put things other than
receipts in unprotected headers, such as counter signatures and
timestamps... and you would want to make those transparent, if they were
part of deciding if the original message should be transparent.

I've implemented 3, so that you can (if you follow the rules), detach the
receipt, recompute the tree entry from the signed message, and verify the
receipt... but this process is fragile when you keep attached and detaching
data in the unprotected header, it's easy to end up with a message that
won't verifying in the inclusion proof.

This fragility is similar to the fragility that you get when attaching and
detaching signatures while trying to consistently produce hashes.

My preference would be to allow for implementations that can handle this
fragility too, but not require it...and certainly not forbid it... but I'm
happy to be convinced by others.

If you understand this issue of embedding and canonicalizing, and have the
cycles to review the SCITT PR, we could use the feedback.

Regards,

OS






On Thu, Apr 17, 2025 at 1:51 AM Wolf McNally <[email protected]> wrote:

> Hi Orie,
>
> Here’s the “Person → Employment → Company” fact expressed in Gordian
> Envelope, using two specific patterns we’ve defined:
>
> * ARIDs – random, implementation‑agnostic identifiers
>
> https://github.com/BlockchainCommons/Research/blob/master/papers/bcr-2022-002-arid.md
> * Signature‑with‑Metadata – four‑step recipe for binding a signature, its
> metadata, and the payload
>
> https://github.com/BlockchainCommons/Research/blob/master/papers/bcr-2024-009-signature-metadata.md
>
> ---
>
> ### 1  Observation payload
>
> ARID(<obs‑id>) [
>     'isA'  : "Observation"
>     'type' : "Employment"
>     'of'   : ARID(<alice>)
>     'date' : 2021‑01‑01
> ]
>
> ARIDs are random, used to identify objects that can mutate over time. If
> you want the subject to refer to a record that is immutable and
> content-addressable, you can use a `Digest` as the subject.
>
> The payload is wrapped so it can be signed as a whole. The braces signify
> that the *entire* envelope, including its assertions are a single subject
> that can have additional assertions added to it, and that apply as a whole:
>
> {
>     ARID(<obs‑id>) [
>         'isA'  : "Observation"
>         'type' : "Employment"
>         'of'   : ARID(<alice>)
>         'date' : 2021‑01‑01
>     ]
> }
>
> ---
>
> ### 2  Bob’s signature + metadata
>
> {
>     Signature(<bob>) [               // sig over observation digest (from
> the wrapped envelope above)
>         'signer'    : ARID(<bob>)
>         'purpose'   : "observedBySigner"
>         'validFrom' : 2021‑01‑01
>         'validUntil': 2022‑01‑01
>     ]
> } [
>     'signed' : Signature(<bob>)      // sig over metadata digest (the
> signature + metadata envelope)
> ]
>
> ---
>
> ### 3  Attach the signature to the observation
>
> The 'signed': { ... } is a *single assertion* that is added to the
> original wrapped observation.
>
> {
>     ARID(<obs‑id>) [
>         'isA'  : "Observation"
>         'type' : "Employment"
>         'of'   : ARID(<alice>)
>         'date' : 2021‑01‑01
>     ]
> } [
>     'signed' : {
>         Signature(<bob>) [               // sig over observation digest
>             'signer'    : ARID(<bob>)
>             'purpose'   : "observedBySigner"
>             'validFrom' : 2021‑01‑01
>             'validUntil': 2022‑01‑01
>         ]
>     } [
>         'signed' : Signature(<bob>)      // sig over metadata digest
>     ]
> ]
>
> A verifier checks two signatures with Bob’s key—one on the observation,
> one on the metadata. Both must be made with the same private key.
>
> ---
>
> ### 4  Optional elision
>
> Subscribers to the stream might only see the ID of the observation and who
> authenticated it, but not the observations themselves. Envelopes can be
> elided or encrypted at any granularity, and because signatures cover the
> digest, they still verify even when the content is elided or encrypted. And
> you can always replace any `ELIDED` branch with some or all of its original
> content, enabling progressive disclosure.
>
> {
>     ARID(<obs‑id>) [
>         ELIDED (4)          // Envelope shorthand: four elided assertions
> collapsed onto a single line
>     ]
> } [
>     'signed' : {
>         Signature(<bob>) [               // sig over observation digest
>             'signer'    : ARID(<bob>)
>             'purpose'   : "observedBySigner"
>             'validFrom' : 2021‑01‑01
>             'validUntil': 2022‑01‑01
>         ]
>     } [
>         'signed' : Signature(<bob>)      // sig over metadata digest
>     ]
> ]
>
> ---
>
> Would you agree this addresses the issues you envisioned?
>
> * No forced canonicalization – the inner JWT/CBOR/whatever stays
> byte‑for‑byte.
> * Hash‑addressed substitution / elision – any branch can be replaced by
> `ELIDED`; signatures remain valid.
> * Signer‑subject linkage – explicit `'signer'` field, and the outer
> signature binds the metadata to the attestation, while keeping it severable.
> * Any envelope can have a `’salt’: ` assertion added to it to decorrelate
> its contents from its digest.
> * Multiple signers – just add parallel `'signed'` assertions, each
> embedding its own metadata envelope, without touching the original event.
> * Replica reconciliation (future work) – because envelopes are immutable
> and hash‑addressed, they lend themselves to CRDT‑style set‑union sync:
> peers could exchange their replicas and make the union, with conflicting
> facts persisting side‑by‑side until resolved by a follow‑up assertion
> (e.g., 'contradicts', 'supersedes'). We haven’t built the reference code
> yet, but the data model is designed to support it.
>
> You may also find this interesting:
>
> * Representing Graphs using Gordian Envelope:
>
> https://github.com/BlockchainCommons/Research/blob/master/papers/bcr-2024-006-envelope-graph.md
>
> ~ Wolf
>
> On Apr 16, 2025, at 8:07 PM, Orie <[email protected]> wrote:
>
> Trimming CC.
>
> I'm wearing no hats for this rant.
>
> We seemed to have stumbled on one of my favorite topics.
>
> Authenticated hypergraphs.
>
> Imagine a stream of events, each event is an observation, signed by an
> observer.
>
> The stream is partially observable to subscribers.
>
> Each subscriber assembles the events into a belief network.
>
> In these systems, it's a nice feature to agree on identifiers for subjects
> and objects.
>
> Its also a nice property to be able to match the key material signing or
> decrypting events to the subjects or objects.
>
> But how do you handle conflicting information, merging and consistency
> over time?
>
> Consider this small fragment of cypher:
>
>    MATCH (p:Person)-[r:WORKS_AT {since: 2021}]->(c:Company)
> RETURN p.name AS Employee, c.name AS Company, r.role AS Role
>
> For a query like this, you might want to know which events contribute to
> the result, who signed them, and for how long should the information be
> considered valid.
>
> You can do this with any container format.
>
> You don't need embedded signatures to do this.
>
> Another design consideration is being able to replace a large event with a
> unique identifier.
>
> In order to do that, and still have useful and interesting events people
> turn to general purpose canonicalization.
>
> Meanwhile, most of the interesting events are still JWTs, PGP, CMS, normal
> COSE, etc...
>
> Real data is often not in a canonical form.
>
> People create map keys as they need them, and they like putting "title"
> before "description" even though that's not how they sort lexicographically.
>
> Canonicalization eliminates ways that data can exist.
>
> Cryptography preserves data as it exists.
>
> Naively combining them just eliminates sources of useful information.
>
> For those who want to add embedded signatures to any packaging format.
>
> 1. Define the parameter name.
> 2. Write down the rules for creating and validating it (canonicalization)
> 3. Describe how to embed and extract signatures.
>
> If you want to do chains (ordered counter signatures) you need a way to
> signal the order of the signatures, and rules for how to verify.
>
> The rules don't need to be complicated.
>
> If you want to embed the identifiers for the resource and make them hash
> based, another layer of application specific rules.
>
> Ohh but we want redaction too, let's add salted hashes to all the
> predicates.
>
> {
>   id: (magic determinsitic id)
>   subject: { predicate: object }
>   signature: [ ... ,
>     { ..., key_id, valid_from, valid_until }
>   ]
> }
>
> You basically end with event sourced progressively disclosable attribute
> cert derived labeled property graphs.
>
> As soon as you're done making this system, somebody will want to simply
> sign data without making any changes to it, and you'll be back to enveloped
> signatures.
>
> OS
>
>
> On Wed, Apr 16, 2025, 5:31 PM Wolf McNally <[email protected]> wrote:
>
>> Vadim,
>>
>> > On Apr 16, 2025, at 1:26 PM, Vadim Goncharov <[email protected]>
>> wrote:
>> >
>> > Essentially known-values are just another way to specify a compact
>> encoding
>> > for long type string, like JSON-LD does for URIs, so a generic
>> > (semantic) compaction framework like CBAPT can replace many wheels
>> reinvented
>> > again and again in protocols, leaving only thin amount for them to
>> define.
>>
>> I think I understand you correctly, so yes.
>>
>> Many semantic systems use “triples”:
>>
>> <subject> <predicate> <object>
>> “Sam” ‘knows' “John”
>>
>> Where predicates like ‘knows’ are common, and in some cases universal.
>>
>> Gordian Envelope is based on the pattern:
>>
>> <subject> [
>>   <predicate>: <object>
>>   <predicate>: <object>
>>   ...
>> ]
>>
>> (Note the above is “Envelope Notation”, not CBOR diagnostic notation or
>> CDDL). Gordian Envelope itself is built on dCBOR.
>>
>> Where a number of assertions (<predicate>: <object> pairs) are declared
>> on a subject, and any of these positions can be a dCBOR object, including
>> nested Gordian Envelopes, forming a tree with a unique digest at every node.
>>
>> Strings (like URLs) can work for predicates, but can take up significant
>> space, and where URLs can be specified numerous ways, this can lead to
>> breaking determinism. So we created the known-value space #6.40000 which is
>> always just a tagged 64-bit integer. The length of a known-value with a 1+1
>> integer is only 5 bytes.
>>
>> So we’re making first-come first-served assignments of code points in the
>> known-value space for anyone who would find such a value useful that is
>> fixed public, and globally unique, and can provide a URL to a spec claiming
>> it, with no reasonable request refused.
>>
>> I need to emphasize that there is nothing about known-values that tie
>> them to dCBOR or Gordian Envelope (other than that they are compatible with
>> any kind of CBOR.) You can use them as keys in CBOR maps, or anywhere you
>> want an integer value that is unique and has fixed semantics.
>>
>>
>> https://github.com/BlockchainCommons/Research/blob/master/papers/bcr-2023-002-known-value.md
>>
>> ~ Wolf
>> _______________________________________________
>> CBOR mailing list -- [email protected]
>> To unsubscribe send an email to [email protected]
>>
> _______________________________________________
> CBOR mailing list -- [email protected]
> To unsubscribe send an email to [email protected]
>
>
>

_______________________________________________
COSE mailing list -- [email protected]
To unsubscribe send an email to [email protected]

[COSE] Re: [Cbor] Enveloped => Embedded CBOR signatures

Reply via email to