Good morning list,
I have been thinking of this for some time without really getting any good
results, and in any case [this C-Lightning mailing list
post](https://lists.ozlabs.org/pipermail/c-lightning/2019-December/000172.html)
brings up similar topic.
As well, 0 or more people have asked me opinions on related topics as well.
In any case, the goal is to have some hardware unit, using only a simple
communications channel with the software, to act as signer for all
Lightning-related things (channel funding, channel updates, reclaiming funds
after channel close).
As is typical for such Bitcoin-related hardware units, it might possess its own
user interface by which it can require confirmation of certain actions.
Also, to reduce risk of hacking, the hardware unit does not have a connection
to the netwrok, only a communications channel that is restricted to a local
direct connection (such as USB).
I will now demonstrate that such a design cannot achieve what an onchain
Bitcoin hardware wallet does, namely that the hardware need not tr\*st the
software and need not keep any state beyond the private key (which remains
constant).
For Lightning, it seems any hardware would require tr\*st in the software *and*
some state.
Cast of characters:
* The hardware for this node.
* The software for this node.
* The other node.
What I will attempt here would be to create a hardware that has as little
tr\*st in the software as possible, and reduce the state that the hardware
signer has to store.
If these two goals conflict, I will attempt to reduce the tr\*st in the
software.
Channel Updates
===============
Onchain hardware wallets do not require user interaction just to receive funds.
Thus, let us consider this goal.
On Lightning, funds are received by getting an HTLC that this node can claim
from the other node on the channel.
This incoming HTLC deducts from the funds owned by the remote node.
Thus, the hardware only needs to verify that:
* The funds owned by this node on the channel remains the same.
* A new HTLC is added that this node can claim if it knows the preimage.
This requires that the hardware be able to compare a current state with a
proposed new state.
Presumably, the software for this node provides the current state, then the
proposed new state, then if the hardware confirms that the state changes
"correctly", it will sign the proposed new state.
The problem is, how can the hardware check that the "current state" is correct?
For example, suppose a channel was funded by this node, and thus all funds are
owned by this node.
Then the software is corrupted, and then it claims to the hardware that the
current state is that all funds are owned by the other node and that the other
node is offering an HTLC.
The hardware believes this claim of what the current state is, and signs the
state where almost all the funds are owned by the other node.
Note that we cannot have the software send the entire history of the channel to
the hardware either.
The software can be corrupted into sending only a partial history, rewinding
some set of recent channel updates, then a new history can be created on top of
that, where this node is paid far less.
On the blockchain the longer and presumably "more true" history can be attested
to with proof-of-work, but the hardware does not have a network connection and
is thus under a persistent state of Sybil attack from the node software.
Thus, the hardware has to remember at least what the current state is, and
store this knowledge in persistent, secure, modifiable memory that the software
cannot touch.
It need not store the *whole* state: for example, it could store just a
commitment to the current state (and have the software store the entire state).
In fact, it is best for the hardware to store the *signature* for the
commitment transaction:
* The signature commits to the commitment transaction, which is a
representation of the current state, thus the signature itself is a sufficient
commitment to the current state.
* There is the issue of outputs with values too small to have outputs on the
commitment transactions (i.e. below the dust limit).
* We can use sign-to-contract, where the signature commits to *another*
preimage, one which contains the outputs that are too small to put on the
Bitcoin-level commitment transaction.
* In case of a communication breakdown between the hardware and the software,
the hardware can re-send the signature of the latest commitment transaction.
* This is important since the hardware and the software now have to
synchronize their state storage!
The procedure is:
* The software stores the new state in a "proposed" slot in its persistent
memory.
* The software sends the current and proposed states to the hardware for
signing the proposed state.
* The hardware validates the update, then signs the proposed state and
replaces the signature in its persistent memory.
* The hardware sends the new signature.
* The software moves the state in the proposed slot to the current slot.
* With the above, if the software ever starts up with a state in the
"proposed" slot for a channel, it can query the hardware for the most recent
signature, and determine if the most recent signed is in the "proposed" or
"current" slot.
* Thus, the hardware only requires an atomic update to its persistent
memory.
The hardware can safely resend the most recent signature, assuming it can trust
its own persistent modifiable memory.
This is because presumably it validated every update leading to the most recent
signature in the past, else this signature in its memory would not have existed
in the first place.
As each channel has two sets of commitment transactions, the hardware has to
store two commitments to commitment transactions for *each* channel.
Another point is that revocations of *own* old commitment transactions can be
issued by the hardware itself.
This requires only a shachain; the hardware can hash the concatenation of the
private key plus some channel-unique data (the full channel ID may be enough)
to form the shachain seed for the revocation sequence of *own* commitment
transactions.
Tr\*sted Eventual Synchrony of Two Commitments
==============================================
Each channel has two commitment transactions, one for each node.
The commitment transaction for a particular node identifies which of the two
nodes actually performed the unilateral close.
In particular, the BOLT spec allows both commitment transactions to drift out
of synchrony with each other.
Eventually, if the channel becomes relatively inactive (no updates going
through it), then it is expected that the two commitment transactions will
eventually converge into a single common state.
But the two commitments may remain not completely in-synch indefinitely.
In order to properly determine synchrony, the hardware has to have an honest
account of the transcript between the software for this node, and the other
node.
The only source of this information is the software.
The software cannot falsely give a signature from the other node that updates
our *own* commitment transaction, but the software can definitely lie and deny
that the remote side has done so.
Thus, the two commitment transactions may drift out of synch significantly.
There might not be any significant attacks enabled by having the two
commitments out-of-sync indefinitely.
This may prevent some HTLCs from being safely forwardable, but that only means
stalls in payment forwarding.
Non-publication of Revoked Transactions
=======================================
With Poon-Dryja, publication of old state is fatal and is an easy way for a
corrupted node software to donate money to the other node.
Even with Decker-Wattenhofer or Decker-Russell-Osuntokun, a corrupted node
software can simply publish an old state where most of the money is allocated
to the other node, and not publish the latest state, so switching to
Decker-Russell-Osuntokun ("eltoo") is not a large increase in security for a
hardware signer.
To prevent publication of our *own* commitment transaction at least, the
hardware can simply not sign *own* commitment transactions as they are created.
(the hardware still has to immediately sign the *remote* commitment
transaction, as part of the update protocol requires handing over that
signature to the other node.)
Instead of storing a signature for our "own* commitment transaction, the
hardware can store a simple hash or txid of the latest *own* commitment
transaction.
Thus:
* The hardware stores:
* The latest signature for the *remote* commitment transaction.
* The latest txid for the *own* commitment transaction.
The hardware will only sign an *own* commitment transaction that matches the
txid currently stored in it.
Further, this particular commitment must now become the *last* commitment and
the hardware should subsequently reject all attempts to further advance the
state.
When updating the *own* commitment transaction, the software must provide the
signature from the other node for the next commitment transaction, as well as
the current and proposed-next commitment transaction.
The hardware validates that the state change is valid, changes the *own*
commitment txid, and then returns the equivalent revocation key for the
just-replaced commitment transaction.
Communications breakdown between the hardware and software can still occur
between the time the hardware sends the signature for the latest (and last)
commitment transaction, and when the software receives it.
Thus, the hardware has to have a concept of a "closed" channel, where it marks
that the current txid for the *own* commitment transaction is the last one, and
it will provide the signature for that commitment transaction, but refuse to
sign any other commitment transaction and refuse to advance the state of the
channel.
To reduce the complexity of the software-hardware interface, we could also
allow the hardware to sign a mutual close transaction that matches the *own*
and *remote* commitment transactions even in this "closed" state.
Finally the channel can later be removed from the (presumably limited)
persistent storage of the hardware.
Thus the protocol for our own unilateral or mutual close would be:
* The software requests to close the channel.
* The hardware marks the channel as closed, which prevents the channel from
updating in the future.
* The software requests a signature for the final commitment or mutual close
transaction.
* For mutual closes it provides the commitment transaction and the mutual
close, and the hardware validates that the mutual close matches the commitment
transaction.
* For unilateral closes the hardware also provides the signatures to claim
the revocable outputs.
* The software receives the signature and broadcasts the commitment or mutual
close transaction onchain.
* Once the transaction is deeply confirmed, the software requests the channel
to be deleted from the hardware persistent storage.
Thus the hardware provides three APIs:
* Mark-as-close.
* Get-final-signature (unilateral or mutual close option).
* Delete-closed.
At restart, the software can query the hardware for any channels that are
currently being closed, and can check the mempool and blockchain for whether
they have completed the close or not.
Tr\*sted Revocation of Old Remote State
=======================================
Unfortunately, the hardware has to tr\*st the software to check that the other
node is not cheating.
As the hardware itself is not capable of accessing the blockchain or mempool,
it cannot do this checking directly.
Thus, revocation of old remote state is not ensurable by the hardware.
As revocation can only be done by the software anyway, the revocation secrets
for the *remote* commitment transactions can be stored by the software only.
The hardware need never learn the revocation secrets as it can do nothing with
them anyway.
Opening a Channel
=================
Of course, before we even start having updates to channel state, first we must
have opened channels.
Opening will of course allocate a slot in the hardware persistent memory for
the channel.
Further, it must respect the opening sequence.
When opening a channel where the other node is the only funder, then it is
sufficient to just propose some initial state for both commitment transactions
and have the hardware initialize a slot in its persistent memory with this
channel open.
However, when opening a channel where some of the funds come from this node,
the hardware will also later sign the funding transaction spending its own
funds and feeding into the channel to be funded.
The software provides the other node signature for the *own* initial commitment
transaction.
The hardware then validates that it is able to claim the equivalent amount in
both the *own* and *remote* commitment transactions.
If there is a difference, that implies that this node is what pays the fee, and
the hardware should probably double-confirm with the user using its UI whether
the paid fees are acceptab;e.
Tr\*sted Forwarding
===================
Forwarding should be automated and not require a confirmation from the user on
the hardware unit UI.
Unfortunately, the hardware has to trust the software to actually perform
forwarding correctly.
(indeed, it is unlikely that confirmation from the user would be able to give
users a good way to understand what is being asked when a forwarding attempt is
made)
When forwarding, the hardware has to sign off on an update that actively spends
money from funds owned by this node, on the basis of a promise that some other
HTLC is offering money of greater value to this node.
Certainly the hardware can demand that the software provide a current channel
state showing the incoming HTLC, before it signs off on an updated channel
state that instantiates the outgoing HTLC from the funds owned by this node.
However, the hardware has to keep track that an incoming HTLC only has at most
one outgoing HTLC,.
It cannot identify using the hash, incidentally: it is theoretically possible
for a route to loop through a node twice, e.g. A -> B -> C -> D -> E -> C -> F,
where C appears twice.
Further, if a payment attempt fails, then the ultimate payer might still try to
route through this node, giving a different postfix past this node, so this
node may forward the same hash twice on two or more different payment attempts.
Thus, the software and the hardware have to agree on some other identifier on
HTLCs.
In C-Lightning, for example, HTLCs are internally identified by the database ID
that they happen to get assigned to; database IDs are simply
automatically-incremented counters.
Regardless, even if we somehow, the *enforcement* of HTLCs is still controlled
by the software:
* The hardware has no access to the blockchain, thus it cannot know if the
incoming HTLC is in fact something in the deep past already.
A corrupted software can induce the hardware to create an outgoing HTLC whose
timelock is deeply in the past, then refuse to take the timelock branch of the
outgoing HTLC while letting the timelock branch of the incoming HTLC be claimed
by the other nodes.
Thus, forwarding is still trusted.
Alternately, we can also point out that the only purpose of forwarding is
actually just to improve our privacy.
(in case it is not obvious, this is hyperbole)
By forwarding, it becomes ambiguous whether we are the ultimate payee if we get
an incoming HTLC, and ambiguous whether we are the ultimate payer if we send an
outgoing HTLC.
If we never forward, then this privacy is lost.
(This is the same argument that rejects unpublished channels: nodes that only
have unpublished channels cannot forward, thus every activity on their node is
obviously arising or terminating at their node: unpublished channels delenda
est)
I will also point out that I laid out the goals for this exercise would be to
reduce trust requirements first and state requirements second: privacy was not
listed.
Thus, removing the ability to forward automatically can be removed for the
purpose of this exercise.
Retrying Payments
=================
When trying a payment once, the hardware can confirm to the user via its own UI
whether to authorize the payment.
However, when *re*trying a payment, it would be onerous if the hardware had to
keep asking the user to confirm each payment attempt.
Instead, the hardware can keep track of exactly one pending outgoing payment
that the software will try its best to deliver.
The hardware can store the hash of this payment, and a maximum sendable amount
(the nominal payment amount plus any extra that would pay for fees and any
other privacy-improving routing techniques the software may apply).
Initially the payment slot will be empty.
Then, when the software initiates a payment, the hardware can be given the
various data in the invoice, which it can display to the user in its UI, and
when the user confirms the payment, it can load the hash and maximum allowed
value into the payment slot.
Then, when the software requests a new channel state, such that a new HTLC
matching the payment slot hash is created from funds owned by this node, the
hardware can keep track of the number of such HTLCs currently instantiated.
As the HTLC has to be instantiated on both commitment transactions, the
hardware has to keep track, in persistent memory, a single counter that goes
from 0 to 2.
When such an HTLC is instantiated from the own node funds, the hardware
increments this pointer if it is less than two.
Then if such an HTLC is removed, and the funds returned to our own node funds,
the hardware decrements this pointer.
If the HTLC is removed with the funds given to the other node, then the
hardware must demand the preimage of the hash, as proof that indeed, the
payment occurred.
Then it can decrement that counter and set a flag that the payment has been
completed (and thus the outgoing HTLC cannot be added again, i.e. the counter
cannot be incremented).
Then when the counter has fallen to zero with the payment-complete flag set,
the hardware can clear the payment slot.
The hardware can support having multiple payments in parallel running by
providing more than one payment slot.
Accepting Payments
==================
This is trivial: the hardware will sign off on all state updates that retain or
increase the amount of funds owned by this node.
Payment preimages for incoming payments need not even be known by the hardware,
ever.
Thus there is no trust or storage requirements for accepting payments.
Reducing Storage Size
=====================
The hardware can, instead of storing multiple channel slots and payment slots
in its own persistent memory, can instead just store a single *commitment* to
the current state it knows to be valid.
This can be done by transforming its state into a purely-functional data
structure and merklizing it.
A purely-functional data structure is a data structure where structural nodes,
once constructed, cannot be modified.
For example, a singly-linked list can be made into a purely-functional data
structure,
If we have a list A -> B -> C -> null, then we can "mutate" this structure by
creating a new list with a different prefix, such as Q -> R -> C -> null.
The old C -> null node in the original list can be reused.
Binary search trees are also easily transformed into purely-functional data
structures.
Generally, the Okasaki book is the primary reference for purely-functional data
structures.
Then, to Merklize a purely-functional data structure, simply means that all
pointers are replaced with hashes.
Thus, a Merkle tree is nothing more than a plain binary tree, Merklized.
>From this point of view, the blockchain is nothing more than a Merklized
>singly-linked list, with the `prevBlockHash` serving as the "next" pointer.
New blocks are prepended to the front of this singly-linked list, and the
"null" terminating the singly-linked list is nothing more than genesis.
For the hardware state, the top node can contain two "pointers", one for the
channel data and one for the running-payments data.
The root is then the "pointer" to the current top node.
Those "pointers" then point to two red-black (or other balanced purely
functional) trees, one for the channel data and one for the running payments.
When channel data only is updated, the top node is replaced, but the node for
the running payments data is just copied from the previous top node.
Similarly, only the path to the channel data through the red-black tree is
affected.
Using a Merklized purely-functional data structure to store the entire hardware
state allows the hardware to only absolutely require a secure persistent memory
that fits 32 bytes, enough for a single commitment.
This can allow some leeway in storing this root data redundantly in multiple
places, and to encode it with error correction and detection, and so on.
Whenever the software makes a request to the hardware, which in the above
discussion requires the hardware to read its internal state, the software
instead serializes the internal state.
Then the hardware can validate that the software has not given it incorrect
state.
Similarly, when the hardware has to update its state, it can give the
reserialization of the modified data, and update its internal state.
The cute thing is that, just as a standard Merkle tree inclusion proof only
requires sending a small subset of the entire Merkle tree, the software need
not send the *entire* hardware state --- it can send only those parts that are
relevant to the specific channel(s) and outgoing payment attempt(s) that it is
currently asking for.
This can let the hardware be very cheap and limited, with the software handling
all of the *actual* state storage.
However, we must now be very careful to ensure that the state of both hardware
and software remain the same even if both become powered off or some
communication failure occurs.
My suggestion is for the software to store two copies of the hardware state
(both "copies" can share some data files due to he purely-functional nature of
the data, but some nodes will exist in only one copy or the other).
Then, whenever the hardware has to update the state:
* The software gives the part of the state that needs updating by the hardware.
* The hardware provides the updated state data, plus a signature of the
concatenation of the old root plus the new root as well as the new root.
* The software stores the new state data (but retains the old state data).
* The software asks the hardware to update its single root variable.
It provides the current root plus the new root, as well as the previous
signature of the concatenation of those.
* The hardware validates that the current root in its slot matches, that the
signature is indeed a valid signature of itself, then updates its root slot.
* The software deletes the parts of the old state data that were not reused in
the new state.
At any time, if power is shut off, the software and hardware can recover by
checking the current root held in the hardware.
Regards,
ZmnSCPxj
_______________________________________________
Lightning-dev mailing list
[email protected]
https://lists.linuxfoundation.org/mailman/listinfo/lightning-dev