Re: [bitcoin-dev] Refreshed BIP324

2022-11-12 Thread Pieter Wuille via bitcoin-dev
Another idea...

On Thursday, November 10th, 2022 at 4:23 PM, Pieter Wuille via bitcoin-dev 
 wrote:

> On Thursday, November 3rd, 2022 at 1:53 PM, Murch wrote:
> 
> > From what I understand we'll have about 35 message types on the network
> > with the addition of BIP324. 256 possible IDs sounds like plenty room to
> > grow, but perhaps we can be a bit more conservative:
> > 
> > We could use the first bit to signal a 2-byte message ID. That allows us
> > to express 128 IDs with 1 byte, but if we need more, we get a total of
> > 2^15 IDs across 2 bytes.
> 
> Yeah, effectively treating the first 1 or 2 bytes as a simple variable
> length integer is a nice way of increasing the space at low cost.

The above would really result in having two separate variable-length encodings:
* First byte 1-12 to signify length of alphabetic command
* Otherwise first bit to signify length of short command

I think we can just merge the two and have a single variable-length command 
structure that can be used for both: command encodings are 1 to 12 bytes, each 
byte's top bit indicating whether another byte follows (the top bit of byte 11 
has no special meaning).

This means:
* Every alphabetic command of L characters becomes L bytes.
* 102 non-alphabetic 1-byte commands can be assigned.
* 15708 non-alphabetic 2-byte commands can be assigned.
* etc

So this gives a uniform space which commands can be assigned from, and there is 
no strict need for thinking of the short-binary and long-alphabetic commands as 
distinct. In v2, some short ones would be treated as aliases for old 
long-alphabetic ones. But new commands could also just be introduced as short 
ones only (even in v1).

WDYT?

Cheers,

-- 
Pieter




___
bitcoin-dev mailing list
bitcoin-dev@lists.linuxfoundation.org
https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev


Re: [bitcoin-dev] Merkleize All The Things

2022-11-12 Thread Salvatore Ingala via bitcoin-dev
Hi Antoine,
It appears that my explanation of the relationship between the covenant and
the bisection protocol is still unclear; I'll do my best to clarify.

The bisection's Merkle tree never ends up on-chain, in any form. Therefore,
a bigger computation does not end up in a bigger witness size, which is key
to the scalability of the approach. Only in the case of a challenge, it
will result in a (logarithmically) longer chain of transactions to resolve
it. This chain of transactions could be mapped to a root-to-leaf path in
the Merkle tree of the computation trace.

The actual computation trace (and the corresponding Merkle tree) is only
computed by the parties when they execute the computation.
What's in the tapleaves is only the valid transitions at the current state
of the protocol; that is, the valid transitions in the Finite State Machine
(and possibly other valid exit conditions that remove the covenant
encumbrance, if any).

The bisection protocol only makes sense as a step in a larger protocol that
makes use of it.

Perhaps presenting the protocol in a less-than-general case might help to
understand it. So let's play a simpler game using a protocol that includes
a fraud proof.

Alice claims that she knows how to multiply by 256, while Bob doesn't
believe her. So they make a bet: they each commit 1 coin; then Bob choses a
number x; then Alice computes y = 256*x by doubling x eight times
(expensive multiplications were disabled in a tragic DDoS accident), and
publishes the result y. Bob independently computes 256 * x (he has a friend
who is a mathematician, he'll know how to do it). If the result is not y,
Bob will start a challenge; otherwise, Alice wins and takes the money.

(The example is of course artificial, as redoing the computation in Script
is cheaper than executing the fraud proof in this case!)

What follows is an actual execution of the protocol. In the following, each
[Si] is a UTXO corresponding to some possible FSM node, starting with the
S0, the UTXO with 1+1 = 2 coins. Each line with "-" is a possible
transition (script in the taptree), specifying what is the next FSM node
after the "==>" symbol; the encumbrance in the scripts enforce that the
state of the next UTXO is updated correctly (details omitted below), and
any other necessary conditions to ensure the integrity of the protocol.


=


[S0]: Initial UTXO
  - only Bob can spend, he must choose his number x ==> S1

[S1; state: x]:
  - only Alice can spend, she publishes her answer y ==> S2

[S2. state: x, y]:
  - after 1 day: Alice won, she can take the money // Happy case!
Usually that's the end
  - Bob disagrees with the answer, post z as his answer. ==> S3

The challenge starts here! Let's put some actual numbers. x = 2; y = 508; z
= 512.

This is what Alice computed:

  2 => 4 => 8 => 16 => 32 => 64 => 127 => 254 => 508

This is what Bob computed:

  2 => 4 => 8 => 16 => 32 => 64 => 128 => 256 => 512

At this time, we don't know who is right. They both built a tree that looks
like this (ASCII art only working in fixed-width font):

 ___H18___
/ \
   /   \
H14 H58
/ \ / \
   /   \   /   \
  / \ / \
H12 H34 H56 H78
/ \ / \ / \ / \
  H1  H2  H3  H4  H5  H6  H7  H8

Remember that each internal node commits to: the state of the computation
before the leftmost leaf in the subtree, the state after the rightmost
leaf, and the hash of sub-trace for the sub-tree. Each leaf just commits to
that intermediate computation step (and the operation, which here is always
"double the input"). For example, H4 commits to "16 => 32" according to
both Alice's and Bob's computation traces.

(From our privileged point of view, we can foresee that the earliest
disagreement is on the 6th step of the computation: "64 => 127" according
to Alice, "64 => 128" according to Bob).

Let's denote h_{A;18} (resp. h_{B;18}) all the information committed to in
the node H18, according to Alice (resp. Bob). Similarly for all the other
nodes.

[S3. state: x, y, z]: Challenge starts!
  - Alice posts the root of her computation trace h_{A;18} ==> S4

[S4. state: x, y, z, h_{A;18}]
  - Bob posts the root of her computation trace h_{B;18} ==> S5

Since they disagree, it must be the case that h_{A;18} != h_{B;18}.

[S5. state: x, y, z, h_{A;18}, h_{B;18}]
  - Alice opens the commitment h_{A;18}, by revealing H14 and H58
(according to her) ==> S6

Note that in this last transition (going to S6), the covenant enforces that
the child commitments are compatible: the transition is only valid if the
starting state of the computation according to h_{A;14} is compatible with
h_{A;18} (that is, it's equal to x); similarly the end state of the
computation in h_{A;58} must be y, and h_{A;18} can be recomputed from the
data provided (ensuring the integrity of the Merkle tree).
This is true for all the commitment openings belo

Re: [bitcoin-dev] Refreshed BIP324

2022-11-12 Thread Yuval Kogman via bitcoin-dev
On Sat, 12 Nov 2022, 11:01 Pieter Wuille via bitcoin-dev, <
bitcoin-dev@lists.linuxfoundation.org> wrote:

> I think we can just merge the two and have a single variable-length
> command structure that can be used for both: command encodings are 1 to 12
> bytes, each byte's top bit indicating whether another byte follows (the top
> bit of byte 11 has no special meaning).
>
...

> So this gives a uniform space which commands can be assigned from, and
> there is no strict need for thinking of the short-binary and
> long-alphabetic commands as distinct.In v2, some short ones would be
> treated as aliases for old long-alphabetic ones.


This seems a bit dubious, but since command names are ASCII strings
reversing the meaning of the top bit so that 0 signifies a following byte
would allow the alphabetic commands to be reinterpreted as non-alphabetic,
so the space no longer needs to be a disjoint union of two sub-spaces which
arguably takes the "no [...] need for thinking of [them] as distinct" logic
a little bit bit farther.

The main downsides are that this does nothing to re-assign shorter codes to
existing commands, and secondly that even the non-alphabetic code path
completely supersedes the alphabetic one, the numerical values are up to 85
or 86 bits wide, which is not a reasonable word size. Perhaps this is not a
concern since as far as I know there are no collisions in the first 9 bytes
of existing commands, and constrain the non-alphabetic representation to 9
bytes would leave the last top bit free, so the space would be isomorphic
to {0,1}^64.
___
bitcoin-dev mailing list
bitcoin-dev@lists.linuxfoundation.org
https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev