[bitcoin-dev] Jets (Was: `OP_FOLD`: A Looping Construct For Bitcoin SCRIPT)

ZmnSCPxj via bitcoin-dev Mon, 07 Mar 2022 15:35:28 -0800

Good morning Billy,

Changed subject since this is only tangentially related to `OP_FOLD`.


> Let me organize my thoughts on this a little more clearly. There's a couple 
> possibilities I can think of for a jet-like system:
>
> A. We could implement jets now without a consensus change, and without 
> requiring all nodes to upgrade to new relay rules. Probably. This would give 
> upgraded nodes improved validation performance and many upgraded nodes relay 
> savings (transmitting/receiving fewer bytes). Transactions would be weighted 
> the same as without the use of jets tho.
> B. We could implement the above + lighter weighting by using a soft fork to 
> put the jets in a part of the blockchain hidden from unupgraded nodes, as you 
> mentioned. 
> C. We could implement the above + the jet registration idea in a soft fork. 
>
> For A:
>
> * Upgraded nodes query each connection for support of jets in general, and 
> which specific jets they support.
> * For a connection to another upgraded node that supports the jet(s) that a 
> transaction contains, the transaction is sent verbatim with the jet included 
> in the script (eg as some fake opcode line like 23 OP_JET, indicating to 
> insert standard jet 23 in its place). When validation happens, or when a 
> miner includes it in a block, the jet opcode call is replaced with the script 
> it represents so hashing happens in a way that is recognizable to unupgraded 
> nodes.
> * For a connection to a non-upgraded node that doesn't support jets, or an 
> upgraded node that doesn't support the particular jet included in the script, 
> the jet opcode call is replaced as above before sending to that node. In 
> addition, some data is added to the transaction that unupgraded nodes 
> propagate along but otherwise ignore. Maybe this is extra witness data, maybe 
> this is some kind of "annex", or something else. But that data would contain 
> the original jet opcode (in this example "23 OP_JET") so that when that 
> transaction data reaches an upgraded node that recognizes that jet again, it 
> can swap that back in, in place of the script fragment it represents. 
>
> I'm not 100% sure the required mechanism I mentioned of "extra ignored data" 
> exists, and if it doesn't, then all nodes would at least need to be upgraded 
> to support that before this mechanism could fully work.

I am not sure that can even be *made* to exist.
It seems to me a trivial way to launch a DDoS: Just ask a bunch of fullnodes to 
add this 1Mb of extra ignored data in this tiny 1-input-1-output transaction so 
I pay only a small fee if it confirms but the bandwidth of all fullnodes is 
wasted transmitting and then ignoring this block of data.

> But even if such a mechanism doesn't exist, a jet script could still be used, 
> but it would be clobbered by the first nonupgraded node it is relayed to, and 
> can't then be converted back (without using a potentially expensive lookup 
> table as you mentioned). 

Yes, and people still run Bitcoin Core 0.8.x.....

> > If the script does not weigh less if it uses a jet, then there is no 
> > incentive for end-users to use a jet
>
> That's a good point. However, I'd point out that nodes do lots of things that 
> there's no individual incentive for, and this might be one where people 
> either altruistically use jets to be lighter on the network, or use them in 
> the hopes that the jet is accepted as a standard, reducing the cost of their 
> scripts. But certainly a direct incentive to use them is better. Honest nodes 
> can favor connecting to those that support jets.

Since you do not want a dynamic lookup table (because of the cost of lookup), 
how do new jets get introduced?
If a new jet requires coordinated deployment over the network, then you might 
as well just softfork and be done with it.
If a new jet can just be entered into some configuration file, how do you 
coordinate those between multiple users so that there *is* some benefit for 
relay?

> >if a jet would allow SCRIPT weights to decrease, upgraded nodes need to hide 
> >them from unupgraded nodes
> > we have to do that by telling unupgraded nodes "this script will always 
> > succeed and has weight 0"
>
> Right. It doesn't have to be weight zero, but that would work fine enough. 
>
> > if everybody else has not upgraded, a user of a new jet has no security.
>
> For case A, no security is lost. For case B you're right. For case C, once 
> nodes upgrade to the initial soft fork, new registered jets can take 
> advantage of relay-cost weight savings (defined by the soft fork) without 
> requiring any nodes to do any upgrading, and nodes could be further upgraded 
> to optimize the validation of various of those registered jets, but those 
> processing savings couldn't change the weighting of transactions without an 
> additional soft fork.
>
> > Consider an attack where I feed you a SCRIPT that validates trivially but 
> > is filled with almost-but-not-quite-jettable code
>
> I agree a pattern-matching lookup table is probably not a great design. But a 
> lookup table like that is not needed for the jet registration idea. After the 
> necessary soft fork, there would be standard rules for which registered jets 
> nodes are required to keep an index of, and so the lookup table would be a 
> straightforward jet hash lookup rather than a pattern-matching lookup, which 
> wouldn't have the same DOS problems. A node would simply find a jet opcode 
> call like "ab38cd39e OP_JET" and just lookup ab38cd39e in its index. 

How does the unupgraded-to-upgraded boundary work?
Having a static lookup table is better since you can pattern-match on strings 
of specific, static length, and we can take a page from `rsync` and use its 
"rolling checksum" idea which works with identifying strings of a certain 
specific length at arbitrary offsets.

Say you have jetted sequences where the original code is 42 bytes, and another 
jetted sequence where the original code is 54 bytes, you would keep a 42-byte 
rolling checksum and a separate 54-byte rolling checksum, and then when it 
matches, you check if the last 42 or 54 bytes matched the jetted sequences.

It does imply having a bunch of rolling checksums around, though.
Sigh.

---

To make jets more useful, we should redesign the language so that `OP_PUSH` is 
not in the opcode stream, but instead, we have a separate table of constants 
that is attached / concatenated to the actual SCRIPT.

So for example instead of an HTLC having embedded `OP_PUSH`es like this:

   OP_IF
       OP_HASH160 <hash> OP_EQUALVERIFY OP_DUP OP_HASH160 <acceptor pkh>
   OP_ELSE
       <timeout> OP_CHECKLOCKTIMEVERIFY OP_DROP OP_DUP OP_HASH160 <offerrer pkh>
   OP_ENDIF
   OP_EQUALVERIFY
   OP_CHECKSIG

We would have:

   constants:
       h = <hash>
       a = <acceptor pkh>
       t = <timeout>
       o = <offerer pkh>
   script:
       OP_IF
           OP_HASH160 h OP_EQUALVERIFY OP_DUP OP_HASH160 a
       OP_ELSE
           t OP_CHECKLOCKTIMEVERIFY OP_DROP OP_DUP OP_HASH160 o
       OP_ENDIF
       OP_EQUALVERIFY
       OP_CHECKSIG

The above allows for more compressibility, as the entire `script` portion can 
be recognized as a jet outright.
Move the incompressible hashes out of the main SCRIPT body.

We should note as well that this makes it *easier* to create recursive 
covenants (for good or ill) out of `OP_CAT` and whatever opcode you want that 
allows recursive covenants in combination with `OP_CAT`.
Generally, recursive covenants are *much* more interesting if they can change 
some variables at each iteration, and having a separate table-of-constants 
greatly facilitates that.

Indeed, the exercise of `OP_TLUV` in [drivechains-over-recursive-convenants][] 
puts the loop variables into the front of the SCRIPT to make it easier to work 
with the SCRIPT manipulation.

[drivechains-over-recursive-covenants]: 
https://lists.linuxfoundation.org/pipermail/bitcoin-dev/2022-February/019976.html

---

Perhaps we can consider the general vs specific tension in 
information-theoretic terms.

A language which supports more computational power --- i.e. more general --- 
must, by necessity, have longer symbols, as a basic law of information theory.
After all, a general language can express more things.

However, we do recognize that certain sequences of things-to-say are much more 
likely than others.
That is, we expect that certain sequences "make sense" to do.
That is why "jets" are even proposed, they are shortcuts towards those.

Assuming a general language is already deployed for Bitcoin, then a new opcode 
is a jet as it simply makes the SCRIPT shorter.

Instead of starting with a verbose (by necessity) general language, we could 
instead start with a terse but restricted language, and slowly loosen up its 
restrictions by adding new capabilities in softforks.

Regards,
ZmnSCPxj

_______________________________________________
bitcoin-dev mailing list
[email protected]
https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev

[bitcoin-dev] Jets (Was: `OP_FOLD`: A Looping Construct For Bitcoin SCRIPT)

Reply via email to