Hi, lightning devs

It's my first mail, so first, I'd like to say a big thanks to everyone involved 
in the development of the lightning network.

I've just finished going through this mailing list, so I'm not sure if Payment 
Route Reservation is already proposed in some other place. If there is a 
similar proposal, please point me to the right place. 

I'll first list some potential improvements with this proposal:

1)  Higher reliability of payments
2)  Lower payment latency (on average)
3)  Lower fees
4)  Increase in privacy
5)  Trampoline payment improvements
6)  Significant reduction of gossip messages
7)  Healthier, more decentralized network
8)  Simplified routing algorithm
9)  LN Wallet UX improvements
10) Fast spam prevention
11) Channel liquidity probing prevention
12) Eliminates the need for splice-in (and partially splice-out)
13) Eliminates the need for rebalancing and fee update scripts 
14) Eliminates the need for global nodes' reputation

So what is a payment route reservation?
  

Payment Route Reservation 
========================= 

The idea behind Payment Route Reservation is to split the payment into two 
steps. In the first step, we reserve the route, and in the second, we send a 
payment. In the reservation step, routing nodes will not sign a new commitment 
state. Instead, they will just reserve a specified amount of satoshis. Those 
satoshis might be used with the next payment step or not if the reservation is 
canceled. Each reservation has a timeout (let's say 1 min). After a timeout, 
the reservation is removed from the channel reservation set and considered 
canceled. Reservation can be canceled manually by any upstream node.

For example:


          add_reservation(payment_hash, amount) -->
             ┌───┐         ┌───┐         ┌───┐
     ───────►│   ├────────►│   ├────────►│   ├────────►
   S         │ A │         │ B │         │ C │          R
     ◄───────┤   │◄────────┤   │◄────────┤   │◄────────
             └───┘         └───┘         └───┘
          <- reservation_success(fees_sum, cltv_delta_sum)


S is a sender, R is a receiver, and A, B, and C are routing nodes.

S wants to send `amount` satoshis with `payment hash`[1] to R. S first creates 
a reservation onion for A->B->C->R route and then sends the onion to the first 
node in the route, using `add_reservation` call. A unwraps the onion, and if he 
has enough unreserved balance in A->B channel, he will add a new reservation 
for the `amount` request to the channel reservation set and then forward the 
onion to B. Reservation in reservation set can be addressed by `payment hash`.
If A has an insufficient channel reservation balance, A will return 
'reservation_failed' to the upstream node(sender S). B and C will do the same. 
Assuming there is enough reservation balance in all channels to reserve a 
payment, reservation flow eventually reaches R. R unwraps the onion and checks 
if he knows the preimage. If he knows, R returns `reservation_success` to 
upstream node C.

At this point, we have successfully reserved the payment route for `amount` 
satoshis from S to R. But, our route nodes haven't included the fees(or 
cltv_delta) of downstream nodes. `reservation_success` returns to upstream 
route node tuple (fees_sum, cltv_delta_sum). In this case, R will return (0, 0) 
to C because he is the last node in the route[2]. C will try to extend his 
reservation by fees from the R. If reservation extension is successful, C sends 
`reservation_success` to B, with a tuple increased by his fees[3] and his cltv 
delta. If C can't extend the reservation, he will send to B 
`reservation_failure` and to R `cancel_reservation`[4]. A will do the same. 
Eventually, if everything is ok with nodes A and B, S will receive a tuple 
(fees_sum, cltv_delta_sum). Now S knows precisely how much fees and cltv_delta 
route expects, and he can now send the payment onion. If S doesn't like the 
fees or cltv_delta, he can try some other route.

Let's compare current payment forwarding to payment with reservation forwarding:


  +-------+                               +-------+             +-------+       
                        +-------+
  |       |--(1)---- update_add_htlc ---->|       |             |       
|--(1)---- add_reservation ---->|       |
  |       |                               |       |             |       
|<-(2)---- reservation_success--|       | reservation onion
  |       |                               |       |           
──|─────────────────────────────────────────────────
  |       |--(2)--- commitment_signed --->|       |             |       
|--(3)--- commitment_signed --->|       |
  |   A   |<-(3)---- revoke_and_ack ------|   B   |             |   A   
|<-(4)---- revoke_and_ack ------|   B   |
  |       |                               |       |             |       |       
                        |       | HTLC onion
  |       |<-(4)--- commitment_signed ----|       |             |       
|<-(5)--- commitment_signed ----|       |
  |       |--(5)---- revoke_and_ack ----->|       |             |       
|--(6)---- revoke_and_ack ----->|       |
  |       |                               |       |             |       |       
                        |       |
  +-------+                               +-------+             +-------+       
                        +-------+

One `update_add_htlc` is now replaced with two calls - `add_reservation` and 
`reservation_success`. Also, we have two onions now.

When node A receives reservation_success with a tuple (fees_sum, 
cltv_delta_sum), nodes A and B have all the information to sign a new 
commitment transaction. The new commitment transaction amount would be `amount` 
+ fees_sum, and cltv_delta would be cltv_delta_sum. 

For now, we've just increased the latency. I have yet to check any 
implementation, so it is hard to guess how significant this increase is. The 
thing to note is that the reservation set is stored only in memory. No 
information needs to be stored in DB or replicated. If the node crashes with a 
live reservation, nothing bad can happen. So increase in latency would probably 
be proportional to 1 additional call between hops and how fast hops can unwrap 
the onion.

But before we tackle latency, let's first address the reliability of payments.


[1] For PTLC case, we can use points
[2] R can return some low random values to prevent C from discovering that R is 
the receiver.
[3] Fees can be ppm, base, or whatever fee function the node operator prefers.
[4] To ensure the node can always extend a reservation, we can set aside part 
of the channel reserve balance just for this case (let's say 0.5 percent).


Route Split Reservation 
=======================

If the routing node can't add a new reservation because the channel reservation 
balance is insufficient, the node can try to find some other route to reserve 
the missing amount for the next downstream node.

For instance, A needs to forward a reservation of 10 mBTC to B, but A->B 
channel has only 8 mBTC left of its channel reservation capacity[5].

           ┌───┐           ┌───┐
       10  │   │     8     │   │   10
      ────►│ A ├──────────►│ B ├───────►
           │   │           │   │
           └─┬─┘           └─▲─┘
             │               │
             │2    ┌───┐     │2
             │     │   │     │
             └────►│ C ├─────┘
                   │   │
                   └───┘


Rather than returning `reservation_failure` and losing the possibility to earn 
fees on the remaining 8 mBTC, A will try to find a route for the remaining 2 
mBTC. Let's say A picks A->C->B route. It is important to note that the new 
route should share the same payment hash as the main route(->A->B->)[6]. Now B 
will accept to forward the reservation of 10 mBTC further because he knows if a 
payment route is ever constructed, and if the preimage is revealed, he will 
receive his 10 mBTC.

There can be optimization in terms of fees. For instance, A can renounce his 
A->C channel fees. So the only fee A needs to pay for the new route is for 2 
mBTC on C->B route. Assuming network convergences to roughly the same fees on 
all nodes, cumulative route split fees effect would be the same as if the whole 
amount was routed through A->B. Although this example looks a bit too perfect, 
it's quite common on the plebnet part of the lightning network, where most 
nodes are created using Triangle Liquidity Swaps.
It is important to note that A now needs to take into account the fees and 
cltv_delta of the new route. Fees are the sum of the fees in A->B channel and 
A->C->B route. Cumulative cltv_delta would be maximum between A->B and A->C->B 
route.

Note that A can try multiple reservation routes. Also, if C doesn't have enough 
reserve balance, C can split a route through some node D to B, etc. To prevent 
infinite reservation recursion calls (in case of huge payment), the route split 
node will pass a tuple (max_fees, max_cltv_delta) that he will accept from the 
new route. Each node in the route will first decrement tuple by his fees and 
cltv_delta. If the new tuple contains any element below 0, the node will return 
`reservation_error`. If tuple elements are still positive, the reservation will 
continue, and decremented tuple will be passed to the next downstream node.

To hide the difference between the main route and route split, the payment 
sender will also send the tuple(max_fees, max_cltv_delta) that he accepts for 
the main payment route. Now, if someone is observing lightning network traffic, 
he wouldn't know if a reservation is for payment or route split. Thus we have 
an increase in privacy of payment.

Note that now we can also send payments larger than channel capacity A->B.


# Trampoline route split reservation

What if A and B don't have a direct channel, or the sender doesn't know the 
whole network topology or all the channels between A and B are suddenly 
disabled?


                          ┌───┐         .         ┌───┐
                       10 │   │---->   . .   ---->│   │ 10
                     ────►│ A │---->   . .   ---->│ B ├─────►
                          │   │---->    .    ---->│   │
                          └───┘                   └───┘

A still can try to reserve payments through different routes to B. Idea is 
similar to what is currently known as Trampoline Routing. 
The biggest benefit of using Trampoline route split reservation payment over 
Trampoline Routing payment is that a sender, before initiating the payment, 
knows exactly how much in fees he needs to pay. As well as how big cltv_delta 
will be.



How expensive is a route split reservation?
A needs to create a new reservation onion, and until that route split 
reservation propagates to B, B can't forward the reservation further downstream.
So we increased latency again. 

My guess is that reservations with route split payment will increase the 
reliability of single payments to the range of 90-95%. But can we reach 99% and 
finally lower the average latency?

[5] The outbound liquidity can be greater than 10 mBTC, but the reservation set 
has only 8 mBTC left.
[6] For PTLC A->B and C->B, payment hops need to commit to the same point.


Redundant Multi Route Payment Reservation
========================================

The idea is similar to what is now called Multi-Path Payments and Stuckless 
Payments[7].


                          ┌───┐       ┌───┐       ┌───┐
                          │   ├──────►│   ├──────►│   │
                     ────►│ R │       │ T │       │ U ├─────►
                          │   ├──────►│   │   ┌──►│   │
                          └───┘       └───┘   │   └───┘
                                              │
                             .......................

                          ┌───┐       ┌───┐  │    ┌───┐
              S           │   ├──────►│   │  └───►│   │
                     ────►│ D │       │ E │       │ F ├─────►    R
                          │   ├────┐  │   ├──┐    │   │
                          └───┘    │  └───┘  │    └───┘
                                   │         │
                          ┌───┐    │  ┌───┐  │    ┌───┐        
                          │   │    └─►│   │◄─┘    │   │
                     ────►│ A │       │ B │       │ C ├─────►
                          │   ├──────►│   ├─────► │   │
                          └───┘       └───┘       └───┘

The problem with a single route reservation is that reservation fees(or cltv) 
might be too high, and the sender will have to create a new reservation route 
hoping to get a better fee. Also, reservations might take longer than we are 
willing to wait. 

To overcome this issue, we first split the amount into N parts[8], and then we 
create 2*N[9] reservation routes each for the amount of `payment amount`/N. Now 
a sender needs to wait for just N routes out of 2N to return 
`reservation_success`, and then initiate payment. Note that multiple 
reservation routes can go through the same node.

This now creates an interesting market dynamic where the fastest route wins. 
And the fastest route is with the fastest nodes. Now nodes are incentivized to 
compete with better internet and better hardware. There would also be 
competition between lightning node developers to create the fastest lightning 
node. 

If a sender is not that interested in the speed of payment, he can wait for all 
2N routes to return reservation results and then pick the cheapest N routes.
Now nodes also need to compete in fees as well. Cheaper the fees, the higher 
probability of the sender picking the route.

All this eventually will benefit lightning network users.

It doesn't matter if the channels are balanced or not. If the node is well 
connected and has enough outbound liquidity, the reservation route split 
should(in most cases) find a route to the next node. So there is no need for 
rebalancing or fee update scripts. 

Just because nodes reserved the route doesn't mean they have to route the 
payment. There is still a probability of payment failure. Payment can fail 
because of reservation time out, node going offline, griefing node, etc. To 
further increase reliability, we can also add stuckles payments.

With Redundant Multi Route Payment Reservations, I believe the reliability of 
payments would be around the 99% percent range.
Latency, on average, should converge around a value much lower than the average 
value we have now. 

[7] 
https://lists.linuxfoundation.org/pipermail/lightning-dev/2019-June/002029.html
[8] Unless it is a small amount, for instance, below 0.5 mBTC. 
[9] Or some other multiplier


LN Wallet UX improvements
=========================

Not every payment is the same. For instance, if I'm paying a groceries bill in 
a supermarket, I value the speed of payments more than fees. On the other hand, 
if I'm paying my electricity bill at home, I value fees more than the speed of 
transactions. Also, if I'm paying a VPN subscription, I care more about privacy 
and cost over transaction speed.

With route reservation, we can present the exact user fees he is going to pay, 
so wallet developers can add the following element to UX:

Radio box: Fast Payment | Cheap Payment | MaxPrivacy

- If a user selects `Fast Payment` wallet will wait for the first N fastest 
routes and initiate payment automatically(unless fees exceed in percents some 
config value). 
- If a user selects the `Cheap Payments` node will wait for all 2N routes to 
return, pick N cheapest routes, and then present the user with fees and a send 
button. If fees are too high, the user can press the `Try again` button.
- For `MaxPrivacy` wallet will behave as in `Cheap Payment` mode, except the 
routing algorithm will prefer longer routes over smaller nodes, with preferably 
at least one Tor node.

*N routes can be N trampoline routes if a wallet doesn't have a whole network 
topology.


Gossip messages spam
====================

Currently, 97 percent or more LN gossip messages go to `channel_update` 
messages. The reason is that most nodes on the network are running some form of 
automatic fee adjustment script. If the number of nodes and channels continues 
to grow, soon, this can become unsustainable.

Reservation is not used just to reserve the route but also to inform a sender 
of fees and cltvs. Thus there is no need for nodes to publish their fees to the 
whole network. And thus, information like base_fee, ppm fee, and cltv can be 
removed from the `channel_update` message.

With route reservation, there is no need for channel rebalancing or fee update 
scripts. Node operators can still run fee update scripts without restrictions, 
but those updates would not propagate to the network.


Routing algorithm and network decentralization
=============================================

In terms of network topology, a current lightning network can be seen as a 
decentralized network. But in terms of a payment network, it is more a hub base 
network, where the majority of payments go through ~30 largest nodes. Medium 
and smaller nodes rarely forward any payment, and they are mostly used by big 
nodes for channel rebalancing. This is a big issue in terms of fairness(the 
rich are getting richer faster), privacy, reliability of the network, cost of a 
transaction, etc.

I would argue that this problem is not because the rich are rich but because of 
the routing algorithm. Currently, used routing algorithms usually calculate the 
probability of routing a payment over some channel as a channel capacity 
function. If the channel has a large capacity probability is higher, small 
channels will have a lower probability of payment routing. Because all wallets 
are effectively running in `Fast Payment` mode, the routing algorithm will 
almost always choose bigger channel routes than smaller ones and thus create 
the hub-based model we have today.

If we remove `fees` and `cltvs` from the gossip, what is left for the routing 
algorithm is `channel_capacity`, `htlc_minimum_msat` and `htlc_maximum_msat`. 
`htlc_maximum_msat` - can be a local node config. If the reservation attempts 
to reserve a greater amount than the maximum allowed for a single payment, a 
node can try to split the reservation over multiple channels.
`htlc_maximum_msat` - can remain a global config to avoid unnecessary 
reservation failures.

What about channel capacity? 

Channel capacity can be the biggest misleading factor for LN routing 
algorithms. For instance, I can create a node and then buy 100 channels with 
1BTC each of income liquidity. Then I set all channel fees to 0. Every routing 
algorithm will try to route payments through my node, but my node can't route a 
single satoshi. I would suspect that today's most popular nodes on the LN 
network probably have way more inbound than outbound liquidity.
So I think that capacity should not be used at all as a heuristic in the 
routing algorithms.
I think a better approach would be if nodes would just publish their cumulative 
outbound liquidity over all channels. If there are routing node A and routing 
node B, and they are well connected, we know that we can send(in theory) from 
node A to node B, with route split, a maximum payment of `min(A.balance, 
B.balance)` amount.
This would also signal to the rest of the network where liquidity is needed and 
where there is too much liquidity. There would be no liquidity sinks, and 
network liquidity could be better deployed. Automatic channel opening becomes 
trivial.

But would the nodes lie? 
If the node lies, the reservation will fail on his node, and if there is a lot 
of reservation failure on misleading nodes, neighbor nodes might close the 
channel with him. Also, while processing huge payment reservations, which will 
eventually fail, a node could have routed a smaller payment and earned fees. So 
nodes will be incentives to report the real outbound liquidity or lower if 
privacy is the issue.

Now routing algorithm has become very simple. We make routing decisions just on 
hop count from our node, on node reported balance, and a bit of randomness. 
Randomness is the most important part. If some route fails, we can't know which 
node to blame. It can be any node in the route. The routing algorithm should 
not track any blame information about individual nodes and would not try to 
blacklist any particular node. If the node is well connected and the routing 
algorithm is random enough, we should be able eventually to find enough routes 
to the sender. If there is a faulty, griefing, or just slow node in the 
network, we let nodes handle deputies themselves. Each node will track the 
local reputation of each of its neighbors. If there are a lot of failed 
payments from one neighbor, the node operator might contact that neighbor for 
an explanation or just close the channel to that neighbor node.
 
For privacy currencies, not publishing channel capacity is a necessity. Privacy 
currency users (like Monero) don't want to relieve any of their UTXO amounts. 
Privacy currency LN can avoid publishing their channel capacity and outbound 
node liquidity, and only make routing decisions on hop counts and randomness. 
There would probably be more reservation failures, especially for larger 
amounts, but it should be worth privacy preserved.


Fast spam prevention
====================

Fast spam is a situation on the network when malicious nodes start creating a 
lot of payments to the random nodes, which will never be resolved because the 
receiver does not know the payment preimage. The receiver can only fail the 
payment. 
This is not possible with a reservation anymore because receivers will reject a 
reservation, and there will be no payment. Now, this is a lot better because 
the reservation does not commit to a new channel state, and no DB operations 
are involved.

But how to prevent reservation spam?
If the node sees a lot of reservation errors per minute coming from one 
channel, the node can throttle reservations coming from that channel. 
Eventually, if spam continues node operator can inform the neighbor node 
operator to investigate the reason. The neighbor node will do the same. And if 
the problem persists, the node operator can just close the channel.

But what if an attacker controls both nodes(receiver and sender) or if the 
attacker creates a circular route?

The malicious receiver node will accept the reservation, and then when HTLC 
comes, it just fails the payment. This would be basically fast spam again, with 
a bit more work to be done by the attacker.
As proposed before[10], we can demand a prepayment. If a node sees a large 
amount of failed payments per minute on some of its channels, a node can start 
demanding some prepayment to route HTLC. The problem is how much to charge. If 
we charge too little, spam will continue. If we charge too much, this will 
affect regular micro-payments. 
We can extend reservation_success tuple with a prepayment fee. And we can 
adjust the fee on the fly. There is no need to propagate it to the whole 
network. A node can initially charge 0 prepayment fees. This is in their best 
interest because the routing algorithm will prefer routes without prepayment. 
The reason is that there is still a chance of payment failure. In a payment 
failure case, the sender loses the pre-payed amount unconditionally. Now if a 
node observes a large number of payment failures per minute, he can start 
aggressively increasing his prepayment fees, thus making the attack very costly.

There was also a discussion on how to send a prepayment. Wouldn't the first 
node in the route just steal all prepayment and then fail the payment?
There are two possible scenarios here:
 - prepayment is less than or equal to the fees the node is going to earn 
routing the payment. In this case, it is in the node's best interest to route 
the payment.
 - prepayment amount is greater than node payment fees. I can see two possible 
solutions:
    - let a first node steal the prepayment. This should be fine because if 
prepayment gets so big, most likely, the attacker is the one being robbed. 
Honest users will prefer routes without a prepayment, especially large 
prepayments.
    - we can also try to send multiple keysend prepayments, but there are some 
privacy issues here because the length of the route is revealed to the first 
node, and the last route node finds out who the receiver is. A slightly less 
bad option is to send prepayments for the parts of the route. For instance, one 
keysend is for the first half of the route. The second is for the second half 
of the route.

[10] 
https://lists.linuxfoundation.org/pipermail/lightning-dev/2020-February/002547.html

Channel liquidity probing
=========================

With a route reservation split, inspecting channel liquidity from other nodes 
becomes significantly more demanding. The reason is the fact that reservations 
should rarely fail. Now observer needs to deduct the channel liquidity from the 
reservation response (fee,cltv_delta). If there is a reservation split, fees 
and cltv_delta will most likely be higher than if there is no reservation 
split. But every node can change fee and cltv_delta on the fly, and by just 
randomizing those values on every return, potential channel observer jobs get 
much harder.
Also, if the attacker starts sending a lot of reservation requests, those 
requests might be seen as a DOS attempt, and the node might start throttling 
the reservation requests.


No need for splice-in(and partially splice-out)
===============================================

The idea behind the splicing is to increase the capacity of the channel, so 
that channel can route bigger payments. Also, the routing algorithm prefers a 
bigger channel so the node would route more payments.
With route split reservations, this is not necessary anymore. For example, if 
nodes A and B have N channels between them, node A  would have the option to 
reserve a route through all N channels during reservation. Thus N smaller 
channels can be seen as one big channel, not by routing algorithm but by node 
A. Routing algorithm will just care if there is a connection between A and B.

If nodes A and B have N channels between them, closing one channel basically 
becomes a splice-out operation.



Looking forward to your feedback.

Best Regards,
g0b1el




_______________________________________________
Lightning-dev mailing list
Lightning-dev@lists.linuxfoundation.org
https://lists.linuxfoundation.org/mailman/listinfo/lightning-dev

Reply via email to