Indeed this is how you typically set up a multihomed service (addresses
on lo and then announce that using BGP or something).
If you use one of the network links directly for the service and that
link network goes down (it may not even be in your AS so you may not
know?) then the service is offline.
use a route-map in your bgp config to set the src address of routes to
the address on lo, that works for wg :)
/Peter
On 2023-02-19 13:10, Nico Schottelius wrote:
Aside from nginx + icmp being handled correctly as a reference,
I want to further elaborate on this case to show that something is
really wrong with the current behaviour:
A typical scenario for routers is to have a lot of global reachable IP
addresses (IPv6, IPv4) assigned to the loopback interface, such as this
system:
[13:11] router2.place6:~# ip a sh dev lo
1: lo: mtu 65536 qdisc noqueue state UNKNOWN group
default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 2a0a:e5c0:1e:a::b/128 scope global
valid_lft forever preferred_lft forever
inet6 2a0a:e5c0:1e:a::a/128 scope global
valid_lft forever preferred_lft forever
inet6 2a0a:e5c0:2:a::b/128 scope global
valid_lft forever preferred_lft forever
inet6 2a0a:e5c0:2:a::a/128 scope global
valid_lft forever preferred_lft forever
inet6 2a0a:e5c0:2:1::7/128 scope global
valid_lft forever preferred_lft forever
inet6 2a0a:e5c0:2:1::6/128 scope global
valid_lft forever preferred_lft forever
inet6 2a0a:e5c0:2:1::5/128 scope global
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
The motivation behind that is that independent of the actual routing
interface, these IP addresses are always reachable.
Now in the case of wireguard selecting the source IP based on the
outgoing interface, this is never going to work, as lo cannot send
packets to the outside world.
Nico Schottelius writes:
Let me rephrase the problem statement:
- ping and http calls to the multi homed machine work correctly:
I can ping 147.78.195.254 and the reply contains the same address.
I can ping 195.141.200.73 and the reply contains the same address.
I can curl 147.78.195.254 and the reply contains the same address.
I can curl 195.141.200.73 and the reply contains the same address.
- wireguard does NOT work because it changes the reply address:
A packet sent to 147.78.195.254 is being replied with 195.141.200.73
In general, processes reply with the IP address that was used to contact
them and not with the outgoing interface address, which would also break
adding IP addresses to the loopback interface.
For full detail, see ip addresses [0] and routing below [1] and tests
executed [2].
I believe that this is a bug in wireguard.
[2]
Let's see how it looks like in detail:
1) ping to 147.78.195.254: works
[9:14] nb3:~% ping -c2 147.78.195.254
PING 147.78.195.254 (147.78.195.254) 56(84) bytes of data.
64 bytes from 147.78.195.254: icmp_seq=1 ttl=53 time=7.27 ms
64 bytes from 147.78.195.254: icmp_seq=2 ttl=53 time=6.30 ms
--- 147.78.195.254 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1002ms
rtt min/avg/max/mdev = 6.296/6.781/7.267/0.485 ms
/ # tcpdump -ni any host 194.5.220.43
tcpdump: data link type LINUX_SLL2
tcpdump: verbose output suppressed, use -v[v]... for full protocol decode
listening on any, link-type LINUX_SLL2 (Linux cooked v2), snapshot length
262144 bytes
08:14:48.379618 net1 In IP 194.5.220.43 > 147.78.195.254: ICMP echo request,
id 89, seq 1, length 64
08:14:48.379651 net2 Out IP 147.78.195.254 > 194.5.220.43: ICMP echo reply, id
89, seq 1, length 64
08:14:49.380340 net1 In IP 194.5.220.43 > 147.78.195.254: ICMP echo request,
id 89, seq 2, length 64
08:14:49.380392 net2 Out IP 147.78.195.254 > 194.5.220.43: ICMP echo reply, id
89, seq 2, length 64
2) ping to 195.141.200.73
[9:14] nb3:~% ping -c2 195.141.200.73
PING 195.141.200.73 (195.141.200.73) 56(84) bytes of data.
64 bytes from 195.141.200.73: icmp_seq=1 ttl=53 time=11.3 ms
64 bytes from 195.141.200.73: icmp_seq=2 ttl=53 time=6.81 ms
--- 195.141.200.73 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1002ms
rtt min/avg/max/mdev = 6.813/9.057/11.301/2.244 ms
[9:15] nb3:~%
/ # tcpdump -ni any host 194.5.220.43
tcpdump: data link type LINUX_SLL2
tcpdump: verbose output suppressed, use -v[v]... for full protocol decode
listening on any, link-type LINUX_SLL2 (Linux cooked v2), snapshot length
262144 bytes
08:16:19.257697 net2 In IP 194.5.220.43 > 195.141.200.73: ICMP echo request,
id 91, seq 1, length 64
08:16:19.257730 net2 Out IP 195.141.200.73 > 194.5.220.43: ICMP echo reply, id
91,