Hi, Just wondered if anyone had any ideas on this?
Thanks, Ian On Mon, Apr 18, 2022 at 9:52 AM Ian Chilton <[email protected]> wrote: > Hi, > > I think this is state related, as that's when i've seen this symptom > before - traffic being dropped even though there are rules to permit it. In > this case, I can't see why though. > > I have two OpenBSD hosts, both doing BGP. Let's call them gw1 and gw2 > here. I've replaced my real loopback IP with 172.16.0.x. When running > rpki-client on them, I found one had double the ROAs of the other. After > some investigation, I found that RIPE is missing from gw2 because it can't > talk to rpki.ripe.net. > > I'm using: "!route sourceaddr -ifp lo1" on both hosts, so outgoing traffic > from the hosts themselves originate from the loopback address. > > If I ping with a source address of the loopback, it works from gw1, but > not from gw2: > > ichilton@gw1:~$ ping -I 172.16.0.90 -v rpki.ripe.net > PING rpki.ripe.net (172.16.0.90 --> 193.0.6.138): 56 data bytes > 64 bytes from 193.0.6.138: icmp_seq=0 ttl=252 time=7.671 ms > 64 bytes from 193.0.6.138: icmp_seq=1 ttl=252 time=7.580 ms > > ichilton@gw2:~$ ping -I 172.16.0.91 -v rpki.ripe.net > PING rpki.ripe.net (172.16.0.91 --> 193.0.6.138): 56 data bytes > ^C > --- rpki.ripe.net ping statistics --- > 2 packets transmitted, 0 packets received, 100.0% packet loss > > The outbound path for that is out of a connected transit interface and the > inbound path is a transit interface on gw1. > > When pinging from gw2, I see the echo request go out the correct interface > on gw2: > > root@gw2:~# tcpdump -i vlan367 -n host 193.0.6.138 > tcpdump: listening on vlan367, link-type EN10MB > 09:32:18.837929 172.16.0.91 > 193.0.6.138: icmp: echo request > 09:32:19.837919 172.16.0.91 > 193.0.6.138: icmp: echo request > > I see the reply come in on the transit interface on gw1: > > root@gw1:~# tcpdump -i vlan313 -n host 193.0.6.138 > tcpdump: listening on vlan313, link-type EN10MB > 09:33:00.835072 193.0.6.138 > 172.16.0.91: icmp: echo reply (DF) > 09:33:01.835271 193.0.6.138 > 172.16.0.91: icmp: echo reply (DF) > > Then it should route over the linknet interface, vlan409. However, the > replies are not there. They are dropped somewhere between the interfaces. > > This is where it's interesting. > > The relevant parts of my ruleset are: > > set skip on lo > block all > pass out quick on linknet from (self) > pass out quick on { admin, external, linknet } proto { tcp, udp } > pass quick proto { icmp, icmp6 } > > So ICMP is allowed full stop.. as is outgoing on the linknet. > > If I disable pf or comment out the 'block all', then it instantly starts > working - I see the echo replies start flowing on the linknet (vlan409) > interface and pings succeed. As soon as I re-instate 'block all', it stops > again. If I put 'pass all' above or below 'block all', it doesn't help, as > doesn't ''pass quick all', 'pass quick on vlan409' or anything that should > otherwise pass the traffic (which is unsurprising as i've already got 'pass > quick proto icmp'. > > If I add 'log' to the block all, I can see on pflog0 that it's the 'block > all' rule which is blocking it. > > I've seen this similar behaviour before I set up pfsync, but in this case > it seems to be working fine and both hosts have a state entry which is > created when I start pings: > > root@gw2:~# pfctl -ss |grep 193.0.6.138 > all icmp 172.16.0.91:8123 -> 193.0.6.138:8 0:0 > > root@gw1:~# pfctl -ss |grep 193.0.6.138 > all icmp 172.16.0.91:8123 -> 193.0.6.138:8 0:0 > > Is anyone able to shed any light on what's going on? > > Thanks, > > Ian > >
