Re: Packets passed by pf don't make it out?

2020-10-11 Thread J David
On Sun, Oct 11, 2020 at 12:46 PM Andreas Longwitz  wrote:
> Please look at the output of "pfctl -vsn" on fb2 during your test.
> With "netstat -ss | grep drop" you can check for packets dropped by the
> kernel for what reason ever.

Here's the complete diff of the output from netstat -ss from before to
after running the test:

--- nss.pre 2020-10-11 17:10:19.932002000 +
+++ nss.post 2020-10-11 17:10:21.999823000 +
@@ -48,9 +48,9 @@
  Packet drop statistics:
  Timeouts:
 ip:
- 66578 total packets received
+ 66582 total packets received
  66531 packets for this host
- 16 packets forwarded
+ 17 packets forwarded
  1 packet not forwardable
  31675 packets sent from this host
  10 packets sent with fabricated ip header

No drops of any kind (nor anything else) recorded during the test.

4 packets in, 1 packet forwarded, which exactly matches the observed
behavior of only one packet reaching the server.

The results of "pfctl -vsn" are a bit more interesting, and also inconsistent.

Before, after a full flush to zero states and counters:

rdr inet proto udp from any to 172.16.0.0/12 port = 12345 -> 10.255.255.3
  [ Evaluations: 0 Packets: 0 Bytes: 0   States: 0 ]
  [ Inserted: uid 0 pid 1044 State Creations: 0 ]

After:

rdr inet proto udp from any to 172.16.0.0/12 port = 12345 -> 10.255.255.3
  [ Evaluations: 4 Packets: 1 Bytes: 44  States: 1 ]
  [ Inserted: uid 0 pid 1044 State Creations: 4 ]

So it says it created four states, but only matched one packet out of
the four it evaluated.  And it didn't create 4 states, either. "pfctl
-s state" shows only 1:

all udp 10.255.255.3:12345 (172.16.0.1:12345) <- 10.0.0.1:23456
NO_TRAFFIC:SINGLE

and pflog0 reported all four packets as matching the pass rule which,
important, is based on the destination address after redirection:

17:23:39.039641 rule 0/0(match): pass in on em1: 10.0.0.1.23456 >
10.255.255.3.12345: UDP, length 16
17:23:39.039751 rule 0/0(match): pass in on em1: 10.0.0.1.23456 >
10.255.255.3.12345: UDP, length 16
17:23:39.039769 rule 0/0(match): pass in on em1: 10.0.0.1.23456 >
10.255.255.3.12345: UDP, length 16
17:23:39.039780 rule 0/0(match): pass in on em1: 10.0.0.1.23456 >
10.255.255.3.12345: UDP, length 16

If I repeat the test, this happens:

rdr inet proto udp from any to 172.16.0.0/12 port = 12345 -> 10.255.255.3
  [ Evaluations: 7 Packets: 2 Bytes: 88  States: 1 ]
  [ Inserted: uid 0 pid 1044 State Creations: 7 ]

But still just the one state:

all udp 10.255.255.3:12345 (172.16.0.1:12345) <- 10.0.0.1:23456
NO_TRAFFIC:SINGLE

But only three passes appear in pflog0:

17:29:19.857174 rule 0/0(match): pass in on em1: 10.0.0.1.23456 >
10.255.255.3.12345: UDP, length 16
17:29:19.857193 rule 0/0(match): pass in on em1: 10.0.0.1.23456 >
10.255.255.3.12345: UDP, length 16
17:29:19.857226 rule 0/0(match): pass in on em1: 10.0.0.1.23456 >
10.255.255.3.12345: UDP, length 16

And, to confirm, the first packet from each of these tests did reach
the server, the remaining three from each test did not.

> A similar setup works for me without any problems, so there may be
> something special in your environment.

This has been tested on fresh installs of both FreeBSD 12.1 and 11.4
on both physical hardware and virtual machines including both Xeons
and AMD Epyc.  So it seems like most environmental factors have been
controlled for.

> It seems your routing table on fb2 is empty, please try to set a
> defaultroute, e.g.: "route add default 10.0.0.NN" with any NN.

fb2 does have a default route, it is obtained from DHCP on the first
interface.  But that is not relevant; the client machine (fb1) is
directly connected to fb2's second interface, and the server (fb3) is
directly connected to fb2's third interface.

No additional routes are necessary for this test, and the default
route is never consulted.

Perhaps there's some detail of the scenario that I have omitted
without which it's not clear?

Thanks!
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: Packets passed by pf don't make it out?

2020-10-09 Thread J David
To investigate this issue, I've created a greatly simplified and
reproducible test case.  The code is available at:

https://github.com/jdavidlists/pfudpbug

It includes the pf.conf, the source code for client and server, and
the rc.conf from all three machines.

The test uses three FreeBSD systems (client, gateway, and server) to
demonstrate that if a client with a bound UDP source port sends UDP
packets to multiple server addresses that a gateway running pf
redirects to one backend server, only the first such packet will be
delivered.  The remaining packets never leave the gateway; they get
lost somewhere after being logged as passing pf via pflog0 but before
a tcpdump running on the gateway's server-facing interface.  They also
do not increase the outbound packet count on the server-facing
interface.

(More detail is available in the repository's README.md.)

This may indicate a bug, but I'm not sure whether it is happening
inside pf or farther along the output path.  Nor do I know how to
explore this further.

Is anyone able to point me in the right direction here?

Thanks!

On Fri, Oct 2, 2020 at 1:35 PM J David  wrote:
>
> We have pf running on a FreeBSD 11.4 system acting as a load balancer,
> mapping a set of 8 external DNS service IP addresses to a set of
> internal DNS servers, any of which can handle those requests.
>
> When UDP packets from one source IP/port arrive for multiple external
> IPs in a short period of time, pf claims they all pass, but only the
> ones for the first IP actually make it out the outbound interface.
>
> Redirect rule:
>
> rdr inet proto udp from any to { 172.17.53.1, 172.17.53.2,
> 172.17.53.3, 172.17.53.4, 172.17.53.5, 172.17.53.6, 172.17.53.7,
> 172.17.53.8 } port 53 -> { 10.53.0.1, 10.53.0.2, 10.53.0.3 }
> round-robin sticky-address
>
> Pass rule:
>
> pass in log quick proto udp to { 10.53.0.1, 10.53.0.2, 10.53.0.3 } port 53
>
> (The pass rule isn't technically necessary, it's only there to log the
> packets to debug this issue.)
>
> With tcpdumps running simultaneously on ix1, all packets show up the
> inbound interface:
>
> 16:32:39.183168 IP 149.20.1.48.56246 > 172.17.53.1.53: 3215 SOA?
> example.com. (29)
> 16:32:39.183761 IP 149.20.1.48.56246 > 172.17.53.2.53: 2934 SOA?
> example.com. (29)
> 16:32:39.184368 IP 149.20.1.48.56246 > 172.17.53.3.53: 52875 SOA?
> example.com. (29)
> 16:32:39.185618 IP 149.20.1.48.56246 > 172.17.53.4.53: 36289 SOA?
> example.com. (29)
> 16:32:39.186067 IP 149.20.1.48.56246 > 172.17.53.5.53: 44049 SOA?
> example.com. (29)
> 16:32:39.186422 IP 149.20.1.48.56246 > 172.17.53.6.53: 34410 SOA?
> example.com. (29)
> 16:32:39.186494 IP 149.20.1.48.56246 > 172.17.53.7.53: 30923 SOA?
> example.com. (29)
> 16:32:39.188541 IP 149.20.1.48.56246 > 172.17.53.8.53: 48814 SOA?
> example.com. (29)
>
> and on pflog0:
>
> 16:32:39.183189 rule 16/0(match): pass in on ix1: 149.20.1.48.56246 >
> 10.53.0.1.53: 3215 SOA? example.com. (29)
> 16:32:39.183780 rule 16/0(match): pass in on ix1: 149.20.1.48.56246 >
> 10.53.0.1.53: 2934 SOA? example.com. (29)
> 16:32:39.184375 rule 16/0(match): pass in on ix1: 149.20.1.48.56246 >
> 10.53.0.1.53: 52875 SOA? example.com. (29)
> 16:32:39.185625 rule 16/0(match): pass in on ix1: 149.20.1.48.56246 >
> 10.53.0.1.53: 36289 SOA? example.com. (29)
> 16:32:39.186074 rule 16/0(match): pass in on ix1: 149.20.1.48.56246 >
> 10.53.0.1.53: 44049 SOA? example.com. (29)
> 16:32:39.186425 rule 16/0(match): pass in on ix1: 149.20.1.48.56246 >
> 10.53.0.1.53: 34410 SOA? example.com. (29)
> 16:32:39.186499 rule 16/0(match): pass in on ix1: 149.20.1.48.56246 >
> 10.53.0.1.53: 30923 SOA? example.com. (29)
> 16:32:39.188548 rule 16/0(match): pass in on ix1: 149.20.1.48.56246 >
> 10.53.0.1.53: 48814 SOA? example.com. (29)
>
> but only the first one appears on ix0, the outbound interface:
>
> 16:32:39.183211 IP 149.20.1.48.56246 > 10.53.0.1.53: 3215 SOA? example.com. 
> (29)
>
> The actual query order is random, so if the test is repeated a minute
> later, then 172.17.53.3 might be hit first, and then that one will
> make it through and the rest will disappear.  So it is not specific to
> any destination IP.
>
> It also only appears to occur when the UDP source port is the same
> across the connections.  (This is probably why TCP is not affected.)
>
> It does not appear related to state entries ("no state") doesn't help.
>
> If "sticky-address" is removed from the rdr, then one packet will make
> it through for each backend IP, instead of one total.
>
> What could be causing this?  It seems somehow related to 5-tuple
> non-uniqueness after the rdr, but that shouldn't be an issue for UDP;
> it should be treated as two connectionless packets from the same
> source for the same destination.
>
> (The query test source, 149.20.1.48 is an EDNS checker found at
> https://dnsflagday.net/2020/ .)
>
> Thanks for any advice!
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/