Hi list!
We're playing around with two 4.6 boxes, running carp and relayd. We
successfully got a basic DSR setup running, and it seems to be working
fine! However, when failing over to the secondary box, it fails.
All inbound packets goes nicely through the box, and return packets
from the Linux server directly back to the router. Long running
sessions doesn't seem to be a problem either.
The current setup consists of a clean PF, with only anchor-rules for
relayd, and the follwing relayd conf:
fle_vip="10.0.0.40"
fle2="10.0.0.42"
table <fle> { $fle2 }
redirect fle {
listen on $fle_vip port 443 interface vlan412
route to <fle> check tcp interface vlan413
}
As simple as it gets! Anyway, as I said, that part works fine. The
state i see after a connection have been established is the following:
all tcp 10.0.0.40:443 <- 192.168.0.1:50786 ESTABLISHED:ESTABLISHED
[0 + 1] [1035366774 + 2]
age 00:00:09, expires in 00:09:59, 514:0 pkts, 30678:0 bytes,
anchor 0, rule 0, sloppy
I see this state on both boxes, so pfsync is working properly.
When I demote the master, and the backup takes over, the TCP
connection gets terminated immediately. Looking at the state, it goes
into TIME_WAIT on both boxes:
all tcp 10.0.0.40:443 <- 192.168.0.1:50786 TIME_WAIT:TIME_WAIT
[0 + 1] [1035366774 + 2]
age 00:00:18, expires in 00:02:59, 1221:0 pkts, 67459:0 bytes,
anchor 0, rule 0, sloppy
Looking at the packets, I see the following on the incoming interface
on the master before i failover:
09:33:04.941171 192.68.0.1.50786 > 10.0.0.40.443: . ack 3071549 win
33124 <nop,nop,timestamp 501213487 164423091>
09:33:04.942591 192.68.0.1.50786 > 10.0.0.40.443: . ack 3072750 win
32523 <nop,nop,timestamp 501213487 164423091>
Those where the last packets seen before failover, and immediately
after failover this is what I see on the slave:
09:33:05.601850 192.168.0.1.50786 > 10.0.0.40.443: . ack 4221828731
win 32448 <nop,nop,timestamp 501213494 164423759>
09:33:05.601865 10.0.0.40.13.443 > 192.168.0.1.50786: R
4221828731:4221828731(0) win 0 (DF)
and a bunch more of those, ACKs responded to with RSTs.
As there are no other rules in pf, there shouldnt be anything
explicitly dropped at least. I'm suspecting something fishy with the
states or something.. I've tried pfctl -x loud, but it doesn't say
anything.
Does anyone have any clues about what the problem could be? Googling
the subject doesn't give much hits on the subject, except for the
undeadly article and the original commits, so I suspect there aren't
that many users/experimenters of this yet.. :)
Thanks for any input!
Best regards
Johan