Hi all I got a setup with two OpenBSD boxes which both do a BGP-peering to our upstream internet provider and both provide redundancy to our internal LANs with CARP and pfsync.
The setup looks like the following: | $ext_if | $ext_if | (with BGP) | (with BGP) ......... ......... | r0a |............| r0b | |_______| $pfsync_if |_______| | | | | |a |b |a |b (a and b are connected to our two LAN segments and are CARPed. They're later on called $dmz_if and $lan_if.) So far this setup works fine. r0a is master and usually all traffic is routed over this machine. In case r0a goes down, r0b takes over. The BGP-Peering to our upstream on r0a has a "depend on carp0" and "depend on carp1" in its configuration. Further, both machines have a BGP-peering together over $pfsync_if. I verified my setup by: a.) unplugging either $dmz_if or $lan_if or both on r0a makes r0b taking over - everything works fine. Traffic will then get routed from LAN to r0b and to the internet. b.) switching off r0a makes r0b taking over - everything works fine. Traffic will then get routed from LAN to r0b and to the internet. In one case the fail-over does not work well: If the BGP-peering on r0a to the upstream goes down all traffic will be routed from r0a via $pfsync_if to r0b and to the upstream from there on. SSH and browsing through web pages with HTTP works that way. But downloads with HTTP or FTP do not work. As long as traffic gets routed from LAN via r0a to r0b every large download just stalls after a few kbytes. With tcpdump I found out that the first few kbytes make it through and afterwards ICMP host-unreachable messages will be generated. If I do a "pfctl -F rules" on both machines, then everything works fine. At the first place I thought about a blocking rule in PF and put at "log" option to very "block" rule I have on both machines. I made sure that pflogd is running and attached tcpdump to pflog0 and retried my fail-over tests. Apparently, no packets get blocked on both machines. For now I wonder why all downloads stall in the specific fail-over situation mentioned above. Rather small transmissions (browsing web pages, using SSH) work fine and I actually can't see a reason why downloads should stall. I attached my pf.conf below. I replaced all IP address with (empty). I added the altq stuff recently. While I did the fail-over tests mentioned above the altq stuff wasn't there. So I guess that altq has no impact on my problem. In the pf.conf there are actually only a few rules because the mentioned setup is not yet in productive business and there are only a few machines using it yet. Further, I attached a dmesg of r0a. Both my OpenBSD boxes are the same machine type, so one dmesg should be enough. I would appreciate to get some hints how I could further debug this problem. I worked some hours yet on that problem and so far I wasn't able to get any step further in solving it. With best regards, Thomas. ./. ### ## macros # interfaces ext_if="fxp0" pfsync_if="fxp1" lan_if="em0" lan_carp_if="carp0" dmz_if="fxp2" dmz_carp_if="carp1" # ranges UNIVERSE="0.0.0.0/0" EXT="(empty)" # transit range EXT_CLR="(empty)" # upstream router PFSYNC="(empty)" # pfsync range LAN="(empty)" DMZ="(empty)" VPN="(empty)" HOUSING="(empty)" # machines R1="(empty)" # router 1 R1_HOU="(empty)" TEMP_HOU="(empty)" # housings MATTENWEG="(empty)" CRAWFISH="(empty)" AWAY="(empty)" # services sVPN="{ 1194 }" sVNC="{ 5900, 5800 }" sRDP="{ 3389 }" sRETRO="{ 497 }" sFR="{ 8088 }" ### ## options set skip on lo0 # skip loopback set skip on $pfsync_if # skip pfsync set block-policy return # always return on block ### ## normalization scrub in all scrub out random-id ### ## Traffic Shaping altq on $ext_if bandwidth 100Mb cbq queue { lan_dparent, dmz_parent, prem_services } queue lan_dparent on $ext_if bandwidth 5Mb cbq { lan_out, lan_aout, voip_out } queue lan_out on $ext_if bandwidth 2.5Mb cbq queue lan_aout on $ext_if bandwidth 500Kb priority 6 cbq queue voip_out on $ext_if bandwidth 2Mb priority 7 cbq queue dmz_parent on $ext_if bandwidth 10Mb cbq {dmz_out, dmz_ack, hou_out} queue dmz_out on $ext_if bandwidth 7.5Mb cbq(default, borrow) queue dmz_ack on $ext_if bandwidth 500Kb priority 6 cbq queue hou_out on $ext_if bandwidth 2Mb cbq queue prem_services on $ext_if bandwidth 20Mb cbq { pp_out, tt_out } queue pp_out on $ext_if bandwidth 10Mb cbq(borrow) queue tt_out on $ext_if bandwidth 10Mb cbq altq on $lan_if bandwidth 100Mb cbq queue { lan_parent } queue lan_parent on $lan_if bandwidth 5Mb cbq { lan_std, lan_ack, voip_out } queue lan_std on $lan_if bandwidth 2.5Mb cbq(default, borrow) queue lan_ack on $lan_if bandwidth 500Kb priority 6 cbq queue voip_out on $lan_if bandwidth 2Mb priority 7 cbq altq on $dmz_if bandwidth 100Mb cbq queue { dmz_in, hou_in, pp_in, tt_in } queue dmz_in on $dmz_if bandwidth 10Mb cbq(default) queue hou_in on $dmz_if bandwidth 2Mb cbq queue pp_in on $dmz_if bandwidth 20Mb cbq queue tt_in on $dmz_if bandwidth 10Mb cbq ### ## NAT nat on $ext_if inet from $LAN to $UNIVERSE -> $dmz_carp_if nat on $lan_if inet from $VPN to $LAN -> carp0 ### ## rules - general # deny all traffic incoming and outgoing block return in all block return out all # antispoof antispoof for { $dmz_if $lan_if $ext_if } inet # allow CARP and pfsync pass quick on $lan_if proto carp keep state pass quick on $dmz_if proto carp keep state # allow ICMP pass in on { $ext_if, $dmz_if, $lan_if } inet proto icmp icmp-type 8 code 0 keep state ### ## rules - communication on ext. interface # allow us to reach the outside pass out quick on $ext_if from $LAN to $UNIVERSE modulate state queue ( lan_out, lan_aout ) pass out quick on $ext_if from $DMZ to $UNIVERSE modulate state queue ( dmz_out, dmz_ack ) pass out quick on $ext_if from $HOUSING to $UNIVERSE modulate state queue ( hou_out, dmz_ack ) pass out quick on $ext_if to $UNIVERSE modulate state # allow BGP peering pass in quick on $ext_if proto tcp from $EXT_CLR to $ext_if port bgp ### ## rules - communication on LAN interface # allow traffic to LAN pass out quick on $lan_if to $LAN keep state # allow LAN to talk to the outside, but not to pfsync block in quick on $lan_if from $LAN to $PFSYNC pass in quick on $lan_if from $LAN to $UNIVERSE keep state queue (lan_std, lan_ack) # allow LAN to use ssh and NTP pass in quick on $lan_if proto tcp from $LAN to $lan_if port ssh keep state pass in quick on $lan_if proto udp from $LAN to $lan_if port ntp keep state ### ## rules - communication on DMZ interface # allow traffic to DMZ pass out quick on $dmz_if to $DMZ keep state # allow DMZ to talk to the outside, but not to pfsync and LAN block in quick on $dmz_if from $DMZ to $PFSYNC block in quick on $dmz_if from $DMZ to $LAN pass in quick on $dmz_if from $DMZ to $UNIVERSE keep state queue (dmz_out, dmz_ack) # allow DMZ to use NTP pass in quick on $dmz_if proto udp from $DMZ to $dmz_if port ntp keep state # allow R1 to use BGP pass in quick on $dmz_if proto tcp from $R1 to $dmz_if port bgp keep state ### ## rules - vpn # allow VPN to reach LAN pass in quick on $dmz_if from $VPN to $LAN keep state pass out quick on $dmz_if from $LAN to $VPN keep state pass out quick on $dmz_if from $DMZ to $VPN keep state ### ## rules - housings # deny reaching our LAN and PFSYNC block in quick on $dmz_if from $HOUSING to $LAN block in quick on $dmz_if from $HOUSING to $PFSYNC pass in quick on $dmz_if from $R1_HOU to $UNIVERSE keep state pass out quick on $dmz_if from $UNIVERSE to $R1_HOU keep state # allow mattenweg to connect to the outside pass in quick on $dmz_if from $MATTENWEG to $UNIVERSE keep state pass out quick on $dmz_if from $UNIVERSE to $MATTENWEG keep state # allow crawfish to connect to the outside pass in quick on $dmz_if from $CRAWFISH to $UNIVERSE keep state pass out quick on $dmz_if from $UNIVERSE to $CRAWFISH keep state # allow away to connect to the outside pass in quick on $dmz_if from $AWAY to $UNIVERSE keep state pass out quick on $dmz_if from $UNIVERSE to $AWAY keep state # allow TEMP_HOU to connect to the outside pass in quick on $dmz_if from $TEMP_HOU to $UNIVERSE keep state pass out quick on $dmz_if from $UNIVERSE to $TEMP_HOU keep state ### ## rules - allowed services on servers # R1 pass in quick on $ext_if proto {tcp, udp} from $UNIVERSE to $R1 port $sVPN flags S/SA keep state # TEMP_HOU pass in quick on $ext_if proto tcp from $UNIVERSE to (empty) port 80 keep state # MATTENWEG mattenweg_tcp="{ ftp, ftp-data, 32000:32999, http, https, imap, pop3, smtp, 5800, 5900, 497, 3389, mysql, 4711, 28900, 14667, 90, pptp }" mattenweg_udp="{ 497, 14567:14570, 22000, 23000:23009, 27900 } " pass in quick on $ext_if proto tcp from $UNIVERSE to $MATTENWEG port $mattenweg_tcp flags S/SA keep state queue hou_in pass in quick on $ext_if proto udp from $UNIVERSE to $MATTENWEG port $mattenweg_udp keep state queue hou_in # CRAWFISH crawfish_tcp="{ ftp, ftp-data, 32000:32999, http, https, 497, mysql, 3389, 5800, 5900, 8088 }" crawfish_udp="{ 497 }" pass in quick on $ext_if proto tcp from $UNIVERSE to $CRAWFISH port $crawfish_tcp flags S/SA keep state pass in quick on $ext_if proto udp from $UNIVERSE to $CRAWFISH port $crawfish_udp keep state # AWAY away_tcp="{ domain, ftp, ftp-data, 32000:32999, http, https, imap, ntp, pop3, smtp, 5800, 5900, 3389, 8888 8383 } " away_udp="{ domain ntp 8888 } " pass in quick on $ext_if proto tcp from $UNIVERSE to $AWAY port $away_tcp flags S/SA keep state pass in quick on $ext_if proto udp from $UNIVERSE to $AWAY port $away_udp keep state ./. OpenBSD 3.9-stable (COMMELL-LE564) #0: Sun May 7 15:42:38 CEST 2006 [EMAIL PROTECTED]:/usr/local/src/flashboot-0.9beta1/obj/COMMELL-LE564 cpu0: VIA Samuel 2 ("CentaurHauls" 686-class) 533 MHz cpu0: FPU,DE,TSC,MSR,MTRR,PGE,MMX real mem = 131706880 (128620K) avail mem = 84967424 (82976K) using 1633 buffers containing 6688768 bytes (6532K) of memory mainbus0 (root) bios0 at mainbus0: AT/286+(51) BIOS, date 11/24/03, BIOS32 rev. 0 @ 0xfb590 apm0 at bios0: Power Management spec V1.2 apm0: AC on, battery charge unknown apm0: flags 70102 dobusy 1 doidle 1 pcibios0 at bios0: rev 2.1 @ 0xf0000/0xdef4 pcibios0: PCI IRQ Routing Table rev 1.0 @ 0xfde50/160 (8 entries) pcibios0: PCI Exclusive IRQs: 5 10 11 12 pcibios0: PCI Interrupt Router at 000:07:0 ("VIA VT82C596A ISA" rev 0x00) pcibios0: PCI bus #1 is the last bus bios0: ROM list: 0xc0000/0xc000 0xcc000/0x4000! cpu0 at mainbus0 pci0 at mainbus0 bus 0: configuration mode 1 (no bios) pchb0 at pci0 dev 0 function 0 "VIA VT8601 PCI" rev 0x05 ppb0 at pci0 dev 1 function 0 "VIA VT82C601 AGP" rev 0x00 pci1 at ppb0 bus 1 vga0 at pci1 dev 0 function 0 "Trident CyberBlade i1" rev 0x6a wsdisplay0 at vga0 mux 1: console (80x25, vt100 emulation) wsdisplay0: screen 1-5 added (80x25, vt100 emulation) pcib0 at pci0 dev 7 function 0 "VIA VT82C686 ISA" rev 0x40 pciide0 at pci0 dev 7 function 1 "VIA VT82C571 IDE" rev 0x06: ATA100, channel 0 configured to compatibility, channel 1 configured to compatibility wd0 at pciide0 channel 0 drive 0: <Industrial CF Card> wd0: 1-sector PIO, LBA, 250MB, 512000 sectors wd0(pciide0:0:0): using PIO mode 0 pciide0: channel 1 disabled (no drives) uhci0 at pci0 dev 7 function 2 "VIA VT83C572 USB" rev 0x1a: irq 10 usb0 at uhci0: USB revision 1.0 uhub0 at usb0 uhub0: VIA UHCI root hub, rev 1.00/1.00, addr 1 uhub0: 2 ports with 2 removable, self powered uhci1 at pci0 dev 7 function 3 "VIA VT83C572 USB" rev 0x1a: irq 10 usb1 at uhci1: USB revision 1.0 uhub1 at usb1 uhub1: VIA UHCI root hub, rev 1.00/1.00, addr 1 uhub1: 2 ports with 2 removable, self powered viaenv0 at pci0 dev 7 function 4 "VIA VT82C686 SMBus" rev 0x40 em0 at pci0 dev 16 function 0 "Intel PRO/1000MT (82540EM)" rev 0x02: irq 5, address 00:03:1d:02:5c:a7 fxp0 at pci0 dev 17 function 0 "Intel 8255x" rev 0x10, i82551: irq 12, address 00:03:1d:02:5c:a8 inphy0 at fxp0 phy 1: i82555 10/100 PHY, rev. 4 fxp1 at pci0 dev 18 function 0 "Intel 8255x" rev 0x10, i82551: irq 10, address 00:03:1d:02:5c:a9 inphy1 at fxp1 phy 1: i82555 10/100 PHY, rev. 4 fxp2 at pci0 dev 19 function 0 "Intel 8255x" rev 0x10, i82551: irq 11, address 00:03:1d:02:5c:aa inphy2 at fxp2 phy 1: i82555 10/100 PHY, rev. 4 isa0 at pcib0 pckbc0 at isa0 port 0x60/5 pckbd0 at pckbc0 (kbd slot) pckbc0: using irq 1 for kbd slot wskbd0 at pckbd0: console keyboard, using wsdisplay0 pcppi0 at isa0 port 0x61 spkr0 at pcppi0 midi0 at pcppi0: <PC speaker> npx0 at isa0 port 0xf0/16: using exception 16 pccom0 at isa0 port 0x3f8/8 irq 4: ns16550a, 16 byte fifo pccom1 at isa0 port 0x2f8/8 irq 3: ns16550a, 16 byte fifo biomask e7c5 netmask ffe5 ttymask ffe7 rd0: fixed, 61440 blocks pctr: user-level cycle counter enabled dkcsum: wd0 matches BIOS drive 0x80 root on rd0a rootdev=0x1100 rrootdev=0x2f00 rawdev=0x2f02