On 14-3-2012 10:43, Kapetanakis Giannis wrote: >> While heavily demoted, it still assumes the master role. I guess it's >> not seeing the carp announcements from firewall-2 at all. Do you use >> spanning tree in the network? > > Yes. The latest change which I did on the switch where the firewalls are > connected is adding: > spanning-tree portfast trunk > spanning-tree bpdufilter enable > in order to startup the port faster. Don't know if this is causing the > problem, cause now the ports are coming up really fast. They used to > come up after 1 minute.
Fast is good. >> How many states do you typically have? The bulk pfsync is taking a >> really long time here... 4 minutes. Any errors on the pfsync interface? >> What speed is it? > I usually have around 90k states (pfctl -ss |wc -l) > On both firewalls it's 1Gbps > media: Ethernet autoselect (1000baseT full-duplex,rxpause,txpause) > media: Ethernet autoselect (1000baseT full-duplex,master,rxpause,txpause) > > # netstat -id > > Name Mtu Network Address Ipkts Ierrs > Opkts Oerrs Colls Drop > > em2(sync_if_f1) 1500<Link> 00:19:99:98:e4:ea 682406 225 > 255969304 0 0 0 > bge1(sync_if_f2) 1500<Link> 00:0a:e4:80:73:3d 387753797 461 > 1152887 0 0 0 Hmm, 225 errors on 682406 packets is low, but a bit higher then I would expect to see. > > f1# netstat -s > carp: > 12 packets received (IPv4) > 0 packets received (IPv6) > 0 packets discarded for bad interface > 0 packets discarded for wrong TTL > 0 packets shorter than header > 0 discarded for bad checksums > 0 discarded packets with a bad version > 0 discarded because packet too short > 0 discarded for bad authentication > 0 discarded for unknown vhid > 0 discarded because of a bad address list > 1586084 packets sent (IPv4) > 0 packets sent (IPv6) > 0 send failed due to mbuf memory error > 8 transitions to master > pfsync: > 682381 packets received (IPv4) > 0 packets received (IPv6) > 0 packets discarded for bad interface > 0 packets discarded for bad ttl > 0 packets shorter than header > 0 packets discarded for bad version > 0 packets discarded for bad HMAC > 0 packets discarded for bad action > 0 packets discarded for short packet > 0 states discarded for bad values > 88 stale states > 809627 failed state lookup/inserts > 256080550 packets sent (IPv4) > 0 packets sent (IPv6) > 0 send failed due to mbuf memory error > 0 send error This is not from just after the reboot right? The "failed state lookup/inserts" might be interesting just after the firewalls have stabilized. > f2# netstat -s > carp: > 2236176 packets received (IPv4) > 0 packets received (IPv6) > 0 packets discarded for bad interface > 0 packets discarded for wrong TTL > 0 packets shorter than header > 0 discarded for bad checksums > 0 discarded packets with a bad version > 0 discarded because packet too short > 0 discarded for bad authentication > 0 discarded for unknown vhid > 0 discarded because of a bad address list > 460 packets sent (IPv4) > 0 packets sent (IPv6) > 0 send failed due to mbuf memory error > 12 transitions to master > pfsync: > 387828563 packets received (IPv4) > 0 packets received (IPv6) > 0 packets discarded for bad interface > 0 packets discarded for bad ttl > 0 packets shorter than header > 0 packets discarded for bad version > 0 packets discarded for bad HMAC > 0 packets discarded for bad action > 0 packets discarded for short packet > 0 states discarded for bad values > 435 stale states > 1173653 failed state lookup/inserts > 1152819 packets sent (IPv4) > 0 packets sent (IPv6) > 0 send failed due to mbuf memory error > 0 send error > > > >> What does your ifstated.conf look like? >> > > ifstated runs only on primary firewall. > Primary firewall runs with advbase 1 advskew 10 > secondary firewall runs with advbase 1 advskew 100 > > carp_up = "carp0.link.up&& carp1.link.up&& carp2.link.up&& > carp3.link.up" > carp_down = "!carp0.link.up&& !carp1.link.up&& !carp2.link.up&& > !carp3.link.up" > carp_sync = "carp0.link.up&& carp1.link.up&& carp2.link.up&& > carp3.link.up || \ > !carp0.link.up&& !carp1.link.up&& !carp2.link.up&& !carp3.link.up" > > # check remote gateways > net = '( "ping -q -c 1 -w 1 aaa.aaa.aaa.aaa> /dev/null" every 10&& \ > "ping -q -c 1 -w 1 bbb.bbb.bbb.bbb> /dev/null" every 10&& \ > "ping -q -c 1 -w 1 ccc.ccc.ccc.ccc> /dev/null" every 10&& \ > "ping -q -c 1 -w 1 ddd.ddd.ddd.ddd> /dev/null" every 10)' > > # check firewall-2 > peer = '( "ping -q -c 1 -w 1 eee.eee.eee.eee> /dev/null" every 10 )' > > state auto { > if $carp_up > set-state primary > if $carp_down > set-state backup > } > > state primary { > init { > run "ifconfig carp0 advskew 10" > run "ifconfig carp1 advskew 10" > run "ifconfig carp2 advskew 10" > run "ifconfig carp3 advskew 10" > } > if ! $net > set-state demoted > } > > state demoted { > init { > run "ifconfig carp0 advskew 200" > run "ifconfig carp1 advskew 200" > run "ifconfig carp2 advskew 200" > run "ifconfig carp3 advskew 200" > } > if $net > set-state primary > } > > state promoted { > init { > run "ifconfig carp0 advskew 101" > run "ifconfig carp1 advskew 101" > run "ifconfig carp2 advskew 101" > run "ifconfig carp3 advskew 101" > } > if $net > set-state primary > if ! $net&& $peer > set-state backup > } > > state backup { > init { > run "ifconfig carp0 advskew 254" > run "ifconfig carp1 advskew 254" > run "ifconfig carp2 advskew 254" > run "ifconfig carp3 advskew 254" > } > # The "sleep 5" below is a hack to dampen the $carp_sync when we come > # out of promoted state. Thinking about the correct fix... > if ! $carp_sync&& $net&& "sleep 5" every 10 > if ! $carp_sync&& $net > set-state promoted > } I would not muck with the advskew like that anymore. The demotion based on linkstate works automatically now. If you really like to keep the ping test just use "ifconfig -g carp" for the demotion and promotion.