Re: carp master <-> backup problem
Hello i noticed that my netstat -s -p carp shows 1068 discarded for bad authentication My carp works okey otherwise, but should i worry about it ? how to debug it ? Bryan Irvine wrote: VVV 372 discarded for unknown vhid I know someone else already pointed it out but this is worth drawing your attention to as well. -B
Re: carp master <-> backup problem
VVV > 372 discarded for unknown vhid I know someone else already pointed it out but this is worth drawing your attention to as well. -B
Re: carp master <-> backup problem
Bryan Irvine wrote: I do believe preempt should be 1 on both servers. Let the advskew handle which one is primary. What do you see for output of 'netstat -s -p carp' and 'netstat -s -p pfsync' -B I tried it with both servers set to preempt=1, with the same results, but to double check I did it again. The results are identical to everything I posted previous, except (on the secondary server): $ sysctl net.inet.carp net.inet.carp.allow=1 net.inet.carp.preempt=1 net.inet.carp.log=2 Per your request: (on the primary:) $ netstat -s -p carp carp: 226 packets received (IPv4) 0 packets received (IPv6) 0 packets discarded for bad interface 0 packets discarded for wrong TTL 0 packets shorter than header 0 discarded for bad checksums 0 discarded packets with a bad version 0 discarded because packet too short 0 discarded for bad authentication 226 discarded for unknown vhid 0 discarded because of a bad address list 387 packets sent (IPv4) 0 packets sent (IPv6) 0 send failed due to mbuf memory error 1 transition to master (on the secondary:) $ netstat -s -p carp carp: 335 packets received (IPv4) 0 packets received (IPv6) 0 packets discarded for bad interface 0 packets discarded for wrong TTL 0 packets shorter than header 0 discarded for bad checksums 0 discarded packets with a bad version 0 discarded because packet too short 0 discarded for bad authentication 335 discarded for unknown vhid 0 discarded because of a bad address list 236 packets sent (IPv4) 0 packets sent (IPv6) 0 send failed due to mbuf memory error 1 transition to master This was done after a clean reboot (both) and my accessing the site from an external shell account I have (using lynx). The secondary still responds first, and when it is taken offline (halt -p), the primary does not take over (no answer). The primary only takes over normal duties when the hostname.carp0 file has been renamed on the secondary, the secondary has actually been rebooted and sh /etc/netstart has been run on the primary. After the secondary was taken offline, and sh /etc/netstart run on the primary, I accessed the site again (the primary is then the only carp node), and did this: (from the primary) $ netstat -s -p carp carp: 372 packets received (IPv4) 0 packets received (IPv6) 0 packets discarded for bad interface 0 packets discarded for wrong TTL 0 packets shorter than header 0 discarded for bad checksums 0 discarded packets with a bad version 0 discarded because packet too short 0 discarded for bad authentication 372 discarded for unknown vhid 0 discarded because of a bad address list 704 packets sent (IPv4) 0 packets sent (IPv6) 0 send failed due to mbuf memory error 1 transition to master As for output regarding pfsync, all values are zero because I do not use pfsync. It is a single firewall with two web servers internally, not a redundant firewall situation. No changes have been made to the firewall at all. I'm at my wits end for why this doesn't work. It *must* be something wrong with my config, as I just don't believe it's a "bug" in carp. This config is practically straight out of the FAQ so I'm at a total loss. :( FWIW, the pf.conf on the firewall uses these values (which normally work fine): (...) gw_ext=$ext_ip4 <-- my external IP addy for that web site, I have 5 IPs gw_int="192.168.0.9" <-- the carp node, or when not using carp, the primary web server #gw_int="192.168.0.19" <-- for when I manually switch to the secondary server gw_ports="{ 80, 443 }" int0_if="xl0" tcp_flags="flags S/SA modulate state" (...) not_private="{ \ !0.0.0.0/8, \ !10.0.0.0/8, \ !127.0.0.0/8, \ !169.254.0.0/16, \ !172.16.0.0/12, \ !192.8.2.0/24, \ !192.168.0.0/16, \ !240.0.0.0/4, \ !255.255.255.255/32 \ }" (...) rdr on $ext_if proto tcp from $not_private to $gw_ext port \ $gw_ports -> $gw_int (...) pass in log quick on $ext_if inet proto tcp from $not_private to $gw_int \ port $gw_ports flags S/SA synproxy state (...) pass out quick on $int0_if proto tcp from $not_private to $gw_int \ port $gw_ports $tcp_flags The firewall config has worked fine and hasn't been changed in ages, but I can't help wonder if something there is screwing up carp. Redoing and simplifying the fw rules (using tags) is next on my todo list, but I figured I'd get carp working first before changing a "known good" fw config and adding another change to the mix. -- -RSM http://www.erratic.ca
Re: carp master <-> backup problem
Peter Hessler wrote: On 2009 Oct 28 (Wed) at 01:55:40 -0400 (-0400), Scott wrote: :$ cat /etc/hostname.carp0: :inet 192.168.0.9 255.255.255.0 192.168.0.255 vhid 1 carpdev fxp0 -snip- :$ cat /etc/hostname.carp0 :inet 192.168.0.9 255.255.255.0 192.168.0.255 vhid 2 advbase 1 advskew :100 carpdev xl0 The vhids need to be identical. And therein lies the solution. I misunderstood the documents and thought that each carp node had a unique vhid. I've since tested with both online, the master offline, then put back, etc. and all works *perfectly* fine now! I knew it was my bad. Thank-you very much for pointing out my error, and to the others that helped out. I'm sorry for the noise. BTW: I forgot to mention this, but thanks to all the folks involved with 4.6. The CDs arrived just outside of Toronto on 19 Oct (Monday last week.) :) :) -- -RSM http://www.erratic.ca
Re: carp master <-> backup problem
On 2009 Oct 28 (Wed) at 01:55:40 -0400 (-0400), Scott wrote: :$ cat /etc/hostname.carp0: :inet 192.168.0.9 255.255.255.0 192.168.0.255 vhid 1 carpdev fxp0 -snip- :$ cat /etc/hostname.carp0 :inet 192.168.0.9 255.255.255.0 192.168.0.255 vhid 2 advbase 1 advskew :100 carpdev xl0 The vhids need to be identical. -- Legalize free-enterprise murder: why should governments have all the fun?
Re: carp master <-> backup problem
On 01:55, Wed 28 Oct 09, Scott wrote: > I must be missing something in my config, and I'd appreciate it if my > blunder could be pointed out to me. > [snip] Do you have pf enabled ? If so, make sure you allow carp traffic on the physical interface that runs carp. -- Michiel van Baak mich...@vanbaak.eu http://michiel.vanbaak.eu GnuPG key: http://pgp.mit.edu:11371/pks/lookup?op=get&search=0x71C946BD "Why is it drug addicts and computer aficionados are both called users?"
Re: carp master <-> backup problem
On Tue, Oct 27, 2009 at 10:55 PM, Scott wrote: > I must be missing something in my config, and I'd appreciate it if my > blunder could be pointed out to me. > > I have two web servers behind a firewall (all machines are running > 4.6-stable, generic kernel). The firewall has rdr & pass rules to both web > servers, with one commented out at a time. I change it manually when I want > to switch them. This same setup has been working fine since 4.4. > Generally, pf routes web traffic to the primary web server (192.168.0.9) > but sometimes I use it's twin at 192.168.0.19. > > Today I decided to try using carp to *not* load balance, but use the > primary and have the secondary kick in when I have the primary offline > for maintenance instead of me changing the pf rule by hand. Simple > enough. I read the man pages for carp and ifconfig, and read the > example in the FAQ. (This will eventually be load balanced in the > future if I can get MySQL clustering to work on OpenBSD... haven't tried > that yet.) > > The problem is that when I access my site from an external account, my > primary never gets used, the secondary takes all connections, and to make it > worse, if the secondary (which is being used first) is taken offline, the > primary doesn't even get touched. I have to delete the carp i/f on the > secondary and reboot the primary for web access to go back to normal. > > On the primary web server: > > $ sysctl net.inet.carp > net.inet.carp.allow=1 > net.inet.carp.preempt=1 > net.inet.carp.log=2 > > $ cat /etc/hostname.carp0: > inet 192.168.0.9 255.255.255.0 192.168.0.255 vhid 1 carpdev fxp0 > > $ cat /etc/hostname.fxp0 > inet 192.168.0.2 255.255.255.0 NONE media 100baseTX mediaopt full-duplex > inet alias 192.168.0.9 255.255.255.0 > inet alias 192.168.0.10 255.255.255.0 > inet alias 192.168.0.11 255.255.255.0 > inet alias 192.168.0.12 255.255.255.0 > inet alias 192.168.0.13 255.255.255.0 > > $ ifconfig carp0 > carp0: flags=8843 mtu 1500 > lladdr 00:00:5e:00:01:01 > priority: 0 > carp: MASTER carpdev fxp0 vhid 1 advbase 1 advskew 0 > groups: carp > inet6 fe80::200:5eff:fe00:101%carp0 prefixlen 64 scopeid 0x5 > inet 192.168.0.9 netmask 0xff00 broadcast 192.168.0.255 > > > On the secondary web server: > > $ sysctl net.inet.carp > net.inet.carp.allow=1 > net.inet.carp.preempt=0 > net.inet.carp.log=2 > > $ cat /etc/hostname.carp0 > inet 192.168.0.9 255.255.255.0 192.168.0.255 vhid 2 advbase 1 advskew > 100 carpdev xl0 > > $ cat /etc/hostname.xl0 > inet 192.168.0.3 255.255.255.0 NONE media 100baseTX mediaopt full-duplex > inet alias 192.168.0.20 255.255.255.0 > inet alias 192.168.0.21 255.255.255.0 > inet alias 192.168.0.22 255.255.255.0 > inet alias 192.168.0.23 255.255.255.0 > > $ ifconfig carp0 > carp0: flags=8843 mtu 1500 > lladdr 00:00:5e:00:01:02 > priority: 0 > carp: MASTER carpdev xl0 vhid 2 advbase 1 advskew 100 > groups: carp > inet6 fe80::200:5eff:fe00:102%carp0 prefixlen 64 scopeid 0x5 > inet 192.168.0.9 netmask 0xff00 broadcast 192.168.0.255 > > > I have tried making slight changes to the hostname files, such as > including "advbase 1 advskew 1" to the primary, adding and removing the > alias for .9 on the master, changing preempt=1 on the secondary, and none of > it makes any difference. I continually see what (I think) should be the > backup on the secondary server shown as a master (above), and it takes all > the web server connections. Other than my carp experiments, everything > works perfectly fine. I must be missing > something, somewhere, but I'm out of clues. Any pointers in the right > direction would be appreciated, > Thanks. > > -- > > -RSM > > I do believe preempt should be 1 on both servers. Let the advskew handle which one is primary. What do you see for output of 'netstat -s -p carp' and 'netstat -s -p pfsync' -B
Re: carp master <-> backup problem
Marco Pfatschbacher wrote: Hi, I actually didn't read your entire mail.. but: Having 192.168.0.9 on both the physical and the carp interface cannot really work. Thanks for trying! Unfortunately, I tried that as well (and double checked it again after your reply) where the carp IP is not assigned anywhere else. Still the problem remains: the backup (secondary server) insists on being the master, and it is given priority when the firewall sends web traffic to the 192.168.0.9 address. Unfortunately, the ifconfig output with both machines reading "MASTER" remains 100% identical to those in my original message, so I've ruled out that it's somehow a problem with the addresses being aliases. I still have to mv the /etc/hostname.carp0 file to anything else and reboot for web traffic to flow to the primary server. Grr. -- -RSM http://www.erratic.ca
carp master <-> backup problem
I must be missing something in my config, and I'd appreciate it if my blunder could be pointed out to me. I have two web servers behind a firewall (all machines are running 4.6-stable, generic kernel). The firewall has rdr & pass rules to both web servers, with one commented out at a time. I change it manually when I want to switch them. This same setup has been working fine since 4.4. Generally, pf routes web traffic to the primary web server (192.168.0.9) but sometimes I use it's twin at 192.168.0.19. Today I decided to try using carp to *not* load balance, but use the primary and have the secondary kick in when I have the primary offline for maintenance instead of me changing the pf rule by hand. Simple enough. I read the man pages for carp and ifconfig, and read the example in the FAQ. (This will eventually be load balanced in the future if I can get MySQL clustering to work on OpenBSD... haven't tried that yet.) The problem is that when I access my site from an external account, my primary never gets used, the secondary takes all connections, and to make it worse, if the secondary (which is being used first) is taken offline, the primary doesn't even get touched. I have to delete the carp i/f on the secondary and reboot the primary for web access to go back to normal. On the primary web server: $ sysctl net.inet.carp net.inet.carp.allow=1 net.inet.carp.preempt=1 net.inet.carp.log=2 $ cat /etc/hostname.carp0: inet 192.168.0.9 255.255.255.0 192.168.0.255 vhid 1 carpdev fxp0 $ cat /etc/hostname.fxp0 inet 192.168.0.2 255.255.255.0 NONE media 100baseTX mediaopt full-duplex inet alias 192.168.0.9 255.255.255.0 inet alias 192.168.0.10 255.255.255.0 inet alias 192.168.0.11 255.255.255.0 inet alias 192.168.0.12 255.255.255.0 inet alias 192.168.0.13 255.255.255.0 $ ifconfig carp0 carp0: flags=8843 mtu 1500 lladdr 00:00:5e:00:01:01 priority: 0 carp: MASTER carpdev fxp0 vhid 1 advbase 1 advskew 0 groups: carp inet6 fe80::200:5eff:fe00:101%carp0 prefixlen 64 scopeid 0x5 inet 192.168.0.9 netmask 0xff00 broadcast 192.168.0.255 On the secondary web server: $ sysctl net.inet.carp net.inet.carp.allow=1 net.inet.carp.preempt=0 net.inet.carp.log=2 $ cat /etc/hostname.carp0 inet 192.168.0.9 255.255.255.0 192.168.0.255 vhid 2 advbase 1 advskew 100 carpdev xl0 $ cat /etc/hostname.xl0 inet 192.168.0.3 255.255.255.0 NONE media 100baseTX mediaopt full-duplex inet alias 192.168.0.20 255.255.255.0 inet alias 192.168.0.21 255.255.255.0 inet alias 192.168.0.22 255.255.255.0 inet alias 192.168.0.23 255.255.255.0 $ ifconfig carp0 carp0: flags=8843 mtu 1500 lladdr 00:00:5e:00:01:02 priority: 0 carp: MASTER carpdev xl0 vhid 2 advbase 1 advskew 100 groups: carp inet6 fe80::200:5eff:fe00:102%carp0 prefixlen 64 scopeid 0x5 inet 192.168.0.9 netmask 0xff00 broadcast 192.168.0.255 I have tried making slight changes to the hostname files, such as including "advbase 1 advskew 1" to the primary, adding and removing the alias for .9 on the master, changing preempt=1 on the secondary, and none of it makes any difference. I continually see what (I think) should be the backup on the secondary server shown as a master (above), and it takes all the web server connections. Other than my carp experiments, everything works perfectly fine. I must be missing something, somewhere, but I'm out of clues. Any pointers in the right direction would be appreciated, Thanks. -- -RSM