Graeme Fowler wrote: > Gerry > > On Thu, 2007-08-02 at 14:49 -0400, Gerry Reno wrote: >> I would like to know how to make LVS reliable even when taking servers >> down for maintenance. > > I think you need to back up a bit and take stock. > > Firstly, keepalived is not LVS. It's a combined VRRP implementation, > healthcheck subsystem and comprehensive LVS configuration system. It has > its' own mailing list, the details of which you'll find at > http://www.keepalived.org/ - several of your questions have wider remit > than just LVS and although the two lists overlap, the union of the two > areas is not completely inclusive of both. > > I think you need to understand a bit about L2 networks before proceeding > (and pardon me if you do already). When you restart keepalived and it > becomes MASTER for a given vrrp_instance, it will send gratuitous ARP > packets out on the local LAN which say, in effect, "$VIP has MAC address > so-and-so". Any systems listening which honour GARP will flush their ARP > cache and put the relevant MAC/IP pair in there. > > If you stop the master, the backup *should* transition to MASTER state > and send out GARPs for $VIP. The same thing should happen. > > In your case, this does not appear to be true. Do you have the same > firewall rules in place on both master and backup directors? Does the > backup make the transition properly (see the logs)? In the state where > the master is down and backup is MASTER (IYSWIM), can you see traffic on > the external interface on the backup? What does your router's ARP cache > contain at that moment? > > For now, that'll do. We'll move onto LVS when you have keepalived/VRRP > behaving as you want it to. > > Graeme > > Hi Graeme, This is all LVS-DR and I admit I am no network expert. But I do think I understand the basic concepts of how LVS functions. So here goes at some basic information of my setup:
FIREWALLS: both MASTER and BACKUP are identical: [EMAIL PROTECTED] keepalived]# service iptables status Table: filter Chain INPUT (policy ACCEPT) num target prot opt source destination 1 RH-Firewall-1-INPUT 0 -- 0.0.0.0/0 0.0.0.0/0 Chain FORWARD (policy ACCEPT) num target prot opt source destination 1 REJECT 0 -- 0.0.0.0/0 0.0.0.0/0 reject-with icmp-host-prohibited Chain OUTPUT (policy ACCEPT) num target prot opt source destination Chain RH-Firewall-1-INPUT (1 references) num target prot opt source destination 1 ACCEPT 0 -- 0.0.0.0/0 224.0.0.18 2 ACCEPT 0 -- 0.0.0.0/0 0.0.0.0/0 3 ACCEPT icmp -- 0.0.0.0/0 0.0.0.0/0 icmp type 255 4 ACCEPT esp -- 0.0.0.0/0 0.0.0.0/0 5 ACCEPT ah -- 0.0.0.0/0 0.0.0.0/0 6 ACCEPT udp -- 0.0.0.0/0 224.0.0.251 udp dpt:5353 7 ACCEPT udp -- 0.0.0.0/0 0.0.0.0/0 udp dpt:631 8 ACCEPT tcp -- 0.0.0.0/0 0.0.0.0/0 tcp dpt:631 9 ACCEPT 0 -- 0.0.0.0/0 0.0.0.0/0 state RELATED,ESTABLISHED 10 ACCEPT tcp -- 0.0.0.0/0 0.0.0.0/0 state NEW tcp dpt:22 11 ACCEPT tcp -- 0.0.0.0/0 0.0.0.0/0 state NEW tcp dpt:443 12 ACCEPT tcp -- 0.0.0.0/0 0.0.0.0/0 state NEW tcp dpt:80 13 ACCEPT tcp -- 0.0.0.0/0 0.0.0.0/0 state NEW tcp dpts:1010:1023 14 ACCEPT tcp -- 0.0.0.0/0 0.0.0.0/0 state NEW tcp dpt:904 15 REJECT 0 -- 0.0.0.0/0 0.0.0.0/0 reject-with icmp-host-prohibited CONFIGS: vrrp_instance VI_1 { state MASTER # state BACKUP interface eth0 track_interface { eth0 } lvs_sync_daemon_interface eth0 virtual_router_id 25 priority 150 # MASTER # priority 100 # BACKUP advert_int 2 authentication { auth_type PASS auth_pass tps } virtual_ipaddress { 192.168.1.240/24 } notify_master "/etc/keepalived/manage_ip_lvs_dr del" notify_backup "/etc/keepalived/manage_ip_lvs_dr add" notify_fault "/etc/keepalived/manage_ip_lvs_dr add" } virtual_server 192.168.1.240 22 { ... real_server 192.168.1.200 22 { ... TCP_CHECK } real_server 192.168.1.201 22 { ... TCP_CHECK } } virtual_server 192.168.1.240 80 { ... real_server 192.168.1.200 80 { ... TCP_CHECK } real_server 192.168.1.201 80 { ... TCP_CHECK } } virtual_server 192.168.1.240 443 { ... real_server 192.168.1.200 443 { ... TCP_CHECK } real_server 192.168.1.201 443 { ... TCP_CHECK } } NOTIFY SCRIPT ACTIONS: case del: rsh ALL_RS ip addr add 192.168.1.240/32 dev lo brd + scope host rsh ALL_RS echo "1" > /proc/sys/net/ipv4/conf/eth0/arp_ignore rsh ALL_RS echo "2" > /proc/sys/net/ipv4/conf/eth0/arp_announce rsh ALL_RS route del default rsh ALL_RS route add default gw 192.168.1.1 case add: ip addr add 192.168.1.240/32 dev lo brd + scope host TESTCASES: 1. manual failover to backup: on MASTER: service keepalived stop MASTER log: Aug 2 16:08:16 grp-01-00-50 Keepalived: Terminating on signal Aug 2 16:08:16 grp-01-00-50 Keepalived_vrrp: Terminating VRRP child process on signal Aug 2 16:08:16 grp-01-00-50 Keepalived_vrrp: VRRP_Instance(VI_1) removing protocol VIPs. Aug 2 16:08:16 grp-01-00-50 Keepalived_healthcheckers: Netlink reflector reports IP 192.168.1.240 removed Aug 2 16:08:16 grp-01-00-50 kernel: IPVS: stopping sync thread 11674 ... Aug 2 16:08:16 grp-01-00-50 Keepalived_healthcheckers: Terminating Healthchecker child process on signal Aug 2 16:08:16 grp-01-00-50 Keepalived: Stopping Keepalived v1.1.13 (03/26,2007) Aug 2 16:08:16 grp-01-00-50 kernel: IPVS: sync thread stopped! BACKUP log: Aug 2 16:08:17 grp-01-00-51 Keepalived_vrrp: VRRP_Instance(VI_1) Transition to MASTER STATE Aug 2 16:08:19 grp-01-00-51 Keepalived_vrrp: VRRP_Instance(VI_1) Entering MASTER STATE Aug 2 16:08:19 grp-01-00-51 Keepalived_vrrp: VRRP_Instance(VI_1) setting protocol VIPs. Aug 2 16:08:19 grp-01-00-51 Keepalived_vrrp: VRRP_Instance(VI_1) Sending gratuitous ARPs on eth0 for 192.168.1.240 Aug 2 16:08:19 grp-01-00-51 Keepalived_healthcheckers: Netlink reflector reports IP 192.168.1.240 added Aug 2 16:08:19 grp-01-00-51 Keepalived_vrrp: Netlink: skipping nl_cmd msg... Aug 2 16:08:19 grp-01-00-51 avahi-daemon[2053]: Registering new address record for 192.168.1.240 on eth0.IPv4. Aug 2 16:08:19 grp-01-00-51 kernel: IPVS: stopping sync thread 2689 ... Aug 2 16:08:19 grp-01-00-51 root: /etc/keepalived/manage_ip_lvs_dr (caller: keepalived): rsh 192.168.1.200 /sbin/ip addr add 192.168.1.240/32 dev lo brd + scope host Aug 2 16:08:19 grp-01-00-51 root: /etc/keepalived/manage_ip_lvs_dr (caller: keepalived): RTNETLINK answers: File exists Aug 2 16:08:19 grp-01-00-51 root: /etc/keepalived/manage_ip_lvs_dr (caller: keepalived): rsh 192.168.1.200 echo "1" > /proc/sys/net/ipv4/conf/eth0/arp_ignore Aug 2 16:08:20 grp-01-00-51 root: /etc/keepalived/manage_ip_lvs_dr (caller: keepalived): rsh 192.168.1.200 echo "2" > /proc/sys/net/ipv4/conf/eth0/arp_announce Aug 2 16:08:20 grp-01-00-51 kernel: IPVS: sync thread stopped! Aug 2 16:08:24 grp-01-00-51 Keepalived_vrrp: VRRP_Instance(VI_1) Sending gratuitous ARPs on eth0 for 192.168.1.240 Aug 2 16:08:24 grp-01-00-51 root: /etc/keepalived/manage_ip_lvs_dr (caller: keepalived): rsh 192.168.1.200 "/sbin/route del default; /sbin/route add default gw 192.168.1.1" Aug 2 16:08:24 grp-01-00-51 root: /etc/keepalived/manage_ip_lvs_dr (caller: keepalived): bash: /sbin/route del default; /sbin/route add default gw 192.168.1.1: No such file or directory Aug 2 16:08:25 grp-01-00-51 root: /etc/keepalived/manage_ip_lvs_dr (caller: keepalived): rsh 192.168.1.201 /sbin/ip addr add 192.168.1.240/32 dev lo brd + scope host Aug 2 16:08:25 grp-01-00-51 root: /etc/keepalived/manage_ip_lvs_dr (caller: keepalived): RTNETLINK answers: File exists Aug 2 16:08:25 grp-01-00-51 root: /etc/keepalived/manage_ip_lvs_dr (caller: keepalived): rsh 192.168.1.201 echo "1" > /proc/sys/net/ipv4/conf/eth0/arp_ignore Aug 2 16:08:25 grp-01-00-51 root: /etc/keepalived/manage_ip_lvs_dr (caller: keepalived): rsh 192.168.1.201 echo "2" > /proc/sys/net/ipv4/conf/eth0/arp_announce Aug 2 16:08:25 grp-01-00-51 root: /etc/keepalived/manage_ip_lvs_dr (caller: keepalived): rsh 192.168.1.201 "/sbin/route del default; /sbin/route add default gw 192.168.1.1" Aug 2 16:08:31 grp-01-00-51 root: /etc/keepalived/manage_ip_lvs_dr (caller: keepalived): bash: /sbin/route del default; /sbin/route add default gw 192.168.1.1: No such file or directory Aug 2 16:08:31 grp-01-00-51 root: /etc/keepalived/manage_ip_lvs_dr (caller: keepalived): ip addr del 192.168.1.240/32 dev lo Aug 2 16:08:31 grp-01-00-51 root: /etc/keepalived/manage_ip_lvs_dr (caller: keepalived): RTNETLINK answers: Cannot assign requested address Aug 2 16:08:47 grp-01-00-51 ntpd[1839]: time reset -0.469433 s Aug 2 16:09:26 grp-01-00-51 ntpd[1839]: Listening on interface #7 eth0, 192.168.1.240#123 Enabled RESULT: SUCCESS 2. restart MASTER on MASTER: service keepalived start MASTER log: Aug 2 16:11:50 grp-01-00-50 Keepalived: Starting Keepalived v1.1.13 (03/26,2007) Aug 2 16:11:50 grp-01-00-50 Keepalived_healthcheckers: Using MII-BMSR NIC polling thread... Aug 2 16:11:50 grp-01-00-50 Keepalived_healthcheckers: Netlink reflector reports IP 192.168.1.150 added Aug 2 16:11:50 grp-01-00-50 Keepalived_healthcheckers: Registering Kernel netlink reflector Aug 2 16:11:50 grp-01-00-50 Keepalived_healthcheckers: Registering Kernel netlink command channel Aug 2 16:11:50 grp-01-00-50 Keepalived: Starting Healthcheck child process, pid=11934 Aug 2 16:11:50 grp-01-00-50 Keepalived_vrrp: Using MII-BMSR NIC polling thread... Aug 2 16:11:50 grp-01-00-50 Keepalived_vrrp: Netlink reflector reports IP 192.168.1.150 added Aug 2 16:11:50 grp-01-00-50 Keepalived_vrrp: Registering Kernel netlink reflector Aug 2 16:11:50 grp-01-00-50 Keepalived_vrrp: Registering Kernel netlink command channel Aug 2 16:11:50 grp-01-00-50 Keepalived_vrrp: Registering gratutious ARP shared channel Aug 2 16:11:50 grp-01-00-50 Keepalived_vrrp: Configuration is using : 35690 Bytes Aug 2 16:11:50 grp-01-00-50 Keepalived: Starting VRRP child process, pid=11935 Aug 2 16:11:50 grp-01-00-50 kernel: IPVS: sync thread started: state = MASTER, mcast_ifn = eth0, syncid = 25 Aug 2 16:11:50 grp-01-00-50 Keepalived_healthcheckers: Configuration is using : 20835 Bytes Aug 2 16:11:50 grp-01-00-50 Keepalived_healthcheckers: Activating healtchecker for service [192.168.1.200:22] Aug 2 16:11:50 grp-01-00-50 Keepalived_healthcheckers: Activating healtchecker for service [192.168.1.201:22] Aug 2 16:11:50 grp-01-00-50 Keepalived_healthcheckers: Activating healtchecker for service [192.168.1.200:80] Aug 2 16:11:50 grp-01-00-50 Keepalived_healthcheckers: Activating healtchecker for service [192.168.1.201:80] Aug 2 16:11:50 grp-01-00-50 Keepalived_healthcheckers: Activating healtchecker for service [192.168.1.200:443] Aug 2 16:11:50 grp-01-00-50 Keepalived_healthcheckers: Activating healtchecker for service [192.168.1.201:443] Aug 2 16:11:50 grp-01-00-50 Keepalived_vrrp: VRRP sockpool: [ifindex(2), proto(112), fd(8,9)] Aug 2 16:11:51 grp-01-00-50 Keepalived_vrrp: VRRP_Instance(VI_1) Transition to MASTER STATE Aug 2 16:11:53 grp-01-00-50 Keepalived_vrrp: VRRP_Instance(VI_1) Entering MASTER STATE Aug 2 16:11:53 grp-01-00-50 Keepalived_vrrp: VRRP_Instance(VI_1) setting protocol VIPs. Aug 2 16:11:53 grp-01-00-50 Keepalived_vrrp: VRRP_Instance(VI_1) Sending gratuitous ARPs on eth0 for 192.168.1.240 Aug 2 16:11:53 grp-01-00-50 Keepalived_vrrp: Netlink: skipping nl_cmd msg... Aug 2 16:11:53 grp-01-00-50 Keepalived_healthcheckers: Netlink reflector reports IP 192.168.1.240 added Aug 2 16:11:53 grp-01-00-50 root: /etc/keepalived/manage_ip_lvs_dr (caller: keepalived): rsh 192.168.1.200 /sbin/ip addr add 192.168.1.240/32 dev lo brd + scope host Aug 2 16:11:53 grp-01-00-50 root: /etc/keepalived/manage_ip_lvs_dr (caller: keepalived): RTNETLINK answers: File exists Aug 2 16:11:53 grp-01-00-50 root: /etc/keepalived/manage_ip_lvs_dr (caller: keepalived): rsh 192.168.1.200 echo "1" > /proc/sys/net/ipv4/conf/eth0/arp_ignore Aug 2 16:11:54 grp-01-00-50 root: /etc/keepalived/manage_ip_lvs_dr (caller: keepalived): rsh 192.168.1.200 echo "2" > /proc/sys/net/ipv4/conf/eth0/arp_announce Aug 2 16:11:54 grp-01-00-50 root: /etc/keepalived/manage_ip_lvs_dr (caller: keepalived): rsh 192.168.1.200 "/sbin/route del default; /sbin/route add default gw 192.168.1.1" Aug 2 16:11:58 grp-01-00-50 Keepalived_vrrp: VRRP_Instance(VI_1) Sending gratuitous ARPs on eth0 for 192.168.1.240 Aug 2 16:11:59 grp-01-00-50 root: /etc/keepalived/manage_ip_lvs_dr (caller: keepalived): bash: /sbin/route del default; /sbin/route add default gw 192.168.1.1: No such file or directory Aug 2 16:11:59 grp-01-00-50 root: /etc/keepalived/manage_ip_lvs_dr (caller: keepalived): rsh 192.168.1.201 /sbin/ip addr add 192.168.1.240/32 dev lo brd + scope host Aug 2 16:12:00 grp-01-00-50 root: /etc/keepalived/manage_ip_lvs_dr (caller: keepalived): RTNETLINK answers: File exists Aug 2 16:12:00 grp-01-00-50 root: /etc/keepalived/manage_ip_lvs_dr (caller: keepalived): rsh 192.168.1.201 echo "1" > /proc/sys/net/ipv4/conf/eth0/arp_ignore Aug 2 16:12:00 grp-01-00-50 root: /etc/keepalived/manage_ip_lvs_dr (caller: keepalived): rsh 192.168.1.201 echo "2" > /proc/sys/net/ipv4/conf/eth0/arp_announce Aug 2 16:12:00 grp-01-00-50 root: /etc/keepalived/manage_ip_lvs_dr (caller: keepalived): rsh 192.168.1.201 "/sbin/route del default; /sbin/route add default gw 192.168.1.1" Aug 2 16:12:06 grp-01-00-50 root: /etc/keepalived/manage_ip_lvs_dr (caller: keepalived): bash: /sbin/route del default; /sbin/route add default gw 192.168.1.1: No such file or directory Aug 2 16:12:06 grp-01-00-50 root: /etc/keepalived/manage_ip_lvs_dr (caller: keepalived): ip addr del 192.168.1.240/32 dev lo Aug 2 16:12:06 grp-01-00-50 root: /etc/keepalived/manage_ip_lvs_dr (caller: keepalived): RTNETLINK answers: Cannot assign requested address BACKUP log: Aug 2 16:11:51 grp-01-00-51 Keepalived_vrrp: VRRP_Instance(VI_1) Received higher prio advert Aug 2 16:11:51 grp-01-00-51 Keepalived_vrrp: VRRP_Instance(VI_1) Entering BACKUP STATE Aug 2 16:11:51 grp-01-00-51 Keepalived_vrrp: VRRP_Instance(VI_1) removing protocol VIPs. Aug 2 16:11:51 grp-01-00-51 Keepalived_vrrp: Netlink: skipping nl_cmd msg... Aug 2 16:11:51 grp-01-00-51 avahi-daemon[2053]: Withdrawing address record for 192.168.1.240 on eth0. Aug 2 16:11:51 grp-01-00-51 Keepalived_healthcheckers: Netlink reflector reports IP 192.168.1.240 removed Aug 2 16:11:51 grp-01-00-51 kernel: IPVS: stopping sync thread 2691 ... Aug 2 16:11:51 grp-01-00-51 root: /etc/keepalived/manage_ip_lvs_dr (caller: keepalived): ip addr add 192.168.1.240/32 dev lo brd + scope host Aug 2 16:11:51 grp-01-00-51 kernel: IPVS: sync thread stopped! Aug 2 16:11:52 grp-01-00-51 kernel: IPVS: sync thread started: state = BACKUP, mcast_ifn = eth0, syncid = 25 RESULT: SUCCESS Note: when checking browser connection by clicking link in existing webapp session, the first click yields an error, but the second reclick yields the page. So here is what ipvsadm -l shows: on MASTER: IP Virtual Server version 1.2.1 (size=4096) Prot LocalAddress:Port Scheduler Flags -> RemoteAddress:Port Forward Weight ActiveConn InActConn TCP 192.168.1.240:https rr persistent 600 -> 192.168.1.201:https Route 1 0 0 -> 192.168.1.200:https Route 1 0 0 TCP 10.3.0.3:http wlc persistent 600 TCP 192.168.1.240:http rr persistent 600 -> 192.168.1.201:http Route 1 0 0 -> 192.168.1.200:http Route 1 0 0 TCP 192.168.1.240:ssh rr persistent 600 -> 192.168.1.201:ssh Route 1 0 0 -> 192.168.1.200:ssh Route 1 0 0 [EMAIL PROTECTED] keepalived]# ip addr show 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 16436 qdisc noqueue link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo inet6 ::1/128 scope host valid_lft forever preferred_lft forever 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast qlen 1000 link/ether 00:0c:29:a7:c7:33 brd ff:ff:ff:ff:ff:ff inet 192.168.1.150/24 brd 192.168.1.255 scope global eth0 inet 192.168.1.240/24 scope global secondary eth0 inet6 fe80::20c:29ff:fea7:c733/64 scope link valid_lft forever preferred_lft forever on BACKUP: [EMAIL PROTECTED] ~]# ipvsadm -l IP Virtual Server version 1.2.1 (size=4096) Prot LocalAddress:Port Scheduler Flags -> RemoteAddress:Port Forward Weight ActiveConn InActConn 'TCP 192.168.1.240:https rr persistent 600 -> 192.168.1.201:https Route 1 0 0 -> 192.168.1.200:https Route 1 0 0 TCP 192.168.1.240:http rr persistent 600 -> 192.168.1.201:http Route 1 3 0 -> 192.168.1.200:http Route 1 0 0 TCP 192.168.1.240:ssh rr persistent 600 -> 192.168.1.201:ssh Route 1 0 0 -> 192.168.1.200:ssh Route 1 0 0 [EMAIL PROTECTED] ~]# ip addr show \1: lo: <LOOPBACK,UP,LOWER_UP> mtu 16436 qdisc noqueue link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo inet 192.168.1.240/32 scope host lo inet6 ::1/128 scope host valid_lft forever preferred_lft forever 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast qlen 1000 link/ether 00:0c:29:54:ef:09 brd ff:ff:ff:ff:ff:ff inet 192.168.1.151/24 brd 192.168.1.255 scope global eth0 inet6 fe80::20c:29ff:fe54:ef09/64 scope link valid_lft forever preferred_lft forever Please notice that the connection is showing to the BACKUP even though the VIP/24 is on the MASTER eth0 interface. This is what I do not understand. How is this possible? Anyway, is this enough information? Please let me know what else I can provide. Thanks, Gerry _______________________________________________ LinuxVirtualServer.org mailing list - lvs-users@LinuxVirtualServer.org Send requests to [EMAIL PROTECTED] or go to http://lists.graemef.net/mailman/listinfo/lvs-users