On Sun, Mar 28, 2010 at 12:31:20PM +0200, svensven wrote: > In short: ip_vs_conn_in_get() does not match on fwmark, so incoming > packets to the backup LVS that were forwarded from the master LVS will > match a synchronized connection and thus be sent through ipvs on the > backup LVS, which is also the destination realserver. ipvs will loop > the packet, causing the node to hang. Without conn sync, the nodes > work fine (though of course breaking existing connections when failing > over). Tested on Linux 2.6.33. > > Here's my setup: > > client ----+ > 10.0.0.3 | vip: 10.0.0.10 > / \ > / \ > +------------+ +------------+ > | LVS A (mst)| | LVS B (bkp)| > |Realserver A| |Realserver B| > | 10.0.0.5 | | 10.0.0.6 | > +------------+ +------------+ > > Both nodes are set up with the vip on lo:10, an iptables rule to set > the fwmark if the request does not come from the other LVS and > arp_ignore=1, arp_announce=2 on all interfaces. See net/iptables/ > sysctl config for LVS master [3] and backup [4]. The realservers run > lighttpd on port 9999 and bind to 0.0.0.0. > > Both nodes have an identical keepalived.conf, except for the priority. > See full keepalived.conf for LVS A [5]. The important parts of it are > shown below: > > virtual_server fwmark 10 { > lb_algo rr > lb_kind DR > real_server 10.0.0.5 9999 {...} > real_server 10.0.0.6 9999 {...} > } > > The config includes notify_master/notify_backup scripts that > start/stop the ipvs connection synchronization daemon. For testing > purposes, the sync threshold is tweaked to sync after the TCP 3-way > handshake is done (2 incoming packets seen: SYN and ACK): > > net.ipv4.vs.sync_threshold="2 10" > > The debug kernel output in [1] shows how the connection fails when the > client queries the vip, LVS A is master, and the connection is > forwarded to realserver B. > > The debug kernel output in [2] shows how the connection works when the > client queries the vip, LVS B is the master, and the connection is > forwarded to realserver B (itself), i.e. with no connection > synchronization. > > > Questions: > 1. Should the ip_vs_conn_in_get() function also take fwmark into > consideration when matching incoming packets to its list of > established ipvs connections?
I suspect not, as the connection table doesn't include fwmark information. And I think that there ought to be a simper resolution to your problem than refactoring connection table entries. > 2. Is this the right way of setting up a two-node LVS setup with > localnodes and connection synchronization on a modern kernel? > (Assuming the conn sync would not break) I think that you could get around this problem by only activating the LVS rules on the master-node. Or is that already the case? > thanks! > S. > > *** > > [1]: Example of fail > LVS A is master, balances to realserver B. > The output below is from LVS B / realserver B kern.log after: > * adding LOG entries to iptables -t filter, chain INPUT and OUTPUT > * setting net.ipv4.vs.debug_level to 13 (max) > * stripping away some crud, cleaning timestamps, etc > * adding <notes> on progress > > Interesting lines: 11, 21, 28 > > 1 <Connection from client to VIP> > 2 [52.351] filter-INPUT : IN=eth0 OUT= MAC=lvsB_mac:lvsA_mac:08:00 > SRC=10.0.0.3 DST=10.0.0.10 SPT=54590 DPT=9999 SYN > 3 [52.351] IPVS: lookup/in TCP 10.0.0.3:54590->10.0.0.10:9999 not hit > 4 [52.351] IPVS: lookup/out TCP 10.0.0.3:54590->10.0.0.10:9999 not hit > 5 [52.351] IPVS: lookup service: fwm 0 TCP 10.0.0.10:9999 not hit > 6 [52.351] filter-OUTPUT: IN= OUT=eth0 SRC=10.0.0.10 DST=10.0.0.3 > SPT=9999 DPT=54590 ACK SYN > 7 [52.457] filter-INPUT : IN=eth0 OUT= MAC=lvsB_mac:lvsA_mac:08:00 > SRC=10.0.0.3 DST=10.0.0.10 SPT=54590 DPT=9999 ACK > 8 [52.457] IPVS: lookup/in TCP 10.0.0.3:54590->10.0.0.10:9999 not hit > 9 [52.457] IPVS: lookup/out TCP 10.0.0.3:54590->10.0.0.10:9999 not hit > 10 <TCP handshake complete> > 11 <IPVS state is synchronized from MASTER to BACKUP> > 12 [52.869] IPVS: packet type=2 proto=17 daddr=224.0.0.81 ignored > 13 [52.869] IPVS: Enter: ip_vs_receive, net/netfilter/ipvs/ip_vs_sync.c > line 722 > 14 [52.869] IPVS: Leave: ip_vs_receive, net/netfilter/ipvs/ip_vs_sync.c > line 733 > 15 [52.869] IPVS: lookup/in TCP 10.0.0.3:54590->10.0.0.10:9999 not hit > 16 [52.869] IPVS: lookup service: fwm 0 TCP 10.0.0.10:9999 not hit > 17 [53.353] IPVS: packet type=5 proto=2 daddr=224.0.0.81 ignored > 18 <One line of data sent from client to VIP> > 19 [60.906] filter-INPUT : IN=eth0 OUT= MAC=lvsB_mac:lvsA_mac:08:00 > SRC=10.0.0.3 DST=10.0.0.10 SPT=54590 DPT=9999 ACK PSH > 20 <Packet matches synchronized state> > 21 [60.906] IPVS: lookup/in TCP 10.0.0.3:54590->10.0.0.10:9999 hit > 22 [60.906] IPVS: Enter: ip_vs_dr_xmit, net/netfilter/ipvs/ip_vs_xmit.c > line 756 > 23 <IPVS forwards the packet to the local interface> > 24 [60.906] filter-OUTPUT: IN= OUT=lo SRC=10.0.0.3 DST=10.0.0.10 > SPT=54590 DPT=9999 ACK PSH > 25 [60.906] IPVS: Leave: ip_vs_dr_xmit, net/netfilter/ipvs/ip_vs_xmit.c > line 789 > 26 [61.011] filter-INPUT : IN=lo OUT= > MAC=00:00:00:00:00:00:00:00:00:00:00:00:08:00 SRC=10.0.0.3 DST=10.0.0.10 > SPT=54590 DPT=9999 ACK PSH > 27 <Packet matches synchronized state again ...> > 28 [61.019] IPVS: lookup/in TCP 10.0.0.3:54590->10.0.0.10:9999 hit > 29 [61.019] IPVS: Enter: ip_vs_dr_xmit, net/netfilter/ipvs/ip_vs_xmit.c > line 756 I think that this is critical to the problem. That is ip_vs_dr_xmit() is being called which causes a loop. I suspect that ip_vs_null_xmit() should be called and if so the loop wouldn't occur. Could you post the output of "ipvsadm -Ln" ? I'm also wondering if this relates to a recent report of Local forwarding not working since 2.6.28. http://marc.info/?l=linux-virtual-server&m=126943987132679&w=2 > 30 <IPVS repeats the forwarding in a loop, machine stops responding> > 31 [61.030] filter-OUTPUT: IN= OUT=lo SRC=10.0.0.3 DST=10.0.0.10 > SPT=54590 DPT=9999 ACK PSH > 32 [61.041] IPVS: Leave: ip_vs_dr_xmit, net/netfilter/ipvs/ip_vs_xmit.c > line 789 > 33 [61.074] filter-INPUT : IN=lo OUT= > MAC=00:00:00:00:00:00:00:00:00:00:00:00:08:00 SRC=10.0.0.3 DST=10.0.0.10 > SPT=54590 DPT=9999 ACK PSH > 34 [61.083] IPVS: lookup/in TCP 10.0.0.3:54590->10.0.0.10:9999 hit > 35 [61.084] IPVS: Enter: ip_vs_dr_xmit, net/netfilter/ipvs/ip_vs_xmit.c > line 756 > 36 <etc, etc> > > Note that the incoming packet is not fwmarked, and that the ipvs > lookup/in check does not try to match on fwmark. [snip] _______________________________________________ Please read the documentation before posting - it's available at: http://www.linuxvirtualserver.org/ LinuxVirtualServer.org mailing list - lvs-users@LinuxVirtualServer.org Send requests to lvs-users-requ...@linuxvirtualserver.org or go to http://lists.graemef.net/mailman/listinfo/lvs-users