Re: Startup delay problem
Hi Guys, Here' what I tried Move stunnel startup script at S60stunnel then haproxy to S61haproxy Hartbeat is S75heartbeat That did not help SO then I created a S62sleep script ;-) I know it's ugly but ot was becoming late The script would sleep for 20 sec the time it take haproxy to see the first server up Then heartbeat com up and has soon as the IP came in service well guess what it took another 20 second for the server to came Up in haproxy. So then (just for a test) I did a sleep 120 sec in my S62sleep script guess what after my reboot it took 120 second to start heartbeat but then when trafic came in the node another 20 sec befrea haproxy see my server up ! So maybe since heartbeat is in the loop I have a arp delay has willy said. I will be putting a wget in my S62sleep to test haproxy and maybe force the haproxy to wake up ;-) then do my sleep 20 seconde before heartbeat start and I'll try to tcpdump there. Also with heartbeat could I put a single mac for my Service IP on my both node would that solve this issue that I only see at boot time on my 2 nodes, Thanks for any advices ;-) Mike Hoffs a écrit : Wait a minute, I did not notice you were running heartbeat. It changes a lot of things. It's taking the IP over and depending on whether it's announcing gratuitous ARPs on fail-over and if other equipemnts accept them, it is possible that you have to wait for a cache to expire somwhere. Tcpdump will show that a lot better (please get the full captures, not just screen dumps, as we'll have to dig into the MAC addresses and correlate them with ARP traffic). Depending on the network topology, u could ping after the heartbeat taken over the ip to the routers from that ip. That solve for us a problem for long term arp caches sometimes. -- Guillaume Bourque, B.Sc., consultant, infrastructures technologiques libres ! Logisoft Technologies inc. http://www.logisoftech.com 514 576-7638, http://ca.linkedin.com/in/GuillaumeBourque/fr
Re: Startup delay problem
Hi Guillaume, First, thank you for the feedback. I have one question below : On Sun, Jan 09, 2011 at 12:57:28PM -0500, Guillaume Bourque wrote: Hi Guys, Here' what I tried Move stunnel startup script at S60stunnel then haproxy to S61haproxy Hartbeat is S75heartbeat That did not help SO then I created a S62sleep script ;-) I know it's ugly but ot was becoming late The script would sleep for 20 sec the time it take haproxy to see the first server up Then heartbeat com up and has soon as the IP came in service well guess what it took another 20 second for the server to came Up in haproxy. Are you sure that the switch port to which your LB is connected is not in blocking mode after the interface gets up ? You can check for that by pinging anything from the machine. On cisco switches, for instance, you have to use the portfast option so that the ports immediately forward. Otherwise the switch blocks for some time, checking for possible spanning tree frames. Regards, Willy
Re: Startup delay problem
Hourra ;-) I found it and as I suspected haproxy had nothnig to do with it The backend serser are on a different subnet and need a special route That route was only added by /etc/rc.local which is executed after all the startup script so until that route was added by /etc/rc.local haproxy could not connect to those backend Noe I have add those route in /etc/sysconfig/network-scripts/route-eth1 And now when haproxy start all backend are there ;-) Thanks for your support and sorry for this non issue ! Bye Willy Tarreau a écrit : Hi Guillaume, First, thank you for the feedback. I have one question below : On Sun, Jan 09, 2011 at 12:57:28PM -0500, Guillaume Bourque wrote: Hi Guys, Here' what I tried Move stunnel startup script at S60stunnel then haproxy to S61haproxy Hartbeat is S75heartbeat That did not help SO then I created a S62sleep script ;-) I know it's ugly but ot was becoming late The script would sleep for 20 sec the time it take haproxy to see the first server up Then heartbeat com up and has soon as the IP came in service well guess what it took another 20 second for the server to came Up in haproxy. Are you sure that the switch port to which your LB is connected is not in blocking mode after the interface gets up ? You can check for that by pinging anything from the machine. On cisco switches, for instance, you have to use the portfast option so that the ports immediately forward. Otherwise the switch blocks for some time, checking for possible spanning tree frames. Regards, Willy -- Guillaume Bourque, B.Sc., consultant, infrastructures technologiques libres ! Logisoft Technologies inc. http://www.logisoftech.com 514 576-7638, http://ca.linkedin.com/in/GuillaumeBourque/fr
Re: Startup delay problem
Thanks Willy But since we have no trafic yet maybe 5-10 session, I doubt any queu are full Anything else I should look ? The server are in a different subnet behind a Juniper firewall could that have any effect ? But I will look at the maxconn settings If someone has seen a similar behavior your insight would be very appreciated Guillaume Envoyé via mon tél Guillaume Bourque, B.Sc., consultant, infrastructures technologiques libres ! Logisoft Technologies inc. 514 576-7638 http://www.logisoftech.com Le 2011-01-08 à 17:07, Willy Tarreau w...@1wt.eu a écrit : On Sat, Jan 08, 2011 at 11:29:00AM -0500, Guillaume.Bourque wrote: Hi To make this simple is there any haproxy param that I can put so that when haproxy strart all server are by default mark as up ??? No there is no such thing unfortunately. In fact they are already marked up but just for one last check. That means that they have failed the first check. What you could do would be to increase the check interval. One reason I would suspect is that the maxconn you're using is too close from the server's maxclient. Due to that, the server's backlog is full, and the new process' checks can't get a connection then fail. Probably that if you lower the maxconn values a bit (after ensuring that they really are below the server's maxclient setting), the problem will completely disappear. I know it's not perfect but that could be fine for now Any reason why there are those delay only at boot time ? It could be that the maxconn is so close to the server's limit that the server can't accept one more check. Regards, Willy
Re: Startup delay problem
On Sat, Jan 08, 2011 at 05:15:48PM -0500, Guillaume.Bourque wrote: Thanks Willy But since we have no trafic yet maybe 5-10 session, I doubt any queu are full Interesting case then ! Anything else I should look ? Tcpdump on the faulty node ! The server are in a different subnet behind a Juniper firewall could that have any effect ? Yes it could, firewalls are rarely all transparent. For instance, it's conceivable that when the second node has not been used for some time, the firewall does not have any ARP entry for it and that the first SYN packets do not get a response until ARP is resolved. That could be enough to timeout and fail the check. Increasing the check interval could definitely help in this situation. A SYN retransmit usually applies 3s after the first one, so a check interval of 5s should cover that. Wait a minute, I did not notice you were running heartbeat. It changes a lot of things. It's taking the IP over and depending on whether it's announcing gratuitous ARPs on fail-over and if other equipemnts accept them, it is possible that you have to wait for a cache to expire somwhere. Tcpdump will show that a lot better (please get the full captures, not just screen dumps, as we'll have to dig into the MAC addresses and correlate them with ARP traffic). Regards, Willy
RE: Startup delay problem
Wait a minute, I did not notice you were running heartbeat. It changes a lot of things. It's taking the IP over and depending on whether it's announcing gratuitous ARPs on fail-over and if other equipemnts accept them, it is possible that you have to wait for a cache to expire somwhere. Tcpdump will show that a lot better (please get the full captures, not just screen dumps, as we'll have to dig into the MAC addresses and correlate them with ARP traffic). Depending on the network topology, u could ping after the heartbeat taken over the ip to the routers from that ip. That solve for us a problem for long term arp caches sometimes.
Startup delay problem
Hi all This is a new install going into pilot monday and it's late I know we just found about this boot delay issue with haproxy on this setup. Our setup 2 lb running centos 5.5 64 bit with - stunnel - heartbeat-2.1.4-11.el5 - haproxy-1.4.9-1.el5 At boot time everythnig starts well except that all servers for all backend stay down for 20 second before going UP ( Online ) But if I stop haproxy manually and wait 10 sec and restarted it then all my server come Online within 1-2 sec has I always have seen it with haproxy-1.3.x So each time I reboot lb1 or lb2 when the server comes back to life there is a 20 sec where I get 503 service unavailable for all those site Here is the Log from haproxy, this is the first message seen @ boot time for the first client coming in which is wget locally Jan 7 22:20:26 localhost haproxy[3093]: 127.0.0.1:49884 [07/Jan/2011:22:20:26.705] DISPATCH-lb1 DomaineClient-PPROD-SSL/NOSRV 25/-1/-1/-1/25 503 212 - - SC-- 0/0/0/0/0 0/0 {...site name.. .||} GET /webapp/user/login/;jsessionid=F08069967657D3A1 HTTP/1.1tart I see that there is a string a the end of the log line tart, normal ? Here is the config file geeral section and for 1 backend Haproxy config file #- # Global settings #- global daemon log 127.0.0.1 local6 info log 127.0.0.1 local1 notice chroot /var/lib/haproxy pidfile /var/run/haproxy.pid stats socket /var/run/haproxy-socket-stats mode 600 maxconn2000# count about 1 Gb per 2 connections userhaproxy group haproxy #- # common defaults that all the 'listen' and 'backend' sections will # use if not designated in their block #- defaults modehttp maxconn 1950 # should be slightly smaller than global maxconn timeout connect 4s # default 4s second time out if a backend is not found timeout client 90s timeout server 90s timeout queue 90s timeout http-request 5s timeout http-request 5s log global option dontlognull option httpclose# disable keep-alive optionabortonclose option httplog option forwardforexcept 127.0.0.1 option redispatch retries 3 frontend DISPATCH-lb2 bind :80,:8001,:8002,:8003,:8881,:8882,:8883 acl url_static path_beg -i /download acl url_secure hdr_dom(host) -i secure.DomaineClient.com acl url_pprod hdr_dom(host) -i pprod.DomaineClient.com acl url_mobile hdr_dom(host) -i m.DomaineClient.com acl url_pprod-ssl hdr_dom(host) -i pprod-ssl.DomaineClient.com acl url_qa hdr_dom(host) -i qa.DomaineClient.com wp.DomaineClient.com acl normal_port dst_port80 acl secure_portdst_port8001 acl secure_port-pprod dst_port8002 acl secure_port-mobile dst_port8003 acl secure_port-ct-qa dst_port8881 acl secure_port-ct-pproddst_port8882 acl secure_port-ct-proddst_port8883 redirect prefix http://www.DomaineClient.com if !url_qa !url_pprod !url_pprod-ssl !url_secure !url_mobile secure_port redirect prefix https://secure.DomaineClient.comif url_secure !secure_port redirect prefix https://m.DomaineClient.comif url_mobile !secure_port-mobile redirect prefix http://pprod.DomaineClient.com if url_pprod secure_port-pprod redirect prefix https://pprod-ssl.DomaineClient.com if url_pprod-ssl !secure_port-pprod use_backend DomaineClient-SSL if url_secure secure_port use_backend DomaineClient-MOBILEif url_mobile secure_port-mobile use_backend DomaineClient-PPROD if url_pprod normal_port use_backend DomaineClient-PPROD-SSL if url_pprod-ssl secure_port-pprod use_backend DomaineClient-QAif url_qa use_backend DomaineClient-CT-PROD if secure_port-ct-prod use_backend DomaineClient-CT-PPROD if secure_port-ct-pprod use_backend DomaineClient-CT-QA if secure_port-ct-qa default_backend DomaineClient-PROD capture cookie ASPSESSION len 32 # log the name of the virtual
Delay problem
Hello everybody... I've little problem with haproxy: it's working fine in transparent mode (with tproxy enabled) but sometimes (NOT on every reload), when I try to load page (all pages are in jsp), I've delay: I must attend few seconds to have the page completed. All pages are in JAVA (jsp extension) Here's my haproxy.cfg: listen MAIN PUBLIC_IP:80 modehttp option forwardfor acl x_ACL hdr_dom(host) www.x.it acl y_ACL hdr_dom(host) www.y.it source 192.168.0.133 usesrc clientip stats enable stats uri /haproxy stats auth admin:sbereu208 use_backend X if dnshosting_ACL use_backend Y if joomlahost_ACL option redispatch backend BACKEND1 PUBLIC_IP:80 modehttp balance roundrobin option forwardfor acl indirizzo_dnshst path_end / source 192.168.0.133 usesrc clientip redirect location /dnshst/index.jsp if indirizzo_dnshst cookie SERVERID insert nocache # cookie JSESSIONID prefix server resin1.x.it 192.168.0.132 cookie resin1 check port 80 inter 3 rise 2 fall 5 maxconn 300 server resin2.y.it 192.168.0.141 cookie resin2 check port 80 inter 3 rise 2 fall 5 maxconn 300 option redispatch backend BACKEND2 PUBLIC_IP:80 modehttp balance roundrobin acl indirizzo_jhst path_end / source 192.168.0.133 usesrc clientip redirect location /dnshst/jm/index.jsp if indirizzo_jhst cookie SERVERID insert nocache # cookie JSESSIONID prefix server resin1.x.it 192.168.0.132 cookie resin1 check port 80 inter 3 rise 2 fall 5 maxconn 300 server resin2.y.it 192.168.0.141 cookie resin2 check port 80 inter 3 rise 2 fall 5 maxconn 300 option redispatch And here's my iptables rules on haproxy server: echo 1 /proc/sys/net/ipv4/ip_forward /usr/local/sbin/iptables -t mangle -N DIVERT /usr/local/sbin/iptables -t mangle -A PREROUTING -p tcp -m socket -j DIVERT /usr/local/sbin/iptables -t mangle -A DIVERT -j MARK --set-mark 1 /usr/local/sbin/iptables -t mangle -A DIVERT -j ACCEPT ip rule add fwmark 1 lookup 100 ip route add local 0.0.0.0/0 dev lo table 100 iptables --table nat --append POSTROUTING --out-interface eth0 -j MASQUERADE iptables --append FORWARD --in-interface eth1 -j ACCEPT Eth0 is my public interface Eth1 the private one Routing form my two webserver is working fine both for public and private IPs Thanks! Carlo
R: Delay problem
If I try to see web pages without haproxy I've no problem... I'm trying to tune haproxy first Thanks for your help. Carlo -Messaggio originale- Da: John Lauro [mailto:john.la...@covenanteyes.com] Inviato: lunedì 29 giugno 2009 14.25 A: 'Carlo Granisso' Oggetto: RE: Delay problem Reloading just haproxy, or reloading the jsp server? If jsp, I think it's to be expected, although you might be able to do some tuning in haproxy to minimize the amount of traffic to a server coming online. JSP has to compile pages as it encounters them the first time after a reload, making each page slow at first.
R: Delay problem
Ok, it seems that problem was in: contimeout clitimeout I've reduced these parameters and now seems that all is working fine. I've read haproxy documentation but I can't completly understand the meaning of Set the maximum inactivity time on the client side: this mean that after complete download of the page haproxy leave opened the connection until...: 1) Client do some operations 2) Timeout reached Probably my problem was the second point: page was correctly loaded and haproxy wait for other activity. Is it correct? Thanks for your help. Carlo -Messaggio originale- Da: Carlo Granisso [mailto:c.grani...@dnshosting.it] Inviato: lunedì 29 giugno 2009 14.34 A: 'John Lauro' Cc: haproxy@formilux.org Oggetto: R: Delay problem If I try to see web pages without haproxy I've no problem... I'm trying to tune haproxy first Thanks for your help. Carlo -Messaggio originale- Da: John Lauro [mailto:john.la...@covenanteyes.com] Inviato: lunedì 29 giugno 2009 14.25 A: 'Carlo Granisso' Oggetto: RE: Delay problem Reloading just haproxy, or reloading the jsp server? If jsp, I think it's to be expected, although you might be able to do some tuning in haproxy to minimize the amount of traffic to a server coming online. JSP has to compile pages as it encounters them the first time after a reload, making each page slow at first. Checked by AVG - www.avg.com Version: 8.5.375 / Virus Database: 270.12.93/2206 - Release Date: 06/29/09 05:54:00
Re: R: Delay problem
Hello, On Mon, Jun 29, 2009 at 04:44:13PM +0200, Carlo Granisso wrote: Ok, it seems that problem was in: contimeout clitimeout I've reduced these parameters and now seems that all is working fine. I've read haproxy documentation but I can't completly understand the meaning of Set the maximum inactivity time on the client side: this mean that after complete download of the page haproxy leave opened the connection until...: 1) Client do some operations 2) Timeout reached Probably my problem was the second point: page was correctly loaded and haproxy wait for other activity. Is it correct? I think it is even simpler than that. You have maxconn 300 on your servers, and you don't have option httpclose, which means that clients can maintain a keep-alive connection open an unused after they retrieve an object. By reducing timeout client, you are forcing those connections to die faster, but it's still not the right way to do this. Please simply add option httpclose and I'm sure the problem will definitely vanish. Regards, Willy