Re: Startup delay problem

2011-01-09 Thread Guillaume Bourque

Hi Guys,

Here' what I tried

Move stunnel startup script at S60stunnel then  haproxy to S61haproxy

Hartbeat is S75heartbeat

That did not help

SO then I created a S62sleep script ;-) I know it's ugly but ot was 
becoming late


The script would sleep for 20 sec the time it take haproxy to see the 
first server up


Then heartbeat com up and has soon as the IP came in service well guess 
what it took another 20 second for the server to came Up in haproxy.


So then (just for a test) I did a sleep 120 sec in my S62sleep script 
guess what after my reboot it took 120 second to start heartbeat but 
then when trafic came in the node another 20 sec befrea haproxy see my 
server up !


So maybe since heartbeat is in the loop I have a arp delay has willy said.

I will be putting a wget in my S62sleep to test haproxy and maybe force 
the haproxy to wake up ;-) then do my sleep 20 seconde before heartbeat 
start and I'll try to tcpdump there.


Also with heartbeat could I put a single mac for my Service IP on my 
both node would that solve this issue that I only see at boot time on my 
2 nodes,


Thanks for any advices ;-)

Mike Hoffs a écrit :

Wait a minute, I did not notice you were running heartbeat. It changes
a lot of things. It's taking the IP over and depending on whether it's
announcing gratuitous ARPs on fail-over and if other equipemnts accept
them, it is possible that you have to wait for a cache to expire somwhere.
Tcpdump will show that a lot better (please get the full captures, not just
screen dumps, as we'll have to dig into the MAC addresses and correlate
them with ARP traffic).



Depending on the network topology, u could ping after the heartbeat taken over 
the ip to the routers from that ip. That solve for us a problem for long term 
arp caches sometimes.

  



--
Guillaume Bourque, B.Sc.,
consultant, infrastructures technologiques libres !
Logisoft Technologies inc.  http://www.logisoftech.com
514 576-7638, http://ca.linkedin.com/in/GuillaumeBourque/fr




Re: Startup delay problem

2011-01-09 Thread Willy Tarreau
Hi Guillaume,

First, thank you for the feedback. I have one question below :

On Sun, Jan 09, 2011 at 12:57:28PM -0500, Guillaume Bourque wrote:
 Hi Guys,
 
 Here' what I tried
 
 Move stunnel startup script at S60stunnel then  haproxy to S61haproxy
 
 Hartbeat is S75heartbeat
 
 That did not help
 
 SO then I created a S62sleep script ;-) I know it's ugly but ot was 
 becoming late
 
 The script would sleep for 20 sec the time it take haproxy to see the 
 first server up
 
 Then heartbeat com up and has soon as the IP came in service well guess 
 what it took another 20 second for the server to came Up in haproxy.

Are you sure that the switch port to which your LB is connected is not
in blocking mode after the interface gets up ? You can check for that
by pinging anything from the machine. On cisco switches, for instance,
you have to use the portfast option so that the ports immediately
forward. Otherwise the switch blocks for some time, checking for
possible spanning tree frames.

Regards,
Willy




Re: Startup delay problem

2011-01-09 Thread Guillaume Bourque

Hourra ;-)

I found it and as I suspected haproxy had nothnig to do with it

The backend serser are on a different subnet and need a special route 

That route was only added by /etc/rc.local which is executed after all 
the startup script so until that route was added by /etc/rc.local 
haproxy could not connect to those backend 


Noe I have add those route in /etc/sysconfig/network-scripts/route-eth1

And now when haproxy start all backend are there ;-)

Thanks for your support and sorry for this non issue !

Bye

Willy Tarreau a écrit :

Hi Guillaume,

First, thank you for the feedback. I have one question below :

On Sun, Jan 09, 2011 at 12:57:28PM -0500, Guillaume Bourque wrote:
  

Hi Guys,

Here' what I tried

Move stunnel startup script at S60stunnel then  haproxy to S61haproxy

Hartbeat is S75heartbeat

That did not help

SO then I created a S62sleep script ;-) I know it's ugly but ot was 
becoming late


The script would sleep for 20 sec the time it take haproxy to see the 
first server up


Then heartbeat com up and has soon as the IP came in service well guess 
what it took another 20 second for the server to came Up in haproxy.



Are you sure that the switch port to which your LB is connected is not
in blocking mode after the interface gets up ? You can check for that
by pinging anything from the machine. On cisco switches, for instance,
you have to use the portfast option so that the ports immediately
forward. Otherwise the switch blocks for some time, checking for
possible spanning tree frames.

Regards,
Willy

  



--
Guillaume Bourque, B.Sc.,
consultant, infrastructures technologiques libres !
Logisoft Technologies inc.  http://www.logisoftech.com
514 576-7638, http://ca.linkedin.com/in/GuillaumeBourque/fr




Re: Startup delay problem

2011-01-08 Thread Guillaume.Bourque

Thanks Willy

But since we have no trafic yet maybe 5-10 session, I doubt any queu  
are full


Anything else I should look ?

The server are in a different subnet behind a Juniper firewall could  
that have any effect ?


But I will look at the maxconn settings

If someone has seen a similar behavior your insight would be very  
appreciated


Guillaume

Envoyé via mon tél
Guillaume Bourque, B.Sc.,
consultant, infrastructures technologiques libres !
Logisoft Technologies inc.
514 576-7638
http://www.logisoftech.com

Le 2011-01-08 à 17:07, Willy Tarreau w...@1wt.eu a écrit :


On Sat, Jan 08, 2011 at 11:29:00AM -0500, Guillaume.Bourque wrote:

Hi

To make this simple is there any haproxy param that I can put so that
when haproxy strart all server are by default mark as up ???


No there is no such thing unfortunately. In fact they are already  
marked

up but just for one last check. That means that they have failed the
first check. What you could do would be to increase the check  
interval.


One reason I would suspect is that the maxconn you're using is too  
close
from the server's maxclient. Due to that, the server's backlog is  
full,

and the new process' checks can't get a connection then fail. Probably
that if you lower the maxconn values a bit (after ensuring that they
really are below the server's maxclient setting), the problem will
completely disappear.


I know it's not perfect but that could be fine for now

Any reason why there are those delay only at boot time ?


It could be that the maxconn is so close to the server's limit that
the server can't accept one more check.

Regards,
Willy





Re: Startup delay problem

2011-01-08 Thread Willy Tarreau
On Sat, Jan 08, 2011 at 05:15:48PM -0500, Guillaume.Bourque wrote:
 Thanks Willy
 
 But since we have no trafic yet maybe 5-10 session, I doubt any queu  
 are full

Interesting case then !

 Anything else I should look ?

Tcpdump on the faulty node !

 The server are in a different subnet behind a Juniper firewall could  
 that have any effect ?

Yes it could, firewalls are rarely all transparent. For instance, it's
conceivable that when the second node has not been used for some time,
the firewall does not have any ARP entry for it and that the first SYN
packets do not get a response until ARP is resolved. That could be
enough to timeout and fail the check. Increasing the check interval could
definitely help in this situation. A SYN retransmit usually applies 3s
after the first one, so a check interval of 5s should cover that.

Wait a minute, I did not notice you were running heartbeat. It changes
a lot of things. It's taking the IP over and depending on whether it's
announcing gratuitous ARPs on fail-over and if other equipemnts accept
them, it is possible that you have to wait for a cache to expire somwhere.
Tcpdump will show that a lot better (please get the full captures, not just
screen dumps, as we'll have to dig into the MAC addresses and correlate
them with ARP traffic).

Regards,
Willy




RE: Startup delay problem

2011-01-08 Thread Mike Hoffs
 Wait a minute, I did not notice you were running heartbeat. It changes
 a lot of things. It's taking the IP over and depending on whether it's
 announcing gratuitous ARPs on fail-over and if other equipemnts accept
 them, it is possible that you have to wait for a cache to expire somwhere.
 Tcpdump will show that a lot better (please get the full captures, not just
 screen dumps, as we'll have to dig into the MAC addresses and correlate
 them with ARP traffic).

Depending on the network topology, u could ping after the heartbeat taken over 
the ip to the routers from that ip. That solve for us a problem for long term 
arp caches sometimes.




Startup delay problem

2011-01-07 Thread Guillaume Bourque

Hi all

This is a new install going into pilot monday and it's late I know 
we just found about this boot delay issue with haproxy on this setup.


Our setup

2 lb running centos 5.5 64 bit with
- stunnel
- heartbeat-2.1.4-11.el5
- haproxy-1.4.9-1.el5

At boot time everythnig starts well except that all servers for all 
backend stay down for 20 second before going UP ( Online )


But if I stop haproxy manually and wait 10 sec and restarted it then all 
my server come Online within 1-2 sec has I always have seen it with 
haproxy-1.3.x


So each time I reboot lb1 or lb2 when the server comes back to life 
there is a 20 sec where I get 503 service unavailable for all those site


Here is the Log from haproxy,  this is the first message seen @ boot 
time for the first client coming in which is wget locally



Jan  7 22:20:26 localhost haproxy[3093]: 127.0.0.1:49884 [07/Jan/2011:22:20:26.705] 
DISPATCH-lb1 DomaineClient-PPROD-SSL/NOSRV 25/-1/-1/-1/25 503 212 - - SC-- 0/0/0/0/0 
0/0 {...site name.. .||} GET /webapp/user/login/;jsessionid=F08069967657D3A1 
HTTP/1.1tart


I see that there is a string a the end of the log line tart, normal ?


Here is the config file geeral section and for 1 backend

Haproxy config file

#-

# Global settings

#-

global

   daemon


   log 127.0.0.1 local6 info

   log 127.0.0.1 local1 notice

   chroot  /var/lib/haproxy

   pidfile /var/run/haproxy.pid

   stats socket /var/run/haproxy-socket-stats mode 600

   maxconn2000# count about 1 Gb per 2 connections

   userhaproxy

   group   haproxy

#-

# common defaults that all the 'listen' and 'backend' sections will

# use if not designated in their block

#-

defaults

   modehttp

   maxconn 1950 # should be slightly smaller than global 
maxconn

   timeout connect 4s  # default 4s second time out if a 
backend is not found

   timeout client 90s

   timeout server 90s

   timeout queue  90s

   timeout http-request  5s

   timeout http-request  5s

   log global

   option  dontlognull

   option  httpclose# disable keep-alive

   optionabortonclose

   option  httplog

   option  forwardforexcept 127.0.0.1

   option  redispatch

   retries 3

frontend DISPATCH-lb2

   bind :80,:8001,:8002,:8003,:8881,:8882,:8883

   acl url_static   path_beg   -i /download

   acl url_secure  hdr_dom(host)   -i secure.DomaineClient.com

   acl url_pprod   hdr_dom(host)   -i pprod.DomaineClient.com

   acl url_mobile  hdr_dom(host)   -i m.DomaineClient.com

   acl url_pprod-ssl   hdr_dom(host)   -i 
pprod-ssl.DomaineClient.com

   acl url_qa  hdr_dom(host)   -i qa.DomaineClient.com 
wp.DomaineClient.com

   acl normal_port dst_port80

   acl secure_portdst_port8001

   acl secure_port-pprod   dst_port8002

   acl secure_port-mobile  dst_port8003

   acl secure_port-ct-qa  dst_port8881

   acl secure_port-ct-pproddst_port8882

   acl secure_port-ct-proddst_port8883

   redirect prefix http://www.DomaineClient.com if !url_qa 
!url_pprod !url_pprod-ssl !url_secure !url_mobile secure_port

   redirect prefix https://secure.DomaineClient.comif 
url_secure !secure_port

   redirect prefix https://m.DomaineClient.comif url_mobile 
!secure_port-mobile

   redirect prefix http://pprod.DomaineClient.com  if url_pprod 
secure_port-pprod

   redirect prefix https://pprod-ssl.DomaineClient.com if 
url_pprod-ssl !secure_port-pprod

   use_backend DomaineClient-SSL   if url_secure secure_port

   use_backend DomaineClient-MOBILEif url_mobile 
secure_port-mobile

   use_backend DomaineClient-PPROD if url_pprod normal_port

   use_backend DomaineClient-PPROD-SSL if url_pprod-ssl 
secure_port-pprod

   use_backend DomaineClient-QAif url_qa

   use_backend DomaineClient-CT-PROD  if secure_port-ct-prod

   use_backend DomaineClient-CT-PPROD  if secure_port-ct-pprod

   use_backend DomaineClient-CT-QA if secure_port-ct-qa

   default_backend DomaineClient-PROD

   capture cookie ASPSESSION len 32

   # log the name of the virtual 

Delay problem

2009-06-29 Thread Carlo Granisso
Hello everybody...
 
I've little problem with haproxy: it's working fine in transparent mode
(with tproxy enabled) but sometimes (NOT on every reload), when I try to
load page (all pages are in jsp), I've delay: I must attend few seconds to
have the page completed.


All pages are in JAVA (jsp extension)
 
Here's my haproxy.cfg:
 
 
listen  MAIN PUBLIC_IP:80
modehttp
option  forwardfor
acl x_ACL hdr_dom(host) www.x.it
acl y_ACL hdr_dom(host) www.y.it
source  192.168.0.133 usesrc clientip
stats enable
stats uri /haproxy
stats auth  admin:sbereu208
use_backend X if dnshosting_ACL
use_backend Y if joomlahost_ACL
option redispatch
 
backend  BACKEND1 PUBLIC_IP:80
modehttp
balance roundrobin
option forwardfor
acl indirizzo_dnshst  path_end /
source  192.168.0.133 usesrc clientip
redirect location /dnshst/index.jsp if indirizzo_dnshst
cookie  SERVERID insert nocache
#   cookie JSESSIONID prefix
server resin1.x.it 192.168.0.132 cookie resin1 check port 80
inter 3 rise 2 fall 5 maxconn 300
server resin2.y.it 192.168.0.141 cookie resin2 check port 80
inter 3 rise 2 fall 5 maxconn 300
option redispatch
 
backend  BACKEND2 PUBLIC_IP:80
modehttp
balance roundrobin
acl indirizzo_jhst path_end /
source  192.168.0.133 usesrc clientip
redirect location /dnshst/jm/index.jsp if indirizzo_jhst
cookie  SERVERID insert nocache
#   cookie JSESSIONID prefix
server resin1.x.it 192.168.0.132 cookie resin1 check port 80
inter 3 rise 2 fall 5 maxconn 300
server resin2.y.it 192.168.0.141 cookie resin2 check port 80
inter 3 rise 2 fall 5 maxconn 300
option redispatch


And here's my iptables rules on haproxy server:

echo 1  /proc/sys/net/ipv4/ip_forward
/usr/local/sbin/iptables -t mangle -N DIVERT
/usr/local/sbin/iptables -t mangle -A PREROUTING -p tcp -m socket -j DIVERT
/usr/local/sbin/iptables -t mangle -A DIVERT -j MARK --set-mark 1
/usr/local/sbin/iptables -t mangle -A DIVERT -j ACCEPT

ip rule add fwmark 1 lookup 100
ip route add local 0.0.0.0/0 dev lo table 100
iptables --table nat --append POSTROUTING --out-interface eth0 -j MASQUERADE
iptables --append FORWARD --in-interface eth1 -j ACCEPT

Eth0 is my public interface
Eth1 the private one

Routing form my two webserver is working fine both for public and private
IPs


Thanks!



Carlo




R: Delay problem

2009-06-29 Thread Carlo Granisso
 If I try to see web pages without haproxy I've no problem... I'm trying to
tune haproxy first

Thanks for your help.


Carlo

-Messaggio originale-
Da: John Lauro [mailto:john.la...@covenanteyes.com] 
Inviato: lunedì 29 giugno 2009 14.25
A: 'Carlo Granisso'
Oggetto: RE: Delay problem

Reloading just haproxy, or reloading the jsp server?  If jsp, I think it's
to be expected, although you might be able to do some tuning in haproxy to
minimize the amount of traffic to a server coming online.  JSP has to
compile pages as it encounters them the first time after a reload, making
each page slow at first.




R: Delay problem

2009-06-29 Thread Carlo Granisso
Ok, it seems that problem was in:

 contimeout
   clitimeout

I've reduced these parameters and now seems that all is working fine.
I've read haproxy documentation but I can't completly understand the meaning
of Set the maximum inactivity time on the client side: this mean that
after complete download of the page haproxy leave opened the connection
until...:


1) Client do some operations
2) Timeout reached


Probably my problem was the second point: page was correctly loaded and
haproxy wait for other activity.

Is it correct?

Thanks for your help.


Carlo

-Messaggio originale-
Da: Carlo Granisso [mailto:c.grani...@dnshosting.it] 
Inviato: lunedì 29 giugno 2009 14.34
A: 'John Lauro'
Cc: haproxy@formilux.org
Oggetto: R: Delay problem

 If I try to see web pages without haproxy I've no problem... I'm trying to
tune haproxy first

Thanks for your help.


Carlo

-Messaggio originale-
Da: John Lauro [mailto:john.la...@covenanteyes.com]
Inviato: lunedì 29 giugno 2009 14.25
A: 'Carlo Granisso'
Oggetto: RE: Delay problem

Reloading just haproxy, or reloading the jsp server?  If jsp, I think it's
to be expected, although you might be able to do some tuning in haproxy to
minimize the amount of traffic to a server coming online.  JSP has to
compile pages as it encounters them the first time after a reload, making
each page slow at first.




Checked by AVG - www.avg.com
Version: 8.5.375 / Virus Database: 270.12.93/2206 - Release Date: 06/29/09
05:54:00




Re: R: Delay problem

2009-06-29 Thread Willy Tarreau
Hello,

On Mon, Jun 29, 2009 at 04:44:13PM +0200, Carlo Granisso wrote:
 Ok, it seems that problem was in:
 
contimeout
clitimeout
 
 I've reduced these parameters and now seems that all is working fine.
 I've read haproxy documentation but I can't completly understand the meaning
 of Set the maximum inactivity time on the client side: this mean that
 after complete download of the page haproxy leave opened the connection
 until...:
 
 
 1) Client do some operations
 2) Timeout reached
 
 
 Probably my problem was the second point: page was correctly loaded and
 haproxy wait for other activity.
 
 Is it correct?

I think it is even simpler than that. You have maxconn 300 on your servers,
and you don't have option httpclose, which means that clients can maintain
a keep-alive connection open an unused after they retrieve an object. By
reducing timeout client, you are forcing those connections to die faster,
but it's still not the right way to do this. Please simply add option 
httpclose
and I'm sure the problem will definitely vanish.

Regards,
Willy