Re: "option httpchk" is reporting servers as down when they're not
Hi Thomas, On Wed, Mar 25, 2009 at 12:57:41PM -0400, Allen, Thomas wrote: > Hi Willy, > > We now have HAProxy running over our freshly released website: > http://www.infrastructurereportcard.org/ thanks for the heads up ! > Thanks for this great piece of software and all the help! Only two > connection errors in 3 connections thus far, one of which was due to > me cancelling a long-running page load in the admin. fine ! anyway, you should expect to get some error requests due to such activities from your clients. In general, various sites report request error rates ranging from 0.1 to 0.6%, so what you observe is almost perfect :-) Cheers, Willy
RE: "option httpchk" is reporting servers as down when they're not
Hi Willy, We now have HAProxy running over our freshly released website: http://www.infrastructurereportcard.org/ Thanks for this great piece of software and all the help! Only two connection errors in 3 connections thus far, one of which was due to me cancelling a long-running page load in the admin. Thomas Allen Web Developer, ASCE 703.295.6355 -Original Message- From: Willy Tarreau [mailto:w...@1wt.eu] Sent: Monday, March 09, 2009 5:26 PM To: Allen, Thomas Cc: Jeffrey 'jf' Lim; haproxy@formilux.org Subject: Re: "option httpchk" is reporting servers as down when they're not Hi Thomas, On Mon, Mar 09, 2009 at 05:20:49PM -0400, Allen, Thomas wrote: > Hi Willy, > > Hm, changing to "60s" for each gave me 100% 504 errors, I removed all > three. Bad idea, I know, but at least it works then. then use "6", that's the old way of doing it :-) > I'm running 1.2.18 because the HAProxy homepage calls it the Latest > version. Ah OK, version 1.2 did not have the time units. Well, in fact it's not exactly marked as the only latest version, it's the latest version of branch 1.2, and 1.2 is the only one not tainted by development I admit. > I've removed all cookies from this IP, cleared my cache, and still it > seems that only one server is being hit. But the stats page reports an > equal distribution, so it's anybody's guess. What would be a simple way > to log the distribution? I find it difficult to determine this even in > debug mode (I'm running the proxy in daemon mode, of course). it is in the logs, you have the server's name (assuming you're logging with "option httplog"). Something is possible if you're playing with only once client. If the number of objects on a page is a multiple of the number of servers and you're in round-robin mode, then each time you'll fetch a page, you'll alternatively fetch objects from both servers and come back to the first one for the next click. Of course that does not happen as soon as you have at least another client. And since I saw 20 sessions on your stats after my access, I'm tempted to think that it could be related. Regards, Willy
Re: "option httpchk" is reporting servers as down when they're not
Hi Thomas, On Mon, Mar 09, 2009 at 05:20:49PM -0400, Allen, Thomas wrote: > Hi Willy, > > Hm, changing to "60s" for each gave me 100% 504 errors, I removed all > three. Bad idea, I know, but at least it works then. then use "6", that's the old way of doing it :-) > I'm running 1.2.18 because the HAProxy homepage calls it the Latest > version. Ah OK, version 1.2 did not have the time units. Well, in fact it's not exactly marked as the only latest version, it's the latest version of branch 1.2, and 1.2 is the only one not tainted by development I admit. > I've removed all cookies from this IP, cleared my cache, and still it > seems that only one server is being hit. But the stats page reports an > equal distribution, so it's anybody's guess. What would be a simple way > to log the distribution? I find it difficult to determine this even in > debug mode (I'm running the proxy in daemon mode, of course). it is in the logs, you have the server's name (assuming you're logging with "option httplog"). Something is possible if you're playing with only once client. If the number of objects on a page is a multiple of the number of servers and you're in round-robin mode, then each time you'll fetch a page, you'll alternatively fetch objects from both servers and come back to the first one for the next click. Of course that does not happen as soon as you have at least another client. And since I saw 20 sessions on your stats after my access, I'm tempted to think that it could be related. Regards, Willy
RE: "option httpchk" is reporting servers as down when they're not
Hi Willy, Hm, changing to "60s" for each gave me 100% 504 errors, I removed all three. Bad idea, I know, but at least it works then. I'm running 1.2.18 because the HAProxy homepage calls it the Latest version. I've removed all cookies from this IP, cleared my cache, and still it seems that only one server is being hit. But the stats page reports an equal distribution, so it's anybody's guess. What would be a simple way to log the distribution? I find it difficult to determine this even in debug mode (I'm running the proxy in daemon mode, of course). Thanks, Thomas Allen Web Developer, ASCE 703.295.6355 -Original Message- From: Willy Tarreau [mailto:w...@1wt.eu] Sent: Monday, March 09, 2009 4:58 PM To: Allen, Thomas Cc: Jeffrey 'jf' Lim; haproxy@formilux.org Subject: Re: "option httpchk" is reporting servers as down when they're not On Mon, Mar 09, 2009 at 04:15:34PM -0400, Allen, Thomas wrote: > I used the unit 'S' for my timeouts, as in > > clitimeout 60S > contimeout 60S > srvtimeout 60S > > Is that to be avoided? I assumed it meant "seconds." OK it's just a minor problem. You have to use a lower-case "s" : 60s. It's stupid that the parser did not catch this mistake. I should improve it. By default, it ignores unknown chars, you you clearly had 60 ms here. BTW, there's no use in setting large contimeouts. You should usually stay with lower values such as 5-10s. Oh BTW, what version are you running ? Your stats page looks old. The time units were introduced in 1.3.14, so I hope you're at least at this level. > I'm using roundrobin and adding the httpclose option. I've been using > cookie stickiness (which will be important for this website), but after > disabling this stickiness, I get the same results. I tried clearing out > the server cookie before and opening the page in multiple browsers, and > still got these results. Then it is possible that haproxy could not manage to connect to your server in 60ms, then immediately retried on the other one, and sticked to that one. Regards, Willy
Re: "option httpchk" is reporting servers as down when they're not
On Mon, Mar 09, 2009 at 04:15:34PM -0400, Allen, Thomas wrote: > I used the unit 'S' for my timeouts, as in > > clitimeout 60S > contimeout 60S > srvtimeout 60S > > Is that to be avoided? I assumed it meant "seconds." OK it's just a minor problem. You have to use a lower-case "s" : 60s. It's stupid that the parser did not catch this mistake. I should improve it. By default, it ignores unknown chars, you you clearly had 60 ms here. BTW, there's no use in setting large contimeouts. You should usually stay with lower values such as 5-10s. Oh BTW, what version are you running ? Your stats page looks old. The time units were introduced in 1.3.14, so I hope you're at least at this level. > I'm using roundrobin and adding the httpclose option. I've been using > cookie stickiness (which will be important for this website), but after > disabling this stickiness, I get the same results. I tried clearing out > the server cookie before and opening the page in multiple browsers, and > still got these results. Then it is possible that haproxy could not manage to connect to your server in 60ms, then immediately retried on the other one, and sticked to that one. Regards, Willy
RE: "option httpchk" is reporting servers as down when they're not
I used the unit 'S' for my timeouts, as in clitimeout 60S contimeout 60S srvtimeout 60S Is that to be avoided? I assumed it meant "seconds." I'm using roundrobin and adding the httpclose option. I've been using cookie stickiness (which will be important for this website), but after disabling this stickiness, I get the same results. I tried clearing out the server cookie before and opening the page in multiple browsers, and still got these results. Thanks, Thomas Allen Web Developer, ASCE 703.295.6355 -Original Message- From: Willy Tarreau [mailto:w...@1wt.eu] Sent: Monday, March 09, 2009 4:09 PM To: Allen, Thomas Cc: Jeffrey 'jf' Lim; haproxy@formilux.org Subject: Re: "option httpchk" is reporting servers as down when they're not Hi Thomas, just replying quick, as I'm in a hurry. On Mon, Mar 09, 2009 at 04:01:29PM -0400, Allen, Thomas wrote: > That, along with specifying HTTP1.1, did it, so thanks! What should I > load into "Host:" ? It seems to work fine with "www", but I'd prefer to > use something I understand. Please keep in mind that none of this is yet > associated with a domain, so www.mydomain.com would be inaccurate. Of course, www.mydomain.com was an example. Often web servers are fine with just "www" but normally you should use the same host name that your server will respond to. Sometimes you can also put the server's IP address. Some servers also accept an empty header (so just "Host:" and nothing else). > Beginning very recently, I get a 504 Gateway Timeout for about 30% of > all requests. What could be causing this? responses taking too much time. Are you sure that your "timeout server" is properly set ? Maybe you have put times in milliseconds there thinking they were in seconds ? > More importantly, I'm not > convinced that HAProxy is successfully forwarding requests to both > servers, although I could wrong. As you can see on the two app > instances, each reports a separate internal IP to help diagnose. It > appears that only SAMP1 receives requests, although both pass health > checks now. I see both servers receiving 20 sessions, so that seems fine. Among possible reasons for what you observe : - ensure you're using "balance roundrobin" and not any sort of hash or source-based algorithm - ensure that you have not enabled cookie stickiness, or that you close your browser before retrying. - ensure that you have "option httpclose" and that your browser is not simply pushing all requests in the same session tunnelled to the first server haproxy connected to. Regards, Willy
Re: "option httpchk" is reporting servers as down when they're not
Hi Thomas, just replying quick, as I'm in a hurry. On Mon, Mar 09, 2009 at 04:01:29PM -0400, Allen, Thomas wrote: > That, along with specifying HTTP1.1, did it, so thanks! What should I > load into "Host:" ? It seems to work fine with "www", but I'd prefer to > use something I understand. Please keep in mind that none of this is yet > associated with a domain, so www.mydomain.com would be inaccurate. Of course, www.mydomain.com was an example. Often web servers are fine with just "www" but normally you should use the same host name that your server will respond to. Sometimes you can also put the server's IP address. Some servers also accept an empty header (so just "Host:" and nothing else). > Beginning very recently, I get a 504 Gateway Timeout for about 30% of > all requests. What could be causing this? responses taking too much time. Are you sure that your "timeout server" is properly set ? Maybe you have put times in milliseconds there thinking they were in seconds ? > More importantly, I'm not > convinced that HAProxy is successfully forwarding requests to both > servers, although I could wrong. As you can see on the two app > instances, each reports a separate internal IP to help diagnose. It > appears that only SAMP1 receives requests, although both pass health > checks now. I see both servers receiving 20 sessions, so that seems fine. Among possible reasons for what you observe : - ensure you're using "balance roundrobin" and not any sort of hash or source-based algorithm - ensure that you have not enabled cookie stickiness, or that you close your browser before retrying. - ensure that you have "option httpclose" and that your browser is not simply pushing all requests in the same session tunnelled to the first server haproxy connected to. Regards, Willy
RE: "option httpchk" is reporting servers as down when they're not
That, along with specifying HTTP1.1, did it, so thanks! What should I load into "Host:" ? It seems to work fine with "www", but I'd prefer to use something I understand. Please keep in mind that none of this is yet associated with a domain, so www.mydomain.com would be inaccurate. Beginning very recently, I get a 504 Gateway Timeout for about 30% of all requests. What could be causing this? More importantly, I'm not convinced that HAProxy is successfully forwarding requests to both servers, although I could wrong. As you can see on the two app instances, each reports a separate internal IP to help diagnose. It appears that only SAMP1 receives requests, although both pass health checks now. Load balancer: http://174.129.240.119/ and stats (temporarily unblocked) http://174.129.240.119/status/lb SAMP1: http://174.129.251.234/ SAMP2: http://174.129.244.252/ Thanks, Thomas Allen Web Developer, ASCE 703.295.6355 -Original Message- From: Willy Tarreau [mailto:w...@1wt.eu] Sent: Friday, March 06, 2009 1:39 PM To: Allen, Thomas Cc: Jeffrey 'jf' Lim; haproxy@formilux.org Subject: Re: "option httpchk" is reporting servers as down when they're not Hi Thomas, On Thu, Mar 05, 2009 at 08:45:20AM -0500, Allen, Thomas wrote: > Hi Jeff, > > The thing is that if I don't include the health check, the load balancer works fine and each server receives equal distribution. I have no idea why the servers would be reported as "down" but still work when unchecked. It is possible that your servers expect the "Host:" header to be set during the checks. There's a trick to do it right now (don't forget to escape spaces) : option httpchk GET /index.php HTTP/1.0\r\nHost:\ www.mydomain.com Also, you should check the server's logs to see why it is reporting the service as down. And as a last resort, a tcpdump of the traffic between haproxy and a failed server will show you both the request and the complete error from the server. Regards, Willy
Re: "option httpchk" is reporting servers as down when they're not
On Sat, Mar 7, 2009 at 2:38 AM, Willy Tarreau wrote: > Hi Thomas, > > On Thu, Mar 05, 2009 at 08:45:20AM -0500, Allen, Thomas wrote: >> Hi Jeff, >> >> The thing is that if I don't include the health check, the load balancer >> works fine and each server receives equal distribution. I have no idea why >> the servers would be reported as "down" but still work when unchecked. > > It is possible that your servers expect the "Host:" header to > be set during the checks. There's a trick to do it right now > (don't forget to escape spaces) : > > option httpchk GET /index.php HTTP/1.0\r\nHost:\ www.mydomain.com > you know Thomas, Willy may be very right here. And I just realized as well - u say u're using 'option httpchk /index.php'? - without specifying the 'GET' verb? -jf > Also, you should check the server's logs to see why it is reporting > the service as down. And as a last resort, a tcpdump of the traffic > between haproxy and a failed server will show you both the request > and the complete error from the server. > > Regards, > Willy > >
RE: "option httpchk" is reporting servers as down when they're not
Thanks, once I figure out logging I'll let you guys know what I discover :^) Thomas Allen Web Developer, ASCE 703.295.6355 -Original Message- From: Willy Tarreau [mailto:w...@1wt.eu] Sent: Friday, March 06, 2009 1:39 PM To: Allen, Thomas Cc: Jeffrey 'jf' Lim; haproxy@formilux.org Subject: Re: "option httpchk" is reporting servers as down when they're not Hi Thomas, On Thu, Mar 05, 2009 at 08:45:20AM -0500, Allen, Thomas wrote: > Hi Jeff, > > The thing is that if I don't include the health check, the load balancer works fine and each server receives equal distribution. I have no idea why the servers would be reported as "down" but still work when unchecked. It is possible that your servers expect the "Host:" header to be set during the checks. There's a trick to do it right now (don't forget to escape spaces) : option httpchk GET /index.php HTTP/1.0\r\nHost:\ www.mydomain.com Also, you should check the server's logs to see why it is reporting the service as down. And as a last resort, a tcpdump of the traffic between haproxy and a failed server will show you both the request and the complete error from the server. Regards, Willy
Re: "option httpchk" is reporting servers as down when they're not
Hi Thomas, On Thu, Mar 05, 2009 at 08:45:20AM -0500, Allen, Thomas wrote: > Hi Jeff, > > The thing is that if I don't include the health check, the load balancer > works fine and each server receives equal distribution. I have no idea why > the servers would be reported as "down" but still work when unchecked. It is possible that your servers expect the "Host:" header to be set during the checks. There's a trick to do it right now (don't forget to escape spaces) : option httpchk GET /index.php HTTP/1.0\r\nHost:\ www.mydomain.com Also, you should check the server's logs to see why it is reporting the service as down. And as a last resort, a tcpdump of the traffic between haproxy and a failed server will show you both the request and the complete error from the server. Regards, Willy
RE: "option httpchk" is reporting servers as down when they're not
Hi Jeff, The thing is that if I don't include the health check, the load balancer works fine and each server receives equal distribution. I have no idea why the servers would be reported as "down" but still work when unchecked. Thanks, Thomas Allen Web Developer, ASCE 703.295.6355 -Original Message- From: Jeffrey 'jf' Lim [mailto:jfs.wo...@gmail.com] Sent: Wednesday, March 04, 2009 8:11 PM To: Allen, Thomas Cc: haproxy@formilux.org Subject: Re: "option httpchk" is reporting servers as down when they're not well, looks like ur servers are actually down then. Do a curl from your haproxy machine to both servers. What do you get? -jf -- In the meantime, here is your PSA: "It's so hard to write a graphics driver that open-sourcing it would not help." -- Andrew Fear, Software Product Manager, NVIDIA Corporation http://kerneltrap.org/node/7228 On Wed, Mar 4, 2009 at 9:40 PM, Allen, Thomas wrote: > Never mind, I got it going. My stats page simply says that both servers are > down. What else should I be looking for? > > Thomas Allen > Web Developer, ASCE > 703.295.6355 > > -Original Message- > From: Jeffrey 'jf' Lim [mailto:jfs.wo...@gmail.com] > Sent: Wednesday, March 04, 2009 2:22 AM > To: Allen, Thomas > Cc: haproxy@formilux.org > Subject: Re: "option httpchk" is reporting servers as down when they're not > > - Show quoted text - > On Wed, Mar 4, 2009 at 4:05 AM, Allen, Thomas wrote: >> Hi, >> >> I like the idea of having HAProxy check server health, but for some reason, >> it reports all of my servers as down. Here's my full config: >> >> listen http_proxy :80 >> mode http >> balance roundrobin >> option httpchk >> server webA {IP} cookie A check >> server webB {IP} cookie B check >> >> I tried "option httpchk /index.php" just to be sure, and got the same >> result. If I remove the httpchk option, HAProxy has no problem proxying >> these servers. What am I doing wrong? >> > > what's listed under "Status" for these servers when viewing your > haproxy status page? > > -jf > > -- > In the meantime, here is your PSA: > "It's so hard to write a graphics driver that open-sourcing it would not > help." > -- Andrew Fear, Software Product Manager, NVIDIA Corporation > http://kerneltrap.org/node/7228 >
Re: "option httpchk" is reporting servers as down when they're not
well, looks like ur servers are actually down then. Do a curl from your haproxy machine to both servers. What do you get? -jf -- In the meantime, here is your PSA: "It's so hard to write a graphics driver that open-sourcing it would not help." -- Andrew Fear, Software Product Manager, NVIDIA Corporation http://kerneltrap.org/node/7228 On Wed, Mar 4, 2009 at 9:40 PM, Allen, Thomas wrote: > Never mind, I got it going. My stats page simply says that both servers are > down. What else should I be looking for? > > Thomas Allen > Web Developer, ASCE > 703.295.6355 > > -Original Message- > From: Jeffrey 'jf' Lim [mailto:jfs.wo...@gmail.com] > Sent: Wednesday, March 04, 2009 2:22 AM > To: Allen, Thomas > Cc: haproxy@formilux.org > Subject: Re: "option httpchk" is reporting servers as down when they're not > > - Show quoted text - > On Wed, Mar 4, 2009 at 4:05 AM, Allen, Thomas wrote: >> Hi, >> >> I like the idea of having HAProxy check server health, but for some reason, >> it reports all of my servers as down. Here's my full config: >> >> listen http_proxy :80 >> mode http >> balance roundrobin >> option httpchk >> server webA {IP} cookie A check >> server webB {IP} cookie B check >> >> I tried "option httpchk /index.php" just to be sure, and got the same >> result. If I remove the httpchk option, HAProxy has no problem proxying >> these servers. What am I doing wrong? >> > > what's listed under "Status" for these servers when viewing your > haproxy status page? > > -jf > > -- > In the meantime, here is your PSA: > "It's so hard to write a graphics driver that open-sourcing it would not > help." > -- Andrew Fear, Software Product Manager, NVIDIA Corporation > http://kerneltrap.org/node/7228 >
RE: "option httpchk" is reporting servers as down when they're not
Never mind, I got it going. My stats page simply says that both servers are down. What else should I be looking for? Thomas Allen Web Developer, ASCE 703.295.6355 -Original Message- From: Jeffrey 'jf' Lim [mailto:jfs.wo...@gmail.com] Sent: Wednesday, March 04, 2009 2:22 AM To: Allen, Thomas Cc: haproxy@formilux.org Subject: Re: "option httpchk" is reporting servers as down when they're not On Wed, Mar 4, 2009 at 4:05 AM, Allen, Thomas wrote: > Hi, > > I like the idea of having HAProxy check server health, but for some reason, > it reports all of my servers as down. Here's my full config: > > listen http_proxy :80 > mode http > balance roundrobin > option httpchk > server webA {IP} cookie A check > server webB {IP} cookie B check > > I tried "option httpchk /index.php" just to be sure, and got the same > result. If I remove the httpchk option, HAProxy has no problem proxying > these servers. What am I doing wrong? > what's listed under "Status" for these servers when viewing your haproxy status page? -jf -- In the meantime, here is your PSA: "It's so hard to write a graphics driver that open-sourcing it would not help." -- Andrew Fear, Software Product Manager, NVIDIA Corporation http://kerneltrap.org/node/7228
RE: "option httpchk" is reporting servers as down when they're not
What's a status page? I don't recall anything in the docs about where this is located or how to configure it, but I probably glossed over something. Thanks, Thomas Allen Web Developer, ASCE 703.295.6355 -Original Message- From: Jeffrey 'jf' Lim [mailto:jfs.wo...@gmail.com] Sent: Wednesday, March 04, 2009 2:22 AM To: Allen, Thomas Cc: haproxy@formilux.org Subject: Re: "option httpchk" is reporting servers as down when they're not On Wed, Mar 4, 2009 at 4:05 AM, Allen, Thomas wrote: > Hi, > > I like the idea of having HAProxy check server health, but for some reason, > it reports all of my servers as down. Here's my full config: > > listen http_proxy :80 > mode http > balance roundrobin > option httpchk > server webA {IP} cookie A check > server webB {IP} cookie B check > > I tried "option httpchk /index.php" just to be sure, and got the same > result. If I remove the httpchk option, HAProxy has no problem proxying > these servers. What am I doing wrong? > what's listed under "Status" for these servers when viewing your haproxy status page? -jf -- In the meantime, here is your PSA: "It's so hard to write a graphics driver that open-sourcing it would not help." -- Andrew Fear, Software Product Manager, NVIDIA Corporation http://kerneltrap.org/node/7228
Re: "option httpchk" is reporting servers as down when they're not
On Wed, Mar 4, 2009 at 4:05 AM, Allen, Thomas wrote: > Hi, > > I like the idea of having HAProxy check server health, but for some reason, > it reports all of my servers as down. Here's my full config: > > listen http_proxy :80 > mode http > balance roundrobin > option httpchk > server webA {IP} cookie A check > server webB {IP} cookie B check > > I tried "option httpchk /index.php" just to be sure, and got the same > result. If I remove the httpchk option, HAProxy has no problem proxying > these servers. What am I doing wrong? > what's listed under "Status" for these servers when viewing your haproxy status page? -jf -- In the meantime, here is your PSA: "It's so hard to write a graphics driver that open-sourcing it would not help." -- Andrew Fear, Software Product Manager, NVIDIA Corporation http://kerneltrap.org/node/7228