RE: Doesn't work for a very few visitors

2009-12-20 Thread Joe Torsitano
This was definitely not caused by too many connections or the conntrack
table being full.  In my test setup there were only five people trying to
connect... two people who were having the problem and three of us that it
worked fine for.  Still the strange thing is the exact same iptables setup
works fine for everyone involved connecting directly to the Apache server
but not going through HAProxy.  I also tried a couple other load balancing
solutions and got the same result (Apache fine, load balancing solution
fail).  The only load balancing solution that worked was Apache
mod_balancer, but it is too basic for my needs.

Anyway, on the load balancer I don't need a very sophisticated iptables.  As
long as I can do basic protection and run HAProxy everything will be fine.

Thanks!


--
 Joe Torsitano


-Original Message-
From: Willy Tarreau [mailto:w...@1wt.eu] 
Sent: Saturday, December 19, 2009 9:58 PM
To: John Lauro
Cc: 'Joe Torsitano'; haproxy@formilux.org
Subject: Re: Doesn't work for a very few visitors

On Sat, Dec 19, 2009 at 05:14:42PM -0500, John Lauro wrote:
 Are you using connection tracking with iptables?  If so, you might want to
 consider using a more basic configuration without connection tracking.

Indeed!

most likely you have a rule somewhere which does a REJECT on
INVALID packets and those poor users are running a buggy TCP
stack which breaks window scaling, SACKs or things like this,
regularly causing some INVALID packets to be detected by the
conntrack code.

Once I even found a user who was doing all of his browsing
using the same TCP source port ! You bet the conntrack has
good reasons to complain.

The other common issue with conntrack as shipped in common
distros is that it's tuned for a desktop system (ie not tuned).
And the table fills very fast when you use that on a server.
You can easily detect this by messages in kernel logs :
Conntrack table is full.

Regards,
Willy
 

__ Information from ESET Smart Security, version of virus signature
database 4702 (20091219) __

The message was checked by ESET Smart Security.

http://www.eset.com
 




RE: Doesn't work for a very few visitors

2009-12-19 Thread John Lauro
Are you using connection tracking with iptables?  If so, you might want to
consider using a more basic configuration without connection tracking.

 

What does your iptables configuration look like?

 

 

 

From: Joe Torsitano [mailto:jtorsit...@weatherforyou.com] 
Sent: Saturday, December 19, 2009 4:25 PM
To: Willy Tarreau
Cc: haproxy@formilux.org
Subject: Re: Doesn't work for a very few visitors

 

Hi Willy,

I have been using iptables on the HAProxy servers.  Luckily I found a couple
of willing test subject who were having the problem and shutting off
iptables seemed to correct it (they could then see the sites).  I use a
pretty basic iptables configuration just to restrict access to SSH and close
off all unused ports.  What is it about iptables that HAProxy doesn't get
along with?  Is there an iptables or other firewall configuration that will
work with HAProxy or do I just have to pretty much leave the server HAProxy
is running on wide open?

Thanks for the information.


-- 
Joe Torsitano




On Fri, Dec 18, 2009 at 11:04 PM, Willy Tarreau w...@1wt.eu wrote:

On Fri, Dec 18, 2009 at 05:00:38PM -0800, Joe Torsitano wrote:
 Hi Willy,

 What's strange is traffic still appears normal, and is, for probably at
 least 99% of the visitors.  Logged traffic remains about normal (hundreds
of
 thousands of visitors a day).  I just get a few e-mails asking why the
site
 has been down for days or when it will be back.  But I cannot recreate the
 problem.  And I know there are probably people who just don't e-mail and,
 unfortunately, don't come back.

yes, very possible unfortunately.

 Here is the config file with the IP addresses changed, pretty much the
 default that comes with it...

A few questions that come to mind :
- What version are you running by the way (haproxy -vv) ?
 Several cases of truncated responses were observed between
 1.3.16 and 1.3.18, and sometimes a 502 response could be
 sent if the server closed too fast before 1.3.19. So please
 endure you're on 1.3.22. More info here about the bugs in
 your version :

   http://haproxy.1wt.eu/knownbugs-1.3.html

- Have you tried to look for client errors in the logs ?

- Have you tried to look in the logs if you could find some of
 the complainers' traces ? Most often, you can check for the
 same class-B or class-C addresses as the IP that posted the
 mail, and try to isolate the accesses by taking the access
 time into account.

- are you sure that 2000 concurrent connections are enough ?
 You may check that in the logs too, as there is a field
 with connection counts.

- I'm seeing there is no option httpclose below. Could you
 try to add it in the defaults section and see if it changes
 anything ? Before doing that, please check that you don't
 have iptables enabled on your haproxy machine.

I'm also thinking about something else. You said that when
you don't go through haproxy you don't get any complaint.
Are your systems configured similarly ? I mean, the very
low rate of problems could very well be caused by some TCP
settings which are incompatible with a minority of users
running behind a buggy router/firewall.

In order to check this, you could run the following command
on each server (including the one with haproxy) :

   $ sysctl -a | fgrep net.ipv4.tcp

Please verify if tcp_ecn and tcp_window_scaling are at the
same values. If not, start by setting tcp_ecn to 0 on
the haproxy server. Then later you can try to similarly
disable tcp_window_scaling, though this one is far less
likely because it's enabled almost everywhere.

Also check with ip route and ip address on all servers
if you don't see a different MTU value on the default
route. It's possible that a small part of your clients
are still running misconfigured a PPPoE ADSL line and
can't send/receive full packets. There are still some
large sites who deal with that by setting their MTU to
1492 or even 1452 on the external interface. But this
is less likely.

Regards,
Willy





Internal Virus Database is out of date.
Checked by AVG - www.avg.com
Version: 8.5.427 / Virus Database: 270.14.105/2561 - Release Date: 12/12/09
19:39:00



Re: Doesn't work for a very few visitors

2009-12-19 Thread Willy Tarreau
On Sat, Dec 19, 2009 at 05:14:42PM -0500, John Lauro wrote:
 Are you using connection tracking with iptables?  If so, you might want to
 consider using a more basic configuration without connection tracking.

Indeed!

most likely you have a rule somewhere which does a REJECT on
INVALID packets and those poor users are running a buggy TCP
stack which breaks window scaling, SACKs or things like this,
regularly causing some INVALID packets to be detected by the
conntrack code.

Once I even found a user who was doing all of his browsing
using the same TCP source port ! You bet the conntrack has
good reasons to complain.

The other common issue with conntrack as shipped in common
distros is that it's tuned for a desktop system (ie not tuned).
And the table fills very fast when you use that on a server.
You can easily detect this by messages in kernel logs :
Conntrack table is full.

Regards,
Willy




Re: Doesn't work for a very few visitors

2009-12-17 Thread Willy Tarreau
Hi,

On Thu, Dec 17, 2009 at 02:07:41PM -0800, Joe Torsitano wrote:
 Whenever I turn on HAProxy everything appears to be working great.  However
 I always get two or three e-mails from people who ask when the site is going
 to be back up.  They say the site can no longer be found in their browser,
 even people who have it bookmarked.  They say it's like the server is down.
 As soon as I switch off HAProxy and have the requests delivered directly
 they say everything is fine again.  Unfortunately I've never been able to
 recreate it on almost a dozen computers I try.  I'm attempting to use
 HAProxy in HTTP mode with Apache servers.  Using the Apache load balancer
 works.  Any ideas?

not at all, sounds very strange. Are you sure you don't have
too short client-side timeouts, which would be compatible with
your local computers but not with remote clients (eg: 10 ms) ?

Otherwise, please post your config (you can mask your IPs if
you want).

Willy