Re: potential corruption in request body [1.3.15.7]

2009-04-19 Thread Arash Ferdowsi
another piece of (potentially useful) information is that the I have
haproxy load balancing to 3 different machines with appservers (one of
them being the same machine running haproxy). the corruption only
occured when balancing to the 2 appservers on separate machines and
never seemed to affect the local appservers.

On Sun, Apr 19, 2009 at 9:03 PM, Arash Ferdowsi  wrote:
> hi all,
> I'v been happily using 1.3.15.7 (sitting behind nginx) with no real
> problems to report of. I tried out 1.3.17 and ran into a fairly
> strange and serious problem.
>
> I had the new version running in production for about an hour (served
> about 800k requests in this period). after a few minutes of production
> load, requests from several clients were failing in some application
> level assertions (the post variable in question is base64 decoded and
> then zlib decompressed). it's difficult for me to tell exactly what's
> being corrupted in the request body, but I have noticed that it
> generally seems to happen twice per IP (the client generally retries
> 30 seconds after a failure) and the same assertion will trigger the
> next time around. the bodies from back-to-back requests are supposed
> to be identical (and are in length) but not in content.
>
> have any similar issues been reported on 1.3.17? what can I do to help
> debug this? thanks in advance :-).
>
>
> --
> Arash Ferdowsi
> CTO, Dropbox
> 913.707.5875 (m)
>



-- 
Arash Ferdowsi
CTO, Dropbox
913.707.5875 (m)



potential corruption in request body [1.3.15.7]

2009-04-19 Thread Arash Ferdowsi
hi all,
I'v been happily using 1.3.15.7 (sitting behind nginx) with no real
problems to report of. I tried out 1.3.17 and ran into a fairly
strange and serious problem.

I had the new version running in production for about an hour (served
about 800k requests in this period). after a few minutes of production
load, requests from several clients were failing in some application
level assertions (the post variable in question is base64 decoded and
then zlib decompressed). it's difficult for me to tell exactly what's
being corrupted in the request body, but I have noticed that it
generally seems to happen twice per IP (the client generally retries
30 seconds after a failure) and the same assertion will trigger the
next time around. the bodies from back-to-back requests are supposed
to be identical (and are in length) but not in content.

have any similar issues been reported on 1.3.17? what can I do to help
debug this? thanks in advance :-).


-- 
Arash Ferdowsi
CTO, Dropbox
913.707.5875 (m)



option splice-auto

2009-04-19 Thread Robert Simmons
I am trying to configure HAProxy to use connection splicing, however  
1.3.17 does not seem to accept this option. I've tried splice-auto  
along with request and response without success. According to the  
configuration, placing it in the listen is acceptable - but it is  
returning "unknown option" on a configuration test.


Any ideas?

Robert.

Config:

listen rps-test 10.210.2.30:80
mode http
stats enable
stats uri /stats
option splice-request
option httpclose
balance roundrobin
clitimeout 10
srvtimeout 3
contimeout 4000
maxconn 4
server Web1 10.210.2.20:80
server Web2 10.210.2.21:80
server Web3 10.210.2.22:80

global
ulimit-n 1024576
maxconn 4
stats socket /var/run/haproxy.socket




Re: Simple TCP with backup config

2009-04-19 Thread Willy Tarreau
Hi Michael,

On Fri, Apr 17, 2009 at 04:47:38PM +0100, Michael Miller wrote:
> Hi,
> 
> I am doing some intial testing with HAProxy and have come across a
> problem I don't seem to be able to resolve.
> 
> A summary of what I am initially trying to achieve follows. I am trying
> to use HAProxy to provide a VIP that passes on a tcp (SMTP as it
> happens) stream to a backend server. If that server is down, I would
> like the connection forwarded to a backup server.
> 
> Doing some testing and watching the status page reveals that if both
> servers are configured as normal, rather than backup, servers the tcp
> connection is rerouted when the initial attempt to connect fails.
> However, when one server is configured as backup, the connection never
> gets to the backup server.
> 
> The config I am using is:
> global
> log 127.0.0.1   local0
> log 127.0.0.1   local1 notice
> maxconn 4096
> pidfile /var/run/haproxy.pid
> ##chroot /usr/share/haproxy
> user haproxy
> group haproxy
> daemon
> #debug
> #quiet
> spread-checks 10
> 
> defaults default_settings
> log global
> modehttp
> option  httplog
> option  dontlognull
> option  abortonclose
> ##  option  allbackups
> option  clitcpka
> option  srvtcpka
> option  forwardfor
> retries 10
> option  redispatch
> maxconn 2000
> backlog 256
> timeout connect 5000
> timeout client 5
> timeout server 1
> 
> listen www-health
> bind 0.0.0.0:8080
> mode http
> monitor-uri /haproxy
> stats enable
> stats uri /stats
> 
> listen smtp
> log global
> bind 0.0.0.0:25
> mode tcp
> #option smtpchk HELO haproxy.local
> option tcplog
> balance roundrobin
> rate-limit sessions 10
> timeout connect 1
> timeout client 6
> timeout server 6
> 
> server smtp01 10.1.1.5:25
> server smtp02 10.1.1.6:25 backup
> 
> 
> 
> 
> Note that I am trying to avoid using active health checks and am hoping
> that the tcp connection failure when connecting to the primary will fall
> back to the backup server. This works as expected when both servers are
> configured as "active" rather than "backup" servers. Looking at the
> status page when one is down, the 10 retries against the "down" server
> are shown and then the tcp connection succeeds to the second server.
> 
> Is this a bug that the tcp connection is not forwarded to the backup
> server, or am I missing some "obvious" configuration settings?

Neither :-)
It is designed to work like this though I agree that it is not
necessarily obvious. As documented, a backup server is only activated
when all other servers are down. Here, since you are not checking the
active server, it is never down. That's as simple as that. May I ask
why you don't want to enable health-checks ? That's a rather strange
choice, as it means you don't care about the server's status but
still hope that a failure will be detected fast enough to hope a
redispatch would work. You might destroy a lot of traffic acting
like this.

Also, there is an "smtpchk" option which is able to check that your
server responds on port 25. You should really use it. You don't
necessarily need to check every second, for SMTP generally, checking
once a minute may be enough for small setups.

Regards,
Willy




Re: A patch for haproxy-1.3.17 which add X-Original-Dst header.

2009-04-19 Thread Willy Tarreau
Hi Maik,

On Fri, Apr 17, 2009 at 06:53:21PM +0200, Maik Broemme wrote:
> okay attached is now the final version, there were no new features or
> enhancements. I only renamed it from 'X-Original-Dst' to 'X-Original-To'
> because it is a common practice to name it so. For example Postfix does
> it in a mail header with the same name.

you seem to have done a very clean work. I'm queuing the patch for inclusion.

Thanks !
Willy




Re: HAProxy running at 10 Gigabit/s

2009-04-19 Thread Willy Tarreau
On Sun, Apr 19, 2009 at 12:54:51PM -0700, Michael Fortson wrote:
> This is really impressive, Willy.
> 
> It would be great to be able to recreate an optimized installation
> like the one used in this test. How would one go about finding or
> building a similarly-configured linux distro?

In fact I would like to be able to make 2.6.27.X run on CentOS 5
which is very often deployed it seems. But its mkinitrd is completely
buggy and this crap is unable to boot from the device specified on
the kernel command line. And since device names have changed between
2.6.18 and 2.6.27, I don't know how to sort that out. Maybe someone
who knows this distro well could give it a try. After that I could
indicate which patches to add to this kernel (including the ability
to rebind to a port and the transparent proxy backport). This kernel
will be maintained for a few years, so that would constitute a good
platform. Keepalived is not easy to find packaged for it either BTW.

The rest is "just" kernel parameter tuning. I'm thinking about
writing a tuning guide for 2.6 kernels. I've once again been
contacted by a big site this week-end which was dying under load
because the sysctls had not been tuned, and that's a shame :-/

Regards,
Willy




Re: HAProxy running at 10 Gigabit/s

2009-04-19 Thread Michael Fortson
This is really impressive, Willy.

It would be great to be able to recreate an optimized installation
like the one used in this test. How would one go about finding or
building a similarly-configured linux distro?







On Sun, Apr 19, 2009 at 12:37 PM, Willy Tarreau  wrote:
> Hi all,
>
> I've wanted to redo those benchmarks at 10 Gbps for quite some time now,
> in fact since the release of 1.3.16 which brought splicing support and the
> new I/O layer. Now I found a few hours to re-run them, the results have
> been posted here :
>
>      http://haproxy.1wt.eu/10g.html
>
> In short, raw data throughput excels thanks to Linux kernel 2.6.27's TCP
> splicing and LRO implemented in the Myri-10G NIC, because haproxy is now
> capable of proxying 10 Gbps with less than 20% CPU used on a Core2Duo
> 2.66 GHz. The peak session rate has also significantly improved with the
> I/O rework. We now reach 38000 hits/s on the same hardware, and can get
> as high as 105000 connections/s if they are not forwarded to the server
> (eg: blocking ACLs).
>
> Regards,
> Willy
>
>
>



HAProxy running at 10 Gigabit/s

2009-04-19 Thread Willy Tarreau
Hi all,

I've wanted to redo those benchmarks at 10 Gbps for quite some time now,
in fact since the release of 1.3.16 which brought splicing support and the
new I/O layer. Now I found a few hours to re-run them, the results have
been posted here :

  http://haproxy.1wt.eu/10g.html

In short, raw data throughput excels thanks to Linux kernel 2.6.27's TCP
splicing and LRO implemented in the Myri-10G NIC, because haproxy is now
capable of proxying 10 Gbps with less than 20% CPU used on a Core2Duo
2.66 GHz. The peak session rate has also significantly improved with the
I/O rework. We now reach 38000 hits/s on the same hardware, and can get
as high as 105000 connections/s if they are not forwarded to the server
(eg: blocking ACLs).

Regards,
Willy