Re: Throughput degradation after upgrading haproxy from 1.3.22 to 1.4.1

2010-03-22 Thread Timh Bergström
The problem with nginx is that it doesnt support chunked-encoding. Since
that is what we are after, we can't use nginx until it supports it or until
we can get rid of chunked encoding. So posting about how good it is working
for you is not really helping our issue. Thanks though.

BR,
Timh



2010/3/19 duncan hall dun...@viator.com

 Throw me in a forth on this one.  I use nginx 0.8.34 for gzip compression,
 RAM caching of static content and SSL offload. All very simple to configure
 and low overheads.  All requests HTTP and HTTPs go to Nginx and are then
 forwarded to HAproxy 1.4.2 as HTTP.
 Regards,

 Duncan



 Harvey Yau wrote:

 I can third this - nginx + haproxy works extremely well.  I wish haproxy
 supported SSL directly.  I realize it's not within the design goals of
 haproxy, but the need for this is out there.  Good thing nginx + haproxy
 works well enough.

 -- Harvey

 On 3/18/10 3:29 PM, Nicholas Hadaway wrote:

 I can second this comment and say that it works extremely well...  nginx
 operates very nicely as an SSL offloading device.

 I am right now using nginx 0.8.33 (soon to bump up to 0.8.34) and HAProxy
 1.4.2 in production and things work very well for me.

 -Nick


 Maybe it's worth a try for you to get along with nginx as stunnel
 replacement ?
 Its performance is quit good and the config can be held very short,
 too for only accepting ssl traffic
 and directing it to haproxy.

 kind regards,
 Malte











-- 
Timh Bergström
System Operations Manager
Diino AB - www.diino.com
:wq


Re: Throughput degradation after upgrading haproxy from 1.3.22 to 1.4.1

2010-03-22 Thread Nicholas Hadaway

Chunked encoding support...

http://github.com/agentzh/chunkin-nginx-module

-nick

On 3/22/2010 4:57 AM, Timh Bergström wrote:
The problem with nginx is that it doesnt support chunked-encoding. 
Since that is what we are after, we can't use nginx until it supports 
it or until we can get rid of chunked encoding. So posting about how 
good it is working for you is not really helping our issue. Thanks though.


BR,
Timh



2010/3/19 duncan hall dun...@viator.com mailto:dun...@viator.com

Throw me in a forth on this one.  I use nginx 0.8.34 for gzip
compression, RAM caching of static content and SSL offload. All
very simple to configure and low overheads.  All requests HTTP and
HTTPs go to Nginx and are then forwarded to HAproxy 1.4.2 as HTTP.
Regards,

Duncan



Harvey Yau wrote:

I can third this - nginx + haproxy works extremely well.  I
wish haproxy supported SSL directly.  I realize it's not
within the design goals of haproxy, but the need for this is
out there.  Good thing nginx + haproxy works well enough.

-- Harvey

On 3/18/10 3:29 PM, Nicholas Hadaway wrote:

I can second this comment and say that it works extremely
well...  nginx operates very nicely as an SSL offloading
device.

I am right now using nginx 0.8.33 (soon to bump up to
0.8.34) and HAProxy 1.4.2 in production and things work
very well for me.

-Nick


Maybe it's worth a try for you to get along with nginx
as stunnel
replacement ?
Its performance is quit good and the config can be
held very short,
too for only accepting ssl traffic
and directing it to haproxy.

kind regards,
Malte











--
Timh Bergström
System Operations Manager
Diino AB - www.diino.com http://www.diino.com
:wq




Re: Throughput degradation after upgrading haproxy from 1.3.22 to 1.4.1

2010-03-18 Thread Erik Gulliksson
Hi,

  Did you observe anything special about the CPU usage ? Was it lower
  than with 1.3 ? If so, it would indicate some additional delay somewhere.
  If it was higher, it could indicate that the Transfer-encoding parser
  takes too many cycles but my preliminary tests proved it to be quite
  efficient.

 I did not notice anything special about CPU usage. It seems to be
 around 2-4% with both versions. When checking munin-graphs, this
 morning I did however notice that the counter connection resets
 received  from netstat -s was increasing a lot more with 1.4.

 This led me to look at the log more closely, and there seems to be a
 lot new errors that looks something like this:
 w.x.y.z:4004 [15/Mar/2010:09:50:51.190] fe_xxx be_yyy/upload-srvX
 0/0/0/-1/62 502 391 - PR-- 9/6/6/3/0 0/0 PUT /dav/filename.ext
 HTTP/1.1

 Interesting ! It looks like haproxy has aborted because the server
 returned an invalid response. You can check that using socat on the
 stats socket. For instance :

   echo show errors | socat stdio unix-connect:/var/run/haproxy.stat

 If you don't get anything, then it's something else :-/


Unfortunately I the show errors returned empty, so I guess it was
something else. The good news is that I gave haproxy 1.4.2 a try today
and the 502/PR error with PUT/TE:chunked requests have now vanished.
So thanks for solving this. I'm not sure which one of the bugs I was
hitting but it does not really matter since it now seems to be fixed.

So now when I got a working haproxy 1.4, I continued to try out the
option http-server-close but I hit a problem with our stunnel
(patched with stunnel-4.22-xforwarded-for.diff) instances. It does not
support keep-alive, so only the first HTTP request in a
keepalive-session gets the X-Forwarded-For header added (insert Homer
doh! here :). When giving it some thought, I guess this is the
expected behaviour for what stunnel actually is supposed to do. So,
for now I'll stick with option httpclose for a while longer...

Keep up the good work!

Best regards
Erik

-- 
Erik Gulliksson, erik.gulliks...@diino.net
System Administrator, Diino AB
http://www.diino.com



Re: Throughput degradation after upgrading haproxy from 1.3.22 to 1.4.1

2010-03-18 Thread Malte Geierhos
Hello,

 Unfortunately I the show errors returned empty, so I guess it
 was something else. The good news is that I gave haproxy 1.4.2 a
 try today and the 502/PR error with PUT/TE:chunked requests have
 now vanished. So thanks for solving this. I'm not sure which one of
 the bugs I was hitting but it does not really matter since it now
 seems to be fixed.

 So now when I got a working haproxy 1.4, I continued to try out
 the option http-server-close but I hit a problem with our
 stunnel (patched with stunnel-4.22-xforwarded-for.diff) instances.
 It does not support keep-alive, so only the first HTTP request in
 a keepalive-session gets the X-Forwarded-For header added (insert
 Homer doh! here :). When giving it some thought, I guess this is
 the expected behaviour for what stunnel actually is supposed to do.
 So, for now I'll stick with option httpclose for a while
 longer...

 Keep up the good work!

 Best regards Erik


Maybe it's worth a try for you to get along with nginx as stunnel
replacement ?
Its performance is quit good and the config can be held very short,
too for only accepting ssl traffic
and directing it to haproxy.

kind regards,
Malte




Re: Throughput degradation after upgrading haproxy from 1.3.22 to 1.4.1

2010-03-18 Thread Willy Tarreau
Hi Erik,

On Thu, Mar 18, 2010 at 02:29:46PM +0100, Erik Gulliksson wrote:
 Unfortunately I the show errors returned empty, so I guess it was
 something else. The good news is that I gave haproxy 1.4.2 a try today
 and the 502/PR error with PUT/TE:chunked requests have now vanished.
 So thanks for solving this. I'm not sure which one of the bugs I was
 hitting but it does not really matter since it now seems to be fixed.

OK so very likely it's the same problem I fixed yesterday using Bernhard's
captures.

 So now when I got a working haproxy 1.4, I continued to try out the
 option http-server-close but I hit a problem with our stunnel
 (patched with stunnel-4.22-xforwarded-for.diff) instances. It does not
 support keep-alive, so only the first HTTP request in a
 keepalive-session gets the X-Forwarded-For header added (insert Homer
 doh! here :). When giving it some thought, I guess this is the
 expected behaviour for what stunnel actually is supposed to do.

yes indeed it's expected. Stunnel is not designed to manipulate application
data, and the patch only adds the header to the first request of a connection.
Maybe we should implement some XCLIENT-like protocol between stunnel and
haproxy, to report address of the client of the TCP connection.

 So,
 for now I'll stick with option httpclose for a while longer...

You may get better results with option forceclose now, as it will release
the server connection earlier.

Regards,
Willy




Re: Throughput degradation after upgrading haproxy from 1.3.22 to 1.4.1

2010-03-18 Thread Erik Gulliksson
Hi Malte,

 So now when I got a working haproxy 1.4, I continued to try out
 the option http-server-close but I hit a problem with our
 stunnel (patched with stunnel-4.22-xforwarded-for.diff) instances.
 It does not support keep-alive, so only the first HTTP request in
 a keepalive-session gets the X-Forwarded-For header added (insert
 Homer doh! here :). When giving it some thought, I guess this is
 the expected behaviour for what stunnel actually is supposed to do.
 So, for now I'll stick with option httpclose for a while
 longer...


 Maybe it's worth a try for you to get along with nginx as stunnel
 replacement ?
 Its performance is quit good and the config can be held very short,
 too for only accepting ssl traffic
 and directing it to haproxy.

Thanks for the suggestion. I did give nginx a try in a lab setup, but
for our application it did not work out with the Transfer-Encoding:
chunked header, as nginx returns 411 Content-Length required for
such requests. I also tried with Pound, but got a similar error. There
may be other products out there I have not yet tried however. What I
am looking for in my SSL-decoding solution is support for TE:chunked,
http keep-alive, option to set SSL engine (for h/w acceleration),
soft-reconfiguration (something like haproxy's -sf), HTTP header
manipulation, open-source, free, robust and efficient. This is
beginning to sound like haproxy with SSL support :)

Best regards
Erik Gulliksson

-- 
Erik Gulliksson, erik.gulliks...@diino.net
System Administrator, Diino AB
http://www.diino.com



Re: Throughput degradation after upgrading haproxy from 1.3.22 to 1.4.1

2010-03-18 Thread XANi


 So now when I got a working haproxy 1.4, I continued to try out the
 option http-server-close but I hit a problem with our stunnel
 (patched with stunnel-4.22-xforwarded-for.diff) instances. It does not
 support keep-alive, so only the first HTTP request in a
 keepalive-session gets the X-Forwarded-For header added (insert Homer
 doh! here :). When giving it some thought, I guess this is the
 expected behaviour for what stunnel actually is supposed to do. So,
 for now I'll stick with option httpclose for a while longer...

Maybe try to use some light web server like Nginx or Lighttpd as SSL
proxy instead ?

-- 
Mariusz Gronczewski (XANi) xani...@gmail.com
GnuPG: 0xEA8ACE64
http://devrandom.pl


signature.asc
Description: To jest część  wiadomości podpisana cyfrowo


Re: Throughput degradation after upgrading haproxy from 1.3.22 to 1.4.1

2010-03-18 Thread Erik Gulliksson
Hi Willy

On Thu, Mar 18, 2010 at 3:08 PM, Willy Tarreau w...@1wt.eu wrote:

 OK so very likely it's the same problem I fixed yesterday using Bernhard's
 captures.

Great! Thanks to Bernhard as well then, for providing you with the captures.

 yes indeed it's expected. Stunnel is not designed to manipulate application
 data, and the patch only adds the header to the first request of a connection.
 Maybe we should implement some XCLIENT-like protocol between stunnel and
 haproxy, to report address of the client of the TCP connection.

I would love to see a feature that makes this work. I know too little
about stunnel and haproxy internals to have an opinion on what would
be the best/simplest way to implement it.

 So,
 for now I'll stick with option httpclose for a while longer...

 You may get better results with option forceclose now, as it will release
 the server connection earlier.

OK, I will try to enable option forceclose as well.

Again, thanks for all the help.

Best regards
Erik

-- 
Erik Gulliksson, erik.gulliks...@diino.net
System Administrator, Diino AB
http://www.diino.com



Re: Throughput degradation after upgrading haproxy from 1.3.22 to 1.4.1

2010-03-15 Thread Willy Tarreau
Hi Erik,

On Mon, Mar 15, 2010 at 10:27:38AM +0100, Erik Gulliksson wrote:
 Hi Willy,
 
 Thanks for your elaborative answer.
 
  Did you observe anything special about the CPU usage ? Was it lower
  than with 1.3 ? If so, it would indicate some additional delay somewhere.
  If it was higher, it could indicate that the Transfer-encoding parser
  takes too many cycles but my preliminary tests proved it to be quite
  efficient.
 
 I did not notice anything special about CPU usage. It seems to be
 around 2-4% with both versions. When checking munin-graphs, this
 morning I did however notice that the counter connection resets
 received  from netstat -s was increasing a lot more with 1.4.
 
 This led me to look at the log more closely, and there seems to be a
 lot new errors that looks something like this:
 w.x.y.z:4004 [15/Mar/2010:09:50:51.190] fe_xxx be_yyy/upload-srvX
 0/0/0/-1/62 502 391 - PR-- 9/6/6/3/0 0/0 PUT /dav/filename.ext
 HTTP/1.1

Interesting ! It looks like haproxy has aborted because the server
returned an invalid response. You can check that using socat on the
stats socket. For instance :

   echo show errors | socat stdio unix-connect:/var/run/haproxy.stat

If you don't get anything, then it's something else :-/

 This is only for a few of the PUT requests, most requests seem to get
 proxied successfully. I will try to reproduce this in a more
 controlled lab setup where I can sniff HTTP-headers to see what is
 actually sent in the request.

That would obviously help too :-)

  No, I've run POST requests (very similar to PUT), except that there
  was no Transfer-Encoding in the requests. It's interesting that you're
  doing that in the request, because Apache removed support for TE:chunked
  a few years ago because there was no user. Also, most of my POST tests
  were not performance related.
 
 Interesting. We do use Apache for parts of this application on the
 backend side, although PUT requests are handled by an in-house
 developed Erlang application.

OK.

  A big part has changed, in previous version, haproxy did not care
  at all about the payload. It only saw headers. Now with keepalive
  support, it has to find requests/responses bounds and as such must
  parse the transfer-encoding and content-lengths. However, transfer
  encoding is nice to components such as haproxy because it's very
  cheap. Haproxy reads a chunk size (one line), then forwards that
  many bytes, then reads a new chunk size, etc... So this is really
  a cheap operation. My tests have shown no issue at gigabit/s speeds
  with just a few bytes per chunk.
 
  I suspect that the application tries to use the chunked encoding
  to simulate a bidirectionnal access. In this case, it might be
  waiting for data pending in the kernel buffers which were sent by
  haproxy with the MSG_MORE flag, indicating that more data are
  following (and so you should observe a low CPU usage).
 
  Could you please do a small test : in src/stream_sock.c, please
  comment out line 616 :
 
    615                          /* this flag has precedence over the rest */
    616                     //     if (b-flags  BF_SEND_DONTWAIT)
    617                                  send_flag = ~MSG_MORE;
 
  It will unconditionally disable use of MSG_MORE. If this fixes the
  issue for you, I'll probably have to add an option to disable this
  packet merging for very specific applications.
 
 I tried to comment out the line above as instructed, but it made no
 noticable change. As stated above, I will try to reproduce the problem
 in a lab setup. This may be an issue with our application rather than
 haproxy.

OK, thanks for testing !

Best regards,
Willy




Re: Throughput degradation after upgrading haproxy from 1.3.22 to 1.4.1

2010-03-12 Thread Willy Tarreau
Hi Erik,

On Fri, Mar 12, 2010 at 11:08:08AM +0100, Erik Gulliksson wrote:
 Hi!
 
 First, I'd like to thank Willy and the other haproxy contributors for
 bringing this wonderful piece of software into the world :)

Thanks !

 For the last 2 years now we have been running haproxy 1.3 successfully
 to load balance our frontend applications and storage services. Mainly
 the requests passing through our haproxy instances are WebDAV
 commands. Since there were some new sought-after features announced in
 the new stable 1.4 branch, yesterday we gave it a go and upgraded to
 haproxy from 1.3.22 to 1.4.1 in our production environment (simply
 replaced active binary with -sf switch). After the new version was
 deployed our incoming traffic slowly dropped from approximately
 150Mbps to 80Mbps (as ongoing requests were still processed by
 1.3.22). The configuration file were not changed between the two
 versions, so we have not yet started use any of the new config options
 for 1.4 (http-server-close etc). Because of the drop in throughput we
 have now rolled back to 1.3.22 (and traffic levels are back to
 normal).

Did you observe anything special about the CPU usage ? Was it lower
than with 1.3 ? If so, it would indicate some additional delay somewhere.
If it was higher, it could indicate that the Transfer-encoding parser
takes too many cycles but my preliminary tests proved it to be quite
efficient.

 What differ our service from most other online services is that we are
 more of a content-consumer rather than a content provider. The
 requests that are generating our traffic volume is mostly large and
 small PUT requests with Transfer-Encoding: chunked. Is this type of
 requests included in any of your tests or benchmarks?

No, I've run POST requests (very similar to PUT), except that there
was no Transfer-Encoding in the requests. It's interesting that you're
doing that in the request, because Apache removed support for TE:chunked
a few years ago because there was no user. Also, most of my POST tests
were not performance related.

 Do you have a
 clue of what might have changed in the code base to cause this
 behavior? Any suggestions for where to go from here (other than
 sticking with 1.3 :) is greatly appreciated.

A big part has changed, in previous version, haproxy did not care
at all about the payload. It only saw headers. Now with keepalive
support, it has to find requests/responses bounds and as such must
parse the transfer-encoding and content-lengths. However, transfer
encoding is nice to components such as haproxy because it's very
cheap. Haproxy reads a chunk size (one line), then forwards that
many bytes, then reads a new chunk size, etc... So this is really
a cheap operation. My tests have shown no issue at gigabit/s speeds
with just a few bytes per chunk.

I suspect that the application tries to use the chunked encoding
to simulate a bidirectionnal access. In this case, it might be
waiting for data pending in the kernel buffers which were sent by
haproxy with the MSG_MORE flag, indicating that more data are
following (and so you should observe a low CPU usage).

Could you please do a small test : in src/stream_sock.c, please
comment out line 616 :

   615  /* this flag has precedence over the rest */
   616 // if (b-flags  BF_SEND_DONTWAIT)
   617  send_flag = ~MSG_MORE;

It will unconditionally disable use of MSG_MORE. If this fixes the
issue for you, I'll probably have to add an option to disable this
packet merging for very specific applications.

Regards,
Willy