Hi Erik,

On Fri, Mar 12, 2010 at 11:08:08AM +0100, Erik Gulliksson wrote:
> Hi!
> 
> First, I'd like to thank Willy and the other haproxy contributors for
> bringing this wonderful piece of software into the world :)

Thanks !

> For the last 2 years now we have been running haproxy 1.3 successfully
> to load balance our frontend applications and storage services. Mainly
> the requests passing through our haproxy instances are WebDAV
> commands. Since there were some new sought-after features announced in
> the new stable 1.4 branch, yesterday we gave it a go and upgraded to
> haproxy from 1.3.22 to 1.4.1 in our production environment (simply
> replaced active binary with -sf switch). After the new version was
> deployed our incoming traffic slowly dropped from approximately
> 150Mbps to 80Mbps (as ongoing requests were still processed by
> 1.3.22). The configuration file were not changed between the two
> versions, so we have not yet started use any of the new config options
> for 1.4 (http-server-close etc). Because of the drop in throughput we
> have now rolled back to 1.3.22 (and traffic levels are back to
> normal).

Did you observe anything special about the CPU usage ? Was it lower
than with 1.3 ? If so, it would indicate some additional delay somewhere.
If it was higher, it could indicate that the Transfer-encoding parser
takes too many cycles but my preliminary tests proved it to be quite
efficient.

> What differ our service from most other online services is that we are
> more of a "content-consumer" rather than a content provider. The
> requests that are generating our traffic volume is mostly large and
> small PUT requests with Transfer-Encoding: chunked. Is this type of
> requests included in any of your tests or benchmarks?

No, I've run POST requests (very similar to PUT), except that there
was no Transfer-Encoding in the requests. It's interesting that you're
doing that in the request, because Apache removed support for TE:chunked
a few years ago because there was no user. Also, most of my POST tests
were not performance related.

> Do you have a
> clue of what might have changed in the code base to cause this
> behavior? Any suggestions for where to go from here (other than
> sticking with 1.3 :) is greatly appreciated.

A big part has changed, in previous version, haproxy did not care
at all about the payload. It only saw headers. Now with keepalive
support, it has to find requests/responses bounds and as such must
parse the transfer-encoding and content-lengths. However, transfer
encoding is nice to components such as haproxy because it's very
cheap. Haproxy reads a chunk size (one line), then forwards that
many bytes, then reads a new chunk size, etc... So this is really
a cheap operation. My tests have shown no issue at gigabit/s speeds
with just a few bytes per chunk.

I suspect that the application tries to use the chunked encoding
to simulate a bidirectionnal access. In this case, it might be
waiting for data pending in the kernel buffers which were sent by
haproxy with the MSG_MORE flag, indicating that more data are
following (and so you should observe a low CPU usage).

Could you please do a small test : in src/stream_sock.c, please
comment out line 616 :

   615                          /* this flag has precedence over the rest */
   616                     //     if (b->flags & BF_SEND_DONTWAIT)
   617                                  send_flag &= ~MSG_MORE;

It will unconditionally disable use of MSG_MORE. If this fixes the
issue for you, I'll probably have to add an option to disable this
packet merging for very specific applications.

Regards,
Willy


Reply via email to