Hi Erik, On Mon, Mar 15, 2010 at 10:27:38AM +0100, Erik Gulliksson wrote: > Hi Willy, > > Thanks for your elaborative answer. > > > Did you observe anything special about the CPU usage ? Was it lower > > than with 1.3 ? If so, it would indicate some additional delay somewhere. > > If it was higher, it could indicate that the Transfer-encoding parser > > takes too many cycles but my preliminary tests proved it to be quite > > efficient. > > I did not notice anything special about CPU usage. It seems to be > around 2-4% with both versions. When checking munin-graphs, this > morning I did however notice that the counter "connection resets > received" from "netstat -s" was increasing a lot more with 1.4. > > This led me to look at the log more closely, and there seems to be a > lot new errors that looks something like this: > w.x.y.z:4004 [15/Mar/2010:09:50:51.190] fe_xxx be_yyy/upload-srvX > 0/0/0/-1/62 502 391 - PR-- 9/6/6/3/0 0/0 "PUT /dav/filename.ext > HTTP/1.1"
Interesting ! It looks like haproxy has aborted because the server returned an invalid response. You can check that using socat on the stats socket. For instance : echo "show errors" | socat stdio unix-connect:/var/run/haproxy.stat If you don't get anything, then it's something else :-/ > This is only for a few of the PUT requests, most requests seem to get > proxied successfully. I will try to reproduce this in a more > controlled lab setup where I can sniff HTTP-headers to see what is > actually sent in the request. That would obviously help too :-) > > No, I've run POST requests (very similar to PUT), except that there > > was no Transfer-Encoding in the requests. It's interesting that you're > > doing that in the request, because Apache removed support for TE:chunked > > a few years ago because there was no user. Also, most of my POST tests > > were not performance related. > > Interesting. We do use Apache for parts of this application on the > backend side, although PUT requests are handled by an in-house > developed Erlang application. OK. > > A big part has changed, in previous version, haproxy did not care > > at all about the payload. It only saw headers. Now with keepalive > > support, it has to find requests/responses bounds and as such must > > parse the transfer-encoding and content-lengths. However, transfer > > encoding is nice to components such as haproxy because it's very > > cheap. Haproxy reads a chunk size (one line), then forwards that > > many bytes, then reads a new chunk size, etc... So this is really > > a cheap operation. My tests have shown no issue at gigabit/s speeds > > with just a few bytes per chunk. > > > > I suspect that the application tries to use the chunked encoding > > to simulate a bidirectionnal access. In this case, it might be > > waiting for data pending in the kernel buffers which were sent by > > haproxy with the MSG_MORE flag, indicating that more data are > > following (and so you should observe a low CPU usage). > > > > Could you please do a small test : in src/stream_sock.c, please > > comment out line 616 : > > > > 615 /* this flag has precedence over the rest */ > > 616 // if (b->flags & BF_SEND_DONTWAIT) > > 617 send_flag &= ~MSG_MORE; > > > > It will unconditionally disable use of MSG_MORE. If this fixes the > > issue for you, I'll probably have to add an option to disable this > > packet merging for very specific applications. > > I tried to comment out the line above as instructed, but it made no > noticable change. As stated above, I will try to reproduce the problem > in a lab setup. This may be an issue with our application rather than > haproxy. OK, thanks for testing ! Best regards, Willy