Greetings,
Having paged through the logs, I see a lot that seem to have the first
four numbers fairly small (indicating that the request to the response
headers finished before times started getting extreme) (Tq, Tw, Tc, Tr),
but which have an overall time (Tt) in the realm of five minutes.
This would indicate that the backend is getting the request from the
client (Tq), gets through the queues (Tw), a TCP connection to the
backend is established (Tc), and it sends the response headers (Tr) in a
few hundred ms to a couple of seconds; but then most of the time is
spent with the client sending the body.
Before we move on, does that sound reasonable as a potential issue
location? If not, I can try running some math on the columns to get a
better idea (I just looked at a random sampling of slow requests to
compare to what I've seen as the baseline).
Another thing which is interesting here are the termination states (I
usually look at them as they give an idea for why connections are
failing; definitions are at
https://cbonte.github.io/haproxy-dconv/configuration-1.5.html#8.5):
7 CHVN
9 SDVN
10 cDVN
12 LR--
33 CDVN
50 SHDN
92 --NI
113 sHVN
186 SHVN
2115 --DI
13896 --VN
The first two chars show the state at termiation, and the second two
talk about the persistence cookie (useful for seeing if first time
clients are failing, etc).
The ones starting with -- indicate they were successful, so ignoring
them here. Other then that we have a bunch starting with SH, indicating
that the TCP connection to the backend ether failed or was aborted, and
sH indicating that the backend connection attempt timed out.
The numbers are fairly small there in terms of failures vs successes, so
I'd say that isn't likely to be the primary issue (unless we get to
talking about individual connections).
If thats the case, the next step would be to figure out why the body
data takes so long; which is outside of what HAProxy can cleanly help
with. Do the backends have logs which would indicate what they are
doing? If not, the next thing I'd try would be making a file with
TCPDump to view in Wireshark to see what is going on between haproxy and
the backends (how to do that is outside the scope of what makes any
sense to describe here, though).
- Chad
On 03/10/2016 08:06 AM, matt wrote:
I have the log, but a lot of the data is confidential.
Can I send you by email in order for you to take a look?
We can post a edited version later in order to help others
debug the same issue
Thanks in advance