Hi Alexey,

On Wed, Jul 13, 2011 at 09:50:25PM +0400, Alexey Vlasov wrote:
> On Wed, Jul 13, 2011 at 07:57:05AM +0200, Willy Tarreau wrote:
> > > 
> > > I've got such a scheme on the shared hosting:
> > >                       +- apache_pool1
> > >                       |
> > > apache_fe -> haproxy -|- apache_pool2
> > >                       |
> > >                       +- apache_pool3
> > >              ...
> >
> > you should have at least "option http-server-close" in your config,
> 
> I don't know whether it is important or not but for some reasons
> keep-alive is switched off everywhere in Apache.

OK, this is very common anyway. I just wanted to ensure we were not missing
something. Still, "option http-server-close" will actively track the data
exchanges on the connection and will be able to actively close the server
connection as soon as haproxy receives all the data, which substantially
reduces the amount of concurrent connections on each server. But that's
an optimization point, it's not needed right now, so let's ignore this
for now.

> > > 2. haproxy access.log:
> > > Jul 12 22:28:04 l19 haproxy_aux2_pools[4944]: 111.111.111.111:42001 
> > > [12/Jul/2011:22:28:02.281] backend_pool1 backend_pool1/pool1 
> > > 0/0/0/-1/2084 502 204 - - SH-- 24/6/6/6/0 0/0 {clientvhost.com:9099} "GET 
> > > /?option=com_sobi2&sobi2Task=sobi2Details&sobi2Id=80&default=80&Itemid=7 
> > > HTTP/1.1"
> > > 
> >
> > The "SH" flags indicate that the server has reset the connection while
> > responding. Looking closer, the server waited 2 seconds before doing
> > that. Do you know if it is possible that the log was emitted just before
> > a process crashed ? Since Apache automatically restarts missing processes,
> > it's quite common to see application bugs causing silent crashes.
> 
> Today I once again looked closely the logs, and understood that I was
> wrong.
> 
> Apache_pool does not process the request (item 3 of my previous letter),
> and returns nothing, neither 200-th code, nor any other. I just made a
> mistake.

OK, thanks for these precisions. Don't worry, I too am used to report
erroneous diags after a first look, because it's very easy to mismatch a
log with a request or a network trace :-)

> In moments of 502-th errors there's nothing going on with Apache, in any
> case I have found nothing strange , no falling, no restarts. But after
> some moments the same queries are normally performed.

Indeed that's very strange.

> > Alternatively, something between haproxy and the application might reset
> > the connection once in a while, without the application being aware of
> > it. The application finally responds and logs, but the connection's
> > already dead.
> > Do you have anything in the path which might NAT the traffic, or do you
> > have any shared IP address on the network which might randomly jump for
> > a short period ?
> 
> I have nothing in common of all these, such a usual LAMP server for a
> shared hosting.

Fine.

> All traffic between the haproxy <-> apache_pools goes through the lo
> interface, so I just exclude the impact of iptables, and I don't have
> anything more.

If you have iptables loaded, it will impact the loopback as well as any
interface. The issue with iptables is that it is often shipped with low
settings for conntrack, and that above a few hundreds connections per
second (even on the loopback), the table fills and no connection can be
established until some entries expire. When properly tuned this problem
doesn't happen, but usually it's easier to disable it than to tune it.

> > Otherwise you might have to start tcpdump so that we find out what's
> > precisely happening
> 
> I managed to catch this moment, tcpdumps in an attachment.
> 
> The first file, this is a session between apache_fe and haproxy, and to
> mind it's ok with it. And the second dump has really something strange
> to show, look, may be it can tell something to you.

Ah what you captured is excellent ! Look :

apache_fe              haproxy           apache_pool

12:05:02.543
      ----> SYN
            SYN/ACK <----
      ----> ACK
      ----> GET
                          ----> SYN
                                SYN/ACK  <----
                          ----> ACK
                          ----> REQ
                                ACK <----
            ACK     <----

12:05:04:463
                                FIN <---
                          ----> RST
            502     <----
      ----> FIN
            FIN     <----
      ----> ACK


So what this means : apache_fe sends a complete correct request to
haproxy, which forwards it to the apache pool. Nothing happens for
1.9 second. Then the apache server closes the connection without
saying anything on it and haproxy returns the 502 to the apache_fe.

In my opinion there is no reason for an Apache server to close a
connection without saying anything. So either the process simply
dies, or there is a module on it doing nasty things and forcing
the connection to close without doing anything. Just a hint, could
you check if there's an updated version for it ? Maybe this is just
a known bug that has recently been fixed ?

> > BTW, what version are you running ?
> 
> 1.4.8

OK. If this is a distro backport with all fixes, it's fine. Otherwise
you should consider updating it since a number of issues with cookies
and chunked-encoding have been fixed since. That's unrelated to your
current issue so there's no emergency though.

Regards,
Willy


Reply via email to