Hi,
I am running solaris 8 with apache 2.0.53 over openssl 0.9.7 as a reverse
proxy which terminates and reestablishes SSL to the backend. I run the worker
mpm with 100 threads per process. The server has minimal load and uses a
crypto card. The build is pretty minimal and runs in a gateway environment.
I have had alot of issues with processes going into a high cpu state. These
are always associated with close_wait state connections and nearly all the
time all the connections are close_wait. Would this suggest something shared
is affecting all connections/threads in the process?
After trolling through 50Gb of snoop dumps I have found that the latest appear
associated with an outbound SSL session that does not fully establish. These
outbound connections receive the server certificate and should send the
client key/cipher exchange but nothing follows. The client to proxy session
is normal and the request has been received by the proxy, the proxy does not
establish SSL properly to the server and as such the proxy does not forward
the request.
In the mean time the client and server ends time out (120s approx) and send a
fin which the proxy acks. In both instances the proxy does not return a fin,
nor does it timeout. At some point the process goes high CPU. This is very
repeatable on a server with a load of around 10-15 connections per second and
may occur on average 1-2 times a day. Unfortunately this thing is killing us
on a highly political service.
While it appears SSL is the cause, apache does not seem to be handling the
situation very well. It looks like apache knows it has data to send to the
server and data to return to the client so it does not close it's side of the
socket. At some point it must go into some sort of loop trying to recover I
guess.
I can't recall truss returning anything on these reverse proxy runaway
processes but I managed to get a truss of a forward proxy using heaps of
memory. It looks like apache is not checking the return value on it's write.
From memory in this instance there was no longer an inside connection but
apache was still trying to write to one. The forward proxy servers also
suffer from close_waits when using huge memory but do not suffer from high
cpu. I am thinking the same kind of issue may exist with the reverse proxy
where it loops to read/write a closed socket without checking the return
value resulting in the high cpu.
I have now turned on debug logging for the SSL diag but does anyone have any
idea what may be going on here with apache and the high cpu? When does the
connection timer start, before the connection is attempted or after it is
established? Is there any more info you need?
Thanks,
Jon
Forward Proxy stuff
###
16786: writev(128, 0x7440C860, 6) Err#32 EPIPE
16786: write(6, 0x7440A510, 99)= 99
16786: [ T u e A p r 1 9 0 8 : 2 7 : 3 1 2 0 0 5 ] [ i n f o
16786: ] ( 3 2 ) B r o k e n p i p e : c o r e _ o u t p u t _ f
16786: i l t e r : w r i t i n g d a t a t o t h e n e t w o
16786: r k\n
16786: lwp_sigredirect(0, SIGPIPE, 0x) = 0
16786: poll(0x7440C3C0, 1, 30) = 1
16786: read(131, "C9 nF216 R 08DF303CFE7 F".., 8000) = 1360
16786: poll(0x7440C3C0, 1, 30) = 1
16786: read(131, "11 1 # ; GF9FD + T04 e R".., 8000) = 1360
16786: poll(0x7440C3C0, 1, 30) = 1
16786: read(131, "10 91C f94 I93D2A6CD15 {".., 8000) = 1360
16786: poll(0x7440C3C0, 1, 30) = 1
16786: read(131, "AEE8C78F9504B8 VF607918C".., 8000) = 1360
16786: poll(0x7440C3C0, 1, 30) = 1
16786: read(131, " 0 r K\0 UB0A0\fF3C8 l93".., 8000) = 1360
16786: poll(0x7440C3C0, 1, 30) = 1
16786: read(131, " \ `1EA4D51BCF % /A7 .F3".., 8000) = 1360
16786: signotifywait() = 13
16786: writev(128, 0x7440C860, 6) Err#32 EPIPE
16786: write(6, 0x7440A510, 99)= 99
16786: [ T u e A p r 1 9 0 8 : 2 7 : 3 2 2 0 0 5 ] [ i n f o
16786: ] ( 3 2 ) B r o k e n p i p e : c o r e _ o u t p u t _ f
16786: i l t e r : w r i t i n g d a t a t o t h e n e t w o
16786: r k\n
16786: lwp_sigredirect(0, SIGPIPE, 0x) = 0
16786: poll(0x6DA063C0, 1, 30) = 1
16786: read(45, "11BE K1501 w01C1 = + Z X".., 8000)= 4080
16786: signotifywait() = 13
16786: writev(70, 0x6DA06860, 5) Err#32 EPIPE
16786: write(6, 0x6DA04510, 99)= 99
16786: [ T u e A p r 1 9 0 8 : 2 7 : 3 3 2 0 0 5 ] [ i n f o
16786: ] ( 3 2 ) B r o k e n p i p e : c o r e _ o u t p u t _ f
16786: i l t e r : w r i t i n g d a t a t o