John,
Sometimes it's difficult to see what the error is because you can't see
the request (doesn't get logged)
To get round this - add:
* a transhandler which writes a tag (e.g. ST), the request and the PID
to the error log
* a cleanuphandler which does the same... with a different tag (e.g. FI)
you can then get a better idea of what is causing the error as the
request that causes the seg-fault will have a ST just before the seg
fault but no FI... you will also have a history of all the request
handled by that PID (in case it is cumulative)
Sometime (about 12 years) ago we were having errors with apparently
random requests (including static images) - doing this we discovered the
request which died was the request after a request which talked to a
particular Oracle database.
On the live site we just killed the child at the end of these
requests... and then went back to diagnose the error...
James
On 03/09/2015 22:21, John Dunlap wrote:
Ever since upgrading from Debian 7 - which shipped with Apache 2.2 -
to Debian 8 - which shipped with Apache 2.4 - my user base has been
reporting that their browsers randomly tell them "No data received".
To date, they have not been able to identify any kind of pattern which
triggers it. I've been sifting through the server logs looking for
problems and I'm seeing a lot of errors similar to the following:
[Thu Sep 03 21:12:52.382357 2015] [core:notice] [pid 13199:tid
140364918835072] AH00052: child pid 2088 exit signal Segmentation
fault (11)
[Thu Sep 03 21:13:03.406215 2015] [core:notice] [pid 13199:tid
140364918835072] AH00052: child pid 2121 exit signal Segmentation
fault (11)
[Thu Sep 03 21:13:05.417909 2015] [core:notice] [pid 13199:tid
140364918835072] AH00052: child pid 2165 exit signal Segmentation
fault (11)
[Thu Sep 03 21:13:08.433829 2015] [core:notice] [pid 13199:tid
140364918835072] AH00052: child pid 2232 exit signal Segmentation
fault (11)
[Thu Sep 03 21:15:53.614351 2015] [core:notice] [pid 13199:tid
140364918835072] AH00052: child pid 2264 exit signal Segmentation
fault (11)
[Thu Sep 03 21:16:03.637236 2015] [core:notice] [pid 13199:tid
140364918835072] AH00052: child pid 2539 exit signal Segmentation
fault (11)
Can someone give me some tips on how to proceed with troubleshooting
this and, possibly, fixing it?
--
John Dunlap
/CTO | Lariat/
/
/
/*Direct:*/
/j...@lariat.co <mailto:j...@lariat.co>/
/
*Customer Service:*/
877.268.6667
supp...@lariat.co <mailto:supp...@lariat.co>
--
The Wellcome Trust Sanger Institute is operated by Genome Research
Limited, a charity registered in England with number 1021457 and a
company registered in England with number 2742969, whose registered
office is 215 Euston Road, London, NW1 2BE.