On 15/11/11 01:26, Torsten Förtsch wrote:
> On Monday, 14 November 2011 04:42:16 Max Barry wrote:
>> Here is the result:
>>
>> http://pastebin.com/YDbmq84w
>>
>> This shows me:
>> * running the Apache benchmarking utility to generate lots of requests
>> * identifying a process hung in 'futex_wait' (11447)
>> * killing it with SEGV
>> * obtaining a stack trace
> 
> Thanks Max. It really seems to be a modperl problem. I think there is 
> either something fishy with modperl_tipool_putback_base() or someone 
> writes to a location that it doesn't own.
> 
> Many of your threads block in modperl_tipool_pop() waiting for an 
> interpreter to become available:
> 
>         /* block until an item becomes available */
>         modperl_tipool_wait(tipool);
> 
> In src/modules/perl/modperl_tipool.c in function 
> modperl_tipool_putback_base() you find these lines:
> 
>     if (!listp) {
>         /* XXX: Attempt to putback something that was never there */
>         modperl_tipool_unlock(tipool);
>         return;
>     }
> 
> I think the code should not return here but call abort() and dump core 
> because if it enters the if-block it tries to push back an interpreter 
> that was not taken from the pool. But why would someone call 
> modperl_tipool_putback_base if not to release an interpreter. Hence the 
> interpreter is lost. The other part of the function seems quite 
> reasonable. So, I think modperl_tipool_putback_base() is sometimes called 
> with a wrong data pointer and thus leaks interpreters.
> 
> Can you install the symbol tables for your modperl and perhaps check the 
> values of *tipool in the core? I think it is
> 
>   tipool->size == tipool->in_use == tipool->cfg->max
> 
> That would explain the behavior.
> 
> BTW, there are IMHO many points about the tipool implementation that can 
> be improved. Why do we use these lists? Wouldn't it be better to 
> allocated an array of tipool->cfg->max pointers? Or perhaps an apr_hash_t 
> in pconf?
> 
> Torsten Förtsch

Hi Torsten,

I'm afraid that installing debugging symbols is beyond me, but I have
confirmed that the problem is reproducible in a clean Ubuntu Server install.

Here is me going from a brand new Ubuntu Server install to futex_wait
hang in a few easy steps:

http://pastebin.com/ahDtAeAS

To reproduce:

1. Download an ISO of Ubuntu Server 11.10 64-bit. (I got it from a local
mirror:
http://mirror.aarnet.edu.au/pub/ubuntu/releases/11.10/ubuntu-11.10-server-amd64.iso).

2. Install as a virtual machine. (I installed inside VirtualBox
4.1.4-r74291, accepting all defaults and installing no additional packages.)

3. Install mod_perl2, configure the 'default' site to use it, and lower
MaxRequestsPerChild.

4. Smash the server with requests.

I hope this is sufficient to let you find the problem. Please let me
know if I can help further.

Max.


Reply via email to