On 2013-06-12 10:46, Alex Bligh wrote:
I think I've finally figured out what's going wrong in my module but am unsure
what to do about it.
The module runs on apache 2.2.22 with mpm prefork. Occasionally I am seeing
corruption of the output bucket brigade, primarily the ring pointers (link->next
and link->prev) ending up with strange values.
Having spent some time reading the source, I believe that apache provides no
protection to these pointers, and it's inherently unsafe for a bucket brigade
to be used by more than one thread (even if you are careful with allocators),
unless all callers provide their own mutex protection. As apache itself uses
the output bucket brigade without mutex protection, the output bucket brigade
can never be written to by other threads, and therefore ap_fwrite (to this
brigade) can never be safely by any thread other than the main thread. First
question: is this correct?
My module is currently structured as follows.
The main thread creates another thread for each request (the requests are long
running websocket connections).
The main thread does the following:
while (!done)
{
/* Blocking read */
apr_brigade_create;
ap_get_brigade;
apr_brigade_flatten;
/* Do stuff with the data */
blocking_socket_write;
}
The spawned thread does the following
while (!done)
{
blocking_socket_read;
/* do stuff with the data */
ap_fwrite(output_bucket_brigade);
}
Now, what I believe is happening is as follows. The blocking read in the main
thread at some point calls select(), and does not only do a read, but also also
a write of the data in the output bucket brigade. This removes a bucket from
the ring. If this is happens at the same time as the ap_fwrite in the spawned
thread adds something to the output ring, two threads will be accessing the
ring pointers at once.
What I can't figure out is how to fix this.
I can't put in a mutex to protect the ring pointers, because the access to the
ring pointers by apache is outside of my module.
I can't hold a mutex across the blocking read in the main thread, because
otherwise my module won't be able to write data to the output bucket brigade
whilst there is no input from the apache client; as the apache client may be
waiting for data to be sent to it, this could cause deadlock.
And I can't obviously see how to do the read in a non-blocking way.
Any ideas?
If I understand correctly, the main thread belongs to your module, i.e.
it is not a concise pseudo-code of the request processing in apache's code.
I don't see where the output brigade appears in the main thread. I think
this is critical, as the output_bucket_brigade is the data item shared
between the two threads. ap_get_brigade triggers the execution of the
chain of input filters. One of these input filters writes to the output
brigade?
Sorin