Jeff Trawick <[EMAIL PROTECTED]> writes: > Jeff Trawick <[EMAIL PROTECTED]> writes: > > > Throughout today I've been seeing very intermittent regression > > failures on AIX. The segfault happens when trying to get the IP > > address string from a socket addr. > > > > core_create_conn() calls apr_socket_addr_get(), which returns > > APR_SUCCESS. But somehow we have NULL for the returned socket address > > so apr_sockaddr_ip_get() bombs. > > The immediate cause of the problem is that ap_queue_pop() is returning > EINVAL and worker_thread() didn't react to that and instead tried to > process the would-be socket. > > I suspect that the EINVAL from ap_queue_pop() is from trying to use an > invalid (cleaned up?) pthread mutex. AIX tends to notice errors on > mutexes and fail the call rather than venturing into unpredictable > behavior.
Yep, the mutex has already been cleaned up. It is the mutex unlock operation that fails. This is termination (ungraceful). We don't wait for worker threads to terminate; sometimes the main thread has cleaned up pchild and bailed by the time the worker threads get dispatched from the interrupt-all and then release the mutex. -- Jeff Trawick | [EMAIL PROTECTED] Born in Roswell... married an alien...