On Wed, Sep 02, 2020 at 10:11:39AM +0200, Ruediger Pluem wrote: > > > On 9/2/20 9:10 AM, Joe Orton wrote: > > On Tue, Sep 01, 2020 at 03:11:59PM +0200, Ruediger Pluem wrote: > >> Your point is that there is no case where multiple threads work on the > >> same apr_memcache_conn_t at the same time, correct? > >> If yes, I agree that this is the case and that I missed this and hence my > >> point is mood. > > > > I think that's right. I don't want to disappear too far down the rabbit > > hole for this, given that the *only* symptom we are seeing is the APR > > pool check failing. > > > > Is there any way in which APR is going to guarantee that the internal > > pool state is always consistent across threads if one thread is > > simultaneously inside e.g. apr_palloc()/apr_palloc_debug() and another > > calls apr_pool_find()? I can't see it. > > I can't see it as well. Looks like that apr_pool_find / apr_pool_walk_tree > uses a mutex, > but this mutex is not used by apr_palloc_debug. But I am not sure how many > races we would > have with apr_palloc_debug. I think it could race when we add a new node to > the top of the linked list, but > I think the worst thing that could happen in this case is that it does not > find a pool for a piece of memory even if it was > allocated by a pool.
In this case: https://svn.apache.org/viewvc/apr/apr/trunk/memory/unix/apr_pools.c?annotate=1878340#l1798 node->beginp[node->index] = mem; node->endp[node->index] = (char *)mem + size; node->index++; without any memory ordering/consistency guarantee, is it not possible for another thread to see node->endp[node->index] = (char *)mem + size; node->index++; without the ->beginp update, and hence the node would appear to "contain" memory in the range [0, endp) since the node struct is zero-initialized. ....or am I overthinking this? We could add an "yes this is consistent" atomic int to the debug_node_t and make pool-debug mode *even slower* to avoid possible races!?