>Can you help me/us understand the following a little better?  If it is possible
>for two threads to simultaneously to have the same obj pointer and try to free
>the object, what prevents a seg fault* due to:

>- thread #1 gets a pointer to obj A, then looses its time slice due to the OS
>scheduler before doing anything to the refcount or OBJ_CLEANUP_BIT
>- thread #2 gets a pointer to obj A, refcount + OBJ_CLEANUP_BIT looks good, the
>memory for obj A gets freed
>- thread #1 wakes up and tries to access/update the refcount + OBJ_CLEANUP_BIT
>in the freed memory

Thank you for looking.
Yes, you are right I forgot protecting one spot in remove_url(). I think all the other
ones are clean. Please jump in if you still see abnormalities or
have unanswered questions.
 
Let's look at al the cases where the cleanup function is called or the object is accessed:
The cleanup bit could only be set under the protection of the global mutex. This is critical.
 
create_entity()
  no worries, the refcount is created, set to 1, and the cleanup function is registered
open_entity()
  mutex is locking access to object. The object is still in cache which implies that
  the cleanup bit is not set, refcount is then incremented preventing cleanup,
  and pool cleanup is registered before releasing the lock.
remove_entity()
 mutex is locking access to object. This one is safe because the refcount must be
 at least 1, because open_entity() previously incremented it in the same thread
 The cleanup bit cannot be set before the lock is released.
store_body()
 This is a streamed response, the cleanup bit was set because of an ejection,
 but the refcount is at least one because of a previous access in the current
 thread. No worries about decrement_refcount() bringing it to 0. We hold
 the mutex and then can reset the cleanup bit on the current object and get
 rid of the tmp object.
remove_url() - this one has a potential hole
 mutex is locking access to object. We have an object in the cache, it means that
 it was not previously removed which implies that the cleanup bit is not set and
 cannot be set on a different thread because of the lock. In this case, the later
 call to cleanup_cache_object() could be a hole, I suggest doing that:
@@ -626,26 +621,13 @@
     obj = cache_find(sconf->cache_cache, key);      
     if (obj) {
-        mem_cache_object_t *mobj;
-        cache_remove(sconf->cache_cache, obj);
-        mobj = (mem_cache_object_t *) obj->vobj;
-
-        /* Refcount increment in this case MUST be made under
-         * protection of the lock
-         */
-        apr_atomic_inc(&obj->refcount);
-        if (obj) {
-            obj->cleanup = 1;
-        }
+        cache_remove(sconf->cache_cache, obj);       
+        memcache_cache_free(obj);
     }
     if (sconf->lock) {
         apr_thread_mutex_unlock(sconf->lock);
     }
-    if (obj) {
-        if (!apr_atomic_dec(&obj->refcount)) {
-            cleanup_cache_object(obj);
-        }
-    }    
     return OK;
 }
 
Am I the only one having an increase in brain pulsations?

Attachment: mod_mem_cache_race9.patch
Description: Binary data

Reply via email to