Re: [PATCH2] Re: Seg fault: race conditions in mod_mem_cache.c

Jean-Jacques Clar Tue, 14 Sep 2004 12:14:40 -0700

>Can you help me/us understand the following a little better? If it is possible
>for two threads to simultaneously to have the same obj pointer and try to free
>the object, what prevents a seg fault* due to:

>- thread #1 gets a pointer to obj A, then looses its time slice due to the OS
>scheduler before doing anything to the refcount or OBJ_CLEANUP_BIT
>- thread #2 gets a pointer to obj A, refcount + OBJ_CLEANUP_BIT looks good, the
>memory for obj A gets freed
>- thread #1 wakes up and tries to access/update the refcount + OBJ_CLEANUP_BIT
>in the freed memory

Thank you for looking.

Yes, you are right I forgot protecting one spot in remove_url(). I think all the other

ones are clean. Please jump in if you still see abnormalities or

have unanswered questions.

Let's look at al the cases where the cleanup function is called or the object is accessed:

The cleanup bit could only be set under the protection of the global mutex. This is critical.

create_entity()

no worries, the refcount is created, set to 1, and the cleanup function is registered

open_entity()

mutex is locking access to object. The object is still in cache which implies that

the cleanup bit is not set, refcount is then incremented preventing cleanup,

and pool cleanup is registered before releasing the lock.

remove_entity()

mutex is locking access to object. This one is safe because the refcount must be

at least 1, because open_entity() previously incremented it in the same thread

The cleanup bit cannot be set before the lock is released.

store_body()

This is a streamed response, the cleanup bit was set because of an ejection,

but the refcount is at least one because of a previous access in the current

thread. No worries about decrement_refcount() bringing it to 0. We hold

the mutex and then can reset the cleanup bit on the current object and get

rid of the tmp object.

remove_url() - this one has a potential hole

mutex is locking access to object. We have an object in the cache, it means that

it was not previously removed which implies that the cleanup bit is not set and

cannot be set on a different thread because of the lock. In this case, the later

call to cleanup_cache_object() could be a hole, I suggest doing that:

@@ -626,26 +621,13 @@

     obj = cache_find(sconf->cache_cache, key);
     if (obj) {
-        mem_cache_object_t *mobj;
-        cache_remove(sconf->cache_cache, obj);
-        mobj = (mem_cache_object_t *) obj->vobj;
-
-        /* Refcount increment in this case MUST be made under
-         * protection of the lock
-         */
-        apr_atomic_inc(&obj->refcount);
-        if (obj) {
-            obj->cleanup = 1;
-        }
+        cache_remove(sconf->cache_cache, obj);
+        memcache_cache_free(obj);
     }
     if (sconf->lock) {
         apr_thread_mutex_unlock(sconf->lock);
     }
-    if (obj) {
-        if (!apr_atomic_dec(&obj->refcount)) {
-            cleanup_cache_object(obj);
-        }
-    }
     return OK;
}

Am I the only one having an increase in brain pulsations?

mod_mem_cache_race9.patch
Description: Binary data

Re: [PATCH2] Re: Seg fault: race conditions in mod_mem_cache.c

Reply via email to