On 10/18/2012 04:28 AM, Steven Rostedt wrote:
> On Wed, Oct 17, 2012 at 11:07:21AM -0400, Mikulas Patocka wrote:
>>>
>>> Even the previous patch is applied, percpu_down_read() still
>>> needs mb() to pair with it.
>>
>> percpu_down_read uses rcu_read_lock which should guarantee that memory 
>> accesses don't escape in front of a rcu-protected section.
> 
> You do realize that rcu_read_lock() does nothing more that a barrier(),
> right?
> 
> Paul worked really hard to get rcu_read_locks() to not call HW barriers.
> 
>>
>> If rcu_read_unlock has only an unlock barrier and not a full barrier, 
>> memory accesses could be moved in front of rcu_read_unlock and reordered 
>> with this_cpu_inc(*p->counters), but it doesn't matter because 
>> percpu_down_write does synchronize_rcu(), so it never sees these accesses 
>> halfway through.
> 
> Looking at the patch, you are correct. The read side doesn't need the
> memory barrier as the worse thing that will happen is that it sees the
> locked = false, and will just grab the mutex unnecessarily.

---------------------
A memory barrier can be added iff these two things are known:
        1) it disables the disordering between what and what.
        2) what is the corresponding mb() that it pairs with.

You tried to add a mb() in percpu_up_write(), OK, I know it disables the 
disordering
between the writes to the protected data and the statement "p->locked = false",
But I can't find out the corresponding mb() that it pairs with.

percpu_down_read()                                      writes to the data
        The cpu cache/prefetch the data                 writes to the data
        which is chaos                                  writes to the data
                                                        percpu_up_write()
                                                                mb()
                                                                p->locked = 
false;
        unlikely(p->locked)
                the cpu see p->lock = false,
                don't discard the cached/prefetch data
        this_cpu_inc(*p->counters);
        the code of read-access to the data
        ****and we use the chaos data*****

So you need to add a mb() after "unlikely(p->locked)".

-------------------------

The RCU you use don't protect any data. It protects codes of the fast path:
        unlikely(p->locked);
        this_cpu_inc(*p->counters);

and synchronize_rcu() ensures all previous fast path had fully finished
"this_cpu_inc(*p->counters);".

It don't protect other code/data, if you want to protect other code or other
data, please add more synchronizations or mb()s.

---------------

I extremely hate a synchronization protects code instead of data.
but sometimes I also have to do it.

---------------

a very draft example of paired-mb()s is here:


diff --git a/include/linux/percpu-rwsem.h b/include/linux/percpu-rwsem.h
index cf80f7e..84a93c0 100644
--- a/include/linux/percpu-rwsem.h
+++ b/include/linux/percpu-rwsem.h
@@ -12,6 +12,14 @@ struct percpu_rw_semaphore {
        struct mutex mtx;
 };
 
+#if 1
+#define light_mb() barrier()
+#define heavy_mb() synchronize_sched()
+#else
+#define light_mb() smp_mb()
+#define heavy_mb() smp_mb();
+#endif
+
 static inline void percpu_down_read(struct percpu_rw_semaphore *p)
 {
        rcu_read_lock();
@@ -24,22 +32,12 @@ static inline void percpu_down_read(struct 
percpu_rw_semaphore *p)
        }
        this_cpu_inc(*p->counters);
        rcu_read_unlock();
+       light_mb(); /* A, between read of p->locked and read of data, paired 
with D */
 }
 
 static inline void percpu_up_read(struct percpu_rw_semaphore *p)
 {
-       /*
-        * On X86, write operation in this_cpu_dec serves as a memory unlock
-        * barrier (i.e. memory accesses may be moved before the write, but
-        * no memory accesses are moved past the write).
-        * On other architectures this may not be the case, so we need smp_mb()
-        * there.
-        */
-#if defined(CONFIG_X86) && (!defined(CONFIG_X86_PPRO_FENCE) && 
!defined(CONFIG_X86_OOSTORE))
-       barrier();
-#else
-       smp_mb();
-#endif
+       light_mb(); /* B, between read of the data and write to p->counter, 
paired with C */
        this_cpu_dec(*p->counters);
 }
 
@@ -61,11 +59,12 @@ static inline void percpu_down_write(struct 
percpu_rw_semaphore *p)
        synchronize_rcu();
        while (__percpu_count(p->counters))
                msleep(1);
-       smp_rmb(); /* paired with smp_mb() in percpu_sem_up_read() */
+       heavy_mb(); /* C, between read of p->counter and write to data, paired 
with B */
 }
 
 static inline void percpu_up_write(struct percpu_rw_semaphore *p)
 {
+       heavy_mb(); /* D, between write to data and write to p->locked, paired 
with A */
        p->locked = false;
        mutex_unlock(&p->mtx);
 }
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to