On Thu, Feb 27, 2014 at 11:47:08AM -0800, Linus Torvalds wrote: > On Thu, Feb 27, 2014 at 11:06 AM, Paul E. McKenney > <paul...@linux.vnet.ibm.com> wrote: > > > > 3. The comparison was against another RCU-protected pointer, > > where that other pointer was properly fetched using one > > of the RCU primitives. Here it doesn't matter which pointer > > you use. At least as long as the rcu_assign_pointer() for > > that other pointer happened after the last update to the > > pointed-to structure. > > > > I am a bit nervous about #3. Any thoughts on it? > > I think that it might be worth pointing out as an example, and saying > that code like > > p = atomic_read(consume); > X; > q = atomic_read(consume); > Y; > if (p == q) > data = p->val; > > then the access of "p->val" is constrained to be data-dependent on > *either* p or q, but you can't really tell which, since the compiler > can decide that the values are interchangeable. > > I cannot for the life of me come up with a situation where this would > matter, though. If "X" contains a fence, then that fence will be a > stronger ordering than anything the consume through "p" would > guarantee anyway. And if "X" does *not* contain a fence, then the > atomic reads of p and q are unordered *anyway*, so then whether the > ordering to the access through "p" is through p or q is kind of > irrelevant. No?
I can make a contrived litmus test for it, but you are right, the only time you can see it happen is when X has no barriers, in which case you don't have any ordering anyway -- both the compiler and the CPU can reorder the loads into p and q, and the read from p->val can, as you say, come from either pointer. For whatever it is worth, hear is the litmus test: T1: p = kmalloc(...); if (p == NULL) deal_with_it(); p->a = 42; /* Each field in its own cache line. */ p->b = 43; p->c = 44; atomic_store_explicit(&gp1, p, memory_order_release); p->b = 143; p->c = 144; atomic_store_explicit(&gp2, p, memory_order_release); T2: p = atomic_load_explicit(&gp2, memory_order_consume); r1 = p->b; /* Guaranteed to get 143. */ q = atomic_load_explicit(&gp1, memory_order_consume); if (p == q) { /* The compiler decides that q->c is same as p->c. */ r2 = p->c; /* Could get 44 on weakly order system. */ } The loads from gp1 and gp2 are, as you say, unordered, so you get what you get. And publishing a structure via one RCU-protected pointer, updating it, then publishing it via another pointer seems to me to be asking for trouble anyway. If you really want to do something like that and still see consistency across all the fields in the structure, please put a lock in the structure and use it to guard updates and accesses to those fields. Thanx, Paul