On Wed, May 15, 2013 at 4:22 PM, Paolo Bonzini <pbonz...@redhat.com> wrote: > Il 15/05/2013 03:29, liu ping fan ha scritto: >>>>> Pointers are quite expensive here. With RCU we can fetch a consistent >>>>> root/table pair like this: >>>>> >>>>> rcu_read_lock(); >>>>> do { >>>>> pgtbl = d->cur_pgtbl; >>>>> smp_rmb(); >>>>> root = d->cur_root; >>>>> >>>>> /* RCU ensures that d->cur_pgtbl remains alive, thus it cannot >>>>> * be recycled while this loop is running. If >>>>> * d->cur_pgtbl == pgtbl, the root is the right one for this >>>>> * pgtable. >>>>> */ >>>>> smp_rmb(); >>>>> } while (d->cur_pgtbl == pgtbl); >>> >>> Ouch, != of course. >>> >>>>> ... >>>>> rcu_read_unlock(); >>>>> >>>> It seems to break the semantic of rcu_dereference() and rcu_assign(). >>> >>> It doesn't. In fact it is even stronger, I'm using a "full" rmb instead >>> of read_barrier_depends. >>> >> rcu_dereference()/rcu_assign() ensure the switch from prev to next >> version, based on atomic-ops. > > rcu_dereference()/rcu_assign() are not magic, they are simply > read+read_barrier_depends and wmb+write. > Yes. >> I think your method _does_ work based on >> read+check+barrier skill, but it is not the standard RCU method, and >> export some concept (barrier) outside RCU. > > It is a standard method to load 2 words and ensure it is consistent. If
Yes, it is. > you want to use rcu_dereference(&d->cur_pgtbl) and > rcu_dereference(&d->cur_root), that's fine. But you still need the read > barrier. > Ok. >>>> If pointers are expensive, how about this: >>>> if (unlikely(d->prev_map!=d->cur_map)) { >>>> d->root = d->cur_map->root; >>>> d->pgtbl = d->cur_map->root; >>>> d->prev_map = d->cur_map; >>>> } >>>> So usually, we use cache value. >>> >> rcu_read_lock(); >> map = rcu_derefenrence(d->cur_map) >> if (unlikely(d->prev_map!=map) { >> d->root = map->root; >> d->pgtbl = map->pgtbl; ------------------------------------> d->prev_map = map // As you mentioned ahead, with RCU, ABA can not occur Regards, Pingfan >> } >> ...... >> rcu_read_unlock(); >> >> Then it can avoid ABA problem. > > I don't see the assignment of prev_map, which is where the ABA problem > arises. > > Paolo