On Thu, Aug 29, 2013 at 02:50:51PM +0800, Xiao Guangrong wrote:
> >>> BTW I do not see
> >>> rcu_assign_pointer()/rcu_dereference() in your patches which hints on
> >>
> >> IIUC, We can not directly use rcu_assign_pointer(), that is something like:
> >> p = v to assign a pointer to a pointer. But in our case, we need:
> >>    *pte_list = (unsigned long)desc | 1;
> >>From Documentation/RCU/whatisRCU.txt:
> > 
> > The updater uses this function to assign a new value to an RCU-protected 
> > pointer.
> > 
> > This is what we do, no? (assuming slot->arch.rmap[] is what rcu protects 
> > here)
> > The fact that the value is not correct pointer should not matter.
> > 
> 
> Okay. Will change that code to:
> 
> +
> +#define rcu_assign_head_desc(pte_list_p, value)        \
> +       rcu_assign_pointer(*(unsigned long __rcu **)(pte_list_p), (unsigned 
> long *)(value))
> +
>  /*
>   * Pte mapping structures:
>   *
> @@ -1006,14 +1010,7 @@ static int pte_list_add(struct kvm_vcpu *vcpu, u64 
> *spte,
>                 desc->sptes[1] = spte;
>                 desc_mark_nulls(pte_list, desc);
> 
> -               /*
> -                * Esure the old spte has been updated into desc, so
> -                * that the another side can not get the desc from pte_list
> -                * but miss the old spte.
> -                */
> -               smp_wmb();
> -
> -               *pte_list = (unsigned long)desc | 1;
> +               rcu_assign_head_desc(pte_list, (unsigned long)desc | 1);
> 
> >>
> >> So i add the smp_wmb() by myself:
> >>            /*
> >>             * Esure the old spte has been updated into desc, so
> >>             * that the another side can not get the desc from pte_list
> >>             * but miss the old spte.
> >>             */
> >>            smp_wmb();
> >>
> >>            *pte_list = (unsigned long)desc | 1;
> >>
> >> But i missed it when inserting a empty desc, in that case, we need the 
> >> barrier
> >> too since we should make desc->more visible before assign it to pte_list to
> >> avoid the lookup side seeing the invalid "nulls".
> >>
> >> I also use own code instead of rcu_dereference():
> >> pte_list_walk_lockless():
> >>    pte_list_value = ACCESS_ONCE(*pte_list);
> >>    if (!pte_list_value)
> >>            return;
> >>
> >>    if (!(pte_list_value & 1))
> >>            return fn((u64 *)pte_list_value);
> >>
> >>    /*
> >>     * fetch pte_list before read sptes in the desc, see the comments
> >>     * in pte_list_add().
> >>     *
> >>     * There is the data dependence since the desc is got from pte_list.
> >>     */
> >>    smp_read_barrier_depends();
> >>
> >> That part can be replaced by rcu_dereference().
> >>
> > Yes please, also see commit c87a124a5d5e8cf8e21c4363c3372bcaf53ea190 for
> > kind of scary bugs we can get here.
> 
> Right, it is likely trigger-able in our case, will fix it.
> 
> > 
> >>> incorrect usage of RCU. I think any access to slab pointers will need to
> >>> use those.
> >>
> >> Remove desc is not necessary i think since we do not mind to see the old
> >> info. (hlist_nulls_del_rcu() does not use rcu_dereference() too)
> >>
> > May be a bug. I also noticed that rculist_nulls uses rcu_dereference()
> 
> But list_del_rcu() does not use rcu_assign_pointer() too.
> 
This also suspicious.

> > to access ->next, but it does not use rcu_assign_pointer() pointer to
> > assign it.
> 
> You mean rcu_dereference() is used in hlist_nulls_for_each_entry_rcu()? I 
> think
> it's because we should validate the prefetched data before entry->next is
> accessed, it is paired with the barrier in rcu_assign_pointer() when add a
> new entry into the list. rcu_assign_pointer() make other fields in the entry
> be visible before linking entry to the list. Otherwise, the lookup can access
> that entry but get the invalid fields.
> 
> After more thinking, I still think rcu_assign_pointer() is unneeded when a 
> entry
> is removed. The remove-API does not care the order between unlink the entry 
> and
> the changes to its fields. It is the caller's responsibility:
> - in the case of rcuhlist, the caller uses call_rcu()/synchronize_rcu(), etc 
> to
>   enforce all lookups exit and the later change on that entry is invisible to 
> the
>   lookups.
> 
> - In the case of rculist_nulls, it seems refcounter is used to guarantee the 
> order
>   (see the example from Documentation/RCU/rculist_nulls.txt).
> 
> - In our case, we allow the lookup to see the deleted desc even if it is in 
> slab cache
>   or its is initialized or it is re-added.
> 
> Your thought?
>

As Documentation/RCU/whatisRCU.txt says:
 
        As with rcu_assign_pointer(), an important function of
        rcu_dereference() is to document which pointers are protected by
        RCU, in particular, flagging a pointer that is subject to changing
        at any time, including immediately after the rcu_dereference().
        And, again like rcu_assign_pointer(), rcu_dereference() is
        typically used indirectly, via the _rcu list-manipulation
        primitives, such as list_for_each_entry_rcu().

The documentation aspect of rcu_assign_pointer()/rcu_dereference() is
important. The code is complicated, so self documentation will not hurt.
I want to see what is actually protected by rcu here. Freeing shadow
pages with call_rcu() further complicates matters: does it mean that
shadow pages are also protected by rcu? 

--
                        Gleb.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to