Re: [Xen-devel] [PATCH v2 1/2] xen: credit1: fix a race when picking initial pCPU for a vCPU

2016-08-19 Thread George Dunlap
On 18/08/16 11:00, Dario Faggioli wrote:
> In the Credit1 hunk of 9f358ddd69463 ("xen: Have
> schedulers revise initial placement") csched_cpu_pick()
> is called without taking the runqueue lock of the
> (temporary) pCPU that the vCPU has been assigned to
> (e.g., in XEN_DOMCTL_max_vcpus).
> 
> However, although 'hidden' in the IS_RUNQ_IDLE() macro,
> that function does access the runq (for doing load
> balancing calculations). Two scenarios are possible:
>  1) we are on cpu X, and IS_RUNQ_IDLE() peeks at cpu's
> X own runq;
>  2) we are on cpu X, but IS_RUNQ_IDLE() peeks at some
> other cpu's runq.
> 
> Scenario 2) absolutely requies that the appropriate
> runq lock is taken. Scenario 1) works even without
> taking the cpu's own runq lock, and this is important
> for the case when _csched_pick_cpu() is called from
> csched_vcpu_acct() which in turn is called by
> csched_tick().


> 
> Races have been observed and reported (by both XenServer
> own testing and OSSTest [1]), in the form of
> IS_RUNQ_IDLE() falling over LIST_POISON, because we're
> not currently holding the proper lock, in
> csched_vcpu_insert(), when scenario 1) occurs.
> 
> Since this is all very tricky, in addition to fix things,
> add both an ASSERT() and a comment in IS_RUNQ_IDLE() (which
> is also becoming static inline instead of macro).
> 
> [1] https://lists.xen.org/archives/html/xen-devel/2016-08/msg02144.html
> 
> Reported-by: Andrew Cooper 
> Signed-off-by: Dario Faggioli 
> ---
> Cc: George Dunlap 
> Cc: Andrew Cooper 
> Cc: Jan Beulich 
> ---
> Changes from v1:
>  - macro IS_RUNQ_IDLE() to static inline is_runq_idle(), as suggested
>during review;
>  - add an ASSERT() and a comment, as suggested during review;
>  - take into account what's described in the changelog as "scenario 1)",
>which wasn't being considered in v1.
> ---
>  xen/common/sched_credit.c |   38 +-
>  1 file changed, 33 insertions(+), 5 deletions(-)
> 
> diff --git a/xen/common/sched_credit.c b/xen/common/sched_credit.c
> index 220ff0d..daace81 100644
> --- a/xen/common/sched_credit.c
> +++ b/xen/common/sched_credit.c
> @@ -84,9 +84,6 @@
>  #define CSCHED_VCPU(_vcpu)  ((struct csched_vcpu *) (_vcpu)->sched_priv)
>  #define CSCHED_DOM(_dom)((struct csched_dom *) (_dom)->sched_priv)
>  #define RUNQ(_cpu)  (&(CSCHED_PCPU(_cpu)->runq))
> -/* Is the first element of _cpu's runq its idle vcpu? */
> -#define IS_RUNQ_IDLE(_cpu)  (list_empty(RUNQ(_cpu)) || \
> - 
> is_idle_vcpu(__runq_elem(RUNQ(_cpu)->next)->vcpu))
>  
>  
>  /*
> @@ -248,6 +245,33 @@ __runq_elem(struct list_head *elem)
>  return list_entry(elem, struct csched_vcpu, runq_elem);
>  }
>  
> +/* Is the first element of cpu's runq (if any) cpu's idle vcpu? */
> +static inline bool_t is_runq_idle(unsigned int cpu)
> +{
> +/*
> + * If we are on cpu, and we are peeking at our own runq while cpu itself
> + * is not idle, that's fine even if we don't hold the runq lock. In fact,
> + * the fact that there is a (non idle!) vcpu running means that at least
> + * the idle vcpu is in the runq. And since only cpu itself (via work
> + * stealing) can add stuff to the runq, and no other cpu will ever steal
> + * our idle vcpu, that maks the runq manipulations done below safe, even
> + * without locks.

Thanks for investigating this and figuring out why the lockless access
hasn't caused a problem before.  But relying on this behavior going
forward doesn't really seem like a great idea if we can avoid it.

We can't grab the pcpu scheduler lock in csched_tick(), or in the whole
of csched_vcpu_acct() because we grab the private lock in
__csched_vcpu_acct_start() (and that violates the locking order).  But
is there a reason we can't grab the pcpu lock just around the call to
_csched_cpu_pick?

If not, we would then need to put a comment in the runq struct listing
the restrictions on access: namely, that nothing can be inserted from
other pcpus; and ideally a wrapper for all list modification operations
to ASSERT() that we're on the right pcpu.

 -George


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 1/2] xen: credit1: fix a race when picking initial pCPU for a vCPU

2016-08-19 Thread Dario Faggioli
On Fri, 2016-08-19 at 13:23 +0100, George Dunlap wrote:
> On 18/08/16 11:00, Dario Faggioli wrote:
> > @@ -248,6 +245,33 @@ __runq_elem(struct list_head *elem)
> >  return list_entry(elem, struct csched_vcpu, runq_elem);
> >  }
> >  
> > +/* Is the first element of cpu's runq (if any) cpu's idle vcpu? */
> > +static inline bool_t is_runq_idle(unsigned int cpu)
> > +{
> > +/*
> > + * If we are on cpu, and we are peeking at our own runq while
> > cpu itself
> > + * is not idle, that's fine even if we don't hold the runq
> > lock. In fact,
> > + * the fact that there is a (non idle!) vcpu running means
> > that at least
> > + * the idle vcpu is in the runq. And since only cpu itself
> > (via work
> > + * stealing) can add stuff to the runq, and no other cpu will
> > ever steal
> > + * our idle vcpu, that maks the runq manipulations done below
> > safe, even
> > + * without locks.
> Thanks for investigating this and figuring out why the lockless
> access
> hasn't caused a problem before.  But relying on this behavior going
> forward doesn't really seem like a great idea if we can avoid it.
> 
I totally agree.

> We can't grab the pcpu scheduler lock in csched_tick(), or in the
> whole
> of csched_vcpu_acct() because we grab the private lock in
> __csched_vcpu_acct_start() (and that violates the locking
> order).  But
> is there a reason we can't grab the pcpu lock just around the call to
> _csched_cpu_pick?
> 
The first version of this patch, here in my stgit patchqueue, looked
exactly like that. ISTR I even tested it, and it works.

Then I thought that, since in this case it's all about making an
ASSERT() happy, it may be a good thing to avoid introducing more
contention. Also, I see your point on robustness/reliability. My view
is that locking on this path (if not on Credit1 in general) is already
so bad, that I don't think it's possible to make it any worse (and
hence wans't feeling guilty about taking going the way I did). :-)

*BUT* I don't have a too strong opinion, and if you prefer 'take lock'
approach, I'm fine with that.

I'll send v3.

Thanks and Regards,
Dario
-- 
<> (Raistlin Majere)
-
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)



signature.asc
Description: This is a digitally signed message part
___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel