Actually CCing Rik now!
On Thu, Dec 02, 2010 at 08:57:16PM +0530, Srivatsa Vaddagiri wrote:
> On Thu, Dec 02, 2010 at 03:49:44PM +0200, Avi Kivity wrote:
> > On 12/02/2010 03:13 PM, Srivatsa Vaddagiri wrote:
> > >On Thu, Dec 02, 2010 at 02:41:35PM +0200, Avi Kivity wrote:
> > >> >> What I'd like to see in directed yield is donating exactly the
> > >> >> amount of vruntime that's needed to make the target thread run.
> > >> >
> > >> >I presume this requires the target vcpu to move left in rb-tree to run
> > >> >earlier than scheduled currently and that it doesn't involve any
> > >> >change to the sched_period() of target vcpu?
> > >> >
> > >> >Just was wondering how this would work in case of buggy guests. Lets
> > >> say that a
> > >> >guest ran into a AB<->BA deadlock. VCPU0 spins on lock B (held by VCPU1
> > >> >currently), while VCPU spins on lock A (held by VCPU0 currently). Both
> > >> keep
> > >> >boosting each other's vruntime, potentially affecting fairtime for
> > >> other guests
> > >> >(to the point of starving them perhaps)?
> > >>
> > >> We preserve vruntime overall. If you give vruntime to someone, it
> > >> comes at your own expense. Overall vruntime is preserved.
> > >
> > >Hmm ..so I presume that this means we don't affect target thread's
> > >position in
> > >rb-tree upon donation, rather we influence its sched_period() to include
> > >donated time? IOW donation has no effect on causing the target thread to
> > >run
> > >"immediately", rather it will have the effect of causing it run "longer"
> > >whenever it runs next?
> >
> > No. The intent (at least mine, maybe Rik has other ideas) is to
>
> CCing Rik now ..
>
> > move some vruntime from current to target such that target would be
> > placed before current in the timeline.
>
> Well ok, then this is what I had presumed earlier (about shifting target
> towards
> left in rb-tree).
>
> > >Even that would require some precaution in directed yield to ensure that it
> > >doesn't unduly inflate vruntime of target, hurting fairness for other
> > >guests on
> > >same cpu as target (example guest code that can lead to this situation
> > >below):
> > >
> > >vcpu0: vcpu1:
> > >
> > > spinlock(A);
> > >
> > >spinlock(A);
> > >
> > > while(1)
> > > ;
> > >
> > > spin_unlock(A);
> >
> > directed yield should preserve the invariant that sum(vruntime) does
> > not change.
>
> Hmm don't think I understand this invariant sum() part. Lets take a simple
> example as below:
>
>
> p0 -> A0 B0 A1
>
> p1 -> B1 C0 C1
>
> A/B/C are VMs and A0 etc are virtual cpus. p0/1 are physical cpus
>
> Let's say A0/A1 hit AB-BA spin-deadlock (which one can write in userspace
> delibrately). When A0 spins and exits (due to PLE) what does its directed
> yield
> do? Going by your statement, it can put target before current, leading to
> perhaps this arrangement in runqueue:
>
> p0 -> A1 B0 A0
>
> Now A1 spins and wants to do a directed yield back to A0, leading to :
>
> p0 -> A0 B0 A1
>
> This can go back and forth, starving B0 (iow leading to some sort of DoS
> attack).
>
> Where does the "invariant sum" part of directed yield kick in to avoid such
> nastiness?
>
> - vatsa