At 19:29 +0300 on 08 Aug (1470684579), Razvan Cojocaru wrote:
> On 08/08/16 18:52, Tamas K Lengyel wrote:
> >> I think the issue might be that vm_event_vcpu_pause() uses
> >> > vcpu_pause_nosync(), and it's being used everywhere we pause the VCPU in
> >> > the course of sending out a vm_event.
> >> >
> >> > So this creates the premise for a race condition: either some more
> >> > things happen between sending out the vm_event and replying to it that
> >> > cause v->arch.user_regs to be synchronized - in which case everything
> >> > works (this has been the case when I was reading the VCPU context via a
> >> > domctl that did vcpu_pause()) -, or not, in which case all bets are off.
> >> >
> >> > In this case, we should either drop vm_event_vcpu_pause() completely and
> >> > just use vcpu_pause() everywhere, modify it to use vcpu_sleep_sync()
> >> > (and basically turn it into vcpu_pause()), or only sync in
> >> > vm_event_set_registers() as I've suggested.
> >> >
> > I would say just using vcpu_pause() would make sense as it's the least
> > complicated route. Is there any downside of doing the sync() in all
> > cases? Was the current way implemented perhaps the way it is for
> > performance reasons? If so, is it noticeable at all?
> 
> I was hoping you'd know why it's implemented like this :), I think maybe
> Tim Deegan (CCd, hopefully the email address is not out of date) did the
> original implementation?

Wasn't me!  I'm missing some context here but it looks like
vm_event_vcpu_pause() is always called on current, which means the
vcpu:
 - is "running", i.e. scheduled on this pcpu, so vcpu_pause_sync()
   would deadlock in vcpu_sleep_sync() (well, it would ASSERT first).
 - is not in guest mode, so any VMCx state should have been synced
   onto the local stack at the last vmexit.

If your vm_event response hypercall needs access to remote vcpu state,
then you should call vcpu_pause_sync() on _that_ path, and
vcpu_unpause() when you're done.  If you can _guarantee_ (even with
buggy/malicious tools) that the target vcpu is already paused and
nothing can unpause it underfoot, then just calling vcpu_sleep_sync()
before accessing the state is enough.

The *_sync() functions are dangerous - you can't ever let a domain
call them on its own vcpus, or have two domains that can call them on
each other, without some other interlock to stop two vcpus deadlocking
waiting for each other to be descheduled.

Cheers,

Tim.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Reply via email to