>>> On 02.02.16 at 05:48, <feng...@intel.com> wrote:

> 
>> -----Original Message-----
>> From: Jan Beulich [mailto:jbeul...@suse.com]
>> Sent: Friday, January 29, 2016 5:32 PM
>> To: Wu, Feng <feng...@intel.com>
>> Cc: Andrew Cooper <andrew.coop...@citrix.com>; Dario Faggioli
>> <dario.faggi...@citrix.com>; George Dunlap <george.dun...@eu.citrix.com>;
>> Tian, Kevin <kevin.t...@intel.com>; xen-devel@lists.xen.org; Keir Fraser
>> <k...@xen.org>
>> Subject: RE: [PATCH v11 1/2] vmx: VT-d posted-interrupt core logic handling
>> 
>> >>> On 29.01.16 at 02:53, <feng...@intel.com> wrote:
>> >> From: Jan Beulich [mailto:jbeul...@suse.com]
>> >> Sent: Friday, January 29, 2016 12:38 AM
>> >> >>> On 28.01.16 at 06:12, <feng...@intel.com> wrote:
>> >> > --- a/xen/arch/x86/hvm/vmx/vmx.c
>> >> > +++ b/xen/arch/x86/hvm/vmx/vmx.c
>> >> > @@ -83,7 +83,140 @@ static int vmx_msr_write_intercept(unsigned int
>> msr,
>> >> uint64_t msr_content);
>> >> >  static void vmx_invlpg_intercept(unsigned long vaddr);
>> >> >  static int vmx_vmfunc_intercept(struct cpu_user_regs *regs);
>> >> >
>> >> > +static void vmx_vcpu_block(struct vcpu *v)
>> >> > +{
>> >> > +    unsigned long flags;
>> >> > +    unsigned int dest;
>> >> > +
>> >> > +    struct pi_desc *pi_desc = &v->arch.hvm_vmx.pi_desc;
>> >>
>> >> Stray blank line between declarations.
>> >>
>> >> > +    ASSERT(v->arch.hvm_vmx.pi_block_list_lock == NULL);
>> >>
>> >> I think this needs to be done after acquiring the lock.
>> >
>> > I am not sure what you mean here. The ASSERT here means that
>> > the vCPU is not on any pCPU list when we get here.
>> 
>> My point is that until you've taken the lock, whether the vCPU is on
>> any such list may change. All state updates are done under lock, so
>> checking for valid incoming state should be done under lock too, to
>> be most useful.
> 
> I don't think so. We get this function via vcpu_block()/do_poll() ->
> arch_vcpu_block() -> ... -> vmx_vcpu_block(), and the parameter
> of this function should point to the current vCPU, so when we
> get here, the v->arch.hvm_vmx.pi_block_list_lock should be NULL,
> which means the vCPU is not in any blocking. and no one can change
> it behind us this this moment.

Well, if we were to only think about how the code currently looks
like, then ASSERT()s like this one could be left out. Yet they're
there to also catch unexpected state when later another caller
or another place setting pi_block_list_lock to non-NULL get
introduced. With what you write, an alternative to what I've
suggested above might be to _also_ ASSERT(v == current), but
this would still be a weaker check than if the assertion was moved
into the locked region.

>> >> > +static void vmx_pi_do_resume(struct vcpu *v)
>> >> > +{
>> >> > +    unsigned long flags;
>> >> > +    spinlock_t *pi_block_list_lock;
>> >> > +    struct pi_desc *pi_desc = &v->arch.hvm_vmx.pi_desc;
>> >> > +
>> >> > +    ASSERT(!test_bit(_VPF_blocked, &v->pause_flags));
>> >> > +
>> >> > +    /*
>> >> > +     * Set 'NV' field back to posted_intr_vector, so the
>> >> > +     * Posted-Interrupts can be delivered to the vCPU when
>> >> > +     * it is running in non-root mode.
>> >> > +     */
>> >> > +    if ( pi_desc->nv != posted_intr_vector )
>> >> > +        write_atomic(&pi_desc->nv, posted_intr_vector);
>> >>
>> >> Perhaps this was discussed before, but I don't recall and now
>> >> wonder - why inside an if()? This is a simple memory write on
>> >> x86.
>> >
>> > The initial purpose is that if NV is already equal to 'posted_intr_vector',
>> > we can save the following atomically write operation. There are some
>> > volatile stuff and barriers in write_atomic().
>> 
>> But what does the final generated code look like? As I said, I'm
>> sure it's just a single MOV. And putting a conditional around it will
>> likely make things slower rather than faster.
> 
> Looking at the implementation of wirte_atomic(), it has volatile key
> word barrier inside, if you think this is not a big deal, I am fine to
> remove the check.

Once again - what I think matters only if it matches reality. I.e.
I may be wrong, and you are then free to correct me. That is,
rather than blindly changing the code as I suggest, why don't
you convince yourself which variant is the better one. It is for
that reason that I asked you to inspect the actual generated
code.

>> >> > --- a/xen/drivers/passthrough/vtd/iommu.c
>> >> > +++ b/xen/drivers/passthrough/vtd/iommu.c
>> >> > @@ -2293,6 +2293,8 @@ static int reassign_device_ownership(
>> >> >          pdev->domain = target;
>> >> >      }
>> >> >
>> >> > +    vmx_pi_hooks_reassign(source, target);
>> >> > +
>> >> >      return ret;
>> >> >  }
>> >>
>> >> I think you need to clear source's hooks here, but target's need to
>> >> get set ahead of the actual assignment.
>> >
>> > I think this is the place where the device is moved from
>> > &source->arch.pdev_list to &target->arch.pdev_list, if that is the
>> > case, we should clear source and set target here, right?
>> 
>> No - target needs to be ready to deal with events from the device
>> _before_ the device actually gets assigned to it.
> 
> I still don't get your point. I still think this is the place where the 
> device
> is being got assigned. Or maybe you can share more info about the
> place "_before_ the device actually gets assigned to it " ?

The issue is with the sequence of events. Right now you assign
the device and only then prepare target to deal with it having a
device assigned. Whereas the needed sequence is the other
way around - prepare target, and only then assign the device.
I.e. you need to do things in this order (all inside this function)
- prepare target
- re-assign device
- clean up source

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Reply via email to