On Thu, 29 Jun 2017 16:10:20 +0300
Roman Kagan <rka...@virtuozzo.com> wrote:

> On Thu, Jun 29, 2017 at 01:53:29PM +0200, Igor Mammedov wrote:
> > On Thu, 29 Jun 2017 12:53:27 +0300
> > Roman Kagan <rka...@virtuozzo.com> wrote:
> >   
> > > On Wed, Jun 28, 2017 at 04:47:43PM +0200, Igor Mammedov wrote:  
> > > > On Wed, 21 Jun 2017 19:24:08 +0300
> > > > Roman Kagan <rka...@virtuozzo.com> wrote:
> > > >     
> > > > > Hyper-V identifies vCPUs by Virtual Processor (VP) index which can be
> > > > > queried by the guest via HV_X64_MSR_VP_INDEX msr.  It is defined by 
> > > > > the
> > > > > spec as a sequential number which can't exceed the maximum number of
> > > > > vCPUs per VM.
> > > > > 
> > > > > It has to be owned by QEMU in order to preserve it across migration.
> > > > > 
> > > > > However, the initial implementation in KVM didn't allow to set this
> > > > > msr, and KVM used its own notion of VP index.  Fortunately, the way
> > > > > vCPUs are created in QEMU/KVM makes it likely that the KVM value is
> > > > > equal to QEMU cpu_index.
> > > > > 
> > > > > So choose cpu_index as the value for vp_index, and push that to KVM on
> > > > > kernels that support setting the msr.  On older ones that don't, query
> > > > > the kernel value and assert that it's in sync with QEMU.
> > > > > 
> > > > > Besides, since handling errors from vCPU init at hotplug time is
> > > > > impossible, disable vCPU hotplug.    
> > > > proper place to check if cpu might be created is at 
> > > > pc_cpu_pre_plug() where you can gracefully abort cpu creation process.  
> > > >    
> > > 
> > > Thanks for the suggestion, I'll rework it this way.
> > >   
> > > > Also it's possible to create cold-plugged CPUs in out of order
> > > > sequence using
> > > >  -device cpu-foo on CLI
> > > > will be hyperv kvm/guest side ok with it?    
> > > 
> > > On kernels that support setting HV_X64_MSR_VP_INDEX QEMU will
> > > synchronize all sides.  On kernels that don't, if out-of-order creation
> > > results in vp_index mismatch between the kernel and QEMU, vcpu creation
> > > will fail.  
> > 
> > And additional question,
> > what would happen if VM is started on host supporting VP index setting
> > and then migrated to a host without it?  
> 
> The destination QEMU will attempt to initialize vCPUs, and if that
> fails (e.g. due to vp_index mismatch), the migration will be aborted and
> the source VM will continue running.
> 
> If the destination QEMU is old, too, there's a chance that vp_index will
> change.  Then we just keep the fingers crossed that the guest doesn't
> notice (this is the behavior we have now).
on source, putting in migration stream a flag that setting HV_X64_MSR_VP_INDEX
is in use, should prevent migration to the old destination or new destination 
but
without kernel support.
It also might make sense to disable feature for old machine types
so new->old migration would work as it used to be even if
destination kernel supports feature.

> > > > > +X86CPU *hyperv_find_vcpu(uint32_t vp_index)
> > > > > +{
> > > > > +    return X86_CPU(qemu_get_cpu(vp_index));
> > > > > +}    
> > > > this helper isn't used in this patch, add it in the patch that would 
> > > > actually use it    
> > > 
> > > I thought I would put the only two functions that encapsulate the
> > > knowledge of how vp_index is realted to cpu_index, in a single patch.
> > > 
> > > I'm now thinking of open-coding the iteration over cpus here and
> > > directly look for cpu whose hyperv_vp_index() matches.  Then that
> > > knowledge will become encapsulated in a single place, and indeed, this
> > > helper can go into another patch where it's used.
> > >   
> > > > also if  qemu_get_cpu() were called from each CPU init,
> > > > it would incur O(N^2) complexity, could you do without it?    
> > > 
> > > It isn't called on hot paths (ATM it's called only when SINT routes are
> > > created, which is at most one per cpu).  I don't see a problem here.  
> > For what/where do you need this lookup?  
> 
> The guest configures devices to signal their events with synthetic
> interrupts on specific cpus, identified by vp_index.  When we receive
> such a request we look up the cpu and set up a SINT route to be able to
> deliver interrupts to its synic.
> 
> Or did I misunderstand the question perhaps?
since there is 1:1 mapping between synic:vp_index and
vp_index is dense interval of [0..maxcpus),
I'd suggest to maintain internal synic map where vp_index
could be used as index in array to fetch addressed synic.

look for local_apics as example.

[...]

Reply via email to