On Mon, Jan 17, 2011 at 08:18:22PM +0800, Sheng Yang wrote:
> > > +         goto out;
> > > +
> > > + mmio = &mmio_dev->mmio[idx];
> > > + entry = (addr - mmio->table_base_addr) / PCI_MSIX_ENTRY_SIZE;
> > > + entry_base = mmio->table_base_va + entry * PCI_MSIX_ENTRY_SIZE;
> > > + ctrl_pos = (u32 *)(entry_base + PCI_MSIX_ENTRY_VECTOR_CTRL);
> > > +
> > > + if (get_user(old_ctrl, ctrl_pos))
> > > +         goto out;
> > > +
> > > + /* No allow writing to other fields when entry is unmasked */
> > > + if (!(old_ctrl & PCI_MSIX_ENTRY_CTRL_MASKBIT) &&
> > > +     offset != PCI_MSIX_ENTRY_VECTOR_CTRL)
> > > +         goto out;
> > > +
> > > + if (copy_to_user((void __user *)(entry_base + offset), val, len))
> > > +         goto out;
> > 
> > Instead of copying to/from userspace (which is subject to swapin,
> > unexpected values), you could include the guest written value in a
> > kvm_run structure, along with address. Qemu-kvm would use that to
> > synchronize its copy of the table, on KVM_EXIT_MSIX_ROUTING_UPDATE exit.
> 
> We want to acelerate MSI-X mask bit accessing, which won't exit to userspace 
> in 
> the most condition. That's the cost we want to optimize. Also it's possible 
> to 
> userspace to read the correct value of MMIO(but mostly userspace can't write 
> to it 
> in order to prevent synchronize issue).

Yes, i meant exit to userspace only when necessary, but instead of
copying directly everytime record the value the guest has written in
kvm_run and exit with KVM_EXIT_MSIX_ROUTING_UPDATE.

> > > + if (get_user(new_ctrl, ctrl_pos))
> > > +         goto out;
> > > +
> > > + if ((offset < PCI_MSIX_ENTRY_VECTOR_CTRL && len == 4) ||
> > > +     (offset < PCI_MSIX_ENTRY_DATA && len == 8))
> > > +         ret = -ENOTSYNC;
> > > + if (old_ctrl == new_ctrl)
> > > +         goto out;
> > > + if (!(old_ctrl & PCI_MSIX_ENTRY_CTRL_MASKBIT) &&
> > > +                 (new_ctrl & PCI_MSIX_ENTRY_CTRL_MASKBIT))
> > > +         r = update_msix_mask_bit(mmio_dev->kvm, mmio, entry, 1);
> > > + else if ((old_ctrl & PCI_MSIX_ENTRY_CTRL_MASKBIT) &&
> > > +                 !(new_ctrl & PCI_MSIX_ENTRY_CTRL_MASKBIT))
> > > +         r = update_msix_mask_bit(mmio_dev->kvm, mmio, entry, 0);
> > 
> > Then you rely on the kernel copy of the values to enable/disable irq.
> 
> Yes, they are guest owned assigned IRQs. Any potential issue?

Nothing prevents the kernel from enabling or disabling irq multiple
times with this code (what prevents it is a qemu-kvm that behaves as
expected). This is not very good.

Perhaps the guest can only harm itself with that, but i'm not sure.

And also if an msix table page is swapped out guest will hang.

> > > + return r;
> > > +}
> > 
> > This is not consistent with registration, where there are no checks
> > regarding validity of assigned device id. So why is it necessary?
> 
> I am not quite understand. We need to free mmio anyway, otherwise it would 
> result 
> in wrong mmio interception...

If you return -EOPNOTSUPP in case kvm_find_assigned_dev fails in the
read/write paths, there is no need to free anything.

> 
> > 
> > There is a lock ordering problem BTW:
> > 
> > - read/write handlers: dev->lock -> kvm->lock
> > - vm_ioctl_deassign_device -> kvm_free_msix_mmio: kvm->lock -> dev->lock
> 
> Good catch! Would fix it(and other comments of course).
> 
> --
> regards
> Yang, Sheng
> 
> > 
> > > +
> > > +int kvm_vm_ioctl_unregister_msix_mmio(struct kvm *kvm,
> > > +                               struct kvm_msix_mmio_user *mmio_user)
> > > +{
> > > + struct kvm_msix_mmio mmio;
> > > +
> > > + mmio.dev_id = mmio_user->dev_id;
> > > + mmio.type = mmio_user->type;
> > > +
> > > + return kvm_free_msix_mmio(kvm, &mmio);
> > > +}
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to