Re: [PATCH v10 4/7] PCI: add SR-IOV API for Physical Function driver

2009-03-06 Thread Greg KH
On Fri, Mar 06, 2009 at 01:37:18PM -0700, Matthew Wilcox wrote:
> On Fri, Feb 20, 2009 at 02:54:45PM +0800, Yu Zhao wrote:
> > +   virtfn->sysdata = dev->bus->sysdata;
> > +   virtfn->dev.parent = dev->dev.parent;
> > +   virtfn->dev.bus = dev->dev.bus;
> > +   virtfn->devfn = devfn;
> > +   virtfn->hdr_type = PCI_HEADER_TYPE_NORMAL;
> > +   virtfn->cfg_size = PCI_CFG_SPACE_EXP_SIZE;
> > +   virtfn->error_state = pci_channel_io_normal;
> > +   virtfn->current_state = PCI_UNKNOWN;
> > +   virtfn->is_pcie = 1;
> > +   virtfn->pcie_type = PCI_EXP_TYPE_ENDPOINT;
> > +   virtfn->dma_mask = 0x;
> > +   virtfn->vendor = dev->vendor;
> > +   virtfn->subsystem_vendor = dev->subsystem_vendor;
> > +   virtfn->class = dev->class;
> 
> There seems to be a certain amount of commonality between this and
> pci_scan_device().  Have you considered trying to make a common helper
> function, or does it not work out well?
> 
> > +   pci_device_add(virtfn, virtfn->bus);
> 
> Greg is probably going to ding you here for adding the device, then
> creating the symlinks.  I believe it's now best practice to create the
> symlinks first, so there's no window where userspace can get confused.

If the uevent gets sent before the symlinks are created, it's a bug.

thanks,

greg k-h
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v10 0/7] PCI: Linux kernel SR-IOV support

2009-03-06 Thread Greg KH
On Fri, Mar 06, 2009 at 12:44:11PM -0700, Matthew Wilcox wrote:
> > Physical Function driver patches for Intel 82576 NIC are available:
> >   http://patchwork.kernel.org/patch/8063/
> >   http://patchwork.kernel.org/patch/8064/
> >   http://patchwork.kernel.org/patch/8065/
> >   http://patchwork.kernel.org/patch/8066/
> 
> I need to review this driver; I haven't done that yet.  Has anyone else?

The driver was rejected by the upstream developers, who said it would
never be accepted.

thanks,

greg k-h
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v10 1/7] PCI: initialize and release SR-IOV capability

2009-03-06 Thread Greg KH
On Fri, Mar 06, 2009 at 01:08:10PM -0700, Matthew Wilcox wrote:
> On Fri, Feb 20, 2009 at 02:54:42PM +0800, Yu Zhao wrote:
> > +   list_for_each_entry(pdev, &dev->bus->devices, bus_list)
> > +   if (pdev->sriov)
> > +   break;
> > +   if (list_empty(&dev->bus->devices) || !pdev->sriov)
> > +   pdev = NULL;
> > +   ctrl = 0;
> > +   if (!pdev && pci_ari_enabled(dev->bus))
> > +   ctrl |= PCI_SRIOV_CTRL_ARI;
> > +
> 
> I don't like this loop.  At the end of a list_for_each_entry() loop,
> pdev will not be pointing at a pci_device, it'll be pointing to some
> offset from &dev->bus->devices.  So checking pdev->sriov at this point
> is really, really bad.  I would prefer to see something like this:
> 
> ctrl = 0;
> list_for_each_entry(pdev, &dev->bus->devices, bus_list) {
> if (pdev->sriov)
> goto ari_enabled;
> }
> 
> if (pci_ari_enabled(dev->bus))
> ctrl = PCI_SRIOV_CTRL_ARI;
>  ari_enabled:
> pci_write_config_word(dev, pos + PCI_SRIOV_CTRL, ctrl);

No, please use bus_for_each_dev() instead, or bus_find_device(), don't
walk the bus list by hand.  I'm kind of surprised that even builds.  Hm,
in looking at the 2.6.29-rc kernels, I notice it will not even build at
all, you are now forced to use those functions, which is good.

Has anyone even tried to build this patch recently?

thanks,

greg k-h
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH v10 1/7] PCI: initialize and release SR-IOV capability

2009-03-06 Thread Duyck, Alexander H
Randy Dunlap wrote:
> Matthew Wilcox wrote:
>> On Fri, Feb 20, 2009 at 02:54:42PM +0800, Yu Zhao wrote:
>> PCI MSI can also be disabled at runtime (and Fedora do by default).
>> Since SR-IOV really does require MSI, we need to put in a runtime
>> check to see if pci_msi_enabled() is false.
>> 
>> We don't depend on PCIEPORTBUS (a horribly named symbol).  Should we?
>> SR-IOV is only supported for PCI Express machines.  I'm not sure of
>> the right answer here, but I thought I should raise the question.
>> 
>>> +   help
>>> + PCI-SIG I/O Virtualization (IOV) Specifications support.
>>> + Single Root IOV: allows the Physical Function driver to enable
>>> + the hardware capability, so the Virtual Function is accessible
>>> + via the PCI Configuration Space using its own Bus, Device and
>>> + Function Numbers. Each Virtual Function also has the PCI Memory
>>> + Space to map the device specific register set.
> 
> Too spec. and implementation specific for users.
> 
>> I'm not convinced this is the most helpful we could be to the user
>> who's configuring their own kernel.  How about something like this? 
>> (Randy, I particularly look to you to make my prose less turgid).
>> 
>>  help
>>IO Virtualisation is a PCI feature supported by some devices 
>>z ;) which allows you to create virtual PCI devices and assign
>>them to guest OSes.  This option needs to be selected in the host
>>or Dom0 kernel, but does not need to be selected in the guest
>>or DomU kernel.  If you don't know whether your hardware supports
>>it, you can check by using lspci to look for the SR-IOV
>> capability. 
>> 
>>If you have no idea what any of that means, it is safe to
>>answer 'N' here.
> 
> That's certainly more readable and user-friendly.
> I don't know what else it needs.  Looks good to me.
> 
> ~Randy

I'm not sure about this help text because SR-IOV and direct assignment are two 
very different features, and based on this text I would think that SR-IOV is 
all you need to direct assign VFs into guests when all it actually does is 
generate the devices that can then be assigned.  We already have questions 
about if VFs can be used on the host OS and this help text doesn't resolve that 
and would likely lead to similar questions in the future.

I would recommend keeping things simple and just stating "IO Virtualization is 
a PCI feature, supported by some devices, which allows the creation of virtual 
PCI devices that contain a subset of the original device's resources.  If you 
don't know if you hardware supports it, you can check by using lspci to check 
for the SR-IOV capability".

-Alex

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] qcow2 corruption observed, fixed by reverting old change

2009-03-06 Thread Filip Navara
On Wed, Feb 11, 2009 at 5:48 PM, Jamie Lokier  wrote:
> Kevin Wolf wrote:
>> Besides reviewing the code over and over again, I think the only real
>> chance is that you can get a non-productive copy of your image and add
>> some debug code so that we can see at least which code path is causing
>> problems.
>
> I have a copy of my image to reproduce the bug, so I can test patches
> including diagnostic patches.  That's what I did to narrow it down.

Let's see. I have looked at the change in revision 5006 back and forth
and this is the only bug that I can see...

Does the patch help any?

Best regards,
Filip Navara


block-qcow2.diff
Description: Binary data


Re: [PATCH v10 6/7] PCI: document SR-IOV sysfs entries

2009-03-06 Thread Randy Dunlap
Matthew Wilcox wrote:
> Randy, can you wordsmith this one?

I'll try.

> I think I'm starting to understand the difference between physfn and
> dep_link, but an example would definitely help.  It may or may not be
> appropriate to put it in.
> 
> On Fri, Feb 20, 2009 at 02:54:47PM +0800, Yu Zhao wrote:
>> Signed-off-by: Yu Zhao 
>> ---
>>  Documentation/ABI/testing/sysfs-bus-pci |   27 +++
>>  1 files changed, 27 insertions(+), 0 deletions(-)
>>
>> diff --git a/Documentation/ABI/testing/sysfs-bus-pci 
>> b/Documentation/ABI/testing/sysfs-bus-pci
>> index ceddcff..84dc100 100644
>> --- a/Documentation/ABI/testing/sysfs-bus-pci
>> +++ b/Documentation/ABI/testing/sysfs-bus-pci
>> @@ -9,3 +9,30 @@ Description:
>>  that some devices may have malformatted data.  If the
>>  underlying VPD has a writable section then the
>>  corresponding section of this file will be writable.
>> +
>> +What:   /sys/bus/pci/devices/.../virtfn/N
>> +Date:   February 2009
>> +Contact:Yu Zhao 
>> +Description:
>> +This symbol link appears when hardware supports SR-IOV

 symbolic  supports the SR-IOV

>> +capability and Physical Function driver has enabled it.

   ^the | its | a

>> +The symbol link points to the PCI device sysfs entry of

symbolic 

>> +Virtual Function whose index is N (0...MaxVFs-1).

the Virtual Function

>> +
>> +What:   /sys/bus/pci/devices/.../virtfn/dep_link
>> +Date:   February 2009
>> +Contact:Yu Zhao 
>> +Description:
>> +This symbol link appears when hardware supports SR-IOV

 symbolic  supports the SR-IOV

>> +capability and Physical Function driver has enabled it,

   ^its | the | a

>> +and this device has vendor specific dependencies with
>> +others. The symbol link points to the PCI device sysfs

symbolic

>> +entry of Physical Function this device depends on.
>> +
>> +What:   /sys/bus/pci/devices/.../physfn
>> +Date:   February 2009
>> +Contact:Yu Zhao 
>> +Description:
>> +This symbol link appears when a device is Virtual Function.

 symbolic  is a Virtual Function.

>> +The symbol link points to the PCI device sysfs entry of

symbolic   entry of the

>> +Physical Function this device associates with.

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v10 1/7] PCI: initialize and release SR-IOV capability

2009-03-06 Thread Randy Dunlap
Matthew Wilcox wrote:
> On Fri, Feb 20, 2009 at 02:54:42PM +0800, Yu Zhao wrote:
>> +config PCI_IOV
>> +bool "PCI IOV support"
>> +depends on PCI
>> +select PCI_MSI
> 
> My understanding is that having 'select' of a config symbol that the
> user can choose is bad.  I think we should probably make this 'depends
> on PCI_MSI'.

Ack.

> PCI MSI can also be disabled at runtime (and Fedora do by default).
> Since SR-IOV really does require MSI, we need to put in a runtime check
> to see if pci_msi_enabled() is false.
> 
> We don't depend on PCIEPORTBUS (a horribly named symbol).  Should we?
> SR-IOV is only supported for PCI Express machines.  I'm not sure of the
> right answer here, but I thought I should raise the question.
> 
>> +help
>> +  PCI-SIG I/O Virtualization (IOV) Specifications support.
>> +  Single Root IOV: allows the Physical Function driver to enable
>> +  the hardware capability, so the Virtual Function is accessible
>> +  via the PCI Configuration Space using its own Bus, Device and
>> +  Function Numbers. Each Virtual Function also has the PCI Memory
>> +  Space to map the device specific register set.

Too spec. and implementation specific for users.

> I'm not convinced this is the most helpful we could be to the user who's
> configuring their own kernel.  How about something like this?  (Randy, I
> particularly look to you to make my prose less turgid).
> 
>   help
> IO Virtualisation is a PCI feature supported by some devices
 z ;)
> which allows you to create virtual PCI devices and assign them
> to guest OSes.  This option needs to be selected in the host
> or Dom0 kernel, but does not need to be selected in the guest
> or DomU kernel.  If you don't know whether your hardware supports
> it, you can check by using lspci to look for the SR-IOV capability.
> 
> If you have no idea what any of that means, it is safe to
> answer 'N' here.

That's certainly more readable and user-friendly.
I don't know what else it needs.  Looks good to me.

~Randy

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v10 4/7] PCI: add SR-IOV API for Physical Function driver

2009-03-06 Thread Randy Dunlap
Matthew Wilcox wrote:
> On Fri, Feb 20, 2009 at 02:54:45PM +0800, Yu Zhao wrote:
> 
>> +if (nres != iov->nres) {
>> +dev_err(&dev->dev, "no enough MMIO for SR-IOV\n");
>> +return -ENOMEM;
>> +}

"not enough MMIO BARs for SR-IOV"
or
"not enough MMIO resources for SR-IOV"
or
"too few MMIO BARs for SR-IOV"
?

> Randy, can you help us out with better wording here?
> 
>> +dev_err(&dev->dev, "no enough bus range for SR-IOV\n");
> 
> and here.

"SR-IOV: bus number too large"
or
"SR-IOV: bus number out of range"
or
"SR-IOV: cannot allocate valid bus number"
?

>> +if (iov->link != dev->devfn) {
>> +rc = -ENODEV;
>> +list_for_each_entry(link, &dev->bus->devices, bus_list) {
>> +if (link->sriov && link->devfn == iov->link)
>> +rc = sysfs_create_link(&iov->dev.kobj,
>> +&link->dev.kobj, "dep_link");

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v10 7/7] PCI: manual for SR-IOV user and driver developer

2009-03-06 Thread Matthew Wilcox

I think we'll need to go a few rounds on sorting this one out.  I don't
have time right now, but I think having this document is important.

On Fri, Feb 20, 2009 at 02:54:48PM +0800, Yu Zhao wrote:
> Signed-off-by: Yu Zhao 
> ---
>  Documentation/DocBook/kernel-api.tmpl |1 +
>  Documentation/PCI/pci-iov-howto.txt   |   99 
> +
>  2 files changed, 100 insertions(+), 0 deletions(-)
>  create mode 100644 Documentation/PCI/pci-iov-howto.txt
> 
> diff --git a/Documentation/DocBook/kernel-api.tmpl 
> b/Documentation/DocBook/kernel-api.tmpl
> index 5818ff7..506e611 100644
> --- a/Documentation/DocBook/kernel-api.tmpl
> +++ b/Documentation/DocBook/kernel-api.tmpl
> @@ -251,6 +251,7 @@ X!Edrivers/pci/hotplug.c
>  -->
>  !Edrivers/pci/probe.c
>  !Edrivers/pci/rom.c
> +!Edrivers/pci/iov.c
>   
>   PCI Hotplug Support Library
>  !Edrivers/pci/hotplug/pci_hotplug_core.c
> diff --git a/Documentation/PCI/pci-iov-howto.txt 
> b/Documentation/PCI/pci-iov-howto.txt
> new file mode 100644
> index 000..fc73ef5
> --- /dev/null
> +++ b/Documentation/PCI/pci-iov-howto.txt
> @@ -0,0 +1,99 @@
> + PCI Express I/O Virtualization Howto
> + Copyright (C) 2009 Intel Corporation
> + Yu Zhao 
> +
> +
> +1. Overview
> +
> +1.1 What is SR-IOV
> +
> +Single Root I/O Virtualization (SR-IOV) is a PCI Express Extended
> +capability which makes one physical device appear as multiple virtual
> +devices. The physical device is referred to as Physical Function (PF)
> +while the virtual devices are referred to as Virtual Functions (VF).
> +Allocation of the VF can be dynamically controlled by the PF via
> +registers encapsulated in the capability. By default, this feature is
> +not enabled and the PF behaves as traditional PCIe device. Once it's
> +turned on, each VF's PCI configuration space can be accessed by its own
> +Bus, Device and Function Number (Routing ID). And each VF also has PCI
> +Memory Space, which is used to map its register set. VF device driver
> +operates on the register set so it can be functional and appear as a
> +real existing PCI device.
> +
> +2. User Guide
> +
> +2.1 How can I enable SR-IOV capability
> +
> +The device driver (PF driver) will control the enabling and disabling
> +of the capability via API provided by SR-IOV core. If the hardware
> +has SR-IOV capability, loading its PF driver would enable it and all
> +VFs associated with the PF.
> +
> +2.2 How can I use the Virtual Functions
> +
> +The VF is treated as hot-plugged PCI devices in the kernel, so they
> +should be able to work in the same way as real PCI devices. The VF
> +requires device driver that is same as a normal PCI device's.
> +
> +3. Developer Guide
> +
> +3.1 SR-IOV API
> +
> +To enable SR-IOV capability:
> + int pci_enable_sriov(struct pci_dev *dev, int nr_virtfn);
> + 'nr_virtfn' is number of VFs to be enabled.
> +
> +To disable SR-IOV capability:
> + void pci_disable_sriov(struct pci_dev *dev);
> +
> +To notify SR-IOV core of Virtual Function Migration:
> + irqreturn_t pci_sriov_migration(struct pci_dev *dev);
> +
> +3.2 Usage example
> +
> +Following piece of code illustrates the usage of the SR-IOV API.
> +
> +static int __devinit dev_probe(struct pci_dev *dev, const struct 
> pci_device_id *id)
> +{
> + pci_enable_sriov(dev, NR_VIRTFN);
> +
> + ...
> +
> + return 0;
> +}
> +
> +static void __devexit dev_remove(struct pci_dev *dev)
> +{
> + pci_disable_sriov(dev);
> +
> + ...
> +}
> +
> +static int dev_suspend(struct pci_dev *dev, pm_message_t state)
> +{
> + ...
> +
> + return 0;
> +}
> +
> +static int dev_resume(struct pci_dev *dev)
> +{
> + ...
> +
> + return 0;
> +}
> +
> +static void dev_shutdown(struct pci_dev *dev)
> +{
> + ...
> +}
> +
> +static struct pci_driver dev_driver = {
> + .name = "SR-IOV Physical Function driver",
> + .id_table = dev_id_table,
> + .probe =dev_probe,
> + .remove =   __devexit_p(dev_remove),
> + .suspend =  dev_suspend,
> + .resume =   dev_resume,
> + .shutdown = dev_shutdown,
> +};
> -- 
> 1.6.1
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-pci" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

-- 
Matthew Wilcox  Intel Open Source Technology Centre
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours.  We can't possibly take such
a retrograde step."
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v10 6/7] PCI: document SR-IOV sysfs entries

2009-03-06 Thread Matthew Wilcox

Randy, can you wordsmith this one?

I think I'm starting to understand the difference between physfn and
dep_link, but an example would definitely help.  It may or may not be
appropriate to put it in.

On Fri, Feb 20, 2009 at 02:54:47PM +0800, Yu Zhao wrote:
> Signed-off-by: Yu Zhao 
> ---
>  Documentation/ABI/testing/sysfs-bus-pci |   27 +++
>  1 files changed, 27 insertions(+), 0 deletions(-)
> 
> diff --git a/Documentation/ABI/testing/sysfs-bus-pci 
> b/Documentation/ABI/testing/sysfs-bus-pci
> index ceddcff..84dc100 100644
> --- a/Documentation/ABI/testing/sysfs-bus-pci
> +++ b/Documentation/ABI/testing/sysfs-bus-pci
> @@ -9,3 +9,30 @@ Description:
>   that some devices may have malformatted data.  If the
>   underlying VPD has a writable section then the
>   corresponding section of this file will be writable.
> +
> +What:/sys/bus/pci/devices/.../virtfn/N
> +Date:February 2009
> +Contact: Yu Zhao 
> +Description:
> + This symbol link appears when hardware supports SR-IOV
> + capability and Physical Function driver has enabled it.
> + The symbol link points to the PCI device sysfs entry of
> + Virtual Function whose index is N (0...MaxVFs-1).
> +
> +What:/sys/bus/pci/devices/.../virtfn/dep_link
> +Date:February 2009
> +Contact: Yu Zhao 
> +Description:
> + This symbol link appears when hardware supports SR-IOV
> + capability and Physical Function driver has enabled it,
> + and this device has vendor specific dependencies with
> + others. The symbol link points to the PCI device sysfs
> + entry of Physical Function this device depends on.
> +
> +What:/sys/bus/pci/devices/.../physfn
> +Date:February 2009
> +Contact: Yu Zhao 
> +Description:
> + This symbol link appears when a device is Virtual Function.
> + The symbol link points to the PCI device sysfs entry of
> + Physical Function this device associates with.
> -- 
> 1.6.1
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-pci" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

-- 
Matthew Wilcox  Intel Open Source Technology Centre
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours.  We can't possibly take such
a retrograde step."
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v10 5/7] PCI: handle SR-IOV Virtual Function Migration

2009-03-06 Thread Matthew Wilcox
On Fri, Feb 20, 2009 at 02:54:46PM +0800, Yu Zhao wrote:
> +static int sriov_migration(struct pci_dev *dev)
> +{
> + u16 status;
> + struct pci_sriov *iov = dev->sriov;
> +
> + if (!iov->nr_virtfn)
> + return 0;
> +
> + if (!(iov->cap & PCI_SRIOV_CAP_VFM))
> + return 0;
> +
> + pci_read_config_word(iov->self, iov->pos + PCI_SRIOV_STATUS, &status);

You passed in dev here, you don't need to use iov->self, right?

> + if (!(status & PCI_SRIOV_STATUS_VFM))
> + return 0;
> +
> + schedule_work(&iov->mtask);
> +
> + return 1;
> +}

> +/**
> + * pci_sriov_migration - notify SR-IOV core of Virtual Function Migration
> + * @dev: the PCI device
> + *
> + * Returns IRQ_HANDLED if the IRQ is handled, or IRQ_NONE if not.
> + *
> + * Physical Function driver is responsible to register IRQ handler using
> + * VF Migration Interrupt Message Number, and call this function when the
> + * interrupt is generated by the hardware.
> + */
> +irqreturn_t pci_sriov_migration(struct pci_dev *dev)
> +{
> + if (!dev->sriov)
> + return IRQ_NONE;
> +
> + return sriov_migration(dev) ? IRQ_HANDLED : IRQ_NONE;
> +}
> +EXPORT_SYMBOL_GPL(pci_sriov_migration);

OK, I think I get it -- you've basically written an interrupt handler
for the driver to call from its interrupt handler.  Am I right in
thinking that the reason the driver needs to do the interrupt handler
here is because we don't currently have an interface that looks like:

int pci_get_msix_interrupt(struct pci_dev *dev, unsigned vector);

?  If so, we should probably add it; I want it for my MSI-X rewrite
anyway.

-- 
Matthew Wilcox  Intel Open Source Technology Centre
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours.  We can't possibly take such
a retrograde step."
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v10 4/7] PCI: add SR-IOV API for Physical Function driver

2009-03-06 Thread Matthew Wilcox
On Fri, Feb 20, 2009 at 02:54:45PM +0800, Yu Zhao wrote:
> + virtfn->sysdata = dev->bus->sysdata;
> + virtfn->dev.parent = dev->dev.parent;
> + virtfn->dev.bus = dev->dev.bus;
> + virtfn->devfn = devfn;
> + virtfn->hdr_type = PCI_HEADER_TYPE_NORMAL;
> + virtfn->cfg_size = PCI_CFG_SPACE_EXP_SIZE;
> + virtfn->error_state = pci_channel_io_normal;
> + virtfn->current_state = PCI_UNKNOWN;
> + virtfn->is_pcie = 1;
> + virtfn->pcie_type = PCI_EXP_TYPE_ENDPOINT;
> + virtfn->dma_mask = 0x;
> + virtfn->vendor = dev->vendor;
> + virtfn->subsystem_vendor = dev->subsystem_vendor;
> + virtfn->class = dev->class;

There seems to be a certain amount of commonality between this and
pci_scan_device().  Have you considered trying to make a common helper
function, or does it not work out well?

> + pci_device_add(virtfn, virtfn->bus);

Greg is probably going to ding you here for adding the device, then
creating the symlinks.  I believe it's now best practice to create the
symlinks first, so there's no window where userspace can get confused.

> + mutex_unlock(&iov->pdev->sriov->lock);

I question the existance of this mutex now.  What's it protecting?

Aren't we going to be implicitly protected by virtue of the Physical
Function device driver being the only one calling this function, and the
driver will be calling it from the ->probe routine which is not called
simultaneously for the same device.

> + virtfn->physfn = pci_dev_get(dev);
> +
> + rc = pci_bus_add_device(virtfn);
> + if (rc)
> + goto failed1;
> + sprintf(buf, "%d", id);

%u, perhaps?  And maybe 'id' should always be unsigned?  Just a thought.

> + rc = sysfs_create_link(&iov->dev.kobj, &virtfn->dev.kobj, buf);
> + if (rc)
> + goto failed1;
> + rc = sysfs_create_link(&virtfn->dev.kobj, &dev->dev.kobj, "physfn");
> + if (rc)
> + goto failed2;

I'm glad to see these symlinks documented in later patches!

> + nres = 0;
> + for (i = 0; i < PCI_SRIOV_NUM_BARS; i++) {
> + res = dev->resource + PCI_SRIOV_RESOURCES + i;
> + if (!res->parent)
> + continue;
> + nres++;
> + }

Can't this be written more simply as:

for (i = 0; i < PCI_SRIOV_NUM_BARS; i++) {
res = dev->resource + PCI_SRIOV_RESOURCES + i;
if (res->parent)
nres++;
}
?

> + if (nres != iov->nres) {
> + dev_err(&dev->dev, "no enough MMIO for SR-IOV\n");
> + return -ENOMEM;
> + }

Randy, can you help us out with better wording here?

> + dev_err(&dev->dev, "no enough bus range for SR-IOV\n");

and here.

> + if (iov->link != dev->devfn) {
> + rc = -ENODEV;
> + list_for_each_entry(link, &dev->bus->devices, bus_list) {
> + if (link->sriov && link->devfn == iov->link)
> + rc = sysfs_create_link(&iov->dev.kobj,
> + &link->dev.kobj, "dep_link");

I skipped to the end and read patch 7/7 and I still don't understand
what dep_link is for.  Can you explain please?  In particular, how is it
different from physfn?

-- 
Matthew Wilcox  Intel Open Source Technology Centre
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours.  We can't possibly take such
a retrograde step."
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v10 3/7] PCI: reserve bus range for SR-IOV device

2009-03-06 Thread Matthew Wilcox
On Fri, Feb 20, 2009 at 02:54:44PM +0800, Yu Zhao wrote:
> +static inline void virtfn_bdf(struct pci_dev *dev, int id, u8 *busnr, u8 
> *devfn)
> +{
> + u16 bdf;
> +
> + bdf = (dev->bus->number << 8) + dev->devfn +
> +   dev->sriov->offset + dev->sriov->stride * id;
> + *busnr = bdf >> 8;
> + *devfn = bdf & 0xff;
> +}

I find the interface here a bit clunky -- a function returning void
while having two OUT parameters.  How about this variation on the theme
(viewers are encouraged to come up with their own preferred
implementations and interfaces):

static inline __pure u16 virtfn_bdf(struct pci_dev *dev, int id)
{
return (dev->bus->number << 8) + dev->devfn + dev->sriov->offset +
dev->sriov->stride * id;
}

#define VIRT_BUS(dev, id)   (virtfn_bdf(dev, id) >> 8)
#define VIRT_DEVFN(dev, id) (virtfn_bdf(dev, id) & 0xff)

We rely on GCC to do CSE and not actually invoke virtfn_bdf more than
once.

-- 
Matthew Wilcox  Intel Open Source Technology Centre
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours.  We can't possibly take such
a retrograde step."
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v10 2/7] PCI: restore saved SR-IOV state

2009-03-06 Thread Matthew Wilcox
On Fri, Feb 20, 2009 at 02:54:43PM +0800, Yu Zhao wrote:
> Signed-off-by: Yu Zhao 

It needs a changelog ...

> +static void sriov_restore_state(struct pci_dev *dev)
> +{
[...]
> +}
> +
[...]
> +/**
> + * pci_restore_iov_state - restore the state of the IOV capability
> + * @dev: the PCI device
> + */
> +void pci_restore_iov_state(struct pci_dev *dev)
> +{
> + if (dev->sriov)
> + sriov_restore_state(dev);
> +}

Apart from the design pattern I mentioned in my previous email also
appearing here, I see no problems with this patch.

-- 
Matthew Wilcox  Intel Open Source Technology Centre
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours.  We can't possibly take such
a retrograde step."
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v10 1/7] PCI: initialize and release SR-IOV capability

2009-03-06 Thread Matthew Wilcox
On Fri, Feb 20, 2009 at 02:54:42PM +0800, Yu Zhao wrote:
> +config PCI_IOV
> + bool "PCI IOV support"
> + depends on PCI
> + select PCI_MSI

My understanding is that having 'select' of a config symbol that the
user can choose is bad.  I think we should probably make this 'depends
on PCI_MSI'.

PCI MSI can also be disabled at runtime (and Fedora do by default).
Since SR-IOV really does require MSI, we need to put in a runtime check
to see if pci_msi_enabled() is false.

We don't depend on PCIEPORTBUS (a horribly named symbol).  Should we?
SR-IOV is only supported for PCI Express machines.  I'm not sure of the
right answer here, but I thought I should raise the question.

> + default n

You don't need this -- the default default is n ;-)

> + help
> +   PCI-SIG I/O Virtualization (IOV) Specifications support.
> +   Single Root IOV: allows the Physical Function driver to enable
> +   the hardware capability, so the Virtual Function is accessible
> +   via the PCI Configuration Space using its own Bus, Device and
> +   Function Numbers. Each Virtual Function also has the PCI Memory
> +   Space to map the device specific register set.

I'm not convinced this is the most helpful we could be to the user who's
configuring their own kernel.  How about something like this?  (Randy, I
particularly look to you to make my prose less turgid).

help
  IO Virtualisation is a PCI feature supported by some devices
  which allows you to create virtual PCI devices and assign them
  to guest OSes.  This option needs to be selected in the host
  or Dom0 kernel, but does not need to be selected in the guest
  or DomU kernel.  If you don't know whether your hardware supports
  it, you can check by using lspci to look for the SR-IOV capability.

  If you have no idea what any of that means, it is safe to
  answer 'N' here.

> diff --git a/drivers/pci/Makefile b/drivers/pci/Makefile
> index 3d07ce2..ba99282 100644
> --- a/drivers/pci/Makefile
> +++ b/drivers/pci/Makefile
> @@ -29,6 +29,9 @@ obj-$(CONFIG_DMAR) += dmar.o iova.o intel-iommu.o
>  
>  obj-$(CONFIG_INTR_REMAP) += dmar.o intr_remapping.o
>  
> +# PCI IOV support
> +obj-$(CONFIG_PCI_IOV) += iov.o

I see you're following the gerneal style in this file, but the comments
really add no value.  I should send a patch to take out the existing ones.

> + list_for_each_entry(pdev, &dev->bus->devices, bus_list)
> + if (pdev->sriov)
> + break;
> + if (list_empty(&dev->bus->devices) || !pdev->sriov)
> + pdev = NULL;
> + ctrl = 0;
> + if (!pdev && pci_ari_enabled(dev->bus))
> + ctrl |= PCI_SRIOV_CTRL_ARI;
> +

I don't like this loop.  At the end of a list_for_each_entry() loop,
pdev will not be pointing at a pci_device, it'll be pointing to some
offset from &dev->bus->devices.  So checking pdev->sriov at this point
is really, really bad.  I would prefer to see something like this:

ctrl = 0;
list_for_each_entry(pdev, &dev->bus->devices, bus_list) {
if (pdev->sriov)
goto ari_enabled;
}

if (pci_ari_enabled(dev->bus))
ctrl = PCI_SRIOV_CTRL_ARI;
 ari_enabled:
pci_write_config_word(dev, pos + PCI_SRIOV_CTRL, ctrl);

> + if (pdev)
> + iov->pdev = pci_dev_get(pdev);
> + else {
> + iov->pdev = dev;
> + mutex_init(&iov->lock);
> + }

Now I'm confused.  Why don't we need to init the mutex if there's another
device on the same bus which also has an iov capability?

> +static void sriov_release(struct pci_dev *dev)
> +{
> + if (dev == dev->sriov->pdev)
> + mutex_destroy(&dev->sriov->lock);
> + else
> + pci_dev_put(dev->sriov->pdev);
> +
> + kfree(dev->sriov);
> + dev->sriov = NULL;
> +}

> +void pci_iov_release(struct pci_dev *dev)
> +{
> + if (dev->sriov)
> + sriov_release(dev);
> +}

This seems to be a bit of a design pattern with you, and I'm not quite sure why 
you do it like this instead of just doing:

void pci_iov_release(struct pci_dev *dev)
{
if (!dev->sriov)
return;
[...]
}

-- 
Matthew Wilcox  Intel Open Source Technology Centre
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours.  We can't possibly take such
a retrograde step."
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v10 0/7] PCI: Linux kernel SR-IOV support

2009-03-06 Thread Matthew Wilcox
On Fri, Feb 20, 2009 at 02:54:41PM +0800, Yu Zhao wrote:
> Following patches are intended to support SR-IOV capability in the
> Linux kernel. With these patches, people can turn a PCI device with
> the capability into multiple ones from software perspective, which
> will benefit KVM and achieve other purposes such as QoS, security,
> and etc.

I reviewed this round of patches on the plane ... I'll respond to each
patch individually, but in general this all looks much better than the
first round I reviewed.

> Physical Function driver patches for Intel 82576 NIC are available:
>   http://patchwork.kernel.org/patch/8063/
>   http://patchwork.kernel.org/patch/8064/
>   http://patchwork.kernel.org/patch/8065/
>   http://patchwork.kernel.org/patch/8066/

I need to review this driver; I haven't done that yet.  Has anyone else?

-- 
Matthew Wilcox  Intel Open Source Technology Centre
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours.  We can't possibly take such
a retrograde step."
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v10 0/7] PCI: Linux kernel SR-IOV support

2009-03-06 Thread Matthew Wilcox
On Tue, Feb 24, 2009 at 12:47:38PM +0200, Avi Kivity wrote:
> Do those patches allow using a VF on the host (in other words, does the 
> kernel emulate config space accesses)?

SR-IOV hardware handles config space accesses to virtual functions.  No
kernel changes needed for that aspect of it.

-- 
Matthew Wilcox  Intel Open Source Technology Centre
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours.  We can't possibly take such
a retrograde step."
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: kvm-84 kernel panic

2009-03-06 Thread Johannes Baumann
i installed esxi, same error in ipmi after a while, talking to dell..


Marcelo Tosatti schrieb:
> On Fri, Mar 06, 2009 at 06:02:00AM +0100, Johannes Baumann wrote:
>> Marcelo Tosatti schrieb:
>>> On Thu, Mar 05, 2009 at 11:31:35PM +0100, Johannes Baumann wrote:
 Ryan Harper schrieb:
> * Johannes Baumann  [2009-03-05 16:08]:
>> Ryan Harper schrieb:
>>> * Johannes Baumann  [2009-03-05 15:52]:
 update: its not only happening with smp guest, but with non-smp guest.
 mcelog is empty. the commandline for a debian lenny im using is:

 /usr/src/kvm-84/qemu/x86_64-softmmu/qemu-system-x86_64 -S -M pc -m 512
 -smp 1 -name vm03 -uuid 0b5e3a60-9111-0289-84b5-98cdf27450ac -monitor
 pty -boot c -drive
 file=/kvm/domains/linux-test/image.img,if=ide,index=0,boot=on -drive
 file=,if=ide,media=cdrom,index=2 -net
 nic,macaddr=54:52:00:0b:a1:26,vlan=0 -net
 tap,fd=12,script=,vlan=0,ifname=vnet0 -serial none -parallel none -usb
 -vnc 127.0.0.1:0
>>> you'll want -L /usr/src/kvm-84/qemu/pc-bios to ensure you use the bios
>>> that came with kvm-84 instead of some other location.
>> i could try but i dont think this is the point since every kvm version
>> had the same problems so far.
> if you don't use the right bios that comes with the kernel modules,
> things _will_ break.
 k, trying the -L switch, with vt enabled

 root  6265 23.8 25.6 1100904 1039068 ? Sl   23:24   1:29
 /usr/src/kvm-84/qemu/x86_64-softmmu/qemu-system-x86_64 -L
 /usr/src/kvm-84/qemu/pc-bios -S -M pc -m 1000 -smp 1 -name vm01 -uuid
 3490fdf2-9550-4300-883a-a10fbf9b58e6 -monitor pty -localtime -boot c
 -drive
 file=/kvm/domains/windows2003-enterprise-jb/image.img,if=ide,index=0,boot=on
 -drive file=,if=ide,media=cdrom,index=2 -net
 nic,macaddr=00:16:d3:c2:36:ff,vlan=0,model=e1000 -net
 tap,fd=18,script=,vlan=0,ifname=vnet1 -serial none -parallel none -usb
 -usbdevice tablet -vnc 127.0.0.1:1
>> seems realy be guest bios related, since using the -L switch the host
>> System didnt crash anymore. I got a lot of mcelogs:
>>
>> MCE 0
>> HARDWARE ERROR. This is *NOT* a software problem!
>> Please contact your hardware vendor
>> CPU 2 BANK 0 TSC 34186b6364dd
>> MCG status:
>> MCi status:
>> Error overflow
>> Uncorrected error
>> Error enabled
>> Processor context corrupt
>> MCA:BUS Level-0 Originated-request Generic Memory-access Request-timeout
>> Error
>> Model:
>> STATUS f20008001800 MCGSTATUS 0
>> MCE 1
>> HARDWARE ERROR. This is *NOT* a software problem!
>> Please contact your hardware vendor
>> CPU 2 BANK 5 TSC 34186b6373bd
>> MCG status:
>> MCi status:
>> Uncorrected error
>> Error enabled
>> Processor context corrupt
>> MCA:BUS Generic Generic Generic Other-transaction Request-timeout Error
>> Model:
>> STATUS b20806000e0f MCGSTATUS 0
>> MCE 2
>> HARDWARE ERROR. This is *NOT* a software problem!
>> Please contact your hardware vendor
>> CPU 0 BANK 5 TSC 34186b637434
>> MCG status:RIPV MCIP
>> MCi status:
>> Uncorrected error
>> Error enabled
>> Processor context corrupt
>> MCA:BUS Generic Generic Generic Other-transaction Request-timeout Error
>> Model:Pad state machine
>>
>> and so on..
>>
>>>
  2d | 02/19/2009 | 02:54:31 | Processor #0x0d | Transition to
 Non-recoverable
   2e | 02/19/2009 | 02:54:32 | Processor #0x60 | IERR | Asserted
>>> http://bugs.centos.org/print_bug_page.php?bug_id=2619
>>>
>>> Likely to be a HW problem.
>> I wasn able to reproduce without kvm running, still a hw problem?
> 
> MCE usually indicate hardware problems. Take this to Dell support.
> 

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: does anyone run guests for more than 5 minutes? (virtio-net perf anomaly)

2009-03-06 Thread Alex Williamson
On Fri, 2009-03-06 at 14:17 -0300, Marcelo Tosatti wrote:
> On Fri, Mar 06, 2009 at 08:26:33AM -0700, Alex Williamson wrote:
> > On Thu, 2009-03-05 at 21:25 -0700, Alex Williamson wrote:
> > > On Thu, 2009-03-05 at 21:04 -0300, Marcelo Tosatti wrote:
> > > > On Tue, Mar 03, 2009 at 01:13:41PM -0700, Alex Williamson wrote:
> > > > > Any guesses as to what might be going on?  Can anyone reproduce?  I'm
> > > > > hoping that I'm doing something dumb, but can't figure out what it is.
> > > > > The system is running v2.6.29-rc6-121-g64e7130 in the guest,
> > > > > v2.6.29-rc6-123-gbd7b3b4 on the host, kvm module kvm-84-620-g5bc 
> > > > > and
> > > > > userspace kvm-84-95-gea1b668.  Thanks,
> > > > 
> > > > Nope. Collect kvm_stat -l before/after the slowdown?
> > > 
> > > Attached.  This shows about 1 minute of data before the slowdown and a
> > > dramatic change starting around the 80th row.  Thanks,
> > 
> > For a bit easier consumption, here's a google spreadsheet and chart of
> > what appear to be the interesting columns:
> > 
> > http://spreadsheets.google.com/pub?key=pdwpc4VwMbjWxyQfvw8AEYg
> > 
> > Alex
> 
> irq_injection goes up significantly. Is it virtio_net generating more
> interrupts? Check the rate of irq's generated by hw/virtio-net.c and
> compare that with rate seen in /proc/interrupts inside the guest?

The rate inside the guest more than doubles when things slow down, ~2k/s
-> ~4.5k/s.  Also interesting, the soft interrupt time reported by top
in the guest goes from ~65% to zero.  I haven't checked what
hw/virtio-net.c is generating.  FWIW, it's not processor specific, I see
the same thing on an AMD based system, as shown in the updated
spreadsheet.  Thanks,

Alex


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: does anyone run guests for more than 5 minutes? (virtio-net perf anomaly)

2009-03-06 Thread Marcelo Tosatti
On Fri, Mar 06, 2009 at 08:26:33AM -0700, Alex Williamson wrote:
> On Thu, 2009-03-05 at 21:25 -0700, Alex Williamson wrote:
> > On Thu, 2009-03-05 at 21:04 -0300, Marcelo Tosatti wrote:
> > > On Tue, Mar 03, 2009 at 01:13:41PM -0700, Alex Williamson wrote:
> > > > Any guesses as to what might be going on?  Can anyone reproduce?  I'm
> > > > hoping that I'm doing something dumb, but can't figure out what it is.
> > > > The system is running v2.6.29-rc6-121-g64e7130 in the guest,
> > > > v2.6.29-rc6-123-gbd7b3b4 on the host, kvm module kvm-84-620-g5bc and
> > > > userspace kvm-84-95-gea1b668.  Thanks,
> > > 
> > > Nope. Collect kvm_stat -l before/after the slowdown?
> > 
> > Attached.  This shows about 1 minute of data before the slowdown and a
> > dramatic change starting around the 80th row.  Thanks,
> 
> For a bit easier consumption, here's a google spreadsheet and chart of
> what appear to be the interesting columns:
> 
> http://spreadsheets.google.com/pub?key=pdwpc4VwMbjWxyQfvw8AEYg
> 
> Alex

irq_injection goes up significantly. Is it virtio_net generating more
interrupts? Check the rate of irq's generated by hw/virtio-net.c and
compare that with rate seen in /proc/interrupts inside the guest?

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: does anyone run guests for more than 5 minutes? (virtio-net perf anomaly)

2009-03-06 Thread Alex Williamson
On Thu, 2009-03-05 at 21:25 -0700, Alex Williamson wrote:
> On Thu, 2009-03-05 at 21:04 -0300, Marcelo Tosatti wrote:
> > On Tue, Mar 03, 2009 at 01:13:41PM -0700, Alex Williamson wrote:
> > > Any guesses as to what might be going on?  Can anyone reproduce?  I'm
> > > hoping that I'm doing something dumb, but can't figure out what it is.
> > > The system is running v2.6.29-rc6-121-g64e7130 in the guest,
> > > v2.6.29-rc6-123-gbd7b3b4 on the host, kvm module kvm-84-620-g5bc and
> > > userspace kvm-84-95-gea1b668.  Thanks,
> > 
> > Nope. Collect kvm_stat -l before/after the slowdown?
> 
> Attached.  This shows about 1 minute of data before the slowdown and a
> dramatic change starting around the 80th row.  Thanks,

For a bit easier consumption, here's a google spreadsheet and chart of
what appear to be the interesting columns:

http://spreadsheets.google.com/pub?key=pdwpc4VwMbjWxyQfvw8AEYg

Alex

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 4/6] kvm/x86/mmu: handle invlpg on large pages

2009-03-06 Thread Joerg Roedel
On Thu, Mar 05, 2009 at 06:11:22PM -0300, Marcelo Tosatti wrote:
> On Thu, Mar 05, 2009 at 01:12:31PM +0100, Joerg Roedel wrote:
> > Signed-off-by: Joerg Roedel 
> > ---
> >  arch/x86/kvm/paging_tmpl.h |   12 +---
> >  1 files changed, 9 insertions(+), 3 deletions(-)
> > 
> > diff --git a/arch/x86/kvm/paging_tmpl.h b/arch/x86/kvm/paging_tmpl.h
> > index 79668ba..aa79396 100644
> > --- a/arch/x86/kvm/paging_tmpl.h
> > +++ b/arch/x86/kvm/paging_tmpl.h
> > @@ -441,6 +441,7 @@ out_unlock:
> >  static void FNAME(invlpg)(struct kvm_vcpu *vcpu, gva_t gva)
> >  {
> > struct kvm_shadow_walk_iterator iterator;
> > +   struct kvm_mmu_page *sp;
> > pt_element_t gpte;
> > gpa_t pte_gpa = -1;
> > int level;
> > @@ -451,12 +452,17 @@ static void FNAME(invlpg)(struct kvm_vcpu *vcpu, 
> > gva_t gva)
> > for_each_shadow_entry(vcpu, gva, iterator) {
> > level = iterator.level;
> > sptep = iterator.sptep;
> > +   sp = page_header(__pa(sptep));
> > +
> > +   if (sp->role.direct) {
> > +   /* mapped from a guest's large_pte */
> > +   kvm_mmu_zap_page(vcpu->kvm, sp);
> > +   kvm_flush_remote_tlbs(vcpu->kvm);
> > +   return;
> > +   }
> 
> If the guest has 32-bit pte's there might be:
> 
> - two large shadow entries to cover 4MB
> - one large shadow entry and one shadow page with 512 4k entries
> - two shadow pages with 512 4k entries each
> 
> So need to cover all this cases.

Right. Thanks for pointing this out. I will post an updated version of
this patch.

Joerg

-- 
   | Advanced Micro Devices GmbH
 Operating | Karl-Hammerschmidt-Str. 34, 85609 Dornach bei München
 System| 
 Research  | Geschäftsführer: Jochen Polster, Thomas M. McCoy, Giuliano Meroni
 Center| Sitz: Dornach, Gemeinde Aschheim, Landkreis München
   | Registergericht München, HRB Nr. 43632

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/6] various x86 kvm fixes

2009-03-06 Thread Joerg Roedel
On Thu, Mar 05, 2009 at 06:50:37PM -0300, Marcelo Tosatti wrote:
> On Thu, Mar 05, 2009 at 01:12:27PM +0100, Joerg Roedel wrote:
> > Hi Avi, Marcelo,
> > 
> > this small series of patches collects some fixes I have collected during
> > the last days of KVM hacking. Please review and comment or apply if
> > appropriate :)
> 
> Applied 2, 3 and 6.

Thanks. Sorry for the two false positives. After so much mmu debugging
in the last days it seems I start to see ghost bugs :(

Joerg

-- 
   | Advanced Micro Devices GmbH
 Operating | Karl-Hammerschmidt-Str. 34, 85609 Dornach bei München
 System| 
 Research  | Geschäftsführer: Jochen Polster, Thomas M. McCoy, Giuliano Meroni
 Center| Sitz: Dornach, Gemeinde Aschheim, Landkreis München
   | Registergericht München, HRB Nr. 43632

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


problem with www.linux-kvm.com

2009-03-06 Thread Paolo Pedaletti

ciao,
I hope that some admin/webmaster of www.linux-kvm.com read this ml.
I have tried to login to the website, but:

1) after login adn password, appears:
Access denied
You are not authorized to access this page.

2) the mail-form at http://www.linux-kvm.com/contact
has a captcha that doesn't works
(ie I enter the right word, and it says "Invalid CAPTCHA token."

:-(

thank you

--
Paolo Pedaletti

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html