Re: [PATCH] virtio-pci: fix leaks of msix_affinity_masks

2013-06-20 Thread Michael S. Tsirkin
On Thu, Jun 20, 2013 at 09:36:50AM +0400, Andrey Vagin wrote:
> From: Andrew Vagin 
> 
> vp_dev->msix_vectors should be initialized before allocating
> msix_affinity_masks, otherwise vp_free_vectors will not free these
> objects.
> 
> unreferenced object 0x88010f969d88 (size 512):
>   comm "systemd-udevd", pid 158, jiffies 4294673645 (age 80.545s)
>   hex dump (first 32 bytes):
> 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
> 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
>   backtrace:
> [] kmemleak_alloc+0x5e/0xc0
> [] kmem_cache_alloc_node_trace+0x141/0x2c0
> [] alloc_cpumask_var_node+0x23/0x80
> [] alloc_cpumask_var+0xe/0x10
> [] vp_try_to_find_vqs+0x25d/0x810
> [] vp_find_vqs+0x81/0xb0
> [] init_vqs+0x85/0x120 [virtio_balloon]
> [] virtballoon_probe+0xf9/0x1a0 [virtio_balloon]
> [] virtio_dev_probe+0xde/0x140
> [] driver_probe_device+0x98/0x3a0
> [] __driver_attach+0xab/0xb0
> [] bus_for_each_dev+0x94/0xb0
> [] driver_attach+0x1e/0x20
> [] bus_add_driver+0x200/0x280
> [] driver_register+0x74/0x160
> [] register_virtio_driver+0x20/0x40
> 
> v2: change msix_vectors uncoditionaly in vp_free_vectors
> 
> Cc: Rusty Russell 
> Cc: "Michael S. Tsirkin" 
> Cc: Jason Wang 
> Signed-off-by: Andrew Vagin 
> Signed-off-by: Andrey Vagin 

Acked-by: Michael S. Tsirkin 

> ---
>  drivers/virtio/virtio_pci.c | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/virtio/virtio_pci.c b/drivers/virtio/virtio_pci.c
> index a7ce730..1aba255 100644
> --- a/drivers/virtio/virtio_pci.c
> +++ b/drivers/virtio/virtio_pci.c
> @@ -289,9 +289,9 @@ static void vp_free_vectors(struct virtio_device *vdev)
>  
>   pci_disable_msix(vp_dev->pci_dev);
>   vp_dev->msix_enabled = 0;
> - vp_dev->msix_vectors = 0;
>   }
>  
> + vp_dev->msix_vectors = 0;
>   vp_dev->msix_used_vectors = 0;
>   kfree(vp_dev->msix_names);
>   vp_dev->msix_names = NULL;
> @@ -309,6 +309,8 @@ static int vp_request_msix_vectors(struct virtio_device 
> *vdev, int nvectors,
>   unsigned i, v;
>   int err = -ENOMEM;
>  
> + vp_dev->msix_vectors = nvectors;
> +
>   vp_dev->msix_entries = kmalloc(nvectors * sizeof *vp_dev->msix_entries,
>  GFP_KERNEL);
>   if (!vp_dev->msix_entries)
> @@ -336,7 +338,6 @@ static int vp_request_msix_vectors(struct virtio_device 
> *vdev, int nvectors,
>   err = -ENOSPC;
>   if (err)
>   goto error;
> - vp_dev->msix_vectors = nvectors;
>   vp_dev->msix_enabled = 1;
>  
>   /* Set the vector used for configuration */
> -- 
> 1.8.1.4
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH] virtio-spec: add field for scsi command size

2013-06-20 Thread Paolo Bonzini
Il 20/06/2013 04:40, Rusty Russell ha scritto:
> Paolo Bonzini  writes:
>> Il 19/06/2013 10:24, Michael S. Tsirkin ha scritto:
> 2) We introduce VIRTIO_NET_F_ANY_LAYOUT and VIRTIO_BLK_F_ANY_LAYOUT
>specifically for net and block (note the new names).
>>
>> So why not a transport feature?  Is it just because the SCSI commands
>> for virtio-blk also require a config space field?  Sorry if I missed
>> this upthread.
> 
> Mainly because I'm not sure that *all* devices are now safe.  Are they?

virtio-scsi's implementation in QEMU is not safe (been delaying that for
too long, sorry), but the spec is safe.

Paolo

> I had a stress test half-written for this, pasted below.
> 
> Otherwise I'd be happy to do both: use feature 25 for
> VIRTIO_F_ANY_LAYOUT and another feature bit for the virtio-blk layout
> change.
> 
> Cheers,
> Rusty.
> 
> virtio: CONFIG_VIRTIO_DEVICE_TORTURE
> 
> Virtio devices are not supposed to depend on the framing of the scatter-gather
> lists, but various implementations did.  Safeguard this in future by adding
> an option to deliberately create perverse descriptors.
> 
> Signed-off-by: Rusty Russell 
> 
> diff --git a/drivers/virtio/Kconfig b/drivers/virtio/Kconfig
> index 8d5bddb..99c0187 100644
> --- a/drivers/virtio/Kconfig
> +++ b/drivers/virtio/Kconfig
> @@ -5,6 +5,22 @@ config VIRTIO
> bus, such as CONFIG_VIRTIO_PCI, CONFIG_VIRTIO_MMIO, CONFIG_LGUEST,
> CONFIG_RPMSG or CONFIG_S390_GUEST.
>  
> +config VIRTIO_TORTURE
> + bool "Virtio torture debugging"
> + depends on VIRTIO && DEBUG_KERNEL
> + help
> +
> +   This makes the virtio_ring implementation stress-test
> +   devices.  In particularly, creatively change the format of
> +   requests to make sure that devices are properly implemented,
> +   as well as implement various checks to ensure drivers are
> +   correct.  This will make your virtual machine slow *and*
> +   unreliable!  Say N.
> +
> +   Put virtio_ring.device_torture to your boot commandline to
> +   torture devices (otherwise only simply sanity checks are
> +   done).
> +
>  menu "Virtio drivers"
>  
>  config VIRTIO_PCI
> diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
> index e82821a..6e5271c 100644
> --- a/drivers/virtio/virtio_ring.c
> +++ b/drivers/virtio/virtio_ring.c
> @@ -45,35 +45,6 @@
>  #define virtio_wmb(vq) wmb()
>  #endif
>  
> -#ifdef DEBUG
> -/* For development, we want to crash whenever the ring is screwed. */
> -#define BAD_RING(_vq, fmt, args...)  \
> - do {\
> - dev_err(&(_vq)->vq.vdev->dev,   \
> - "%s:"fmt, (_vq)->vq.name, ##args);  \
> - BUG();  \
> - } while (0)
> -/* Caller is supposed to guarantee no reentry. */
> -#define START_USE(_vq)   \
> - do {\
> - if ((_vq)->in_use)  \
> - panic("%s:in_use = %i\n",   \
> -   (_vq)->vq.name, (_vq)->in_use);   \
> - (_vq)->in_use = __LINE__;   \
> - } while (0)
> -#define END_USE(_vq) \
> - do { BUG_ON(!(_vq)->in_use); (_vq)->in_use = 0; } while(0)
> -#else
> -#define BAD_RING(_vq, fmt, args...)  \
> - do {\
> - dev_err(&_vq->vq.vdev->dev, \
> - "%s:"fmt, (_vq)->vq.name, ##args);  \
> - (_vq)->broken = true;   \
> - } while (0)
> -#define START_USE(vq)
> -#define END_USE(vq)
> -#endif
> -
>  struct indirect_cache {
>   unsigned int max;
>   struct vring_desc *cache;
> @@ -109,7 +80,7 @@ struct vring_virtqueue
>   /* How to notify other side. FIXME: commonalize hcalls! */
>   void (*notify)(struct virtqueue *vq);
>  
> -#ifdef DEBUG
> +#ifdef CONFIG_VIRTIO_TORTURE
>   /* They're supposed to lock for us. */
>   unsigned int in_use;
>  
> @@ -134,6 +105,200 @@ static inline struct indirect_cache 
> *indirect_cache(struct vring_virtqueue *vq)
>   return (struct indirect_cache *)&vq->data[vq->vring.num];
>  }
>  
> +#ifdef CONFIG_VIRTIO_TORTURE
> +/* For development, we want to crash whenever the ring is screwed. */
> +#define BAD_RING(_vq, fmt, args...)  \
> + do {\
> + dev_err(&(_vq)->vq.vdev->dev,   \
> + "%s:"fmt, (_vq)->vq.name, ##args);  \
> + BUG();  \
> + } while (0)
> +/* Caller is supposed to guarantee no reentry. */
> +#define START_USE(_vq)   \
> + do {

Re: what should a virtio-mmio transport without a backend look like?

2013-06-20 Thread Pawel Moll
On Thu, 2013-06-20 at 11:29 +0100, Peter Maydell wrote:
> I'm (finally) trying to add virtio-mmio support properly to
> QEMU now Fred has put all the refactoring foundations in place.
> 
> 1. One question I've run into is: what should a virtio-mmio transport
> with no backend look like to the guest OS? The spec as written
> seems to assume that there's always some backend present.
> (The idea is that QEMU might just always instantiate say 8
> mmio transports, and then whether they actually have a
> blk/net/whatever backend depends on user options).
> 
> It looks as if the current linux driver insists (if it sees a
> device tree node) that the MagicValue register at least is
> correct (otherwise it complains). So one possibility would
> be "MagicValue/Version/VendorID must read as usual, DeviceID
> should read as some special "nothing here" value (0?), everything
> else can RAZ/WI".
> 
> We could just say "all RAZ/WI", since this merely causes Linux
> currently to print a warning about the bad magic number, and
> then subsequently make Linux less alarmist about the zero.
> 
> We could also define that the transport should look as if
> there's some sort of 'null' backend plugged in. This would
> be more effort on the qemu side though, I think.

There are two aspects of the problem:

1. Current implementation of the virtio core won't do anything to the
device if the device/vendor IDs don't match with any of the drivers.

2. All that current virtio-mmio implementation will do is:
* read magic
* read device & vendor id
* write page size

So, a device behaving as you mentioned - magic ok, all register read as
zero, all writes ignored, will do exactly what you want.

Now, as to mandating this:

1. We could mandate device ID 0 (zero) as "NOOP". This is in line with
current ID numbers allocation, just needs formalizing at the top level
of the spec.

2. I have no problem with adding the magic/RAZ/WI behaviour to the
current appendix X, however I must say that the "write page size" will
disappear in the next version of the spec (no more page numbers, just
normal 64-bit addresses), so defining device ID as above will cover your
use case, I believe (as in: correct magic == correct device, NOOP device
== no further actions).

> 2. What was the rationale behind prohibiting interrupt
> line sharing between two virtio-mmio transports? ("device
> operation" section of appendix X). Lack of spare IRQs
> turns out to be the major limit on how many transports you
> can have.

Hm. Simplicity was the intention, really, no hidden agenda. I've never
actually tried to have two platform devices sharing an interrupt, so I'm
not sure how will the generic infrastructure behave in such situation.

> (Is there a mailing list I should be asking this question on?)

virtualization@lists.linux-foundation.org (now on copy)

Paweł


___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

[PATCH net] vhost-net: fix use-after-free in vhost_net_flush

2013-06-20 Thread Michael S. Tsirkin
vhost_net_ubuf_put_and_wait has a confusing name:
it will actually also free it's argument.
Thus since commit 1280c27f8e29acf4af2da914e80ec27c3dbd5c01
vhost_net_flush tries to use the argument after passing it
to vhost_net_ubuf_put_and_wait, this results
in use after free.
To fix, don't free the argument in vhost_net_ubuf_put_and_wait,
add an new API for callers that want to free ubufs.

Signed-off-by: Michael S. Tsirkin 
---

Dave, this is needed for stable as well, but
the code has moved a bit since then.
I'll post a patch tweaked to apply against 3.9 and 3.8 separately.

 drivers/vhost/net.c | 9 +++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
index 5c77d6a..534adb0 100644
--- a/drivers/vhost/net.c
+++ b/drivers/vhost/net.c
@@ -149,6 +149,11 @@ static void vhost_net_ubuf_put_and_wait(struct 
vhost_net_ubuf_ref *ubufs)
 {
kref_put(&ubufs->kref, vhost_net_zerocopy_done_signal);
wait_event(ubufs->wait, !atomic_read(&ubufs->kref.refcount));
+}
+
+static void vhost_net_ubuf_put_wait_and_free(struct vhost_net_ubuf_ref *ubufs)
+{
+   vhost_net_ubuf_put_and_wait(ubufs);
kfree(ubufs);
 }
 
@@ -1073,7 +1078,7 @@ static long vhost_net_set_backend(struct vhost_net *n, 
unsigned index, int fd)
mutex_unlock(&vq->mutex);
 
if (oldubufs) {
-   vhost_net_ubuf_put_and_wait(oldubufs);
+   vhost_net_ubuf_put_wait_and_free(oldubufs);
mutex_lock(&vq->mutex);
vhost_zerocopy_signal_used(n, vq);
mutex_unlock(&vq->mutex);
@@ -1091,7 +1096,7 @@ err_used:
vq->private_data = oldsock;
vhost_net_enable_vq(n, vq);
if (ubufs)
-   vhost_net_ubuf_put_and_wait(ubufs);
+   vhost_net_ubuf_put_wait_and_free(ubufs);
 err_ubufs:
fput(sock->file);
 err_vq:
-- 
MST
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH net for-stable] vhost-net: fix use-after-free in vhost_net_flush

2013-06-20 Thread Michael S. Tsirkin
vhost_net_ubuf_put_and_wait has a confusing name:
it will actually also free it's argument.
Thus since commit 1280c27f8e29acf4af2da914e80ec27c3dbd5c01
vhost_net_flush tries to use the argument after passing it
to vhost_net_ubuf_put_and_wait, this results
in use after free.
To fix, don't free the argument in vhost_net_ubuf_put_and_wait,
add an new API for callers that want to free ubufs.

Signed-off-by: Michael S. Tsirkin 
---

I sent the same patch against 3.10 separately.
This one is for stable, against 3.9/3.8 - the code moved around
since then.
Not sending to stable directly since Dave Miller handles
it for net drivers normally.

 drivers/vhost/net.c   | 4 ++--
 drivers/vhost/vhost.c | 5 +
 drivers/vhost/vhost.h | 1 +
 3 files changed, 8 insertions(+), 2 deletions(-)

diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
index ec6fb3f..e5ff7a5 100644
--- a/drivers/vhost/net.c
+++ b/drivers/vhost/net.c
@@ -856,7 +856,7 @@ static long vhost_net_set_backend(struct vhost_net *n, 
unsigned index, int fd)
mutex_unlock(&vq->mutex);
 
if (oldubufs) {
-   vhost_ubuf_put_and_wait(oldubufs);
+   vhost_ubuf_put_wait_and_free(oldubufs);
mutex_lock(&vq->mutex);
vhost_zerocopy_signal_used(n, vq);
mutex_unlock(&vq->mutex);
@@ -874,7 +874,7 @@ err_used:
rcu_assign_pointer(vq->private_data, oldsock);
vhost_net_enable_vq(n, vq);
if (ubufs)
-   vhost_ubuf_put_and_wait(ubufs);
+   vhost_ubuf_put_wait_and_free(ubufs);
 err_ubufs:
fput(sock->file);
 err_vq:
diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
index 9759249..c01d22f 100644
--- a/drivers/vhost/vhost.c
+++ b/drivers/vhost/vhost.c
@@ -1581,5 +1581,10 @@ void vhost_ubuf_put_and_wait(struct vhost_ubuf_ref 
*ubufs)
 {
kref_put(&ubufs->kref, vhost_zerocopy_done_signal);
wait_event(ubufs->wait, !atomic_read(&ubufs->kref.refcount));
+}
+
+void vhost_ubuf_put_wait_and_free(struct vhost_ubuf_ref *ubufs)
+{
+   vhost_ubuf_put_and_wait(ubufs);
kfree(ubufs);
 }
diff --git a/drivers/vhost/vhost.h b/drivers/vhost/vhost.h
index 17261e2..b53cb28 100644
--- a/drivers/vhost/vhost.h
+++ b/drivers/vhost/vhost.h
@@ -63,6 +63,7 @@ struct vhost_ubuf_ref {
 struct vhost_ubuf_ref *vhost_ubuf_alloc(struct vhost_virtqueue *, bool zcopy);
 void vhost_ubuf_put(struct vhost_ubuf_ref *);
 void vhost_ubuf_put_and_wait(struct vhost_ubuf_ref *);
+void vhost_ubuf_put_wait_and_free(struct vhost_ubuf_ref *);
 
 struct ubuf_info;
 
-- 
MST
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH RFC] vhost-net: make more functions static

2013-06-20 Thread Michael S. Tsirkin
Make two more functions static - they only have local callers.

Signed-off-by: Michael S. Tsirkin 
---

Will be merged through the vhost tree - no need to apply
directly.

 drivers/vhost/net.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
index 534adb0..4c5de87 100644
--- a/drivers/vhost/net.c
+++ b/drivers/vhost/net.c
@@ -167,7 +167,7 @@ static void vhost_net_clear_ubuf_info(struct vhost_net *n)
}
 }
 
-int vhost_net_set_ubuf_info(struct vhost_net *n)
+static int vhost_net_set_ubuf_info(struct vhost_net *n)
 {
bool zcopy;
int i;
@@ -188,7 +188,7 @@ err:
return -ENOMEM;
 }
 
-void vhost_net_vq_reset(struct vhost_net *n)
+static void vhost_net_vq_reset(struct vhost_net *n)
 {
int i;
 
-- 
MST
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH net] vhost-net: fix use-after-free in vhost_net_flush

2013-06-20 Thread Sergei Shtylyov

Hello.

On 20-06-2013 15:48, Michael S. Tsirkin wrote:


vhost_net_ubuf_put_and_wait has a confusing name:
it will actually also free it's argument.
Thus since commit 1280c27f8e29acf4af2da914e80ec27c3dbd5c01


   Please also specify that commit's summary line in parens.


vhost_net_flush tries to use the argument after passing it
to vhost_net_ubuf_put_and_wait, this results
in use after free.
To fix, don't free the argument in vhost_net_ubuf_put_and_wait,
add an new API for callers that want to free ubufs.



Signed-off-by: Michael S. Tsirkin 


WBR, Sergei

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: what should a virtio-mmio transport without a backend look like?

2013-06-20 Thread Christopher Covington
Hi Peter,

On 06/20/2013 07:08 AM, Pawel Moll wrote:
> On Thu, 2013-06-20 at 11:29 +0100, Peter Maydell wrote:
>> I'm (finally) trying to add virtio-mmio support properly to
>> QEMU now Fred has put all the refactoring foundations in place.
>>
>> 1. One question I've run into is: what should a virtio-mmio transport
>> with no backend look like to the guest OS? The spec as written
>> seems to assume that there's always some backend present.
>> (The idea is that QEMU might just always instantiate say 8
>> mmio transports, and then whether they actually have a
>> blk/net/whatever backend depends on user options).
>>
>> It looks as if the current linux driver insists (if it sees a
>> device tree node) that the MagicValue register at least is
>> correct (otherwise it complains). So one possibility would
>> be "MagicValue/Version/VendorID must read as usual, DeviceID
>> should read as some special "nothing here" value (0?), everything
>> else can RAZ/WI".

Might it be reasonably easy to just not enumerate unused transports in the
device tree or kernel parameters?

Regards,
Christopher

-- 
Employee of Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
hosted by the Linux Foundation.
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH 0/3] deprecate usage of pci_scan_bus_parented()

2013-06-20 Thread Jiang Liu
From: Jiang Liu 

This patch tries to deprecate usage of pci_scan_bus_parented().
It applies to
https://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci.git next

Jiang Liu (3):
  PCI: export three functions to support modular host bridge driver
  PCI, xen-pcifront: use new PCI interfaces to simplify implementation
  PCI: mark pci_scan_bus_parented() as __deprecated

 arch/tile/kernel/pci.c |  3 --
 drivers/pci/host-bridge.c  |  1 +
 drivers/pci/probe.c|  1 +
 drivers/pci/remove.c   |  7 +
 drivers/pci/xen-pcifront.c | 70 --
 include/linux/pci.h|  5 ++--
 6 files changed, 42 insertions(+), 45 deletions(-)

-- 
1.8.1.2

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH 1/3] PCI: export three functions to support modular host bridge driver

2013-06-20 Thread Jiang Liu
From: Jiang Liu 

The xen-pcifront host bridge driver could be built as a module,
so export pci_create_root_bus(), pci_stop_and_remove_root_bus() and
pci_set_host_bridge_release() to support modular host bridge drivers.
This patch is a preparation for coming xen-pcifront refinement.

Signed-off-by: Jiang Liu 
Cc: Konrad Rzeszutek Wilk 
Cc: Jeremy Fitzhardinge 
Cc: xen-de...@lists.xensource.com
Cc: virtualization@lists.linux-foundation.org
Cc: linux-...@vger.kernel.org
Cc: linux-ker...@vger.kernel.org
---
 drivers/pci/host-bridge.c | 1 +
 drivers/pci/probe.c   | 1 +
 drivers/pci/remove.c  | 7 +++
 include/linux/pci.h   | 1 +
 4 files changed, 10 insertions(+)

diff --git a/drivers/pci/host-bridge.c b/drivers/pci/host-bridge.c
index a68dc61..6e390a6 100644
--- a/drivers/pci/host-bridge.c
+++ b/drivers/pci/host-bridge.c
@@ -34,6 +34,7 @@ void pci_set_host_bridge_release(struct pci_host_bridge 
*bridge,
bridge->release_fn = release_fn;
bridge->release_data = release_data;
 }
+EXPORT_SYMBOL(pci_set_host_bridge_release);
 
 static bool resource_contains(struct resource *res1, struct resource *res2)
 {
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index 46ada5c..ed768d8 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -1801,6 +1801,7 @@ err_out:
kfree(b);
return NULL;
 }
+EXPORT_SYMBOL(pci_create_root_bus);
 
 int pci_bus_insert_busn_res(struct pci_bus *b, int bus, int bus_max)
 {
diff --git a/drivers/pci/remove.c b/drivers/pci/remove.c
index 8fc54b7..f328668 100644
--- a/drivers/pci/remove.c
+++ b/drivers/pci/remove.c
@@ -147,3 +147,10 @@ void pci_remove_root_bus(struct pci_bus *bus)
/* remove the host bridge */
put_device(&host_bridge->dev);
 }
+
+void pci_stop_and_remove_root_bus(struct pci_bus *bus)
+{
+   pci_stop_root_bus(bus);
+   pci_remove_root_bus(bus);
+}
+EXPORT_SYMBOL(pci_stop_and_remove_root_bus);
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 0fd1f15..f1229c7 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -757,6 +757,7 @@ void pci_remove_bus(struct pci_bus *b);
 void pci_stop_and_remove_bus_device(struct pci_dev *dev);
 void pci_stop_root_bus(struct pci_bus *bus);
 void pci_remove_root_bus(struct pci_bus *bus);
+void pci_stop_and_remove_root_bus(struct pci_bus *bus);
 void pci_setup_cardbus(struct pci_bus *bus);
 void pci_sort_breadthfirst(void);
 #define dev_is_pci(d) ((d)->bus == &pci_bus_type)
-- 
1.8.1.2

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH 2/3] PCI, xen-pcifront: use new PCI interfaces to simplify implementation

2013-06-20 Thread Jiang Liu
From: Jiang Liu 

Use new PCI interfaces to simplify xen-pcifront implementation:
1) Use pci_create_root_bus() instead of pci_scan_bus_parented()
   because pci_scan_bus_parented() will be marked as __deprecated.
   This also gets rid of a duplicated call to pci_bus_start_devices().
2) Use pci_stop_and_remove_root_bus() instead of open-coded private
   implementation.
3) Use pci_set_host_bridge_release() to release data structures
   associated with PCI root buses.

Signed-off-by: Jiang Liu 
Cc: Konrad Rzeszutek Wilk 
Cc: Jeremy Fitzhardinge 
Cc: xen-de...@lists.xensource.com
Cc: virtualization@lists.linux-foundation.org
Cc: linux-...@vger.kernel.org
Cc: linux-ker...@vger.kernel.org
---
 drivers/pci/xen-pcifront.c | 70 --
 1 file changed, 30 insertions(+), 40 deletions(-)

diff --git a/drivers/pci/xen-pcifront.c b/drivers/pci/xen-pcifront.c
index 966abc6..1b01960 100644
--- a/drivers/pci/xen-pcifront.c
+++ b/drivers/pci/xen-pcifront.c
@@ -25,11 +25,6 @@
 #define INVALID_GRANT_REF (0)
 #define INVALID_EVTCHN(-1)
 
-struct pci_bus_entry {
-   struct list_head list;
-   struct pci_bus *bus;
-};
-
 #define _PDEVB_op_active   (0)
 #define PDEVB_op_active(1 << (_PDEVB_op_active))
 
@@ -47,12 +42,12 @@ struct pcifront_device {
struct xen_pci_sharedinfo *sh_info;
struct work_struct op_work;
unsigned long flags;
-
 };
 
 struct pcifront_sd {
int domain;
struct pcifront_device *pdev;
+   struct resource busn_res;
 };
 
 static inline struct pcifront_device *
@@ -67,6 +62,12 @@ static inline void pcifront_init_sd(struct pcifront_sd *sd,
 {
sd->domain = domain;
sd->pdev = pdev;
+
+   /* Xen pci-backend doesn't export P2P bridges */
+   sd->busn_res.start = bus;
+   sd->busn_res.end = bus;
+   sd->busn_res.flags = IORESOURCE_BUS;
+   sd->busn_res.name = "PCI busn";
 }
 
 static DEFINE_SPINLOCK(pcifront_dev_lock);
@@ -441,12 +442,19 @@ static int pcifront_scan_bus(struct pcifront_device *pdev,
return 0;
 }
 
+static void pcifront_release_sd(struct pci_host_bridge *bridge)
+{
+   struct pcifront_sd *sd = bridge->release_data;
+
+   kfree(sd);
+}
+
 static int pcifront_scan_root(struct pcifront_device *pdev,
 unsigned int domain, unsigned int bus)
 {
struct pci_bus *b;
struct pcifront_sd *sd = NULL;
-   struct pci_bus_entry *bus_entry = NULL;
+   LIST_HEAD(resources);
int err = 0;
 
 #ifndef CONFIG_PCI_DOMAINS
@@ -463,16 +471,18 @@ static int pcifront_scan_root(struct pcifront_device 
*pdev,
dev_info(&pdev->xdev->dev, "Creating PCI Frontend Bus %04x:%02x\n",
 domain, bus);
 
-   bus_entry = kmalloc(sizeof(*bus_entry), GFP_KERNEL);
-   sd = kmalloc(sizeof(*sd), GFP_KERNEL);
-   if (!bus_entry || !sd) {
+   sd = kzalloc(sizeof(*sd), GFP_KERNEL);
+   if (!sd) {
err = -ENOMEM;
goto err_out;
}
pcifront_init_sd(sd, domain, bus, pdev);
 
-   b = pci_scan_bus_parented(&pdev->xdev->dev, bus,
- &pcifront_bus_ops, sd);
+   pci_add_resource(&resources, &ioport_resource);
+   pci_add_resource(&resources, &iomem_resource);
+   pci_add_resource(&resources, &sd->busn_res);
+   b = pci_create_root_bus(&pdev->xdev->dev, bus, &pcifront_bus_ops,
+   sd, &resources);
if (!b) {
dev_err(&pdev->xdev->dev,
"Error creating PCI Frontend Bus!\n");
@@ -480,9 +490,8 @@ static int pcifront_scan_root(struct pcifront_device *pdev,
goto err_out;
}
 
-   bus_entry->bus = b;
-
-   list_add(&bus_entry->list, &pdev->root_buses);
+   pci_set_host_bridge_release(to_pci_host_bridge(b->bridge),
+   pcifront_release_sd, sd);
 
/* pci_scan_bus_parented skips devices which do not have a have
* devfn==0. The pcifront_scan_bus enumerates all devfn. */
@@ -497,7 +506,6 @@ static int pcifront_scan_root(struct pcifront_device *pdev,
return err;
 
 err_out:
-   kfree(bus_entry);
kfree(sd);
 
return err;
@@ -538,35 +546,17 @@ static int pcifront_rescan_root(struct pcifront_device 
*pdev,
return err;
 }
 
-static void free_root_bus_devs(struct pci_bus *bus)
-{
-   struct pci_dev *dev;
-
-   while (!list_empty(&bus->devices)) {
-   dev = container_of(bus->devices.next, struct pci_dev,
-  bus_list);
-   dev_dbg(&dev->dev, "removing device\n");
-   pci_stop_and_remove_bus_device(dev);
-   }
-}
-
 static void pcifront_free_roots(struct pcifront_device *pdev)
 {
-   struct pci_bus_entry *bus_entry, *t;
+   struct pcifront_sd *sd;
+   struct pci_bus *bus, *temp;
 
dev_dbg(&pdev->xdev->dev, "cleaning u

[PATCH 3/3] PCI: mark pci_scan_bus_parented() as __deprecated

2013-06-20 Thread Jiang Liu
From: Jiang Liu 

Mark pci_scan_bus_parented() as __deprecated and clean up outdated
comments.

Signed-off-by: Jiang Liu 
Cc: Chris Metcalf 
Cc: Greg Kroah-Hartman 
Cc: Thierry Reding 
Cc: linux-ker...@vger.kernel.org
Cc: linux-...@vger.kernel.org
---
 arch/tile/kernel/pci.c | 3 ---
 include/linux/pci.h| 4 ++--
 2 files changed, 2 insertions(+), 5 deletions(-)

diff --git a/arch/tile/kernel/pci.c b/arch/tile/kernel/pci.c
index 67237d3..936e087 100644
--- a/arch/tile/kernel/pci.c
+++ b/arch/tile/kernel/pci.c
@@ -309,9 +309,6 @@ int __init pcibios_init(void)
 *
 * It reads the PCI tree for this bus into the Linux
 * data structures.
-*
-* This is inlined in linux/pci.h and calls into
-* pci_scan_bus_parented() in probe.c.
 */
pci_add_resource(&resources, &ioport_resource);
pci_add_resource(&resources, &iomem_resource);
diff --git a/include/linux/pci.h b/include/linux/pci.h
index f1229c7..b72d275 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -720,8 +720,8 @@ void pcibios_bus_to_resource(struct pci_dev *dev, struct 
resource *res,
 void pcibios_scan_specific_bus(int busn);
 struct pci_bus *pci_find_bus(int domain, int busnr);
 void pci_bus_add_devices(const struct pci_bus *bus);
-struct pci_bus *pci_scan_bus_parented(struct device *parent, int bus,
- struct pci_ops *ops, void *sysdata);
+struct pci_bus * __deprecated pci_scan_bus_parented(struct device *parent,
+   int bus, struct pci_ops *ops, void *sysdata);
 struct pci_bus *pci_scan_bus(int bus, struct pci_ops *ops, void *sysdata);
 struct pci_bus *pci_create_root_bus(struct device *parent, int bus,
struct pci_ops *ops, void *sysdata,
-- 
1.8.1.2

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH 3/3] PCI: mark pci_scan_bus_parented() as __deprecated

2013-06-20 Thread Greg Kroah-Hartman
On Fri, Jun 21, 2013 at 01:01:05AM +0800, Jiang Liu wrote:
> From: Jiang Liu 
> 
> Mark pci_scan_bus_parented() as __deprecated and clean up outdated
> comments.

Why not just delete the function, if no in-kernel users are calling it,
it's no longer needed at all.

thanks,

greg k-h
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH 3/3] PCI: mark pci_scan_bus_parented() as __deprecated

2013-06-20 Thread Jiang Liu
On 06/21/2013 01:08 AM, Greg Kroah-Hartman wrote:
> On Fri, Jun 21, 2013 at 01:01:05AM +0800, Jiang Liu wrote:
>> From: Jiang Liu 
>>
>> Mark pci_scan_bus_parented() as __deprecated and clean up outdated
>> comments.
> 
> Why not just delete the function, if no in-kernel users are calling it,
> it's no longer needed at all.
Hi Greg,
I thought that may break out of tree drivers, so give a warning first
for smooth transitions. Any guidelines here? I have some other similar
cases to keep some exported symbols just for out of tree drivers.
Regards!
Gerry

> 
> thanks,
> 
> greg k-h
> 

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH 3/3] PCI: mark pci_scan_bus_parented() as __deprecated

2013-06-20 Thread Greg Kroah-Hartman
On Fri, Jun 21, 2013 at 01:14:23AM +0800, Jiang Liu wrote:
> On 06/21/2013 01:08 AM, Greg Kroah-Hartman wrote:
> > On Fri, Jun 21, 2013 at 01:01:05AM +0800, Jiang Liu wrote:
> >> From: Jiang Liu 
> >>
> >> Mark pci_scan_bus_parented() as __deprecated and clean up outdated
> >> comments.
> > 
> > Why not just delete the function, if no in-kernel users are calling it,
> > it's no longer needed at all.
> Hi Greg,
> I thought that may break out of tree drivers, so give a warning first
> for smooth transitions. Any guidelines here? I have some other similar
> cases to keep some exported symbols just for out of tree drivers.

Don't care about out-of-tree drivers, as they obviously don't care about
you, or the in-kernel code.  You are doing no one any favors by keeping
these functions around for a while, only delaying the time that these
out-of-tree drivers will have to be updated, they will not be updated by
a mere __depreciated warning.

So just delete them, that's what the rest of the kernel does, it's the
price that out-of-tree drivers pay, and the authors of them know this
quite well.

thanks,

greg k-h
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH] virtio-pci: fix leaks of msix_affinity_masks

2013-06-20 Thread Jason Wang
On 06/20/2013 01:36 PM, Andrey Vagin wrote:
> From: Andrew Vagin 
>
> vp_dev->msix_vectors should be initialized before allocating
> msix_affinity_masks, otherwise vp_free_vectors will not free these
> objects.
>
> unreferenced object 0x88010f969d88 (size 512):
>   comm "systemd-udevd", pid 158, jiffies 4294673645 (age 80.545s)
>   hex dump (first 32 bytes):
> 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
> 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
>   backtrace:
> [] kmemleak_alloc+0x5e/0xc0
> [] kmem_cache_alloc_node_trace+0x141/0x2c0
> [] alloc_cpumask_var_node+0x23/0x80
> [] alloc_cpumask_var+0xe/0x10
> [] vp_try_to_find_vqs+0x25d/0x810
> [] vp_find_vqs+0x81/0xb0
> [] init_vqs+0x85/0x120 [virtio_balloon]
> [] virtballoon_probe+0xf9/0x1a0 [virtio_balloon]
> [] virtio_dev_probe+0xde/0x140
> [] driver_probe_device+0x98/0x3a0
> [] __driver_attach+0xab/0xb0
> [] bus_for_each_dev+0x94/0xb0
> [] driver_attach+0x1e/0x20
> [] bus_add_driver+0x200/0x280
> [] driver_register+0x74/0x160
> [] register_virtio_driver+0x20/0x40
>
> v2: change msix_vectors uncoditionaly in vp_free_vectors
>
> Cc: Rusty Russell 
> Cc: "Michael S. Tsirkin" 
> Cc: Jason Wang 
> Signed-off-by: Andrew Vagin 
> Signed-off-by: Andrey Vagin 
> ---
>  drivers/virtio/virtio_pci.c | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/virtio/virtio_pci.c b/drivers/virtio/virtio_pci.c
> index a7ce730..1aba255 100644
> --- a/drivers/virtio/virtio_pci.c
> +++ b/drivers/virtio/virtio_pci.c
> @@ -289,9 +289,9 @@ static void vp_free_vectors(struct virtio_device *vdev)
>  
>   pci_disable_msix(vp_dev->pci_dev);
>   vp_dev->msix_enabled = 0;
> - vp_dev->msix_vectors = 0;
>   }
>  
> + vp_dev->msix_vectors = 0;
>   vp_dev->msix_used_vectors = 0;
>   kfree(vp_dev->msix_names);
>   vp_dev->msix_names = NULL;
> @@ -309,6 +309,8 @@ static int vp_request_msix_vectors(struct virtio_device 
> *vdev, int nvectors,
>   unsigned i, v;
>   int err = -ENOMEM;
>  
> + vp_dev->msix_vectors = nvectors;
> +
>   vp_dev->msix_entries = kmalloc(nvectors * sizeof *vp_dev->msix_entries,
>  GFP_KERNEL);
>   if (!vp_dev->msix_entries)
> @@ -336,7 +338,6 @@ static int vp_request_msix_vectors(struct virtio_device 
> *vdev, int nvectors,
>   err = -ENOSPC;
>   if (err)
>   goto error;
> - vp_dev->msix_vectors = nvectors;
>   vp_dev->msix_enabled = 1;
>  
>   /* Set the vector used for configuration */
Acked-by: Jason Wang 
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH net] vhost-net: fix use-after-free in vhost_net_flush

2013-06-20 Thread Jason Wang
On 06/20/2013 07:48 PM, Michael S. Tsirkin wrote:
> vhost_net_ubuf_put_and_wait has a confusing name:
> it will actually also free it's argument.
> Thus since commit 1280c27f8e29acf4af2da914e80ec27c3dbd5c01
> vhost_net_flush tries to use the argument after passing it
> to vhost_net_ubuf_put_and_wait, this results
> in use after free.
> To fix, don't free the argument in vhost_net_ubuf_put_and_wait,
> add an new API for callers that want to free ubufs.
>
> Signed-off-by: Michael S. Tsirkin 
> ---
>
> Dave, this is needed for stable as well, but
> the code has moved a bit since then.
> I'll post a patch tweaked to apply against 3.9 and 3.8 separately.
>
>  drivers/vhost/net.c | 9 +++--
>  1 file changed, 7 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
> index 5c77d6a..534adb0 100644
> --- a/drivers/vhost/net.c
> +++ b/drivers/vhost/net.c
> @@ -149,6 +149,11 @@ static void vhost_net_ubuf_put_and_wait(struct 
> vhost_net_ubuf_ref *ubufs)
>  {
>   kref_put(&ubufs->kref, vhost_net_zerocopy_done_signal);
>   wait_event(ubufs->wait, !atomic_read(&ubufs->kref.refcount));
> +}
> +
> +static void vhost_net_ubuf_put_wait_and_free(struct vhost_net_ubuf_ref 
> *ubufs)
> +{
> + vhost_net_ubuf_put_and_wait(ubufs);
>   kfree(ubufs);
>  }
>  
> @@ -1073,7 +1078,7 @@ static long vhost_net_set_backend(struct vhost_net *n, 
> unsigned index, int fd)
>   mutex_unlock(&vq->mutex);
>  
>   if (oldubufs) {
> - vhost_net_ubuf_put_and_wait(oldubufs);
> + vhost_net_ubuf_put_wait_and_free(oldubufs);
>   mutex_lock(&vq->mutex);
>   vhost_zerocopy_signal_used(n, vq);
>   mutex_unlock(&vq->mutex);
> @@ -1091,7 +1096,7 @@ err_used:
>   vq->private_data = oldsock;
>   vhost_net_enable_vq(n, vq);
>   if (ubufs)
> - vhost_net_ubuf_put_and_wait(ubufs);
> + vhost_net_ubuf_put_wait_and_free(ubufs);
>  err_ubufs:
>   fput(sock->file);
>  err_vq:

Acked-by: Jason Wang 
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization