[PATCH AUTOSEL 4.9 13/16] virtio-blk: limit number of hw queues by nr_cpu_ids

2019-04-26 Thread Sasha Levin
From: Dongli Zhang 

[ Upstream commit bf348f9b78d413e75bb079462751a1d86b6de36c ]

When tag_set->nr_maps is 1, the block layer limits the number of hw queues
by nr_cpu_ids. No matter how many hw queues are used by virtio-blk, as it
has (tag_set->nr_maps == 1), it can use at most nr_cpu_ids hw queues.

In addition, specifically for pci scenario, when the 'num-queues' specified
by qemu is more than maxcpus, virtio-blk would not be able to allocate more
than maxcpus vectors in order to have a vector for each queue. As a result,
it falls back into MSI-X with one vector for config and one shared for
queues.

Considering above reasons, this patch limits the number of hw queues used
by virtio-blk by nr_cpu_ids.

Reviewed-by: Stefan Hajnoczi 
Signed-off-by: Dongli Zhang 
Signed-off-by: Jens Axboe 
Signed-off-by: Sasha Levin 
---
 drivers/block/virtio_blk.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/block/virtio_blk.c b/drivers/block/virtio_blk.c
index 10332c24f961..44ef1d66caa6 100644
--- a/drivers/block/virtio_blk.c
+++ b/drivers/block/virtio_blk.c
@@ -392,6 +392,8 @@ static int init_vq(struct virtio_blk *vblk)
if (err)
num_vqs = 1;
 
+   num_vqs = min_t(unsigned int, nr_cpu_ids, num_vqs);
+
vblk->vqs = kmalloc_array(num_vqs, sizeof(*vblk->vqs), GFP_KERNEL);
if (!vblk->vqs)
return -ENOMEM;
-- 
2.19.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH AUTOSEL 4.14 26/32] virtio-blk: limit number of hw queues by nr_cpu_ids

2019-04-26 Thread Sasha Levin
From: Dongli Zhang 

[ Upstream commit bf348f9b78d413e75bb079462751a1d86b6de36c ]

When tag_set->nr_maps is 1, the block layer limits the number of hw queues
by nr_cpu_ids. No matter how many hw queues are used by virtio-blk, as it
has (tag_set->nr_maps == 1), it can use at most nr_cpu_ids hw queues.

In addition, specifically for pci scenario, when the 'num-queues' specified
by qemu is more than maxcpus, virtio-blk would not be able to allocate more
than maxcpus vectors in order to have a vector for each queue. As a result,
it falls back into MSI-X with one vector for config and one shared for
queues.

Considering above reasons, this patch limits the number of hw queues used
by virtio-blk by nr_cpu_ids.

Reviewed-by: Stefan Hajnoczi 
Signed-off-by: Dongli Zhang 
Signed-off-by: Jens Axboe 
Signed-off-by: Sasha Levin 
---
 drivers/block/virtio_blk.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/block/virtio_blk.c b/drivers/block/virtio_blk.c
index 68846897d213..8767401f75e0 100644
--- a/drivers/block/virtio_blk.c
+++ b/drivers/block/virtio_blk.c
@@ -437,6 +437,8 @@ static int init_vq(struct virtio_blk *vblk)
if (err)
num_vqs = 1;
 
+   num_vqs = min_t(unsigned int, nr_cpu_ids, num_vqs);
+
vblk->vqs = kmalloc_array(num_vqs, sizeof(*vblk->vqs), GFP_KERNEL);
if (!vblk->vqs)
return -ENOMEM;
-- 
2.19.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH AUTOSEL 4.14 20/32] virtio_pci: fix a NULL pointer reference in vp_del_vqs

2019-04-26 Thread Sasha Levin
From: Longpeng 

[ Upstream commit 6a8aae68c87349dbbcd46eac380bc43cdb98a13b ]

If the msix_affinity_masks is alloced failed, then we'll
try to free some resources in vp_free_vectors() that may
access it directly.

We met the following stack in our production:
[   29.296767] BUG: unable to handle kernel NULL pointer dereference at  (null)
[   29.311151] IP: [] vp_free_vectors+0x6a/0x150 [virtio_pci]
[   29.324787] PGD 0
[   29.333224] Oops:  [#1] SMP
[...]
[   29.425175] RIP: 0010:[]  [] 
vp_free_vectors+0x6a/0x150 [virtio_pci]
[   29.441405] RSP: 0018:9a55c2dcfa10  EFLAGS: 00010206
[   29.453491] RAX:  RBX: 9a55c322c400 RCX: 
[   29.467488] RDX:  RSI:  RDI: 9a55c322c400
[   29.481461] RBP: 9a55c2dcfa20 R08:  R09: c1b6806ff020
[   29.495427] R10: 0e95 R11: 00aa R12: 
[   29.509414] R13: 0001 R14: 9a55bd2d9e98 R15: 9a55c322c400
[   29.523407] FS:  7fdcba69f8c0() GS:9a55c284() 
knlGS:
[   29.538472] CS:  0010 DS:  ES:  CR0: 80050033
[   29.551621] CR2:  CR3: 3ce52000 CR4: 003607a0
[   29.565886] DR0:  DR1:  DR2: 
[   29.580055] DR3:  DR6: fffe0ff0 DR7: 0400
[   29.594122] Call Trace:
[   29.603446]  [] vp_request_msix_vectors+0xe2/0x260 
[virtio_pci]
[   29.618017]  [] vp_try_to_find_vqs+0x95/0x3b0 [virtio_pci]
[   29.632152]  [] vp_find_vqs+0x37/0xb0 [virtio_pci]
[   29.645582]  [] init_vq+0x153/0x260 [virtio_blk]
[   29.658831]  [] virtblk_probe+0xe8/0x87f [virtio_blk]
[...]

Cc: Gonglei 
Signed-off-by: Longpeng 
Signed-off-by: Michael S. Tsirkin 
Reviewed-by: Gonglei 
Signed-off-by: Sasha Levin 
---
 drivers/virtio/virtio_pci_common.c | 8 +---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/drivers/virtio/virtio_pci_common.c 
b/drivers/virtio/virtio_pci_common.c
index 1c4797e53f68..80a3704939cd 100644
--- a/drivers/virtio/virtio_pci_common.c
+++ b/drivers/virtio/virtio_pci_common.c
@@ -254,9 +254,11 @@ void vp_del_vqs(struct virtio_device *vdev)
for (i = 0; i < vp_dev->msix_used_vectors; ++i)
free_irq(pci_irq_vector(vp_dev->pci_dev, i), vp_dev);
 
-   for (i = 0; i < vp_dev->msix_vectors; i++)
-   if (vp_dev->msix_affinity_masks[i])
-   free_cpumask_var(vp_dev->msix_affinity_masks[i]);
+   if (vp_dev->msix_affinity_masks) {
+   for (i = 0; i < vp_dev->msix_vectors; i++)
+   if (vp_dev->msix_affinity_masks[i])
+   
free_cpumask_var(vp_dev->msix_affinity_masks[i]);
+   }
 
if (vp_dev->msix_enabled) {
/* Disable the vector used for configuration */
-- 
2.19.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH AUTOSEL 4.19 44/53] virtio-blk: limit number of hw queues by nr_cpu_ids

2019-04-26 Thread Sasha Levin
From: Dongli Zhang 

[ Upstream commit bf348f9b78d413e75bb079462751a1d86b6de36c ]

When tag_set->nr_maps is 1, the block layer limits the number of hw queues
by nr_cpu_ids. No matter how many hw queues are used by virtio-blk, as it
has (tag_set->nr_maps == 1), it can use at most nr_cpu_ids hw queues.

In addition, specifically for pci scenario, when the 'num-queues' specified
by qemu is more than maxcpus, virtio-blk would not be able to allocate more
than maxcpus vectors in order to have a vector for each queue. As a result,
it falls back into MSI-X with one vector for config and one shared for
queues.

Considering above reasons, this patch limits the number of hw queues used
by virtio-blk by nr_cpu_ids.

Reviewed-by: Stefan Hajnoczi 
Signed-off-by: Dongli Zhang 
Signed-off-by: Jens Axboe 
Signed-off-by: Sasha Levin 
---
 drivers/block/virtio_blk.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/block/virtio_blk.c b/drivers/block/virtio_blk.c
index 23752dc99b00..dd64f586679e 100644
--- a/drivers/block/virtio_blk.c
+++ b/drivers/block/virtio_blk.c
@@ -446,6 +446,8 @@ static int init_vq(struct virtio_blk *vblk)
if (err)
num_vqs = 1;
 
+   num_vqs = min_t(unsigned int, nr_cpu_ids, num_vqs);
+
vblk->vqs = kmalloc_array(num_vqs, sizeof(*vblk->vqs), GFP_KERNEL);
if (!vblk->vqs)
return -ENOMEM;
-- 
2.19.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH AUTOSEL 4.19 37/53] virtio_pci: fix a NULL pointer reference in vp_del_vqs

2019-04-26 Thread Sasha Levin
From: Longpeng 

[ Upstream commit 6a8aae68c87349dbbcd46eac380bc43cdb98a13b ]

If the msix_affinity_masks is alloced failed, then we'll
try to free some resources in vp_free_vectors() that may
access it directly.

We met the following stack in our production:
[   29.296767] BUG: unable to handle kernel NULL pointer dereference at  (null)
[   29.311151] IP: [] vp_free_vectors+0x6a/0x150 [virtio_pci]
[   29.324787] PGD 0
[   29.333224] Oops:  [#1] SMP
[...]
[   29.425175] RIP: 0010:[]  [] 
vp_free_vectors+0x6a/0x150 [virtio_pci]
[   29.441405] RSP: 0018:9a55c2dcfa10  EFLAGS: 00010206
[   29.453491] RAX:  RBX: 9a55c322c400 RCX: 
[   29.467488] RDX:  RSI:  RDI: 9a55c322c400
[   29.481461] RBP: 9a55c2dcfa20 R08:  R09: c1b6806ff020
[   29.495427] R10: 0e95 R11: 00aa R12: 
[   29.509414] R13: 0001 R14: 9a55bd2d9e98 R15: 9a55c322c400
[   29.523407] FS:  7fdcba69f8c0() GS:9a55c284() 
knlGS:
[   29.538472] CS:  0010 DS:  ES:  CR0: 80050033
[   29.551621] CR2:  CR3: 3ce52000 CR4: 003607a0
[   29.565886] DR0:  DR1:  DR2: 
[   29.580055] DR3:  DR6: fffe0ff0 DR7: 0400
[   29.594122] Call Trace:
[   29.603446]  [] vp_request_msix_vectors+0xe2/0x260 
[virtio_pci]
[   29.618017]  [] vp_try_to_find_vqs+0x95/0x3b0 [virtio_pci]
[   29.632152]  [] vp_find_vqs+0x37/0xb0 [virtio_pci]
[   29.645582]  [] init_vq+0x153/0x260 [virtio_blk]
[   29.658831]  [] virtblk_probe+0xe8/0x87f [virtio_blk]
[...]

Cc: Gonglei 
Signed-off-by: Longpeng 
Signed-off-by: Michael S. Tsirkin 
Reviewed-by: Gonglei 
Signed-off-by: Sasha Levin 
---
 drivers/virtio/virtio_pci_common.c | 8 +---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/drivers/virtio/virtio_pci_common.c 
b/drivers/virtio/virtio_pci_common.c
index 465a6f5142cc..45b04bc91f24 100644
--- a/drivers/virtio/virtio_pci_common.c
+++ b/drivers/virtio/virtio_pci_common.c
@@ -255,9 +255,11 @@ void vp_del_vqs(struct virtio_device *vdev)
for (i = 0; i < vp_dev->msix_used_vectors; ++i)
free_irq(pci_irq_vector(vp_dev->pci_dev, i), vp_dev);
 
-   for (i = 0; i < vp_dev->msix_vectors; i++)
-   if (vp_dev->msix_affinity_masks[i])
-   free_cpumask_var(vp_dev->msix_affinity_masks[i]);
+   if (vp_dev->msix_affinity_masks) {
+   for (i = 0; i < vp_dev->msix_vectors; i++)
+   if (vp_dev->msix_affinity_masks[i])
+   
free_cpumask_var(vp_dev->msix_affinity_masks[i]);
+   }
 
if (vp_dev->msix_enabled) {
/* Disable the vector used for configuration */
-- 
2.19.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH AUTOSEL 5.0 65/79] virtio-blk: limit number of hw queues by nr_cpu_ids

2019-04-26 Thread Sasha Levin
From: Dongli Zhang 

[ Upstream commit bf348f9b78d413e75bb079462751a1d86b6de36c ]

When tag_set->nr_maps is 1, the block layer limits the number of hw queues
by nr_cpu_ids. No matter how many hw queues are used by virtio-blk, as it
has (tag_set->nr_maps == 1), it can use at most nr_cpu_ids hw queues.

In addition, specifically for pci scenario, when the 'num-queues' specified
by qemu is more than maxcpus, virtio-blk would not be able to allocate more
than maxcpus vectors in order to have a vector for each queue. As a result,
it falls back into MSI-X with one vector for config and one shared for
queues.

Considering above reasons, this patch limits the number of hw queues used
by virtio-blk by nr_cpu_ids.

Reviewed-by: Stefan Hajnoczi 
Signed-off-by: Dongli Zhang 
Signed-off-by: Jens Axboe 
Signed-off-by: Sasha Levin 
---
 drivers/block/virtio_blk.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/block/virtio_blk.c b/drivers/block/virtio_blk.c
index b16a887bbd02..29bede887237 100644
--- a/drivers/block/virtio_blk.c
+++ b/drivers/block/virtio_blk.c
@@ -513,6 +513,8 @@ static int init_vq(struct virtio_blk *vblk)
if (err)
num_vqs = 1;
 
+   num_vqs = min_t(unsigned int, nr_cpu_ids, num_vqs);
+
vblk->vqs = kmalloc_array(num_vqs, sizeof(*vblk->vqs), GFP_KERNEL);
if (!vblk->vqs)
return -ENOMEM;
-- 
2.19.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH AUTOSEL 5.0 48/79] virtio_pci: fix a NULL pointer reference in vp_del_vqs

2019-04-26 Thread Sasha Levin
From: Longpeng 

[ Upstream commit 6a8aae68c87349dbbcd46eac380bc43cdb98a13b ]

If the msix_affinity_masks is alloced failed, then we'll
try to free some resources in vp_free_vectors() that may
access it directly.

We met the following stack in our production:
[   29.296767] BUG: unable to handle kernel NULL pointer dereference at  (null)
[   29.311151] IP: [] vp_free_vectors+0x6a/0x150 [virtio_pci]
[   29.324787] PGD 0
[   29.333224] Oops:  [#1] SMP
[...]
[   29.425175] RIP: 0010:[]  [] 
vp_free_vectors+0x6a/0x150 [virtio_pci]
[   29.441405] RSP: 0018:9a55c2dcfa10  EFLAGS: 00010206
[   29.453491] RAX:  RBX: 9a55c322c400 RCX: 
[   29.467488] RDX:  RSI:  RDI: 9a55c322c400
[   29.481461] RBP: 9a55c2dcfa20 R08:  R09: c1b6806ff020
[   29.495427] R10: 0e95 R11: 00aa R12: 
[   29.509414] R13: 0001 R14: 9a55bd2d9e98 R15: 9a55c322c400
[   29.523407] FS:  7fdcba69f8c0() GS:9a55c284() 
knlGS:
[   29.538472] CS:  0010 DS:  ES:  CR0: 80050033
[   29.551621] CR2:  CR3: 3ce52000 CR4: 003607a0
[   29.565886] DR0:  DR1:  DR2: 
[   29.580055] DR3:  DR6: fffe0ff0 DR7: 0400
[   29.594122] Call Trace:
[   29.603446]  [] vp_request_msix_vectors+0xe2/0x260 
[virtio_pci]
[   29.618017]  [] vp_try_to_find_vqs+0x95/0x3b0 [virtio_pci]
[   29.632152]  [] vp_find_vqs+0x37/0xb0 [virtio_pci]
[   29.645582]  [] init_vq+0x153/0x260 [virtio_blk]
[   29.658831]  [] virtblk_probe+0xe8/0x87f [virtio_blk]
[...]

Cc: Gonglei 
Signed-off-by: Longpeng 
Signed-off-by: Michael S. Tsirkin 
Reviewed-by: Gonglei 
Signed-off-by: Sasha Levin 
---
 drivers/virtio/virtio_pci_common.c | 8 +---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/drivers/virtio/virtio_pci_common.c 
b/drivers/virtio/virtio_pci_common.c
index d0584c040c60..7a0398bb84f7 100644
--- a/drivers/virtio/virtio_pci_common.c
+++ b/drivers/virtio/virtio_pci_common.c
@@ -255,9 +255,11 @@ void vp_del_vqs(struct virtio_device *vdev)
for (i = 0; i < vp_dev->msix_used_vectors; ++i)
free_irq(pci_irq_vector(vp_dev->pci_dev, i), vp_dev);
 
-   for (i = 0; i < vp_dev->msix_vectors; i++)
-   if (vp_dev->msix_affinity_masks[i])
-   free_cpumask_var(vp_dev->msix_affinity_masks[i]);
+   if (vp_dev->msix_affinity_masks) {
+   for (i = 0; i < vp_dev->msix_vectors; i++)
+   if (vp_dev->msix_affinity_masks[i])
+   
free_cpumask_var(vp_dev->msix_affinity_masks[i]);
+   }
 
if (vp_dev->msix_enabled) {
/* Disable the vector used for configuration */
-- 
2.19.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted

2019-04-26 Thread Thiago Jung Bauermann


Michael S. Tsirkin  writes:

> On Wed, Apr 24, 2019 at 10:01:56PM -0300, Thiago Jung Bauermann wrote:
>>
>> Michael S. Tsirkin  writes:
>>
>> > On Wed, Apr 17, 2019 at 06:42:00PM -0300, Thiago Jung Bauermann wrote:
>> >>
>> >> Michael S. Tsirkin  writes:
>> >>
>> >> > On Thu, Mar 21, 2019 at 09:05:04PM -0300, Thiago Jung Bauermann wrote:
>> >> >>
>> >> >> Michael S. Tsirkin  writes:
>> >> >>
>> >> >> > On Wed, Mar 20, 2019 at 01:13:41PM -0300, Thiago Jung Bauermann 
>> >> >> > wrote:
>> >> >> >> >From what I understand of the ACCESS_PLATFORM definition, the host 
>> >> >> >> >will
>> >> >> >> only ever try to access memory addresses that are supplied to it by 
>> >> >> >> the
>> >> >> >> guest, so all of the secure guest memory that the host cares about 
>> >> >> >> is
>> >> >> >> accessible:
>> >> >> >>
>> >> >> >> If this feature bit is set to 0, then the device has same 
>> >> >> >> access to
>> >> >> >> memory addresses supplied to it as the driver has. In 
>> >> >> >> particular,
>> >> >> >> the device will always use physical addresses matching addresses
>> >> >> >> used by the driver (typically meaning physical addresses used 
>> >> >> >> by the
>> >> >> >> CPU) and not translated further, and can access any address 
>> >> >> >> supplied
>> >> >> >> to it by the driver. When clear, this overrides any
>> >> >> >> platform-specific description of whether device access is 
>> >> >> >> limited or
>> >> >> >> translated in any way, e.g. whether an IOMMU may be present.
>> >> >> >>
>> >> >> >> All of the above is true for POWER guests, whether they are secure
>> >> >> >> guests or not.
>> >> >> >>
>> >> >> >> Or are you saying that a virtio device may want to access memory
>> >> >> >> addresses that weren't supplied to it by the driver?
>> >> >> >
>> >> >> > Your logic would apply to IOMMUs as well.  For your mode, there are
>> >> >> > specific encrypted memory regions that driver has access to but 
>> >> >> > device
>> >> >> > does not. that seems to violate the constraint.
>> >> >>
>> >> >> Right, if there's a pre-configured 1:1 mapping in the IOMMU such that
>> >> >> the device can ignore the IOMMU for all practical purposes I would
>> >> >> indeed say that the logic would apply to IOMMUs as well. :-)
>> >> >>
>> >> >> I guess I'm still struggling with the purpose of signalling to the
>> >> >> driver that the host may not have access to memory addresses that it
>> >> >> will never try to access.
>> >> >
>> >> > For example, one of the benefits is to signal to host that driver does
>> >> > not expect ability to access all memory. If it does, host can
>> >> > fail initialization gracefully.
>> >>
>> >> But why would the ability to access all memory be necessary or even
>> >> useful? When would the host access memory that the driver didn't tell it
>> >> to access?
>> >
>> > When I say all memory I mean even memory not allowed by the IOMMU.
>>
>> Yes, but why? How is that memory relevant?
>
> It's relevant when driver is not trusted to only supply correct
> addresses. The feature was originally designed to support userspace
> drivers within guests.

Ah, thanks for clarifying. I don't think that's a problem in our case.
If the guest provides an incorrect address, the hardware simply won't
allow the host to access it.

>> >> >> > Another idea is maybe something like virtio-iommu?
>> >> >>
>> >> >> You mean, have legacy guests use virtio-iommu to request an IOMMU
>> >> >> bypass? If so, it's an interesting idea for new guests but it doesn't
>> >> >> help with guests that are out today in the field, which don't have A
>> >> >> virtio-iommu driver.
>> >> >
>> >> > I presume legacy guests don't use encrypted memory so why do we
>> >> > worry about them at all?
>> >>
>> >> They don't use encrypted memory, but a host machine will run a mix of
>> >> secure and legacy guests. And since the hypervisor doesn't know whether
>> >> a guest will be secure or not at the time it is launched, legacy guests
>> >> will have to be launched with the same configuration as secure guests.
>> >
>> > OK and so I think the issue is that hosts generally fail if they set
>> > ACCESS_PLATFORM and guests do not negotiate it.
>> > So you can not just set ACCESS_PLATFORM for everyone.
>> > Is that the issue here?
>>
>> Yes, that is one half of the issue. The other is that even if hosts
>> didn't fail, existing legacy guests wouldn't "take the initiative" of
>> not negotiating ACCESS_PLATFORM to get the improved performance. They'd
>> have to be modified to do that.
>
> So there's a non-encrypted guest, hypervisor wants to set
> ACCESS_PLATFORM to allow encrypted guests but that will slow down legacy
> guests since their vIOMMU emulation is very slow.

Yes.

> So enabling support for encryption slows down non-encrypted guests. Not
> great but not the end of the world, considering even older guests that
> don't support ACCESS_PLATFORM are completely broken and you do not seem
> to be too worried by that.


Re: [PATCH 04/10] s390/mm: force swiotlb for protected virtualization

2019-04-26 Thread Christoph Hellwig
On Fri, Apr 26, 2019 at 08:32:39PM +0200, Halil Pasic wrote:
> +EXPORT_SYMBOL_GPL(set_memory_encrypted);

> +EXPORT_SYMBOL_GPL(set_memory_decrypted);

> +EXPORT_SYMBOL_GPL(sev_active);

Why do you export these?  I know x86 exports those as well, but
it shoudn't be needed there either.
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH 09/10] virtio/s390: use DMA memory for ccw I/O and classic notifiers

2019-04-26 Thread Halil Pasic
Before virtio-ccw could get away with not using DMA API for the pieces of
memory it does ccw I/O with. With protected virtualization this has to
change, since the hypervisor needs to read and sometimes also write these
pieces of memory.

The hypervisor is supposed to poke the classic notifiers, if these are
used, out of band with regards to ccw I/O. So these need to be allocated
as DMA memory (which is shared memory for protected virtualization
guests).

Let us factor out everything from struct virtio_ccw_device that needs to
be DMA memory in a satellite that is allocated as such.

Note: The control blocks of I/O instructions do not need to be shared.
These are marshalled by the ultravisor.

Signed-off-by: Halil Pasic 
---
 drivers/s390/virtio/virtio_ccw.c | 177 +--
 1 file changed, 96 insertions(+), 81 deletions(-)

diff --git a/drivers/s390/virtio/virtio_ccw.c b/drivers/s390/virtio/virtio_ccw.c
index 1f3e7d56924f..613b18001a0c 100644
--- a/drivers/s390/virtio/virtio_ccw.c
+++ b/drivers/s390/virtio/virtio_ccw.c
@@ -46,9 +46,15 @@ struct vq_config_block {
 #define VIRTIO_CCW_CONFIG_SIZE 0x100
 /* same as PCI config space size, should be enough for all drivers */
 
+struct vcdev_dma_area {
+   unsigned long indicators;
+   unsigned long indicators2;
+   struct vq_config_block config_block;
+   __u8 status;
+};
+
 struct virtio_ccw_device {
struct virtio_device vdev;
-   __u8 *status;
__u8 config[VIRTIO_CCW_CONFIG_SIZE];
struct ccw_device *cdev;
__u32 curr_io;
@@ -58,24 +64,22 @@ struct virtio_ccw_device {
spinlock_t lock;
struct mutex io_lock; /* Serializes I/O requests */
struct list_head virtqueues;
-   unsigned long indicators;
-   unsigned long indicators2;
-   struct vq_config_block *config_block;
bool is_thinint;
bool going_away;
bool device_lost;
unsigned int config_ready;
void *airq_info;
+   struct vcdev_dma_area *dma_area;
 };
 
 static inline unsigned long *indicators(struct virtio_ccw_device *vcdev)
 {
-   return >indicators;
+   return >dma_area->indicators;
 }
 
 static inline unsigned long *indicators2(struct virtio_ccw_device *vcdev)
 {
-   return >indicators2;
+   return >dma_area->indicators2;
 }
 
 struct vq_info_block_legacy {
@@ -176,6 +180,22 @@ static struct virtio_ccw_device *to_vc_device(struct 
virtio_device *vdev)
return container_of(vdev, struct virtio_ccw_device, vdev);
 }
 
+static inline void *__vc_dma_alloc(struct virtio_device *vdev, size_t size)
+{
+   return ccw_device_dma_zalloc(to_vc_device(vdev)->cdev, size);
+}
+
+static inline void __vc_dma_free(struct virtio_device *vdev, size_t size,
+void *cpu_addr)
+{
+   return ccw_device_dma_free(to_vc_device(vdev)->cdev, cpu_addr, size);
+}
+
+#define vc_dma_alloc_struct(vdev, ptr) \
+   ({ptr = __vc_dma_alloc(vdev, sizeof(*(ptr))); })
+#define vc_dma_free_struct(vdev, ptr) \
+   __vc_dma_free(vdev, sizeof(*(ptr)), (ptr))
+
 static void drop_airq_indicator(struct virtqueue *vq, struct airq_info *info)
 {
unsigned long i, flags;
@@ -335,8 +355,7 @@ static void virtio_ccw_drop_indicator(struct 
virtio_ccw_device *vcdev,
struct airq_info *airq_info = vcdev->airq_info;
 
if (vcdev->is_thinint) {
-   thinint_area = kzalloc(sizeof(*thinint_area),
-  GFP_DMA | GFP_KERNEL);
+   vc_dma_alloc_struct(>vdev, thinint_area);
if (!thinint_area)
return;
thinint_area->summary_indicator =
@@ -347,8 +366,8 @@ static void virtio_ccw_drop_indicator(struct 
virtio_ccw_device *vcdev,
ccw->cda = (__u32)(unsigned long) thinint_area;
} else {
/* payload is the address of the indicators */
-   indicatorp = kmalloc(sizeof(indicators(vcdev)),
-GFP_DMA | GFP_KERNEL);
+   indicatorp = __vc_dma_alloc(>vdev,
+  sizeof(indicators(vcdev)));
if (!indicatorp)
return;
*indicatorp = 0;
@@ -368,8 +387,9 @@ static void virtio_ccw_drop_indicator(struct 
virtio_ccw_device *vcdev,
 "Failed to deregister indicators (%d)\n", ret);
else if (vcdev->is_thinint)
virtio_ccw_drop_indicators(vcdev);
-   kfree(indicatorp);
-   kfree(thinint_area);
+   __vc_dma_free(>vdev, sizeof(indicators(vcdev)),
+ indicatorp);
+   vc_dma_free_struct(>vdev, thinint_area);
 }
 
 static inline long __do_kvm_notify(struct subchannel_id schid,
@@ -416,15 +436,15 @@ static int virtio_ccw_read_vq_conf(struct 
virtio_ccw_device *vcdev,
 {
int ret;
 
-   vcdev->config_block->index = index;
+   vcdev->dma_area->config_block.index = index;
  

[PATCH 10/10] virtio/s390: make airq summary indicators DMA

2019-04-26 Thread Halil Pasic
Hypervisor needs to interact with the summary indicators, so these
need to be DMA memory as well (at least for protected virtualization
guests).

Signed-off-by: Halil Pasic 
---
 drivers/s390/virtio/virtio_ccw.c | 24 +---
 1 file changed, 17 insertions(+), 7 deletions(-)

diff --git a/drivers/s390/virtio/virtio_ccw.c b/drivers/s390/virtio/virtio_ccw.c
index 613b18001a0c..6058b07fea08 100644
--- a/drivers/s390/virtio/virtio_ccw.c
+++ b/drivers/s390/virtio/virtio_ccw.c
@@ -140,11 +140,17 @@ static int virtio_ccw_use_airq = 1;
 
 struct airq_info {
rwlock_t lock;
-   u8 summary_indicator;
+   u8 summary_indicator_idx;
struct airq_struct airq;
struct airq_iv *aiv;
 };
 static struct airq_info *airq_areas[MAX_AIRQ_AREAS];
+static u8 *summary_indicators;
+
+static inline u8 *get_summary_indicator(struct airq_info *info)
+{
+   return summary_indicators + info->summary_indicator_idx;
+}
 
 #define CCW_CMD_SET_VQ 0x13
 #define CCW_CMD_VDEV_RESET 0x33
@@ -225,7 +231,7 @@ static void virtio_airq_handler(struct airq_struct *airq)
break;
vring_interrupt(0, (void *)airq_iv_get_ptr(info->aiv, ai));
}
-   info->summary_indicator = 0;
+   *(get_summary_indicator(info)) = 0;
smp_wmb();
/* Walk through indicators field, summary indicator not active. */
for (ai = 0;;) {
@@ -237,7 +243,8 @@ static void virtio_airq_handler(struct airq_struct *airq)
read_unlock(>lock);
 }
 
-static struct airq_info *new_airq_info(void)
+/* call with airq_areas_lock held */
+static struct airq_info *new_airq_info(int index)
 {
struct airq_info *info;
int rc;
@@ -252,7 +259,8 @@ static struct airq_info *new_airq_info(void)
return NULL;
}
info->airq.handler = virtio_airq_handler;
-   info->airq.lsi_ptr = >summary_indicator;
+   info->summary_indicator_idx = index;
+   info->airq.lsi_ptr = get_summary_indicator(info);
info->airq.lsi_mask = 0xff;
info->airq.isc = VIRTIO_AIRQ_ISC;
rc = register_adapter_interrupt(>airq);
@@ -273,8 +281,9 @@ static unsigned long get_airq_indicator(struct virtqueue 
*vqs[], int nvqs,
unsigned long bit, flags;
 
for (i = 0; i < MAX_AIRQ_AREAS && !indicator_addr; i++) {
+   /* TODO: this seems to be racy */
if (!airq_areas[i])
-   airq_areas[i] = new_airq_info();
+   airq_areas[i] = new_airq_info(i);
info = airq_areas[i];
if (!info)
return 0;
@@ -359,7 +368,7 @@ static void virtio_ccw_drop_indicator(struct 
virtio_ccw_device *vcdev,
if (!thinint_area)
return;
thinint_area->summary_indicator =
-   (unsigned long) _info->summary_indicator;
+   (unsigned long) get_summary_indicator(airq_info);
thinint_area->isc = VIRTIO_AIRQ_ISC;
ccw->cmd_code = CCW_CMD_SET_IND_ADAPTER;
ccw->count = sizeof(*thinint_area);
@@ -624,7 +633,7 @@ static int virtio_ccw_register_adapter_ind(struct 
virtio_ccw_device *vcdev,
}
info = vcdev->airq_info;
thinint_area->summary_indicator =
-   (unsigned long) >summary_indicator;
+   (unsigned long) get_summary_indicator(info);
thinint_area->isc = VIRTIO_AIRQ_ISC;
ccw->cmd_code = CCW_CMD_SET_IND_ADAPTER;
ccw->flags = CCW_FLAG_SLI;
@@ -1500,6 +1509,7 @@ static int __init virtio_ccw_init(void)
 {
/* parse no_auto string before we do anything further */
no_auto_parse();
+   summary_indicators = cio_dma_zalloc(MAX_AIRQ_AREAS);
return ccw_driver_register(_ccw_driver);
 }
 device_initcall(virtio_ccw_init);
-- 
2.16.4

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH 07/10] s390/airq: use DMA memory for adapter interrupts

2019-04-26 Thread Halil Pasic
Protected virtualization guests have to use shared pages for airq
notifier bit vectors, because hypervisor needs to write these bits.

Let us make sure we allocate DMA memory for the notifier bit vectors.

Signed-off-by: Halil Pasic 
---
 arch/s390/include/asm/airq.h |  2 ++
 drivers/s390/cio/airq.c  | 18 ++
 2 files changed, 16 insertions(+), 4 deletions(-)

diff --git a/arch/s390/include/asm/airq.h b/arch/s390/include/asm/airq.h
index fcf539efb32f..1492d4856049 100644
--- a/arch/s390/include/asm/airq.h
+++ b/arch/s390/include/asm/airq.h
@@ -11,6 +11,7 @@
 #define _ASM_S390_AIRQ_H
 
 #include 
+#include 
 
 struct airq_struct {
struct hlist_node list; /* Handler queueing. */
@@ -29,6 +30,7 @@ void unregister_adapter_interrupt(struct airq_struct *airq);
 /* Adapter interrupt bit vector */
 struct airq_iv {
unsigned long *vector;  /* Adapter interrupt bit vector */
+   dma_addr_t vector_dma; /* Adapter interrupt bit vector dma */
unsigned long *avail;   /* Allocation bit mask for the bit vector */
unsigned long *bitlock; /* Lock bit mask for the bit vector */
unsigned long *ptr; /* Pointer associated with each bit */
diff --git a/drivers/s390/cio/airq.c b/drivers/s390/cio/airq.c
index a45011e4529e..7a5c0a08ee09 100644
--- a/drivers/s390/cio/airq.c
+++ b/drivers/s390/cio/airq.c
@@ -19,6 +19,7 @@
 
 #include 
 #include 
+#include 
 
 #include "cio.h"
 #include "cio_debug.h"
@@ -113,6 +114,11 @@ void __init init_airq_interrupts(void)
setup_irq(THIN_INTERRUPT, _interrupt);
 }
 
+static inline unsigned long iv_size(unsigned long bits)
+{
+   return BITS_TO_LONGS(bits) * sizeof(unsigned long);
+}
+
 /**
  * airq_iv_create - create an interrupt vector
  * @bits: number of bits in the interrupt vector
@@ -123,14 +129,15 @@ void __init init_airq_interrupts(void)
 struct airq_iv *airq_iv_create(unsigned long bits, unsigned long flags)
 {
struct airq_iv *iv;
-   unsigned long size;
+   unsigned long size = 0;
 
iv = kzalloc(sizeof(*iv), GFP_KERNEL);
if (!iv)
goto out;
iv->bits = bits;
-   size = BITS_TO_LONGS(bits) * sizeof(unsigned long);
-   iv->vector = kzalloc(size, GFP_KERNEL);
+   size = iv_size(bits);
+   iv->vector = dma_alloc_coherent(cio_get_dma_css_dev(), size,
+>vector_dma, GFP_KERNEL);
if (!iv->vector)
goto out_free;
if (flags & AIRQ_IV_ALLOC) {
@@ -165,7 +172,8 @@ struct airq_iv *airq_iv_create(unsigned long bits, unsigned 
long flags)
kfree(iv->ptr);
kfree(iv->bitlock);
kfree(iv->avail);
-   kfree(iv->vector);
+   dma_free_coherent(cio_get_dma_css_dev(), size, iv->vector,
+ iv->vector_dma);
kfree(iv);
 out:
return NULL;
@@ -182,6 +190,8 @@ void airq_iv_release(struct airq_iv *iv)
kfree(iv->ptr);
kfree(iv->bitlock);
kfree(iv->vector);
+   dma_free_coherent(cio_get_dma_css_dev(), iv_size(iv->bits),
+ iv->vector, iv->vector_dma);
kfree(iv->avail);
kfree(iv);
 }
-- 
2.16.4

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH 06/10] s390/cio: add basic protected virtualization support

2019-04-26 Thread Halil Pasic
As virtio-ccw devices are channel devices, we need to use the dma area
for any communication with the hypervisor.

This patch addresses the most basic stuff (mostly what is required for
virtio-ccw), and does take care of QDIO or any devices.

An interesting side effect is that virtio structures are now going to
get allocated in 31 bit addressable storage.

Signed-off-by: Halil Pasic 
---
 arch/s390/include/asm/ccwdev.h   |  4 +++
 drivers/s390/cio/ccwreq.c|  8 ++---
 drivers/s390/cio/device.c| 65 +---
 drivers/s390/cio/device_fsm.c| 40 -
 drivers/s390/cio/device_id.c | 18 +--
 drivers/s390/cio/device_ops.c| 21 +++--
 drivers/s390/cio/device_pgid.c   | 20 ++---
 drivers/s390/cio/device_status.c | 24 +++
 drivers/s390/cio/io_sch.h| 21 +
 drivers/s390/virtio/virtio_ccw.c | 10 ---
 10 files changed, 148 insertions(+), 83 deletions(-)

diff --git a/arch/s390/include/asm/ccwdev.h b/arch/s390/include/asm/ccwdev.h
index a29dd430fb40..865ce1cb86d5 100644
--- a/arch/s390/include/asm/ccwdev.h
+++ b/arch/s390/include/asm/ccwdev.h
@@ -226,6 +226,10 @@ extern int ccw_device_enable_console(struct ccw_device *);
 extern void ccw_device_wait_idle(struct ccw_device *);
 extern int ccw_device_force_console(struct ccw_device *);
 
+extern void *ccw_device_dma_zalloc(struct ccw_device *cdev, size_t size);
+extern void ccw_device_dma_free(struct ccw_device *cdev,
+   void *cpu_addr, size_t size);
+
 int ccw_device_siosl(struct ccw_device *);
 
 extern void ccw_device_get_schid(struct ccw_device *, struct subchannel_id *);
diff --git a/drivers/s390/cio/ccwreq.c b/drivers/s390/cio/ccwreq.c
index 603268a33ea1..dafbceb311b3 100644
--- a/drivers/s390/cio/ccwreq.c
+++ b/drivers/s390/cio/ccwreq.c
@@ -63,7 +63,7 @@ static void ccwreq_stop(struct ccw_device *cdev, int rc)
return;
req->done = 1;
ccw_device_set_timeout(cdev, 0);
-   memset(>private->irb, 0, sizeof(struct irb));
+   memset(>private->dma_area->irb, 0, sizeof(struct irb));
if (rc && rc != -ENODEV && req->drc)
rc = req->drc;
req->callback(cdev, req->data, rc);
@@ -86,7 +86,7 @@ static void ccwreq_do(struct ccw_device *cdev)
continue;
}
/* Perform start function. */
-   memset(>private->irb, 0, sizeof(struct irb));
+   memset(>private->dma_area->irb, 0, sizeof(struct irb));
rc = cio_start(sch, cp, (u8) req->mask);
if (rc == 0) {
/* I/O started successfully. */
@@ -169,7 +169,7 @@ int ccw_request_cancel(struct ccw_device *cdev)
  */
 static enum io_status ccwreq_status(struct ccw_device *cdev, struct irb *lcirb)
 {
-   struct irb *irb = >private->irb;
+   struct irb *irb = >private->dma_area->irb;
struct cmd_scsw *scsw = >scsw.cmd;
enum uc_todo todo;
 
@@ -187,7 +187,7 @@ static enum io_status ccwreq_status(struct ccw_device 
*cdev, struct irb *lcirb)
CIO_TRACE_EVENT(2, "sensedata");
CIO_HEX_EVENT(2, >private->dev_id,
  sizeof(struct ccw_dev_id));
-   CIO_HEX_EVENT(2, >private->irb.ecw, SENSE_MAX_COUNT);
+   CIO_HEX_EVENT(2, >private->dma_area->irb.ecw, 
SENSE_MAX_COUNT);
/* Check for command reject. */
if (irb->ecw[0] & SNS0_CMD_REJECT)
return IO_REJECTED;
diff --git a/drivers/s390/cio/device.c b/drivers/s390/cio/device.c
index 1540229a37bb..a3310ee14a4a 100644
--- a/drivers/s390/cio/device.c
+++ b/drivers/s390/cio/device.c
@@ -24,6 +24,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
@@ -687,6 +688,9 @@ ccw_device_release(struct device *dev)
struct ccw_device *cdev;
 
cdev = to_ccwdev(dev);
+   cio_gp_dma_free(cdev->private->dma_pool, cdev->private->dma_area,
+   sizeof(*cdev->private->dma_area));
+   cio_gp_dma_destroy(cdev->private->dma_pool, >dev);
/* Release reference of parent subchannel. */
put_device(cdev->dev.parent);
kfree(cdev->private);
@@ -696,15 +700,31 @@ ccw_device_release(struct device *dev)
 static struct ccw_device * io_subchannel_allocate_dev(struct subchannel *sch)
 {
struct ccw_device *cdev;
+   struct gen_pool *dma_pool;
 
cdev  = kzalloc(sizeof(*cdev), GFP_KERNEL);
-   if (cdev) {
-   cdev->private = kzalloc(sizeof(struct ccw_device_private),
-   GFP_KERNEL | GFP_DMA);
-   if (cdev->private)
-   return cdev;
-   }
+   if (!cdev)
+   goto err_cdev;
+   cdev->private = kzalloc(sizeof(struct ccw_device_private),
+   GFP_KERNEL | GFP_DMA);
+   if (!cdev->private)
+

[PATCH 08/10] virtio/s390: add indirection to indicators access

2019-04-26 Thread Halil Pasic
This will come in handy soon when we pull out the indicators from
virtio_ccw_device to a memory area that is shared with the hypervisor
(in particular for protected virtualization guests).

Signed-off-by: Halil Pasic 
---
 drivers/s390/virtio/virtio_ccw.c | 40 +---
 1 file changed, 25 insertions(+), 15 deletions(-)

diff --git a/drivers/s390/virtio/virtio_ccw.c b/drivers/s390/virtio/virtio_ccw.c
index bb7a92316fc8..1f3e7d56924f 100644
--- a/drivers/s390/virtio/virtio_ccw.c
+++ b/drivers/s390/virtio/virtio_ccw.c
@@ -68,6 +68,16 @@ struct virtio_ccw_device {
void *airq_info;
 };
 
+static inline unsigned long *indicators(struct virtio_ccw_device *vcdev)
+{
+   return >indicators;
+}
+
+static inline unsigned long *indicators2(struct virtio_ccw_device *vcdev)
+{
+   return >indicators2;
+}
+
 struct vq_info_block_legacy {
__u64 queue;
__u32 align;
@@ -337,17 +347,17 @@ static void virtio_ccw_drop_indicator(struct 
virtio_ccw_device *vcdev,
ccw->cda = (__u32)(unsigned long) thinint_area;
} else {
/* payload is the address of the indicators */
-   indicatorp = kmalloc(sizeof(>indicators),
+   indicatorp = kmalloc(sizeof(indicators(vcdev)),
 GFP_DMA | GFP_KERNEL);
if (!indicatorp)
return;
*indicatorp = 0;
ccw->cmd_code = CCW_CMD_SET_IND;
-   ccw->count = sizeof(>indicators);
+   ccw->count = sizeof(indicators(vcdev));
ccw->cda = (__u32)(unsigned long) indicatorp;
}
/* Deregister indicators from host. */
-   vcdev->indicators = 0;
+   *indicators(vcdev) = 0;
ccw->flags = 0;
ret = ccw_io_helper(vcdev, ccw,
vcdev->is_thinint ?
@@ -656,10 +666,10 @@ static int virtio_ccw_find_vqs(struct virtio_device 
*vdev, unsigned nvqs,
 * We need a data area under 2G to communicate. Our payload is
 * the address of the indicators.
*/
-   indicatorp = kmalloc(sizeof(>indicators), GFP_DMA | GFP_KERNEL);
+   indicatorp = kmalloc(sizeof(indicators(vcdev)), GFP_DMA | GFP_KERNEL);
if (!indicatorp)
goto out;
-   *indicatorp = (unsigned long) >indicators;
+   *indicatorp = (unsigned long) indicators(vcdev);
if (vcdev->is_thinint) {
ret = virtio_ccw_register_adapter_ind(vcdev, vqs, nvqs, ccw);
if (ret)
@@ -668,21 +678,21 @@ static int virtio_ccw_find_vqs(struct virtio_device 
*vdev, unsigned nvqs,
}
if (!vcdev->is_thinint) {
/* Register queue indicators with host. */
-   vcdev->indicators = 0;
+   *indicators(vcdev) = 0;
ccw->cmd_code = CCW_CMD_SET_IND;
ccw->flags = 0;
-   ccw->count = sizeof(>indicators);
+   ccw->count = sizeof(indicators(vcdev));
ccw->cda = (__u32)(unsigned long) indicatorp;
ret = ccw_io_helper(vcdev, ccw, VIRTIO_CCW_DOING_SET_IND);
if (ret)
goto out;
}
/* Register indicators2 with host for config changes */
-   *indicatorp = (unsigned long) >indicators2;
-   vcdev->indicators2 = 0;
+   *indicatorp = (unsigned long) indicators2(vcdev);
+   *indicators2(vcdev) = 0;
ccw->cmd_code = CCW_CMD_SET_CONF_IND;
ccw->flags = 0;
-   ccw->count = sizeof(>indicators2);
+   ccw->count = sizeof(indicators2(vcdev));
ccw->cda = (__u32)(unsigned long) indicatorp;
ret = ccw_io_helper(vcdev, ccw, VIRTIO_CCW_DOING_SET_CONF_IND);
if (ret)
@@ -1092,17 +1102,17 @@ static void virtio_ccw_int_handler(struct ccw_device 
*cdev,
vcdev->err = -EIO;
}
virtio_ccw_check_activity(vcdev, activity);
-   for_each_set_bit(i, >indicators,
-sizeof(vcdev->indicators) * BITS_PER_BYTE) {
+   for_each_set_bit(i, indicators(vcdev),
+sizeof(*indicators(vcdev)) * BITS_PER_BYTE) {
/* The bit clear must happen before the vring kick. */
-   clear_bit(i, >indicators);
+   clear_bit(i, indicators(vcdev));
barrier();
vq = virtio_ccw_vq_by_ind(vcdev, i);
vring_interrupt(0, vq);
}
-   if (test_bit(0, >indicators2)) {
+   if (test_bit(0, indicators2(vcdev))) {
virtio_config_changed(>vdev);
-   clear_bit(0, >indicators2);
+   clear_bit(0, indicators2(vcdev));
}
 }
 
-- 
2.16.4

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH 01/10] virtio/s390: use vring_create_virtqueue

2019-04-26 Thread Halil Pasic
The commit 2a2d1382fe9d ("virtio: Add improved queue allocation API")
establishes a new way of allocating virtqueues (as a part of the effort
that taught DMA to virtio rings).

In the future we will want virtio-ccw to use the DMA API as well.

Let us switch from the legacy method of allocating virtqueues to
vring_create_virtqueue() as the first step into that direction.

Signed-off-by: Halil Pasic 
---
 drivers/s390/virtio/virtio_ccw.c | 30 +++---
 1 file changed, 11 insertions(+), 19 deletions(-)

diff --git a/drivers/s390/virtio/virtio_ccw.c b/drivers/s390/virtio/virtio_ccw.c
index 74c328321889..2c66941ef3d0 100644
--- a/drivers/s390/virtio/virtio_ccw.c
+++ b/drivers/s390/virtio/virtio_ccw.c
@@ -108,7 +108,6 @@ struct virtio_rev_info {
 struct virtio_ccw_vq_info {
struct virtqueue *vq;
int num;
-   void *queue;
union {
struct vq_info_block s;
struct vq_info_block_legacy l;
@@ -423,7 +422,6 @@ static void virtio_ccw_del_vq(struct virtqueue *vq, struct 
ccw1 *ccw)
struct virtio_ccw_device *vcdev = to_vc_device(vq->vdev);
struct virtio_ccw_vq_info *info = vq->priv;
unsigned long flags;
-   unsigned long size;
int ret;
unsigned int index = vq->index;
 
@@ -461,8 +459,6 @@ static void virtio_ccw_del_vq(struct virtqueue *vq, struct 
ccw1 *ccw)
 ret, index);
 
vring_del_virtqueue(vq);
-   size = PAGE_ALIGN(vring_size(info->num, KVM_VIRTIO_CCW_RING_ALIGN));
-   free_pages_exact(info->queue, size);
kfree(info->info_block);
kfree(info);
 }
@@ -494,8 +490,9 @@ static struct virtqueue *virtio_ccw_setup_vq(struct 
virtio_device *vdev,
int err;
struct virtqueue *vq = NULL;
struct virtio_ccw_vq_info *info;
-   unsigned long size = 0; /* silence the compiler */
+   u64 queue;
unsigned long flags;
+   bool may_reduce;
 
/* Allocate queue. */
info = kzalloc(sizeof(struct virtio_ccw_vq_info), GFP_KERNEL);
@@ -516,33 +513,30 @@ static struct virtqueue *virtio_ccw_setup_vq(struct 
virtio_device *vdev,
err = info->num;
goto out_err;
}
-   size = PAGE_ALIGN(vring_size(info->num, KVM_VIRTIO_CCW_RING_ALIGN));
-   info->queue = alloc_pages_exact(size, GFP_KERNEL | __GFP_ZERO);
-   if (info->queue == NULL) {
-   dev_warn(>cdev->dev, "no queue\n");
-   err = -ENOMEM;
-   goto out_err;
-   }
+   may_reduce = vcdev->revision > 0;
+   vq = vring_create_virtqueue(i, info->num, KVM_VIRTIO_CCW_RING_ALIGN,
+   vdev, true, may_reduce, ctx,
+   virtio_ccw_kvm_notify, callback, name);
 
-   vq = vring_new_virtqueue(i, info->num, KVM_VIRTIO_CCW_RING_ALIGN, vdev,
-true, ctx, info->queue, virtio_ccw_kvm_notify,
-callback, name);
if (!vq) {
/* For now, we fail if we can't get the requested size. */
dev_warn(>cdev->dev, "no vq\n");
err = -ENOMEM;
goto out_err;
}
+   /* it may have been reduced */
+   info->num = virtqueue_get_vring_size(vq);
 
/* Register it with the host. */
+   queue = virtqueue_get_desc_addr(vq);
if (vcdev->revision == 0) {
-   info->info_block->l.queue = (__u64)info->queue;
+   info->info_block->l.queue = queue;
info->info_block->l.align = KVM_VIRTIO_CCW_RING_ALIGN;
info->info_block->l.index = i;
info->info_block->l.num = info->num;
ccw->count = sizeof(info->info_block->l);
} else {
-   info->info_block->s.desc = (__u64)info->queue;
+   info->info_block->s.desc = queue;
info->info_block->s.index = i;
info->info_block->s.num = info->num;
info->info_block->s.avail = (__u64)virtqueue_get_avail(vq);
@@ -572,8 +566,6 @@ static struct virtqueue *virtio_ccw_setup_vq(struct 
virtio_device *vdev,
if (vq)
vring_del_virtqueue(vq);
if (info) {
-   if (info->queue)
-   free_pages_exact(info->queue, size);
kfree(info->info_block);
}
kfree(info);
-- 
2.16.4

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH 03/10] virtio/s390: enable packed ring

2019-04-26 Thread Halil Pasic
Nothing precludes to accepting  VIRTIO_F_RING_PACKED any more.

Signed-off-by: Halil Pasic 
---
 drivers/s390/virtio/virtio_ccw.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/drivers/s390/virtio/virtio_ccw.c b/drivers/s390/virtio/virtio_ccw.c
index 42832a164546..6d989c360f38 100644
--- a/drivers/s390/virtio/virtio_ccw.c
+++ b/drivers/s390/virtio/virtio_ccw.c
@@ -773,10 +773,8 @@ static u64 virtio_ccw_get_features(struct virtio_device 
*vdev)
 static void ccw_transport_features(struct virtio_device *vdev)
 {
/*
-* There shouldn't be anything that precludes supporting packed.
-* TODO: Remove the limitation after having another look into this.
+* Currently nothing to do here.
 */
-   __virtio_clear_bit(vdev, VIRTIO_F_RING_PACKED);
 }
 
 static int virtio_ccw_finalize_features(struct virtio_device *vdev)
-- 
2.16.4

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH 05/10] s390/cio: introduce DMA pools to cio

2019-04-26 Thread Halil Pasic
To support protected virtualization cio will need to make sure the
memory used for communication with the hypervisor is DMA memory.

Let us introduce one global cio, and some tools for pools seated
at individual devices.

Our DMA pools are implemented as a gen_pool backed with DMA pages. The
idea is to avoid each allocation effectively wasting a page, as we
typically allocate much less than PAGE_SIZE.

Signed-off-by: Halil Pasic 
---
 arch/s390/Kconfig   |   1 +
 arch/s390/include/asm/cio.h |  11 +
 drivers/s390/cio/cio.h  |   1 +
 drivers/s390/cio/css.c  | 101 
 4 files changed, 114 insertions(+)

diff --git a/arch/s390/Kconfig b/arch/s390/Kconfig
index 5500d05d4d53..5861311d95d9 100644
--- a/arch/s390/Kconfig
+++ b/arch/s390/Kconfig
@@ -195,6 +195,7 @@ config S390
select VIRT_TO_BUS
select HAVE_NMI
select SWIOTLB
+   select GENERIC_ALLOCATOR
 
 
 config SCHED_OMIT_FRAME_POINTER
diff --git a/arch/s390/include/asm/cio.h b/arch/s390/include/asm/cio.h
index 1727180e8ca1..43c007d2775a 100644
--- a/arch/s390/include/asm/cio.h
+++ b/arch/s390/include/asm/cio.h
@@ -328,6 +328,17 @@ static inline u8 pathmask_to_pos(u8 mask)
 void channel_subsystem_reinit(void);
 extern void css_schedule_reprobe(void);
 
+extern void *cio_dma_zalloc(size_t size);
+extern void cio_dma_free(void *cpu_addr, size_t size);
+extern struct device *cio_get_dma_css_dev(void);
+
+struct gen_pool;
+void *cio_gp_dma_zalloc(struct gen_pool *gp_dma, struct device *dma_dev,
+   size_t size);
+void cio_gp_dma_free(struct gen_pool *gp_dma, void *cpu_addr, size_t size);
+void cio_gp_dma_destroy(struct gen_pool *gp_dma, struct device *dma_dev);
+struct gen_pool *cio_gp_dma_create(struct device *dma_dev, int nr_pages);
+
 /* Function from drivers/s390/cio/chsc.c */
 int chsc_sstpc(void *page, unsigned int op, u16 ctrl, u64 *clock_delta);
 int chsc_sstpi(void *page, void *result, size_t size);
diff --git a/drivers/s390/cio/cio.h b/drivers/s390/cio/cio.h
index 92eabbb5f18d..f23f7e2c33f7 100644
--- a/drivers/s390/cio/cio.h
+++ b/drivers/s390/cio/cio.h
@@ -113,6 +113,7 @@ struct subchannel {
enum sch_todo todo;
struct work_struct todo_work;
struct schib_config config;
+   u64 dma_mask;
 } __attribute__ ((aligned(8)));
 
 DECLARE_PER_CPU_ALIGNED(struct irb, cio_irb);
diff --git a/drivers/s390/cio/css.c b/drivers/s390/cio/css.c
index aea502922646..7087cc314fe9 100644
--- a/drivers/s390/cio/css.c
+++ b/drivers/s390/cio/css.c
@@ -20,6 +20,8 @@
 #include 
 #include 
 #include 
+#include 
+#include 
 #include 
 #include 
 
@@ -199,6 +201,8 @@ static int css_validate_subchannel(struct subchannel_id 
schid,
return err;
 }
 
+static u64 css_dev_dma_mask = DMA_BIT_MASK(31);
+
 struct subchannel *css_alloc_subchannel(struct subchannel_id schid,
struct schib *schib)
 {
@@ -224,6 +228,9 @@ struct subchannel *css_alloc_subchannel(struct 
subchannel_id schid,
INIT_WORK(>todo_work, css_sch_todo);
sch->dev.release = _subchannel_release;
device_initialize(>dev);
+   sch->dma_mask = css_dev_dma_mask;
+   sch->dev.dma_mask = >dma_mask;
+   sch->dev.coherent_dma_mask = sch->dma_mask;
return sch;
 
 err:
@@ -899,6 +906,9 @@ static int __init setup_css(int nr)
dev_set_name(>device, "css%x", nr);
css->device.groups = cssdev_attr_groups;
css->device.release = channel_subsystem_release;
+   /* some cio DMA memory needs to be 31 bit addressable */
+   css->device.coherent_dma_mask = css_dev_dma_mask,
+   css->device.dma_mask = _dev_dma_mask;
 
mutex_init(>mutex);
css->cssid = chsc_get_cssid(nr);
@@ -1018,6 +1028,96 @@ static struct notifier_block css_power_notifier = {
.notifier_call = css_power_event,
 };
 
+#define POOL_INIT_PAGES 1
+static struct gen_pool *cio_dma_pool;
+/* Currently cio supports only a single css */
+#define  CIO_DMA_GFP (GFP_KERNEL | __GFP_ZERO)
+
+
+struct device *cio_get_dma_css_dev(void)
+{
+   return _subsystems[0]->device;
+}
+
+struct gen_pool *cio_gp_dma_create(struct device *dma_dev, int nr_pages)
+{
+   struct gen_pool *gp_dma;
+   void *cpu_addr;
+   dma_addr_t dma_addr;
+   int i;
+
+   gp_dma = gen_pool_create(3, -1);
+   if (!gp_dma)
+   return NULL;
+   for (i = 0; i < nr_pages; ++i) {
+   cpu_addr = dma_alloc_coherent(dma_dev, PAGE_SIZE, _addr,
+ CIO_DMA_GFP);
+   if (!cpu_addr)
+   return gp_dma;
+   gen_pool_add_virt(gp_dma, (unsigned long) cpu_addr,
+ dma_addr, PAGE_SIZE, -1);
+   }
+   return gp_dma;
+}
+
+static void __gp_dma_free_dma(struct gen_pool *pool,
+ struct gen_pool_chunk *chunk, void *data)
+{
+   dma_free_coherent((struct device 

[PATCH 04/10] s390/mm: force swiotlb for protected virtualization

2019-04-26 Thread Halil Pasic
On s390, protected virtualization guests have to use bounced I/O
buffers.  That requires some plumbing.

Let us make sure, any device that uses DMA API with direct ops correctly
is spared from the problems, that a hypervisor attempting I/O to a
non-shared page would bring.

Signed-off-by: Halil Pasic 
---
 arch/s390/Kconfig   |  4 +++
 arch/s390/include/asm/mem_encrypt.h | 18 +
 arch/s390/mm/init.c | 50 +
 3 files changed, 72 insertions(+)
 create mode 100644 arch/s390/include/asm/mem_encrypt.h

diff --git a/arch/s390/Kconfig b/arch/s390/Kconfig
index 1c3fcf19c3af..5500d05d4d53 100644
--- a/arch/s390/Kconfig
+++ b/arch/s390/Kconfig
@@ -1,4 +1,7 @@
 # SPDX-License-Identifier: GPL-2.0
+config ARCH_HAS_MEM_ENCRYPT
+def_bool y
+
 config MMU
def_bool y
 
@@ -191,6 +194,7 @@ config S390
select ARCH_HAS_SCALED_CPUTIME
select VIRT_TO_BUS
select HAVE_NMI
+   select SWIOTLB
 
 
 config SCHED_OMIT_FRAME_POINTER
diff --git a/arch/s390/include/asm/mem_encrypt.h 
b/arch/s390/include/asm/mem_encrypt.h
new file mode 100644
index ..0898c09a888c
--- /dev/null
+++ b/arch/s390/include/asm/mem_encrypt.h
@@ -0,0 +1,18 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef S390_MEM_ENCRYPT_H__
+#define S390_MEM_ENCRYPT_H__
+
+#ifndef __ASSEMBLY__
+
+#define sme_me_mask0ULL
+
+static inline bool sme_active(void) { return false; }
+extern bool sev_active(void);
+
+int set_memory_encrypted(unsigned long addr, int numpages);
+int set_memory_decrypted(unsigned long addr, int numpages);
+
+#endif /* __ASSEMBLY__ */
+
+#endif /* S390_MEM_ENCRYPT_H__ */
+
diff --git a/arch/s390/mm/init.c b/arch/s390/mm/init.c
index 3e82f66d5c61..7e3cbd15dcfa 100644
--- a/arch/s390/mm/init.c
+++ b/arch/s390/mm/init.c
@@ -18,6 +18,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -29,6 +30,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -42,6 +44,8 @@
 #include 
 #include 
 #include 
+#include 
+#include 
 
 pgd_t swapper_pg_dir[PTRS_PER_PGD] __section(.bss..swapper_pg_dir);
 
@@ -126,6 +130,50 @@ void mark_rodata_ro(void)
pr_info("Write protected read-only-after-init data: %luk\n", size >> 
10);
 }
 
+int set_memory_encrypted(unsigned long addr, int numpages)
+{
+   int i;
+
+   /* make all pages shared, (swiotlb, dma_free) */
+   for (i = 0; i < numpages; ++i) {
+   uv_remove_shared(addr);
+   addr += PAGE_SIZE;
+   }
+   return 0;
+}
+EXPORT_SYMBOL_GPL(set_memory_encrypted);
+
+int set_memory_decrypted(unsigned long addr, int numpages)
+{
+   int i;
+   /* make all pages shared (swiotlb, dma_alloca) */
+   for (i = 0; i < numpages; ++i) {
+   uv_set_shared(addr);
+   addr += PAGE_SIZE;
+   }
+   return 0;
+}
+EXPORT_SYMBOL_GPL(set_memory_decrypted);
+
+/* are we a protected virtualization guest? */
+bool sev_active(void)
+{
+   return is_prot_virt_guest();
+}
+EXPORT_SYMBOL_GPL(sev_active);
+
+/* protected virtualization */
+static void pv_init(void)
+{
+   if (!sev_active())
+   return;
+
+   /* make sure bounce buffers are shared */
+   swiotlb_init(1);
+   swiotlb_update_mem_attributes();
+   swiotlb_force = SWIOTLB_FORCE;
+}
+
 void __init mem_init(void)
 {
cpumask_set_cpu(0, _mm.context.cpu_attach_mask);
@@ -134,6 +182,8 @@ void __init mem_init(void)
set_max_mapnr(max_low_pfn);
 high_memory = (void *) __va(max_low_pfn * PAGE_SIZE);
 
+   pv_init();
+
/* Setup guest page hinting */
cmma_init();
 
-- 
2.16.4

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH 00/10] s390: virtio: support protected virtualization

2019-04-26 Thread Halil Pasic
Enhanced virtualization protection technology may require the use of
bounce buffers for I/O. While support for this was built into the virtio
core, virtio-ccw wasn't changed accordingly.

Some background on technology (not part of this series) and the
terminology used.

* Protected Virtualization (PV):

Protected Virtualization guarantees, that non-shared memory of a  guest
that operates in PV mode private to that guest. I.e. any attempts by the
hypervisor or other guests to access it will result in an exception. If
supported by the environment (machine, KVM, guest VM) a guest can decide
to change into PV mode by doing the appropriate ultravisor calls. Unlike
some other enhanced virtualization protection technology, 

* Ultravisor:

A hardware/firmware entity that manages PV guests, and polices access to
their memory. A PV guest prospect needs to interact with the ultravisor,
to enter PV mode, and potentially to share pages (for I/O which should
be encrypted by the guest). A guest interacts with the ultravisor via so
called ultravisor calls. A hypervisor needs to interact with the
ultravisor to facilitate interpretation, emulation and swapping. A
hypervisor  interacts with the ultravisor via ultravisor calls and via
the SIE state description. Generally the ultravisor sanitizes hypervisor
inputs so that the guest can not be corrupted (except for denial of
service.


What needs to be done
=

Thus what needs to be done to bring virtio-ccw up to speed with respect
to protected virtualization is:
* use some 'new' common virtio stuff
* make sure that virtio-ccw specific stuff uses shared memory when
  talking to the hypervisor (except control/communication blocks like ORB,
  these are handled by the ultravisor)
* make sure the DMA API does what is necessary to talk through shared
  memory if we are a protected virtualization guest.
* make sure the common IO layer plays along as well (airqs, sense).


Important notes


* This patch set is based on Martins features branch
 (git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux.git branch
 'features').

* Documentation is still very sketchy. I'm committed to improving this,
  but I'm currently hampered by some dependencies currently.  

* The existing naming in the common infrastructure (kernel internal
  interfaces) is pretty much based on the AMD SEV terminology. Thus the
  names aren't always perfect. There might be merit to changing these
  names to more abstract ones. I did not put much thought into that at
  the current stage.

* Testing: Please use iommu_platform=on for any virtio devices you are
  going to test this code with (so virtio actually uses the DMA API).

Change log
==

RFC --> v1:
* Fixed bugs found by Connie (may_reduce and handling reduced,  warning,
  split move -- thanks Connie!).
* Fixed console bug found by Sebastian (thanks Sebastian!).
* Removed the completely useless duplicate of dma-mapping.h spotted by
  Christoph (thanks Christoph!).
* Don't use the global DMA pool for subchannel and ccw device
  owned memory as requested by Sebastian. Consequences:
* Both subchannel and ccw devices have their dma masks
now (both specifying 31 bit addressable)
* We require at least 2 DMA pages per ccw device now, most of
this memory is wasted though.
* DMA memory allocated by virtio is also 31 bit addressable now
as virtio uses the parent (which is the ccw device).
* Enabled packed ring.
* Rebased onto Martins feature branch; using the actual uv (ultravisor)
  interface instead of TODO comments.
* Added some explanations to the cover letter (Connie, David).
* Squashed a couple of patches together and fixed some text stuff. 

Looking forward to your review, or any other type of input.

Halil Pasic (10):
  virtio/s390: use vring_create_virtqueue
  virtio/s390: DMA support for virtio-ccw
  virtio/s390: enable packed ring
  s390/mm: force swiotlb for protected virtualization
  s390/cio: introduce DMA pools to cio
  s390/cio: add basic protected virtualization support
  s390/airq: use DMA memory for adapter interrupts
  virtio/s390: add indirection to indicators access
  virtio/s390: use DMA memory for ccw I/O and classic notifiers
  virtio/s390: make airq summary indicators DMA

 arch/s390/Kconfig   |   5 +
 arch/s390/include/asm/airq.h|   2 +
 arch/s390/include/asm/ccwdev.h  |   4 +
 arch/s390/include/asm/cio.h |  11 ++
 arch/s390/include/asm/mem_encrypt.h |  18 +++
 arch/s390/mm/init.c |  50 +++
 drivers/s390/cio/airq.c |  18 ++-
 drivers/s390/cio/ccwreq.c   |   8 +-
 drivers/s390/cio/cio.h  |   1 +
 drivers/s390/cio/css.c  | 101 +
 drivers/s390/cio/device.c   |  65 +++--
 drivers/s390/cio/device_fsm.c   |  40 +++---
 drivers/s390/cio/device_id.c|  18 +--
 drivers/s390/cio/device_ops.c   |  21 ++-
 

[PATCH 02/10] virtio/s390: DMA support for virtio-ccw

2019-04-26 Thread Halil Pasic
Currently virtio-ccw devices do not work if the device has
VIRTIO_F_IOMMU_PLATFORM. In future we do want to support DMA API with
virtio-ccw.

Let us do the plumbing, so the feature VIRTIO_F_IOMMU_PLATFORM works
with virtio-ccw.

Let us also switch from legacy avail/used accessors to the DMA aware
ones (even if it isn't strictly necessary), and remove the legacy
accessors (we were the last users).

Signed-off-by: Halil Pasic 
---
 drivers/s390/virtio/virtio_ccw.c | 22 +++---
 include/linux/virtio.h   | 17 -
 2 files changed, 15 insertions(+), 24 deletions(-)

diff --git a/drivers/s390/virtio/virtio_ccw.c b/drivers/s390/virtio/virtio_ccw.c
index 2c66941ef3d0..42832a164546 100644
--- a/drivers/s390/virtio/virtio_ccw.c
+++ b/drivers/s390/virtio/virtio_ccw.c
@@ -66,6 +66,7 @@ struct virtio_ccw_device {
bool device_lost;
unsigned int config_ready;
void *airq_info;
+   u64 dma_mask;
 };
 
 struct vq_info_block_legacy {
@@ -539,8 +540,8 @@ static struct virtqueue *virtio_ccw_setup_vq(struct 
virtio_device *vdev,
info->info_block->s.desc = queue;
info->info_block->s.index = i;
info->info_block->s.num = info->num;
-   info->info_block->s.avail = (__u64)virtqueue_get_avail(vq);
-   info->info_block->s.used = (__u64)virtqueue_get_used(vq);
+   info->info_block->s.avail = (__u64)virtqueue_get_avail_addr(vq);
+   info->info_block->s.used = (__u64)virtqueue_get_used_addr(vq);
ccw->count = sizeof(info->info_block->s);
}
ccw->cmd_code = CCW_CMD_SET_VQ;
@@ -772,10 +773,8 @@ static u64 virtio_ccw_get_features(struct virtio_device 
*vdev)
 static void ccw_transport_features(struct virtio_device *vdev)
 {
/*
-* Packed ring isn't enabled on virtio_ccw for now,
-* because virtio_ccw uses some legacy accessors,
-* e.g. virtqueue_get_avail() and virtqueue_get_used()
-* which aren't available in packed ring currently.
+* There shouldn't be anything that precludes supporting packed.
+* TODO: Remove the limitation after having another look into this.
 */
__virtio_clear_bit(vdev, VIRTIO_F_RING_PACKED);
 }
@@ -1258,6 +1257,16 @@ static int virtio_ccw_online(struct ccw_device *cdev)
ret = -ENOMEM;
goto out_free;
}
+
+   vcdev->vdev.dev.parent = >dev;
+   cdev->dev.dma_mask = >dma_mask;
+   /* we are fine with common virtio infrastructure using 64 bit DMA */
+   ret = dma_set_mask_and_coherent(>dev, DMA_BIT_MASK(64));
+   if (ret) {
+   dev_warn(>dev, "Failed to enable 64-bit DMA.\n");
+   goto out_free;
+   }
+
vcdev->config_block = kzalloc(sizeof(*vcdev->config_block),
   GFP_DMA | GFP_KERNEL);
if (!vcdev->config_block) {
@@ -1272,7 +1281,6 @@ static int virtio_ccw_online(struct ccw_device *cdev)
 
vcdev->is_thinint = virtio_ccw_use_airq; /* at least try */
 
-   vcdev->vdev.dev.parent = >dev;
vcdev->vdev.dev.release = virtio_ccw_release_dev;
vcdev->vdev.config = _ccw_config_ops;
vcdev->cdev = cdev;
diff --git a/include/linux/virtio.h b/include/linux/virtio.h
index 673fe3ef3607..15f906e4a748 100644
--- a/include/linux/virtio.h
+++ b/include/linux/virtio.h
@@ -90,23 +90,6 @@ dma_addr_t virtqueue_get_desc_addr(struct virtqueue *vq);
 dma_addr_t virtqueue_get_avail_addr(struct virtqueue *vq);
 dma_addr_t virtqueue_get_used_addr(struct virtqueue *vq);
 
-/*
- * Legacy accessors -- in almost all cases, these are the wrong functions
- * to use.
- */
-static inline void *virtqueue_get_desc(struct virtqueue *vq)
-{
-   return virtqueue_get_vring(vq)->desc;
-}
-static inline void *virtqueue_get_avail(struct virtqueue *vq)
-{
-   return virtqueue_get_vring(vq)->avail;
-}
-static inline void *virtqueue_get_used(struct virtqueue *vq)
-{
-   return virtqueue_get_vring(vq)->used;
-}
-
 /**
  * virtio_device - representation of a device using virtio
  * @index: unique position on the virtio bus
-- 
2.16.4

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH] Revert "drm/qxl: drop prime import/export callbacks"

2019-04-26 Thread Daniel Vetter
On Fri, Apr 26, 2019 at 7:33 AM Gerd Hoffmann  wrote:
>
> This reverts commit f4c34b1e2a37d5676180901fa6ff188bcb6371f8.
>
> Simliar to commit a0cecc23cfcb Revert "drm/virtio: drop prime
> import/export callbacks".  We have to do the same with qxl,
> for the same reasons (it breaks DRI3).
>
> Drop the WARN_ON_ONCE().
>
> Fixes: f4c34b1e2a37d5676180901fa6ff188bcb6371f8
> Signed-off-by: Gerd Hoffmann 

Maybe we need some helpers for virtual drivers which only allow
self-reimport and nothing else at all? I think there's qxl, virgl,
vmwgfx and maybe also vbox one who could use this ... Just a quick
idea.
-Daniel

> ---
>  drivers/gpu/drm/qxl/qxl_drv.c   |  4 
>  drivers/gpu/drm/qxl/qxl_prime.c | 12 
>  2 files changed, 16 insertions(+)
>
> diff --git a/drivers/gpu/drm/qxl/qxl_drv.c b/drivers/gpu/drm/qxl/qxl_drv.c
> index 578d867a81d5..f33e349c4ec5 100644
> --- a/drivers/gpu/drm/qxl/qxl_drv.c
> +++ b/drivers/gpu/drm/qxl/qxl_drv.c
> @@ -255,10 +255,14 @@ static struct drm_driver qxl_driver = {
>  #if defined(CONFIG_DEBUG_FS)
> .debugfs_init = qxl_debugfs_init,
>  #endif
> +   .prime_handle_to_fd = drm_gem_prime_handle_to_fd,
> +   .prime_fd_to_handle = drm_gem_prime_fd_to_handle,
> .gem_prime_export = drm_gem_prime_export,
> .gem_prime_import = drm_gem_prime_import,
> .gem_prime_pin = qxl_gem_prime_pin,
> .gem_prime_unpin = qxl_gem_prime_unpin,
> +   .gem_prime_get_sg_table = qxl_gem_prime_get_sg_table,
> +   .gem_prime_import_sg_table = qxl_gem_prime_import_sg_table,
> .gem_prime_vmap = qxl_gem_prime_vmap,
> .gem_prime_vunmap = qxl_gem_prime_vunmap,
> .gem_prime_mmap = qxl_gem_prime_mmap,
> diff --git a/drivers/gpu/drm/qxl/qxl_prime.c b/drivers/gpu/drm/qxl/qxl_prime.c
> index 8b448eca1cd9..114653b471c6 100644
> --- a/drivers/gpu/drm/qxl/qxl_prime.c
> +++ b/drivers/gpu/drm/qxl/qxl_prime.c
> @@ -42,6 +42,18 @@ void qxl_gem_prime_unpin(struct drm_gem_object *obj)
> qxl_bo_unpin(bo);
>  }
>
> +struct sg_table *qxl_gem_prime_get_sg_table(struct drm_gem_object *obj)
> +{
> +   return ERR_PTR(-ENOSYS);
> +}
> +
> +struct drm_gem_object *qxl_gem_prime_import_sg_table(
> +   struct drm_device *dev, struct dma_buf_attachment *attach,
> +   struct sg_table *table)
> +{
> +   return ERR_PTR(-ENOSYS);
> +}
> +
>  void *qxl_gem_prime_vmap(struct drm_gem_object *obj)
>  {
> struct qxl_bo *bo = gem_to_qxl_bo(obj);
> --
> 2.18.1
>


-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH net] vhost_net: fix possible infinite loop

2019-04-26 Thread Jason Wang


On 2019/4/26 上午1:52, Michael S. Tsirkin wrote:

On Thu, Apr 25, 2019 at 03:33:19AM -0400, Jason Wang wrote:

When the rx buffer is too small for a packet, we will discard the vq
descriptor and retry it for the next packet:

while ((sock_len = vhost_net_rx_peek_head_len(net, sock->sk,
  _intr))) {
...
/* On overrun, truncate and discard */
if (unlikely(headcount > UIO_MAXIOV)) {
iov_iter_init(_iter, READ, vq->iov, 1, 1);
err = sock->ops->recvmsg(sock, ,
 1, MSG_DONTWAIT | MSG_TRUNC);
pr_debug("Discarded rx packet: len %zd\n", sock_len);
continue;
}
...
}

This makes it possible to trigger a infinite while..continue loop
through the co-opreation of two VMs like:

1) Malicious VM1 allocate 1 byte rx buffer and try to slow down the
vhost process as much as possible e.g using indirect descriptors or
other.
2) Malicious VM2 generate packets to VM1 as fast as possible

Fixing this by checking against weight at the end of RX and TX
loop. This also eliminate other similar cases when:

- userspace is consuming the packets in the meanwhile
- theoretical TOCTOU attack if guest moving avail index back and forth
   to hit the continue after vhost find guest just add new buffers

This addresses CVE-2019-3900.

Fixes: d8316f3991d20 ("vhost: fix total length when packets are too short")

I agree this is the real issue.


Fixes: 3a4d5c94e9593 ("vhost_net: a kernel-level virtio server")

This is just a red herring imho. We can stick this on any vhost patch :)


Signed-off-by: Jason Wang 
---
  drivers/vhost/net.c | 41 +
  1 file changed, 21 insertions(+), 20 deletions(-)

diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
index df51a35..fb46e6b 100644
--- a/drivers/vhost/net.c
+++ b/drivers/vhost/net.c
@@ -778,8 +778,9 @@ static void handle_tx_copy(struct vhost_net *net, struct 
socket *sock)
int err;
int sent_pkts = 0;
bool sock_can_batch = (sock->sk->sk_sndbuf == INT_MAX);
+   bool next_round = false;
  
-	for (;;) {

+   do {
bool busyloop_intr = false;
  
  		if (nvq->done_idx == VHOST_NET_BATCH)

@@ -845,11 +846,10 @@ static void handle_tx_copy(struct vhost_net *net, struct 
socket *sock)
vq->heads[nvq->done_idx].id = cpu_to_vhost32(vq, head);
vq->heads[nvq->done_idx].len = 0;
++nvq->done_idx;
-   if (vhost_exceeds_weight(++sent_pkts, total_len)) {
-   vhost_poll_queue(>poll);
-   break;
-   }
-   }
+   } while (!(next_round = vhost_exceeds_weight(++sent_pkts, total_len)));
+
+   if (next_round)
+   vhost_poll_queue(>poll);
  
  	vhost_tx_batch(net, nvq, sock, );

  }
@@ -873,8 +873,9 @@ static void handle_tx_zerocopy(struct vhost_net *net, 
struct socket *sock)
struct vhost_net_ubuf_ref *uninitialized_var(ubufs);
bool zcopy_used;
int sent_pkts = 0;
+   bool next_round = false;
  
-	for (;;) {

+   do {
bool busyloop_intr;
  
  		/* Release DMAs done buffers first */

@@ -951,11 +952,10 @@ static void handle_tx_zerocopy(struct vhost_net *net, 
struct socket *sock)
else
vhost_zerocopy_signal_used(net, vq);
vhost_net_tx_packet(net);
-   if (unlikely(vhost_exceeds_weight(++sent_pkts, total_len))) {
-   vhost_poll_queue(>poll);
-   break;
-   }
-   }
+   } while (!(next_round = vhost_exceeds_weight(++sent_pkts, total_len)));
+
+   if (next_round)
+   vhost_poll_queue(>poll);
  }
  
  /* Expects to be always run from workqueue - which acts as

@@ -1134,6 +1134,7 @@ static void handle_rx(struct vhost_net *net)
struct iov_iter fixup;
__virtio16 num_buffers;
int recv_pkts = 0;
+   bool next_round = false;
  
  	mutex_lock_nested(>mutex, VHOST_NET_VQ_RX);

sock = vq->private_data;
@@ -1153,8 +1154,11 @@ static void handle_rx(struct vhost_net *net)
vq->log : NULL;
mergeable = vhost_has_feature(vq, VIRTIO_NET_F_MRG_RXBUF);
  
-	while ((sock_len = vhost_net_rx_peek_head_len(net, sock->sk,

- _intr))) {
+   do {
+   sock_len = vhost_net_rx_peek_head_len(net, sock->sk,
+ _intr);
+   if (!sock_len)
+   break;
sock_len += sock_hlen;
vhost_len = sock_len + vhost_hlen;
headcount = get_rx_bufs(vq, vq->heads + nvq->done_idx,
@@ -1239,12 +1243,9 @@ static void handle_rx(struct vhost_net *net)
vhost_log_write(vq, vq_log, log, vhost_len,