Re: [PATCH v4 0/4] Implement vdpasim stop operation

2022-05-30 Thread Jason Wang
On Tue, May 31, 2022 at 1:40 PM Michael S. Tsirkin  wrote:
>
> On Mon, May 30, 2022 at 11:39:21AM +0800, Jason Wang wrote:
> > On Fri, May 27, 2022 at 6:56 PM Michael S. Tsirkin  wrote:
> > >
> > > On Thu, May 26, 2022 at 12:54:32PM +, Parav Pandit wrote:
> > > >
> > > >
> > > > > From: Eugenio Pérez 
> > > > > Sent: Thursday, May 26, 2022 8:44 AM
> > > >
> > > > > Implement stop operation for vdpa_sim devices, so vhost-vdpa will 
> > > > > offer
> > > > >
> > > > > that backend feature and userspace can effectively stop the device.
> > > > >
> > > > >
> > > > >
> > > > > This is a must before get virtqueue indexes (base) for live migration,
> > > > >
> > > > > since the device could modify them after userland gets them. There are
> > > > >
> > > > > individual ways to perform that action for some devices
> > > > >
> > > > > (VHOST_NET_SET_BACKEND, VHOST_VSOCK_SET_RUNNING, ...) but there
> > > > > was no
> > > > >
> > > > > way to perform it for any vhost device (and, in particular, 
> > > > > vhost-vdpa).
> > > > >
> > > > >
> > > > >
> > > > > After the return of ioctl with stop != 0, the device MUST finish any
> > > > >
> > > > > pending operations like in flight requests. It must also preserve all
> > > > >
> > > > > the necessary state (the virtqueue vring base plus the possible device
> > > > >
> > > > > specific states) that is required for restoring in the future. The
> > > > >
> > > > > device must not change its configuration after that point.
> > > > >
> > > > >
> > > > >
> > > > > After the return of ioctl with stop == 0, the device can continue
> > > > >
> > > > > processing buffers as long as typical conditions are met (vq is 
> > > > > enabled,
> > > > >
> > > > > DRIVER_OK status bit is enabled, etc).
> > > >
> > > > Just to be clear, we are adding vdpa level new ioctl() that doesn’t map 
> > > > to any mechanism in the virtio spec.
> > > >
> > > > Why can't we use this ioctl() to indicate driver to start/stop the 
> > > > device instead of driving it through the driver_ok?
> > > > This is in the context of other discussion we had in the LM series.
> > >
> > > If there's something in the spec that does this then let's use that.
> >
> > Actually, we try to propose a independent feature here:
> >
> > https://lists.oasis-open.org/archives/virtio-dev/202111/msg00020.html
> >
> > Does it make sense to you?
> >
> > Thanks
>
> But I thought the LM patches are trying to replace all that?

I'm not sure, and actually I think they are orthogonal. We need a new
state and the command to set the state could be transport specific or
a virtqueue.

As far as I know, most of the vendors have implemented this semantic.

Thanks

>
>
> > > Unfortunately the LM series seems to be stuck on moving
> > > bits around with the admin virtqueue ...
> > >
> > > --
> > > MST
> > >
>

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH v4 0/4] Implement vdpasim stop operation

2022-05-30 Thread Michael S. Tsirkin
On Thu, May 26, 2022 at 02:43:34PM +0200, Eugenio Pérez wrote:
> Implement stop operation for vdpa_sim devices, so vhost-vdpa will offer
> that backend feature and userspace can effectively stop the device.
> 
> This is a must before get virtqueue indexes (base) for live migration,
> since the device could modify them after userland gets them. There are
> individual ways to perform that action for some devices
> (VHOST_NET_SET_BACKEND, VHOST_VSOCK_SET_RUNNING, ...) but there was no
> way to perform it for any vhost device (and, in particular, vhost-vdpa).
> 
> After the return of ioctl with stop != 0, the device MUST finish any
> pending operations like in flight requests. It must also preserve all
> the necessary state (the virtqueue vring base plus the possible device
> specific states) that is required for restoring in the future. The
> device must not change its configuration after that point.
> 
> After the return of ioctl with stop == 0, the device can continue
> processing buffers as long as typical conditions are met (vq is enabled,
> DRIVER_OK status bit is enabled, etc).
> 
> In the future, we will provide features similar to VHOST_USER_GET_INFLIGHT_FD
> so the device can save pending operations.
> 
> Comments are welcome.


So given this is just for simulator and affects UAPI I think it's fine
to make it wait for the next merge window, until there's a consensus.
Right?

> v4:
> * Replace VHOST_STOP to VHOST_VDPA_STOP in vhost ioctl switch case too.
> 
> v3:
> * s/VHOST_STOP/VHOST_VDPA_STOP/
> * Add documentation and requirements of the ioctl above its definition.
> 
> v2:
> * Replace raw _F_STOP with BIT_ULL(_F_STOP).
> * Fix obtaining of stop ioctl arg (it was not obtained but written).
> * Add stop to vdpa_sim_blk.
> 
> Eugenio Pérez (4):
>   vdpa: Add stop operation
>   vhost-vdpa: introduce STOP backend feature bit
>   vhost-vdpa: uAPI to stop the device
>   vdpa_sim: Implement stop vdpa op
> 
>  drivers/vdpa/vdpa_sim/vdpa_sim.c | 21 +
>  drivers/vdpa/vdpa_sim/vdpa_sim.h |  1 +
>  drivers/vdpa/vdpa_sim/vdpa_sim_blk.c |  3 +++
>  drivers/vdpa/vdpa_sim/vdpa_sim_net.c |  3 +++
>  drivers/vhost/vdpa.c | 34 +++-
>  include/linux/vdpa.h |  6 +
>  include/uapi/linux/vhost.h   | 14 
>  include/uapi/linux/vhost_types.h |  2 ++
>  8 files changed, 83 insertions(+), 1 deletion(-)
> 
> -- 
> 2.31.1
> 

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH v4 0/4] Implement vdpasim stop operation

2022-05-30 Thread Michael S. Tsirkin
On Mon, May 30, 2022 at 11:39:21AM +0800, Jason Wang wrote:
> On Fri, May 27, 2022 at 6:56 PM Michael S. Tsirkin  wrote:
> >
> > On Thu, May 26, 2022 at 12:54:32PM +, Parav Pandit wrote:
> > >
> > >
> > > > From: Eugenio Pérez 
> > > > Sent: Thursday, May 26, 2022 8:44 AM
> > >
> > > > Implement stop operation for vdpa_sim devices, so vhost-vdpa will offer
> > > >
> > > > that backend feature and userspace can effectively stop the device.
> > > >
> > > >
> > > >
> > > > This is a must before get virtqueue indexes (base) for live migration,
> > > >
> > > > since the device could modify them after userland gets them. There are
> > > >
> > > > individual ways to perform that action for some devices
> > > >
> > > > (VHOST_NET_SET_BACKEND, VHOST_VSOCK_SET_RUNNING, ...) but there
> > > > was no
> > > >
> > > > way to perform it for any vhost device (and, in particular, vhost-vdpa).
> > > >
> > > >
> > > >
> > > > After the return of ioctl with stop != 0, the device MUST finish any
> > > >
> > > > pending operations like in flight requests. It must also preserve all
> > > >
> > > > the necessary state (the virtqueue vring base plus the possible device
> > > >
> > > > specific states) that is required for restoring in the future. The
> > > >
> > > > device must not change its configuration after that point.
> > > >
> > > >
> > > >
> > > > After the return of ioctl with stop == 0, the device can continue
> > > >
> > > > processing buffers as long as typical conditions are met (vq is enabled,
> > > >
> > > > DRIVER_OK status bit is enabled, etc).
> > >
> > > Just to be clear, we are adding vdpa level new ioctl() that doesn’t map 
> > > to any mechanism in the virtio spec.
> > >
> > > Why can't we use this ioctl() to indicate driver to start/stop the device 
> > > instead of driving it through the driver_ok?
> > > This is in the context of other discussion we had in the LM series.
> >
> > If there's something in the spec that does this then let's use that.
> 
> Actually, we try to propose a independent feature here:
> 
> https://lists.oasis-open.org/archives/virtio-dev/202111/msg00020.html
> 
> Does it make sense to you?
> 
> Thanks

But I thought the LM patches are trying to replace all that?


> > Unfortunately the LM series seems to be stuck on moving
> > bits around with the admin virtqueue ...
> >
> > --
> > MST
> >

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: RE: [PATCH v8 1/1] crypto: Introduce RSA algorithm

2022-05-30 Thread zhenwei pi

On 5/30/22 21:31, Gonglei (Arei) wrote:




-Original Message-
From: zhenwei pi [mailto:pizhen...@bytedance.com]
Sent: Friday, May 27, 2022 4:48 PM
To: m...@redhat.com; Gonglei (Arei) 
Cc: qemu-de...@nongnu.org; virtualization@lists.linux-foundation.org;
helei.si...@bytedance.com; berra...@redhat.com; zhenwei pi

Subject: [PATCH v8 1/1] crypto: Introduce RSA algorithm



Skip...


+static int64_t
+virtio_crypto_create_asym_session(VirtIOCrypto *vcrypto,
+   struct virtio_crypto_akcipher_create_session_req
*sess_req,
+   uint32_t queue_id, uint32_t opcode,
+   struct iovec *iov, unsigned int out_num) {
+VirtIODevice *vdev = VIRTIO_DEVICE(vcrypto);
+CryptoDevBackendSessionInfo info = {0};
+CryptoDevBackendAsymSessionInfo *asym_info;
+int64_t session_id;
+int queue_index;
+uint32_t algo, keytype, keylen;
+g_autofree uint8_t *key = NULL;
+Error *local_err = NULL;
+
+algo = ldl_le_p(&sess_req->para.algo);
+keytype = ldl_le_p(&sess_req->para.keytype);
+keylen = ldl_le_p(&sess_req->para.keylen);
+
+if ((keytype != VIRTIO_CRYPTO_AKCIPHER_KEY_TYPE_PUBLIC)
+ && (keytype != VIRTIO_CRYPTO_AKCIPHER_KEY_TYPE_PRIVATE)) {
+error_report("unsupported asym keytype: %d", keytype);
+return -VIRTIO_CRYPTO_NOTSUPP;
+}
+
+if (keylen) {
+key = g_malloc(keylen);
+if (iov_to_buf(iov, out_num, 0, key, keylen) != keylen) {
+virtio_error(vdev, "virtio-crypto asym key incorrect");
+return -EFAULT;


Memory leak.


+}
+iov_discard_front(&iov, &out_num, keylen);
+}
+
+info.op_code = opcode;
+asym_info = &info.u.asym_sess_info;
+asym_info->algo = algo;
+asym_info->keytype = keytype;
+asym_info->keylen = keylen;
+asym_info->key = key;
+switch (asym_info->algo) {
+case VIRTIO_CRYPTO_AKCIPHER_RSA:
+asym_info->u.rsa.padding_algo =
+ldl_le_p(&sess_req->para.u.rsa.padding_algo);
+asym_info->u.rsa.hash_algo =
+ldl_le_p(&sess_req->para.u.rsa.hash_algo);
+break;
+
+/* TODO DSA&ECDSA handling */
+
+default:
+return -VIRTIO_CRYPTO_ERR;
+}
+
+queue_index = virtio_crypto_vq2q(queue_id);
+session_id = cryptodev_backend_create_session(vcrypto->cryptodev,
&info,
+ queue_index, &local_err);
+if (session_id < 0) {
+if (local_err) {
+error_report_err(local_err);
+}
+return -VIRTIO_CRYPTO_ERR;
+}
+
+return session_id;


Where to free the key at both normal and exceptional paths?



Hi, Lei

The key is declared with g_autofree:
g_autofree uint8_t *key = NULL;



Regards,
-Gonglei




--
zhenwei pi
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: Re: [PATCH 0/3] recover hardware corrupted page by virtio balloon

2022-05-30 Thread Peter Xu
On Mon, May 30, 2022 at 07:33:35PM +0800, zhenwei pi wrote:
> A VM uses RAM of 2M huge page. Once a MCE(@HVAy in [HVAx,HVAz)) occurs, the
> 2M([HVAx,HVAz)) of hypervisor becomes unaccessible, but the guest poisons 4K
> (@GPAy in [GPAx, GPAz)) only, it may hit another 511 MCE ([GPAx, GPAz)
> except GPAy). This is the worse case, so I want to add
>  '__le32 corrupted_pages' in struct virtio_balloon_config, it is used in the
> next step: reporting 512 * 4K 'corrupted_pages' to the guest, the guest has
> a chance to isolate the other 511 pages ahead of time. And the guest
> actually loses 2M, fixing 512*4K seems to help significantly.

It sounds hackish to teach a virtio device to assume one page will always
be poisoned in huge page granule.  That's only a limitation to host kernel
not virtio itself.

E.g. there're upstream effort ongoing with enabling doublemap on hugetlbfs
pages so hugetlb pages can be mapped in 4k with it.  It provides potential
possibility to do page poisoning with huge pages in 4k too.  When that'll
be ready the assumption can go away, and that does sound like a better
approach towards this problem.

> 
> > 
> > I assume when talking about "the performance memory drops a lot", you
> > imply that this patch set can mitigate that performance drop?
> > 
> > But why do you see a performance drop? Because we might lose some
> > possible THP candidates (in the host or the guest) and you want to plug
> > does holes? I assume you'll see a performance drop simply because
> > poisoning memory is expensive, including migrating pages around on CE.
> > 
> > If you have some numbers to share, especially before/after this change,
> > that would be great.
> > 
> 
> The CE storm leads 2 problems I have even seen:
> 1, the memory bandwidth slows down to 10%~20%, and the cycles per
> instruction of CPU increases a lot.
> 2, the THR (/proc/interrupts) interrupts frequently, the CPU has to use a
> lot time to handle IRQ.

Totally no good knowledge on CMCI, but if 2) is true then I'm wondering
whether it's necessary to handle the interrupts that frequently.  When I
was reading the Intel CMCI vector handler I stumbled over this comment:

/*
 * The interrupt handler. This is called on every event.
 * Just call the poller directly to log any events.
 * This could in theory increase the threshold under high load,
 * but doesn't for now.
 */
static void intel_threshold_interrupt(void)

I think that matches with what I was thinking..  I mean for 2) not sure
whether it can be seen as a CMCI problem and potentially can be optimized
by adjust the cmci threshold dynamically.

-- 
Peter Xu

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH V6 8/9] virtio: harden vring IRQ

2022-05-30 Thread Cornelia Huck
On Fri, May 27 2022, Jason Wang  wrote:

> This is a rework on the previous IRQ hardening that is done for
> virtio-pci where several drawbacks were found and were reverted:
>
> 1) try to use IRQF_NO_AUTOEN which is not friendly to affinity managed IRQ
>that is used by some device such as virtio-blk
> 2) done only for PCI transport
>
> The vq->broken is re-used in this patch for implementing the IRQ
> hardening. The vq->broken is set to true during both initialization
> and reset. And the vq->broken is set to false in
> virtio_device_ready(). Then vring_interrupt() can check and return
> when vq->broken is true. And in this case, switch to return IRQ_NONE
> to let the interrupt core aware of such invalid interrupt to prevent
> IRQ storm.
>
> The reason of using a per queue variable instead of a per device one
> is that we may need it for per queue reset hardening in the future.
>
> Note that the hardening is only done for vring interrupt since the
> config interrupt hardening is already done in commit 22b7050a024d7
> ("virtio: defer config changed notifications"). But the method that is
> used by config interrupt can't be reused by the vring interrupt
> handler because it uses spinlock to do the synchronization which is
> expensive.
>
> Cc: Thomas Gleixner 
> Cc: Peter Zijlstra 
> Cc: "Paul E. McKenney" 
> Cc: Marc Zyngier 
> Cc: Halil Pasic 
> Cc: Cornelia Huck 
> Cc: Vineeth Vijayan 
> Cc: Peter Oberparleiter 
> Cc: linux-s...@vger.kernel.org
> Signed-off-by: Jason Wang 
> ---
>  drivers/s390/virtio/virtio_ccw.c   |  4 
>  drivers/virtio/virtio.c| 15 ---
>  drivers/virtio/virtio_mmio.c   |  5 +
>  drivers/virtio/virtio_pci_modern_dev.c |  5 +
>  drivers/virtio/virtio_ring.c   | 11 +++
>  include/linux/virtio_config.h  | 20 
>  6 files changed, 53 insertions(+), 7 deletions(-)

Reviewed-by: Cornelia Huck 

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH V6 7/9] virtio: allow to unbreak virtqueue

2022-05-30 Thread Cornelia Huck
On Fri, May 27 2022, Jason Wang  wrote:

> This patch allows the new introduced __virtio_break_device() to
> unbreak the virtqueue.
>
> Cc: Thomas Gleixner 
> Cc: Peter Zijlstra 
> Cc: "Paul E. McKenney" 
> Cc: Marc Zyngier 
> Cc: Halil Pasic 
> Cc: Cornelia Huck 
> Cc: Vineeth Vijayan 
> Cc: Peter Oberparleiter 
> Cc: linux-s...@vger.kernel.org
> Signed-off-by: Jason Wang 
> ---
>  drivers/virtio/virtio_ring.c | 22 ++
>  include/linux/virtio.h   |  1 +
>  2 files changed, 23 insertions(+)
>
> diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
> index 9d0bae4293be..9c231e1fded7 100644
> --- a/drivers/virtio/virtio_ring.c
> +++ b/drivers/virtio/virtio_ring.c
> @@ -2395,6 +2395,28 @@ void virtio_break_device(struct virtio_device *dev)
>  }
>  EXPORT_SYMBOL_GPL(virtio_break_device);
>  
> +/*
> + * This should allow the device to be used by the driver. You may
> + * need to grab appropriate locks to flush the write to
> + * vq->broken. This should only be used in some specific case e.g
> + * (probing and restoring). This function should only be called by the

Minor: "...some specific cases, e.g. probing and restoring."

But no need to respin from my side.

> + * core, not directly by the driver.
> + */

Reviewed-by: Cornelia Huck 

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH v3 2/4] vhost-vdpa: introduce STOP backend feature bit

2022-05-30 Thread Dan Carpenter
On Mon, May 30, 2022 at 05:27:25PM +0300, Dan Carpenter wrote:
> On Fri, May 27, 2022 at 10:36:55AM +0300, Dan Carpenter wrote:
> > static void match_pointer(struct expression *ret_value)
> > {
> > struct symbol *type;
> > char *name;
> > 
> > type = cur_func_return_type();
> > if (!type || sval_type_max(type).value != 1)
> > return;
> > 
> > if (!is_pointer(ret_value))
> > return;
> > 
> > name = expr_to_str(ret_value);
> > sm_msg("'%s' return pointer cast to bool", name);
> > free_string(name);
> > }
> 
> I found a bug in Smatch with this check.  With the Smatch bug fixed then
> there is only only place which does a legitimate implied pointer to bool
> cast.  A different function returns NULL on error instead of false.
> 
> drivers/gpu/drm/i915/display/intel_dmc.c:249 intel_dmc_has_payload() 
> 'i915->dmc.dmc_info[0]->payload' return pointer cast to bool
> drivers/net/wireless/rndis_wlan.c:1980 rndis_bss_info_update() '(0)' return 
> pointer cast to bool
> drivers/net/wireless/rndis_wlan.c:1989 rndis_bss_info_update() '(0)' return 
> pointer cast to bool
> drivers/net/wireless/rndis_wlan.c:1995 rndis_bss_info_update() '(0)' return 
> pointer cast to bool

Hm...  I found another.  I'm not sure why it wasn't showing up before.

lib/assoc_array.c:941 assoc_array_insert_mid_shortcut() 'edit' return pointer 
cast to bool

That's weird.  I will re-run the test.

regards,
dan carpenter

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH V6 6/9] virtio-ccw: implement synchronize_cbs()

2022-05-30 Thread Cornelia Huck
On Fri, May 27 2022, Jason Wang  wrote:

> This patch tries to implement the synchronize_cbs() for ccw. For the
> vring_interrupt() that is called via virtio_airq_handler(), the
> synchronization is simply done via the airq_info's lock. For the
> vring_interrupt() that is called via virtio_ccw_int_handler(), a per
> device rwlock is introduced and used in the synchronization method.
>
> Cc: Thomas Gleixner 
> Cc: Peter Zijlstra 
> Cc: "Paul E. McKenney" 
> Cc: Marc Zyngier 
> Cc: Halil Pasic 
> Cc: Cornelia Huck 
> Cc: Vineeth Vijayan 
> Cc: Peter Oberparleiter 
> Cc: linux-s...@vger.kernel.org
> Reviewed-by: Halil Pasic 
> Signed-off-by: Jason Wang 
> ---
>  drivers/s390/virtio/virtio_ccw.c | 30 ++
>  1 file changed, 30 insertions(+)

Reviewed-by: Cornelia Huck 

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH v3 2/4] vhost-vdpa: introduce STOP backend feature bit

2022-05-30 Thread Dan Carpenter
On Fri, May 27, 2022 at 10:36:55AM +0300, Dan Carpenter wrote:
> static void match_pointer(struct expression *ret_value)
> {
> struct symbol *type;
> char *name;
> 
> type = cur_func_return_type();
> if (!type || sval_type_max(type).value != 1)
> return;
> 
> if (!is_pointer(ret_value))
> return;
> 
> name = expr_to_str(ret_value);
> sm_msg("'%s' return pointer cast to bool", name);
> free_string(name);
> }

I found a bug in Smatch with this check.  With the Smatch bug fixed then
there is only only place which does a legitimate implied pointer to bool
cast.  A different function returns NULL on error instead of false.

drivers/gpu/drm/i915/display/intel_dmc.c:249 intel_dmc_has_payload() 
'i915->dmc.dmc_info[0]->payload' return pointer cast to bool
drivers/net/wireless/rndis_wlan.c:1980 rndis_bss_info_update() '(0)' return 
pointer cast to bool
drivers/net/wireless/rndis_wlan.c:1989 rndis_bss_info_update() '(0)' return 
pointer cast to bool
drivers/net/wireless/rndis_wlan.c:1995 rndis_bss_info_update() '(0)' return 
pointer cast to bool

regards,
dan carpenter

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH AUTOSEL 4.9 01/24] drm/virtio: fix NULL pointer dereference in virtio_gpu_conn_get_modes

2022-05-30 Thread Sasha Levin
From: Liu Zixian 

[ Upstream commit 194d250cdc4a40ccbd179afd522a9e9846957402 ]

drm_cvt_mode may return NULL and we should check it.

This bug is found by syzkaller:

FAULT_INJECTION stacktrace:
[  168.567394] FAULT_INJECTION: forcing a failure.
name failslab, interval 1, probability 0, space 0, times 1
[  168.567403] CPU: 1 PID: 6425 Comm: syz Kdump: loaded Not tainted 
4.19.90-vhulk2201.1.0.h1035.kasan.eulerosv2r10.aarch64 #1
[  168.567406] Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0 02/06/2015
[  168.567408] Call trace:
[  168.567414]  dump_backtrace+0x0/0x310
[  168.567418]  show_stack+0x28/0x38
[  168.567423]  dump_stack+0xec/0x15c
[  168.567427]  should_fail+0x3ac/0x3d0
[  168.567437]  __should_failslab+0xb8/0x120
[  168.567441]  should_failslab+0x28/0xc0
[  168.567445]  kmem_cache_alloc_trace+0x50/0x640
[  168.567454]  drm_mode_create+0x40/0x90
[  168.567458]  drm_cvt_mode+0x48/0xc78
[  168.567477]  virtio_gpu_conn_get_modes+0xa8/0x140 [virtio_gpu]
[  168.567485]  drm_helper_probe_single_connector_modes+0x3a4/0xd80
[  168.567492]  drm_mode_getconnector+0x2e0/0xa70
[  168.567496]  drm_ioctl_kernel+0x11c/0x1d8
[  168.567514]  drm_ioctl+0x558/0x6d0
[  168.567522]  do_vfs_ioctl+0x160/0xf30
[  168.567525]  ksys_ioctl+0x98/0xd8
[  168.567530]  __arm64_sys_ioctl+0x50/0xc8
[  168.567536]  el0_svc_common+0xc8/0x320
[  168.567540]  el0_svc_handler+0xf8/0x160
[  168.567544]  el0_svc+0x10/0x218

KASAN stacktrace:
[  168.567561] BUG: KASAN: null-ptr-deref in 
virtio_gpu_conn_get_modes+0xb4/0x140 [virtio_gpu]
[  168.567565] Read of size 4 at addr 0054 by task syz/6425
[  168.567566]
[  168.567571] CPU: 1 PID: 6425 Comm: syz Kdump: loaded Not tainted 
4.19.90-vhulk2201.1.0.h1035.kasan.eulerosv2r10.aarch64 #1
[  168.567573] Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0 02/06/2015
[  168.567575] Call trace:
[  168.567578]  dump_backtrace+0x0/0x310
[  168.567582]  show_stack+0x28/0x38
[  168.567586]  dump_stack+0xec/0x15c
[  168.567591]  kasan_report+0x244/0x2f0
[  168.567594]  __asan_load4+0x58/0xb0
[  168.567607]  virtio_gpu_conn_get_modes+0xb4/0x140 [virtio_gpu]
[  168.567612]  drm_helper_probe_single_connector_modes+0x3a4/0xd80
[  168.567617]  drm_mode_getconnector+0x2e0/0xa70
[  168.567621]  drm_ioctl_kernel+0x11c/0x1d8
[  168.567624]  drm_ioctl+0x558/0x6d0
[  168.567628]  do_vfs_ioctl+0x160/0xf30
[  168.567632]  ksys_ioctl+0x98/0xd8
[  168.567636]  __arm64_sys_ioctl+0x50/0xc8
[  168.567641]  el0_svc_common+0xc8/0x320
[  168.567645]  el0_svc_handler+0xf8/0x160
[  168.567649]  el0_svc+0x10/0x218

Signed-off-by: Liu Zixian 
Link: 
http://patchwork.freedesktop.org/patch/msgid/20220322091730.1653-1-liuzixi...@huawei.com
Signed-off-by: Gerd Hoffmann 
Signed-off-by: Sasha Levin 
---
 drivers/gpu/drm/virtio/virtgpu_display.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/gpu/drm/virtio/virtgpu_display.c 
b/drivers/gpu/drm/virtio/virtgpu_display.c
index 58048709c34e..1e528f13959d 100644
--- a/drivers/gpu/drm/virtio/virtgpu_display.c
+++ b/drivers/gpu/drm/virtio/virtgpu_display.c
@@ -184,6 +184,8 @@ static int virtio_gpu_conn_get_modes(struct drm_connector 
*connector)
DRM_DEBUG("add mode: %dx%d\n", width, height);
mode = drm_cvt_mode(connector->dev, width, height, 60,
false, false, false);
+   if (!mode)
+   return count;
mode->type |= DRM_MODE_TYPE_PREFERRED;
drm_mode_probed_add(connector, mode);
count++;
-- 
2.35.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH AUTOSEL 4.14 01/29] drm/virtio: fix NULL pointer dereference in virtio_gpu_conn_get_modes

2022-05-30 Thread Sasha Levin
From: Liu Zixian 

[ Upstream commit 194d250cdc4a40ccbd179afd522a9e9846957402 ]

drm_cvt_mode may return NULL and we should check it.

This bug is found by syzkaller:

FAULT_INJECTION stacktrace:
[  168.567394] FAULT_INJECTION: forcing a failure.
name failslab, interval 1, probability 0, space 0, times 1
[  168.567403] CPU: 1 PID: 6425 Comm: syz Kdump: loaded Not tainted 
4.19.90-vhulk2201.1.0.h1035.kasan.eulerosv2r10.aarch64 #1
[  168.567406] Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0 02/06/2015
[  168.567408] Call trace:
[  168.567414]  dump_backtrace+0x0/0x310
[  168.567418]  show_stack+0x28/0x38
[  168.567423]  dump_stack+0xec/0x15c
[  168.567427]  should_fail+0x3ac/0x3d0
[  168.567437]  __should_failslab+0xb8/0x120
[  168.567441]  should_failslab+0x28/0xc0
[  168.567445]  kmem_cache_alloc_trace+0x50/0x640
[  168.567454]  drm_mode_create+0x40/0x90
[  168.567458]  drm_cvt_mode+0x48/0xc78
[  168.567477]  virtio_gpu_conn_get_modes+0xa8/0x140 [virtio_gpu]
[  168.567485]  drm_helper_probe_single_connector_modes+0x3a4/0xd80
[  168.567492]  drm_mode_getconnector+0x2e0/0xa70
[  168.567496]  drm_ioctl_kernel+0x11c/0x1d8
[  168.567514]  drm_ioctl+0x558/0x6d0
[  168.567522]  do_vfs_ioctl+0x160/0xf30
[  168.567525]  ksys_ioctl+0x98/0xd8
[  168.567530]  __arm64_sys_ioctl+0x50/0xc8
[  168.567536]  el0_svc_common+0xc8/0x320
[  168.567540]  el0_svc_handler+0xf8/0x160
[  168.567544]  el0_svc+0x10/0x218

KASAN stacktrace:
[  168.567561] BUG: KASAN: null-ptr-deref in 
virtio_gpu_conn_get_modes+0xb4/0x140 [virtio_gpu]
[  168.567565] Read of size 4 at addr 0054 by task syz/6425
[  168.567566]
[  168.567571] CPU: 1 PID: 6425 Comm: syz Kdump: loaded Not tainted 
4.19.90-vhulk2201.1.0.h1035.kasan.eulerosv2r10.aarch64 #1
[  168.567573] Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0 02/06/2015
[  168.567575] Call trace:
[  168.567578]  dump_backtrace+0x0/0x310
[  168.567582]  show_stack+0x28/0x38
[  168.567586]  dump_stack+0xec/0x15c
[  168.567591]  kasan_report+0x244/0x2f0
[  168.567594]  __asan_load4+0x58/0xb0
[  168.567607]  virtio_gpu_conn_get_modes+0xb4/0x140 [virtio_gpu]
[  168.567612]  drm_helper_probe_single_connector_modes+0x3a4/0xd80
[  168.567617]  drm_mode_getconnector+0x2e0/0xa70
[  168.567621]  drm_ioctl_kernel+0x11c/0x1d8
[  168.567624]  drm_ioctl+0x558/0x6d0
[  168.567628]  do_vfs_ioctl+0x160/0xf30
[  168.567632]  ksys_ioctl+0x98/0xd8
[  168.567636]  __arm64_sys_ioctl+0x50/0xc8
[  168.567641]  el0_svc_common+0xc8/0x320
[  168.567645]  el0_svc_handler+0xf8/0x160
[  168.567649]  el0_svc+0x10/0x218

Signed-off-by: Liu Zixian 
Link: 
http://patchwork.freedesktop.org/patch/msgid/20220322091730.1653-1-liuzixi...@huawei.com
Signed-off-by: Gerd Hoffmann 
Signed-off-by: Sasha Levin 
---
 drivers/gpu/drm/virtio/virtgpu_display.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/gpu/drm/virtio/virtgpu_display.c 
b/drivers/gpu/drm/virtio/virtgpu_display.c
index b6d52055a11f..3a5f73bc2a37 100644
--- a/drivers/gpu/drm/virtio/virtgpu_display.c
+++ b/drivers/gpu/drm/virtio/virtgpu_display.c
@@ -187,6 +187,8 @@ static int virtio_gpu_conn_get_modes(struct drm_connector 
*connector)
DRM_DEBUG("add mode: %dx%d\n", width, height);
mode = drm_cvt_mode(connector->dev, width, height, 60,
false, false, false);
+   if (!mode)
+   return count;
mode->type |= DRM_MODE_TYPE_PREFERRED;
drm_mode_probed_add(connector, mode);
count++;
-- 
2.35.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH AUTOSEL 4.19 01/38] drm/virtio: fix NULL pointer dereference in virtio_gpu_conn_get_modes

2022-05-30 Thread Sasha Levin
From: Liu Zixian 

[ Upstream commit 194d250cdc4a40ccbd179afd522a9e9846957402 ]

drm_cvt_mode may return NULL and we should check it.

This bug is found by syzkaller:

FAULT_INJECTION stacktrace:
[  168.567394] FAULT_INJECTION: forcing a failure.
name failslab, interval 1, probability 0, space 0, times 1
[  168.567403] CPU: 1 PID: 6425 Comm: syz Kdump: loaded Not tainted 
4.19.90-vhulk2201.1.0.h1035.kasan.eulerosv2r10.aarch64 #1
[  168.567406] Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0 02/06/2015
[  168.567408] Call trace:
[  168.567414]  dump_backtrace+0x0/0x310
[  168.567418]  show_stack+0x28/0x38
[  168.567423]  dump_stack+0xec/0x15c
[  168.567427]  should_fail+0x3ac/0x3d0
[  168.567437]  __should_failslab+0xb8/0x120
[  168.567441]  should_failslab+0x28/0xc0
[  168.567445]  kmem_cache_alloc_trace+0x50/0x640
[  168.567454]  drm_mode_create+0x40/0x90
[  168.567458]  drm_cvt_mode+0x48/0xc78
[  168.567477]  virtio_gpu_conn_get_modes+0xa8/0x140 [virtio_gpu]
[  168.567485]  drm_helper_probe_single_connector_modes+0x3a4/0xd80
[  168.567492]  drm_mode_getconnector+0x2e0/0xa70
[  168.567496]  drm_ioctl_kernel+0x11c/0x1d8
[  168.567514]  drm_ioctl+0x558/0x6d0
[  168.567522]  do_vfs_ioctl+0x160/0xf30
[  168.567525]  ksys_ioctl+0x98/0xd8
[  168.567530]  __arm64_sys_ioctl+0x50/0xc8
[  168.567536]  el0_svc_common+0xc8/0x320
[  168.567540]  el0_svc_handler+0xf8/0x160
[  168.567544]  el0_svc+0x10/0x218

KASAN stacktrace:
[  168.567561] BUG: KASAN: null-ptr-deref in 
virtio_gpu_conn_get_modes+0xb4/0x140 [virtio_gpu]
[  168.567565] Read of size 4 at addr 0054 by task syz/6425
[  168.567566]
[  168.567571] CPU: 1 PID: 6425 Comm: syz Kdump: loaded Not tainted 
4.19.90-vhulk2201.1.0.h1035.kasan.eulerosv2r10.aarch64 #1
[  168.567573] Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0 02/06/2015
[  168.567575] Call trace:
[  168.567578]  dump_backtrace+0x0/0x310
[  168.567582]  show_stack+0x28/0x38
[  168.567586]  dump_stack+0xec/0x15c
[  168.567591]  kasan_report+0x244/0x2f0
[  168.567594]  __asan_load4+0x58/0xb0
[  168.567607]  virtio_gpu_conn_get_modes+0xb4/0x140 [virtio_gpu]
[  168.567612]  drm_helper_probe_single_connector_modes+0x3a4/0xd80
[  168.567617]  drm_mode_getconnector+0x2e0/0xa70
[  168.567621]  drm_ioctl_kernel+0x11c/0x1d8
[  168.567624]  drm_ioctl+0x558/0x6d0
[  168.567628]  do_vfs_ioctl+0x160/0xf30
[  168.567632]  ksys_ioctl+0x98/0xd8
[  168.567636]  __arm64_sys_ioctl+0x50/0xc8
[  168.567641]  el0_svc_common+0xc8/0x320
[  168.567645]  el0_svc_handler+0xf8/0x160
[  168.567649]  el0_svc+0x10/0x218

Signed-off-by: Liu Zixian 
Link: 
http://patchwork.freedesktop.org/patch/msgid/20220322091730.1653-1-liuzixi...@huawei.com
Signed-off-by: Gerd Hoffmann 
Signed-off-by: Sasha Levin 
---
 drivers/gpu/drm/virtio/virtgpu_display.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/gpu/drm/virtio/virtgpu_display.c 
b/drivers/gpu/drm/virtio/virtgpu_display.c
index 25503b933599..7511f416eb67 100644
--- a/drivers/gpu/drm/virtio/virtgpu_display.c
+++ b/drivers/gpu/drm/virtio/virtgpu_display.c
@@ -180,6 +180,8 @@ static int virtio_gpu_conn_get_modes(struct drm_connector 
*connector)
DRM_DEBUG("add mode: %dx%d\n", width, height);
mode = drm_cvt_mode(connector->dev, width, height, 60,
false, false, false);
+   if (!mode)
+   return count;
mode->type |= DRM_MODE_TYPE_PREFERRED;
drm_mode_probed_add(connector, mode);
count++;
-- 
2.35.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH AUTOSEL 5.4 01/55] drm/virtio: fix NULL pointer dereference in virtio_gpu_conn_get_modes

2022-05-30 Thread Sasha Levin
From: Liu Zixian 

[ Upstream commit 194d250cdc4a40ccbd179afd522a9e9846957402 ]

drm_cvt_mode may return NULL and we should check it.

This bug is found by syzkaller:

FAULT_INJECTION stacktrace:
[  168.567394] FAULT_INJECTION: forcing a failure.
name failslab, interval 1, probability 0, space 0, times 1
[  168.567403] CPU: 1 PID: 6425 Comm: syz Kdump: loaded Not tainted 
4.19.90-vhulk2201.1.0.h1035.kasan.eulerosv2r10.aarch64 #1
[  168.567406] Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0 02/06/2015
[  168.567408] Call trace:
[  168.567414]  dump_backtrace+0x0/0x310
[  168.567418]  show_stack+0x28/0x38
[  168.567423]  dump_stack+0xec/0x15c
[  168.567427]  should_fail+0x3ac/0x3d0
[  168.567437]  __should_failslab+0xb8/0x120
[  168.567441]  should_failslab+0x28/0xc0
[  168.567445]  kmem_cache_alloc_trace+0x50/0x640
[  168.567454]  drm_mode_create+0x40/0x90
[  168.567458]  drm_cvt_mode+0x48/0xc78
[  168.567477]  virtio_gpu_conn_get_modes+0xa8/0x140 [virtio_gpu]
[  168.567485]  drm_helper_probe_single_connector_modes+0x3a4/0xd80
[  168.567492]  drm_mode_getconnector+0x2e0/0xa70
[  168.567496]  drm_ioctl_kernel+0x11c/0x1d8
[  168.567514]  drm_ioctl+0x558/0x6d0
[  168.567522]  do_vfs_ioctl+0x160/0xf30
[  168.567525]  ksys_ioctl+0x98/0xd8
[  168.567530]  __arm64_sys_ioctl+0x50/0xc8
[  168.567536]  el0_svc_common+0xc8/0x320
[  168.567540]  el0_svc_handler+0xf8/0x160
[  168.567544]  el0_svc+0x10/0x218

KASAN stacktrace:
[  168.567561] BUG: KASAN: null-ptr-deref in 
virtio_gpu_conn_get_modes+0xb4/0x140 [virtio_gpu]
[  168.567565] Read of size 4 at addr 0054 by task syz/6425
[  168.567566]
[  168.567571] CPU: 1 PID: 6425 Comm: syz Kdump: loaded Not tainted 
4.19.90-vhulk2201.1.0.h1035.kasan.eulerosv2r10.aarch64 #1
[  168.567573] Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0 02/06/2015
[  168.567575] Call trace:
[  168.567578]  dump_backtrace+0x0/0x310
[  168.567582]  show_stack+0x28/0x38
[  168.567586]  dump_stack+0xec/0x15c
[  168.567591]  kasan_report+0x244/0x2f0
[  168.567594]  __asan_load4+0x58/0xb0
[  168.567607]  virtio_gpu_conn_get_modes+0xb4/0x140 [virtio_gpu]
[  168.567612]  drm_helper_probe_single_connector_modes+0x3a4/0xd80
[  168.567617]  drm_mode_getconnector+0x2e0/0xa70
[  168.567621]  drm_ioctl_kernel+0x11c/0x1d8
[  168.567624]  drm_ioctl+0x558/0x6d0
[  168.567628]  do_vfs_ioctl+0x160/0xf30
[  168.567632]  ksys_ioctl+0x98/0xd8
[  168.567636]  __arm64_sys_ioctl+0x50/0xc8
[  168.567641]  el0_svc_common+0xc8/0x320
[  168.567645]  el0_svc_handler+0xf8/0x160
[  168.567649]  el0_svc+0x10/0x218

Signed-off-by: Liu Zixian 
Link: 
http://patchwork.freedesktop.org/patch/msgid/20220322091730.1653-1-liuzixi...@huawei.com
Signed-off-by: Gerd Hoffmann 
Signed-off-by: Sasha Levin 
---
 drivers/gpu/drm/virtio/virtgpu_display.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/gpu/drm/virtio/virtgpu_display.c 
b/drivers/gpu/drm/virtio/virtgpu_display.c
index e622485ae826..7e34307eb075 100644
--- a/drivers/gpu/drm/virtio/virtgpu_display.c
+++ b/drivers/gpu/drm/virtio/virtgpu_display.c
@@ -174,6 +174,8 @@ static int virtio_gpu_conn_get_modes(struct drm_connector 
*connector)
DRM_DEBUG("add mode: %dx%d\n", width, height);
mode = drm_cvt_mode(connector->dev, width, height, 60,
false, false, false);
+   if (!mode)
+   return count;
mode->type |= DRM_MODE_TYPE_PREFERRED;
drm_mode_probed_add(connector, mode);
count++;
-- 
2.35.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH AUTOSEL 5.10 02/76] drm/virtio: fix NULL pointer dereference in virtio_gpu_conn_get_modes

2022-05-30 Thread Sasha Levin
From: Liu Zixian 

[ Upstream commit 194d250cdc4a40ccbd179afd522a9e9846957402 ]

drm_cvt_mode may return NULL and we should check it.

This bug is found by syzkaller:

FAULT_INJECTION stacktrace:
[  168.567394] FAULT_INJECTION: forcing a failure.
name failslab, interval 1, probability 0, space 0, times 1
[  168.567403] CPU: 1 PID: 6425 Comm: syz Kdump: loaded Not tainted 
4.19.90-vhulk2201.1.0.h1035.kasan.eulerosv2r10.aarch64 #1
[  168.567406] Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0 02/06/2015
[  168.567408] Call trace:
[  168.567414]  dump_backtrace+0x0/0x310
[  168.567418]  show_stack+0x28/0x38
[  168.567423]  dump_stack+0xec/0x15c
[  168.567427]  should_fail+0x3ac/0x3d0
[  168.567437]  __should_failslab+0xb8/0x120
[  168.567441]  should_failslab+0x28/0xc0
[  168.567445]  kmem_cache_alloc_trace+0x50/0x640
[  168.567454]  drm_mode_create+0x40/0x90
[  168.567458]  drm_cvt_mode+0x48/0xc78
[  168.567477]  virtio_gpu_conn_get_modes+0xa8/0x140 [virtio_gpu]
[  168.567485]  drm_helper_probe_single_connector_modes+0x3a4/0xd80
[  168.567492]  drm_mode_getconnector+0x2e0/0xa70
[  168.567496]  drm_ioctl_kernel+0x11c/0x1d8
[  168.567514]  drm_ioctl+0x558/0x6d0
[  168.567522]  do_vfs_ioctl+0x160/0xf30
[  168.567525]  ksys_ioctl+0x98/0xd8
[  168.567530]  __arm64_sys_ioctl+0x50/0xc8
[  168.567536]  el0_svc_common+0xc8/0x320
[  168.567540]  el0_svc_handler+0xf8/0x160
[  168.567544]  el0_svc+0x10/0x218

KASAN stacktrace:
[  168.567561] BUG: KASAN: null-ptr-deref in 
virtio_gpu_conn_get_modes+0xb4/0x140 [virtio_gpu]
[  168.567565] Read of size 4 at addr 0054 by task syz/6425
[  168.567566]
[  168.567571] CPU: 1 PID: 6425 Comm: syz Kdump: loaded Not tainted 
4.19.90-vhulk2201.1.0.h1035.kasan.eulerosv2r10.aarch64 #1
[  168.567573] Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0 02/06/2015
[  168.567575] Call trace:
[  168.567578]  dump_backtrace+0x0/0x310
[  168.567582]  show_stack+0x28/0x38
[  168.567586]  dump_stack+0xec/0x15c
[  168.567591]  kasan_report+0x244/0x2f0
[  168.567594]  __asan_load4+0x58/0xb0
[  168.567607]  virtio_gpu_conn_get_modes+0xb4/0x140 [virtio_gpu]
[  168.567612]  drm_helper_probe_single_connector_modes+0x3a4/0xd80
[  168.567617]  drm_mode_getconnector+0x2e0/0xa70
[  168.567621]  drm_ioctl_kernel+0x11c/0x1d8
[  168.567624]  drm_ioctl+0x558/0x6d0
[  168.567628]  do_vfs_ioctl+0x160/0xf30
[  168.567632]  ksys_ioctl+0x98/0xd8
[  168.567636]  __arm64_sys_ioctl+0x50/0xc8
[  168.567641]  el0_svc_common+0xc8/0x320
[  168.567645]  el0_svc_handler+0xf8/0x160
[  168.567649]  el0_svc+0x10/0x218

Signed-off-by: Liu Zixian 
Link: 
http://patchwork.freedesktop.org/patch/msgid/20220322091730.1653-1-liuzixi...@huawei.com
Signed-off-by: Gerd Hoffmann 
Signed-off-by: Sasha Levin 
---
 drivers/gpu/drm/virtio/virtgpu_display.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/gpu/drm/virtio/virtgpu_display.c 
b/drivers/gpu/drm/virtio/virtgpu_display.c
index f84b7e61311b..9b2b99e85342 100644
--- a/drivers/gpu/drm/virtio/virtgpu_display.c
+++ b/drivers/gpu/drm/virtio/virtgpu_display.c
@@ -177,6 +177,8 @@ static int virtio_gpu_conn_get_modes(struct drm_connector 
*connector)
DRM_DEBUG("add mode: %dx%d\n", width, height);
mode = drm_cvt_mode(connector->dev, width, height, 60,
false, false, false);
+   if (!mode)
+   return count;
mode->type |= DRM_MODE_TYPE_PREFERRED;
drm_mode_probed_add(connector, mode);
count++;
-- 
2.35.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH v6 14/22] dma-buf: Introduce new locking convention

2022-05-30 Thread Christian König via Virtualization

Hi Dmitry,

Am 30.05.22 um 15:26 schrieb Dmitry Osipenko:

Hello Christian,

On 5/30/22 09:50, Christian König wrote:

Hi Dmitry,

First of all please separate out this patch from the rest of the series,
since this is a complex separate structural change.

I assume all the patches will go via the DRM tree in the end since the
rest of the DRM patches in this series depend on this dma-buf change.
But I see that separation may ease reviewing of the dma-buf changes, so
let's try it.


That sounds like you are underestimating a bit how much trouble this 
will be.



I have tried this before and failed because catching all the locks in
the right code paths are very tricky. So expect some fallout from this
and make sure the kernel test robot and CI systems are clean.

Sure, I'll fix up all the reported things in the next iteration.

BTW, have you ever posted yours version of the patch? Will be great if
we could compare the changed code paths.


No, I never even finished creating it after realizing how much work it 
would be.



This patch introduces new locking convention for dma-buf users. From now
on all dma-buf importers are responsible for holding dma-buf reservation
lock around operations performed over dma-bufs.

This patch implements the new dma-buf locking convention by:

    1. Making dma-buf API functions to take the reservation lock.

    2. Adding new locked variants of the dma-buf API functions for drivers
   that need to manage imported dma-bufs under the held lock.

Instead of adding new locked variants please mark all variants which
expect to be called without a lock with an _unlocked postfix.

This should make it easier to remove those in a follow up patch set and
then fully move the locking into the importer.

Do we really want to move all the locks to the importers? Seems the
majority of drivers should be happy with the dma-buf helpers handling
the locking for them.


Yes, I clearly think so.




    3. Converting all drivers to the new locking scheme.

I have strong doubts that you got all of them. At least radeon and
nouveau should grab the reservation lock in their ->attach callbacks
somehow.

Radeon and Nouveau use gem_prime_import_sg_table() and they take resv
lock already, seems they should be okay (?)


You are looking at the wrong side. You need to fix the export code path, 
not the import ones.


See for example attach on radeon works like this 
drm_gem_map_attach->drm_gem_pin->radeon_gem_prime_pin->radeon_bo_reserve->ttm_bo_reserve->dma_resv_lock.


Same for nouveau and probably a few other exporters as well. That will 
certainly cause a deadlock if you don't fix it.


I strongly suggest to do this step by step, first attach/detach and then 
the rest.


Regards,
Christian.



I assume all the basics should covered in this v6. At minimum Intel,
Tegra, Panfrost, Lima and Rockchip drivers should be good. If I missed
something, then please let me know and I'll correct it.


Signed-off-by: Dmitry Osipenko 
---
   drivers/dma-buf/dma-buf.c | 270 +++---
   drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c   |   6 +-
   drivers/gpu/drm/drm_client.c  |   4 +-
   drivers/gpu/drm/drm_gem.c |  33 +++
   drivers/gpu/drm/drm_gem_framebuffer_helper.c  |   6 +-
   drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c    |  10 +-
   drivers/gpu/drm/qxl/qxl_object.c  |  17 +-
   drivers/gpu/drm/qxl/qxl_prime.c   |   4 +-
   .../common/videobuf2/videobuf2-dma-contig.c   |  11 +-
   .../media/common/videobuf2/videobuf2-dma-sg.c |  11 +-
   .../common/videobuf2/videobuf2-vmalloc.c  |  11 +-
   include/drm/drm_gem.h |   3 +
   include/linux/dma-buf.h   |  14 +-
   13 files changed, 241 insertions(+), 159 deletions(-)

diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c
index 32f55640890c..64a9909ccfa2 100644
--- a/drivers/dma-buf/dma-buf.c
+++ b/drivers/dma-buf/dma-buf.c
@@ -552,7 +552,6 @@ struct dma_buf *dma_buf_export(const struct
dma_buf_export_info *exp_info)
   file->f_mode |= FMODE_LSEEK;
   dmabuf->file = file;
   -    mutex_init(&dmabuf->lock);

Please make removing dmabuf->lock a separate change.

Alright



___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

[PATCH AUTOSEL 5.15 003/109] drm/virtio: fix NULL pointer dereference in virtio_gpu_conn_get_modes

2022-05-30 Thread Sasha Levin
From: Liu Zixian 

[ Upstream commit 194d250cdc4a40ccbd179afd522a9e9846957402 ]

drm_cvt_mode may return NULL and we should check it.

This bug is found by syzkaller:

FAULT_INJECTION stacktrace:
[  168.567394] FAULT_INJECTION: forcing a failure.
name failslab, interval 1, probability 0, space 0, times 1
[  168.567403] CPU: 1 PID: 6425 Comm: syz Kdump: loaded Not tainted 
4.19.90-vhulk2201.1.0.h1035.kasan.eulerosv2r10.aarch64 #1
[  168.567406] Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0 02/06/2015
[  168.567408] Call trace:
[  168.567414]  dump_backtrace+0x0/0x310
[  168.567418]  show_stack+0x28/0x38
[  168.567423]  dump_stack+0xec/0x15c
[  168.567427]  should_fail+0x3ac/0x3d0
[  168.567437]  __should_failslab+0xb8/0x120
[  168.567441]  should_failslab+0x28/0xc0
[  168.567445]  kmem_cache_alloc_trace+0x50/0x640
[  168.567454]  drm_mode_create+0x40/0x90
[  168.567458]  drm_cvt_mode+0x48/0xc78
[  168.567477]  virtio_gpu_conn_get_modes+0xa8/0x140 [virtio_gpu]
[  168.567485]  drm_helper_probe_single_connector_modes+0x3a4/0xd80
[  168.567492]  drm_mode_getconnector+0x2e0/0xa70
[  168.567496]  drm_ioctl_kernel+0x11c/0x1d8
[  168.567514]  drm_ioctl+0x558/0x6d0
[  168.567522]  do_vfs_ioctl+0x160/0xf30
[  168.567525]  ksys_ioctl+0x98/0xd8
[  168.567530]  __arm64_sys_ioctl+0x50/0xc8
[  168.567536]  el0_svc_common+0xc8/0x320
[  168.567540]  el0_svc_handler+0xf8/0x160
[  168.567544]  el0_svc+0x10/0x218

KASAN stacktrace:
[  168.567561] BUG: KASAN: null-ptr-deref in 
virtio_gpu_conn_get_modes+0xb4/0x140 [virtio_gpu]
[  168.567565] Read of size 4 at addr 0054 by task syz/6425
[  168.567566]
[  168.567571] CPU: 1 PID: 6425 Comm: syz Kdump: loaded Not tainted 
4.19.90-vhulk2201.1.0.h1035.kasan.eulerosv2r10.aarch64 #1
[  168.567573] Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0 02/06/2015
[  168.567575] Call trace:
[  168.567578]  dump_backtrace+0x0/0x310
[  168.567582]  show_stack+0x28/0x38
[  168.567586]  dump_stack+0xec/0x15c
[  168.567591]  kasan_report+0x244/0x2f0
[  168.567594]  __asan_load4+0x58/0xb0
[  168.567607]  virtio_gpu_conn_get_modes+0xb4/0x140 [virtio_gpu]
[  168.567612]  drm_helper_probe_single_connector_modes+0x3a4/0xd80
[  168.567617]  drm_mode_getconnector+0x2e0/0xa70
[  168.567621]  drm_ioctl_kernel+0x11c/0x1d8
[  168.567624]  drm_ioctl+0x558/0x6d0
[  168.567628]  do_vfs_ioctl+0x160/0xf30
[  168.567632]  ksys_ioctl+0x98/0xd8
[  168.567636]  __arm64_sys_ioctl+0x50/0xc8
[  168.567641]  el0_svc_common+0xc8/0x320
[  168.567645]  el0_svc_handler+0xf8/0x160
[  168.567649]  el0_svc+0x10/0x218

Signed-off-by: Liu Zixian 
Link: 
http://patchwork.freedesktop.org/patch/msgid/20220322091730.1653-1-liuzixi...@huawei.com
Signed-off-by: Gerd Hoffmann 
Signed-off-by: Sasha Levin 
---
 drivers/gpu/drm/virtio/virtgpu_display.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/gpu/drm/virtio/virtgpu_display.c 
b/drivers/gpu/drm/virtio/virtgpu_display.c
index a6caebd4a0dd..ef1f19083cd3 100644
--- a/drivers/gpu/drm/virtio/virtgpu_display.c
+++ b/drivers/gpu/drm/virtio/virtgpu_display.c
@@ -179,6 +179,8 @@ static int virtio_gpu_conn_get_modes(struct drm_connector 
*connector)
DRM_DEBUG("add mode: %dx%d\n", width, height);
mode = drm_cvt_mode(connector->dev, width, height, 60,
false, false, false);
+   if (!mode)
+   return count;
mode->type |= DRM_MODE_TYPE_PREFERRED;
drm_mode_probed_add(connector, mode);
count++;
-- 
2.35.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH AUTOSEL 5.17 004/135] drm/virtio: fix NULL pointer dereference in virtio_gpu_conn_get_modes

2022-05-30 Thread Sasha Levin
From: Liu Zixian 

[ Upstream commit 194d250cdc4a40ccbd179afd522a9e9846957402 ]

drm_cvt_mode may return NULL and we should check it.

This bug is found by syzkaller:

FAULT_INJECTION stacktrace:
[  168.567394] FAULT_INJECTION: forcing a failure.
name failslab, interval 1, probability 0, space 0, times 1
[  168.567403] CPU: 1 PID: 6425 Comm: syz Kdump: loaded Not tainted 
4.19.90-vhulk2201.1.0.h1035.kasan.eulerosv2r10.aarch64 #1
[  168.567406] Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0 02/06/2015
[  168.567408] Call trace:
[  168.567414]  dump_backtrace+0x0/0x310
[  168.567418]  show_stack+0x28/0x38
[  168.567423]  dump_stack+0xec/0x15c
[  168.567427]  should_fail+0x3ac/0x3d0
[  168.567437]  __should_failslab+0xb8/0x120
[  168.567441]  should_failslab+0x28/0xc0
[  168.567445]  kmem_cache_alloc_trace+0x50/0x640
[  168.567454]  drm_mode_create+0x40/0x90
[  168.567458]  drm_cvt_mode+0x48/0xc78
[  168.567477]  virtio_gpu_conn_get_modes+0xa8/0x140 [virtio_gpu]
[  168.567485]  drm_helper_probe_single_connector_modes+0x3a4/0xd80
[  168.567492]  drm_mode_getconnector+0x2e0/0xa70
[  168.567496]  drm_ioctl_kernel+0x11c/0x1d8
[  168.567514]  drm_ioctl+0x558/0x6d0
[  168.567522]  do_vfs_ioctl+0x160/0xf30
[  168.567525]  ksys_ioctl+0x98/0xd8
[  168.567530]  __arm64_sys_ioctl+0x50/0xc8
[  168.567536]  el0_svc_common+0xc8/0x320
[  168.567540]  el0_svc_handler+0xf8/0x160
[  168.567544]  el0_svc+0x10/0x218

KASAN stacktrace:
[  168.567561] BUG: KASAN: null-ptr-deref in 
virtio_gpu_conn_get_modes+0xb4/0x140 [virtio_gpu]
[  168.567565] Read of size 4 at addr 0054 by task syz/6425
[  168.567566]
[  168.567571] CPU: 1 PID: 6425 Comm: syz Kdump: loaded Not tainted 
4.19.90-vhulk2201.1.0.h1035.kasan.eulerosv2r10.aarch64 #1
[  168.567573] Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0 02/06/2015
[  168.567575] Call trace:
[  168.567578]  dump_backtrace+0x0/0x310
[  168.567582]  show_stack+0x28/0x38
[  168.567586]  dump_stack+0xec/0x15c
[  168.567591]  kasan_report+0x244/0x2f0
[  168.567594]  __asan_load4+0x58/0xb0
[  168.567607]  virtio_gpu_conn_get_modes+0xb4/0x140 [virtio_gpu]
[  168.567612]  drm_helper_probe_single_connector_modes+0x3a4/0xd80
[  168.567617]  drm_mode_getconnector+0x2e0/0xa70
[  168.567621]  drm_ioctl_kernel+0x11c/0x1d8
[  168.567624]  drm_ioctl+0x558/0x6d0
[  168.567628]  do_vfs_ioctl+0x160/0xf30
[  168.567632]  ksys_ioctl+0x98/0xd8
[  168.567636]  __arm64_sys_ioctl+0x50/0xc8
[  168.567641]  el0_svc_common+0xc8/0x320
[  168.567645]  el0_svc_handler+0xf8/0x160
[  168.567649]  el0_svc+0x10/0x218

Signed-off-by: Liu Zixian 
Link: 
http://patchwork.freedesktop.org/patch/msgid/20220322091730.1653-1-liuzixi...@huawei.com
Signed-off-by: Gerd Hoffmann 
Signed-off-by: Sasha Levin 
---
 drivers/gpu/drm/virtio/virtgpu_display.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/gpu/drm/virtio/virtgpu_display.c 
b/drivers/gpu/drm/virtio/virtgpu_display.c
index 5b00310ac4cd..f73352e7b832 100644
--- a/drivers/gpu/drm/virtio/virtgpu_display.c
+++ b/drivers/gpu/drm/virtio/virtgpu_display.c
@@ -179,6 +179,8 @@ static int virtio_gpu_conn_get_modes(struct drm_connector 
*connector)
DRM_DEBUG("add mode: %dx%d\n", width, height);
mode = drm_cvt_mode(connector->dev, width, height, 60,
false, false, false);
+   if (!mode)
+   return count;
mode->type |= DRM_MODE_TYPE_PREFERRED;
drm_mode_probed_add(connector, mode);
count++;
-- 
2.35.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


RE: [PATCH v8 1/1] crypto: Introduce RSA algorithm

2022-05-30 Thread Gonglei (Arei) via Virtualization



> -Original Message-
> From: zhenwei pi [mailto:pizhen...@bytedance.com]
> Sent: Friday, May 27, 2022 4:48 PM
> To: m...@redhat.com; Gonglei (Arei) 
> Cc: qemu-de...@nongnu.org; virtualization@lists.linux-foundation.org;
> helei.si...@bytedance.com; berra...@redhat.com; zhenwei pi
> 
> Subject: [PATCH v8 1/1] crypto: Introduce RSA algorithm
> 
> 
Skip...

> +static int64_t
> +virtio_crypto_create_asym_session(VirtIOCrypto *vcrypto,
> +   struct virtio_crypto_akcipher_create_session_req
> *sess_req,
> +   uint32_t queue_id, uint32_t opcode,
> +   struct iovec *iov, unsigned int out_num) {
> +VirtIODevice *vdev = VIRTIO_DEVICE(vcrypto);
> +CryptoDevBackendSessionInfo info = {0};
> +CryptoDevBackendAsymSessionInfo *asym_info;
> +int64_t session_id;
> +int queue_index;
> +uint32_t algo, keytype, keylen;
> +g_autofree uint8_t *key = NULL;
> +Error *local_err = NULL;
> +
> +algo = ldl_le_p(&sess_req->para.algo);
> +keytype = ldl_le_p(&sess_req->para.keytype);
> +keylen = ldl_le_p(&sess_req->para.keylen);
> +
> +if ((keytype != VIRTIO_CRYPTO_AKCIPHER_KEY_TYPE_PUBLIC)
> + && (keytype != VIRTIO_CRYPTO_AKCIPHER_KEY_TYPE_PRIVATE)) {
> +error_report("unsupported asym keytype: %d", keytype);
> +return -VIRTIO_CRYPTO_NOTSUPP;
> +}
> +
> +if (keylen) {
> +key = g_malloc(keylen);
> +if (iov_to_buf(iov, out_num, 0, key, keylen) != keylen) {
> +virtio_error(vdev, "virtio-crypto asym key incorrect");
> +return -EFAULT;

Memory leak.

> +}
> +iov_discard_front(&iov, &out_num, keylen);
> +}
> +
> +info.op_code = opcode;
> +asym_info = &info.u.asym_sess_info;
> +asym_info->algo = algo;
> +asym_info->keytype = keytype;
> +asym_info->keylen = keylen;
> +asym_info->key = key;
> +switch (asym_info->algo) {
> +case VIRTIO_CRYPTO_AKCIPHER_RSA:
> +asym_info->u.rsa.padding_algo =
> +ldl_le_p(&sess_req->para.u.rsa.padding_algo);
> +asym_info->u.rsa.hash_algo =
> +ldl_le_p(&sess_req->para.u.rsa.hash_algo);
> +break;
> +
> +/* TODO DSA&ECDSA handling */
> +
> +default:
> +return -VIRTIO_CRYPTO_ERR;
> +}
> +
> +queue_index = virtio_crypto_vq2q(queue_id);
> +session_id = cryptodev_backend_create_session(vcrypto->cryptodev,
> &info,
> + queue_index, &local_err);
> +if (session_id < 0) {
> +if (local_err) {
> +error_report_err(local_err);
> +}
> +return -VIRTIO_CRYPTO_ERR;
> +}
> +
> +return session_id;

Where to free the key at both normal and exceptional paths?


Regards,
-Gonglei


___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH AUTOSEL 5.18 005/159] drm/virtio: fix NULL pointer dereference in virtio_gpu_conn_get_modes

2022-05-30 Thread Sasha Levin
From: Liu Zixian 

[ Upstream commit 194d250cdc4a40ccbd179afd522a9e9846957402 ]

drm_cvt_mode may return NULL and we should check it.

This bug is found by syzkaller:

FAULT_INJECTION stacktrace:
[  168.567394] FAULT_INJECTION: forcing a failure.
name failslab, interval 1, probability 0, space 0, times 1
[  168.567403] CPU: 1 PID: 6425 Comm: syz Kdump: loaded Not tainted 
4.19.90-vhulk2201.1.0.h1035.kasan.eulerosv2r10.aarch64 #1
[  168.567406] Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0 02/06/2015
[  168.567408] Call trace:
[  168.567414]  dump_backtrace+0x0/0x310
[  168.567418]  show_stack+0x28/0x38
[  168.567423]  dump_stack+0xec/0x15c
[  168.567427]  should_fail+0x3ac/0x3d0
[  168.567437]  __should_failslab+0xb8/0x120
[  168.567441]  should_failslab+0x28/0xc0
[  168.567445]  kmem_cache_alloc_trace+0x50/0x640
[  168.567454]  drm_mode_create+0x40/0x90
[  168.567458]  drm_cvt_mode+0x48/0xc78
[  168.567477]  virtio_gpu_conn_get_modes+0xa8/0x140 [virtio_gpu]
[  168.567485]  drm_helper_probe_single_connector_modes+0x3a4/0xd80
[  168.567492]  drm_mode_getconnector+0x2e0/0xa70
[  168.567496]  drm_ioctl_kernel+0x11c/0x1d8
[  168.567514]  drm_ioctl+0x558/0x6d0
[  168.567522]  do_vfs_ioctl+0x160/0xf30
[  168.567525]  ksys_ioctl+0x98/0xd8
[  168.567530]  __arm64_sys_ioctl+0x50/0xc8
[  168.567536]  el0_svc_common+0xc8/0x320
[  168.567540]  el0_svc_handler+0xf8/0x160
[  168.567544]  el0_svc+0x10/0x218

KASAN stacktrace:
[  168.567561] BUG: KASAN: null-ptr-deref in 
virtio_gpu_conn_get_modes+0xb4/0x140 [virtio_gpu]
[  168.567565] Read of size 4 at addr 0054 by task syz/6425
[  168.567566]
[  168.567571] CPU: 1 PID: 6425 Comm: syz Kdump: loaded Not tainted 
4.19.90-vhulk2201.1.0.h1035.kasan.eulerosv2r10.aarch64 #1
[  168.567573] Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0 02/06/2015
[  168.567575] Call trace:
[  168.567578]  dump_backtrace+0x0/0x310
[  168.567582]  show_stack+0x28/0x38
[  168.567586]  dump_stack+0xec/0x15c
[  168.567591]  kasan_report+0x244/0x2f0
[  168.567594]  __asan_load4+0x58/0xb0
[  168.567607]  virtio_gpu_conn_get_modes+0xb4/0x140 [virtio_gpu]
[  168.567612]  drm_helper_probe_single_connector_modes+0x3a4/0xd80
[  168.567617]  drm_mode_getconnector+0x2e0/0xa70
[  168.567621]  drm_ioctl_kernel+0x11c/0x1d8
[  168.567624]  drm_ioctl+0x558/0x6d0
[  168.567628]  do_vfs_ioctl+0x160/0xf30
[  168.567632]  ksys_ioctl+0x98/0xd8
[  168.567636]  __arm64_sys_ioctl+0x50/0xc8
[  168.567641]  el0_svc_common+0xc8/0x320
[  168.567645]  el0_svc_handler+0xf8/0x160
[  168.567649]  el0_svc+0x10/0x218

Signed-off-by: Liu Zixian 
Link: 
http://patchwork.freedesktop.org/patch/msgid/20220322091730.1653-1-liuzixi...@huawei.com
Signed-off-by: Gerd Hoffmann 
Signed-off-by: Sasha Levin 
---
 drivers/gpu/drm/virtio/virtgpu_display.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/gpu/drm/virtio/virtgpu_display.c 
b/drivers/gpu/drm/virtio/virtgpu_display.c
index 5b00310ac4cd..f73352e7b832 100644
--- a/drivers/gpu/drm/virtio/virtgpu_display.c
+++ b/drivers/gpu/drm/virtio/virtgpu_display.c
@@ -179,6 +179,8 @@ static int virtio_gpu_conn_get_modes(struct drm_connector 
*connector)
DRM_DEBUG("add mode: %dx%d\n", width, height);
mode = drm_cvt_mode(connector->dev, width, height, 60,
false, false, false);
+   if (!mode)
+   return count;
mode->type |= DRM_MODE_TYPE_PREFERRED;
drm_mode_probed_add(connector, mode);
count++;
-- 
2.35.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: Re: [PATCH 3/3] virtio_balloon: Introduce memory recover

2022-05-30 Thread zhenwei pi




On 5/30/22 15:48, David Hildenbrand wrote:



+
  struct virtio_balloon {
struct virtio_device *vdev;
struct virtqueue *inflate_vq, *deflate_vq, *stats_vq, *free_page_vq;
@@ -126,6 +133,16 @@ struct virtio_balloon {
/* Free page reporting device */
struct virtqueue *reporting_vq;
struct page_reporting_dev_info pr_dev_info;
+
+   /* Memory recover VQ - VIRTIO_BALLOON_F_RECOVER */
+   struct virtqueue *recover_vq;
+   spinlock_t recover_vq_lock;
+   struct notifier_block memory_failure_nb;
+   struct list_head corrupted_page_list;
+   struct list_head recovered_page_list;
+   spinlock_t recover_page_list_lock;
+   struct __virtio_balloon_recover in_vbr;
+   struct work_struct unpoison_memory_work;


I assume we want all that only with CONFIG_MEMORY_FAILURE.



Sorry, I missed this.


  };
  
  static const struct virtio_device_id id_table[] = {

@@ -494,6 +511,198 @@ static void update_balloon_size_func(struct work_struct 
*work)
queue_work(system_freezable_wq, work);
  }
  
+/*

+ * virtballoon_memory_failure - notified by memory failure, try to fix the
+ *  corrupted page.
+ * The memory failure notifier is designed to call back when the kernel handled
+ * successfully only, WARN_ON_ONCE on the unlikely condition to find out any
+ * error(memory error handling is a best effort, not 100% coverd).
+ */
+static int virtballoon_memory_failure(struct notifier_block *notifier,
+ unsigned long pfn, void *parm)
+{
+   struct virtio_balloon *vb = container_of(notifier, struct 
virtio_balloon,
+memory_failure_nb);
+   struct page *page;
+   struct __virtio_balloon_recover *out_vbr;
+   struct scatterlist sg;
+   unsigned long flags;
+   int err;
+
+   page = pfn_to_online_page(pfn);
+   if (WARN_ON_ONCE(!page))
+   return NOTIFY_DONE;
+
+   if (PageHuge(page))
+   return NOTIFY_DONE;
+
+   if (WARN_ON_ONCE(!PageHWPoison(page)))
+   return NOTIFY_DONE;
+
+   if (WARN_ON_ONCE(page_count(page) != 1))
+   return NOTIFY_DONE;


Relying on the page_count to be 1 for correctness is usually a bit
shaky, for example, when racing against isolate_movable_page() that
might temporarily bump upo the refcount.



The memory notifier is designed to call the chain if a page gets result 
MF_RECOVERED only:

 if (result == MF_RECOVERED)
 blocking_notifier_call_chain(&mf_notifier_list, pfn, NULL);



+
+   get_page(page); /* balloon reference */
+
+   out_vbr = kzalloc(sizeof(*out_vbr), GFP_KERNEL);


Are we always guaranteed to be able to use GFP_KERNEL out of MCE
context? (IOW, are we never atomic?)


+   if (WARN_ON_ONCE(!out_vbr))
+   return NOTIFY_BAD;
+
+   spin_lock(&vb->recover_page_list_lock);
+   balloon_page_push(&vb->corrupted_page_list, page);
+   spin_unlock(&vb->recover_page_list_lock);
+
+   out_vbr->vbr.cmd = VIRTIO_BALLOON_R_CMD_RECOVER;


This makes me wonder if we should have a more generic guest->host
request queue, similar to what e.g., virtio-mem uses, instead of adding
a separate VIRTIO_BALLOON_VQ_RECOVER vq.



I'm OK with either one, I'll follow your decision! :D


+   set_page_pfns(vb, out_vbr->pfns, page);
+   sg_init_one(&sg, out_vbr, sizeof(*out_vbr));
+
+   spin_lock_irqsave(&vb->recover_vq_lock, flags);
+   err = virtqueue_add_outbuf(vb->recover_vq, &sg, 1, out_vbr, GFP_KERNEL);
+   if (unlikely(err)) {
+   spin_unlock_irqrestore(&vb->recover_vq_lock, flags);
+   return NOTIFY_DONE;
+   }
+   virtqueue_kick(vb->recover_vq);
+   spin_unlock_irqrestore(&vb->recover_vq_lock, flags);
+
+   return NOTIFY_OK;
+}
+
+static int recover_vq_get_response(struct virtio_balloon *vb)
+{
+   struct __virtio_balloon_recover *in_vbr;
+   struct scatterlist sg;
+   unsigned long flags;
+   int err;
+
+   spin_lock_irqsave(&vb->recover_vq_lock, flags);
+   in_vbr = &vb->in_vbr;
+   memset(in_vbr, 0x00, sizeof(*in_vbr));
+   sg_init_one(&sg, in_vbr, sizeof(*in_vbr));
+   err = virtqueue_add_inbuf(vb->recover_vq, &sg, 1, in_vbr, GFP_KERNEL);
+   if (unlikely(err)) {
+   spin_unlock_irqrestore(&vb->recover_vq_lock, flags);
+   return err;
+   }
+
+   virtqueue_kick(vb->recover_vq);
+   spin_unlock_irqrestore(&vb->recover_vq_lock, flags);
+
+   return 0;
+}
+
+static void recover_vq_handle_response(struct virtio_balloon *vb, unsigned int 
len)
+{
+   struct __virtio_balloon_recover *in_vbr;
+   struct virtio_balloon_recover *vbr;
+   struct page *page;
+   unsigned int pfns;
+   u32 pfn0, pfn1;
+   __u8 status;
+
+   /* the response is not expected */
+   if (unlikely(len != sizeof(struct __virtio_balloon_recover

Re: Re: [PATCH 0/3] recover hardware corrupted page by virtio balloon

2022-05-30 Thread zhenwei pi




On 5/30/22 15:41, David Hildenbrand wrote:

On 27.05.22 08:32, zhenwei pi wrote:

On 5/27/22 02:37, Peter Xu wrote:

On Wed, May 25, 2022 at 01:16:34PM -0700, Jue Wang wrote:

The hypervisor _must_ emulate poisons identified in guest physical
address space (could be transported from the source VM), this is to
prevent silent data corruption in the guest. With a paravirtual
approach like this patch series, the hypervisor can clear some of the
poisoned HVAs knowing for certain that the guest OS has isolated the
poisoned page. I wonder how much value it provides to the guest if the
guest and workload are _not_ in a pressing need for the extra KB/MB
worth of memory.


I'm curious the same on how unpoisoning could help here.  The reasoning
behind would be great material to be mentioned in the next cover letter.

Shouldn't we consider migrating serious workloads off the host already
where there's a sign of more severe hardware issues, instead?

Thanks,



I'm maintaining 1000,000+ virtual machines, from my experience:
UE is quite unusual and occurs randomly, and I did not hit UE storm case
in the past years. The memory also has no obvious performance drop after
hitting UE.

I hit several CE storm case, the performance memory drops a lot. But I
can't find obvious relationship between UE and CE.

So from the point of my view, to fix the corrupted page for VM seems
good enough. And yes, unpoisoning several pages does not help
significantly, but it is still a chance to make the virtualization better.



I'm curious why we should care about resurrecting a handful of poisoned
pages in a VM. The cover letter doesn't touch on that.

IOW, I'm missing the motivation why we should add additional
code+complexity to unpoison pages at all.

If we're talking about individual 4k pages, it's certainly sub-optimal,
but does it matter in practice? I could understand if we're losing
megabytes of memory. But then, I assume the workload might be seriously
harmed either way already?



Yes, resurrecting a handful of poisoned pages does not help 
significantly. And, in some ways, it seems nice to have. :D


A VM uses RAM of 2M huge page. Once a MCE(@HVAy in [HVAx,HVAz)) occurs, 
the 2M([HVAx,HVAz)) of hypervisor becomes unaccessible, but the guest 
poisons 4K (@GPAy in [GPAx, GPAz)) only, it may hit another 511 MCE 
([GPAx, GPAz) except GPAy). This is the worse case, so I want to add
 '__le32 corrupted_pages' in struct virtio_balloon_config, it is used 
in the next step: reporting 512 * 4K 'corrupted_pages' to the guest, the 
guest has a chance to isolate the other 511 pages ahead of time. And the 
guest actually loses 2M, fixing 512*4K seems to help significantly.




I assume when talking about "the performance memory drops a lot", you
imply that this patch set can mitigate that performance drop?

But why do you see a performance drop? Because we might lose some
possible THP candidates (in the host or the guest) and you want to plug
does holes? I assume you'll see a performance drop simply because
poisoning memory is expensive, including migrating pages around on CE.

If you have some numbers to share, especially before/after this change,
that would be great.



The CE storm leads 2 problems I have even seen:
1, the memory bandwidth slows down to 10%~20%, and the cycles per 
instruction of CPU increases a lot.
2, the THR (/proc/interrupts) interrupts frequently, the CPU has to use 
a lot time to handle IRQ.


But no corrupted page occurs. Migrating VM to another healthy host seems 
a good choice. This patch does not handle CE storm case.


--
zhenwei pi
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH 3/3] virtio_balloon: Introduce memory recover

2022-05-30 Thread David Hildenbrand
On 25.05.22 01:32, zhenwei pi wrote:
> 
> 
> On 5/25/22 03:35, Sean Christopherson wrote:
>> On Fri, May 20, 2022, zhenwei pi wrote:
>>> @@ -59,6 +60,12 @@ enum virtio_balloon_config_read {
>>> VIRTIO_BALLOON_CONFIG_READ_CMD_ID = 0,
>>>   };
>>>   
>>> +/* the request body to commucate with host side */
>>> +struct __virtio_balloon_recover {
>>> +   struct virtio_balloon_recover vbr;
>>> +   __virtio32 pfns[VIRTIO_BALLOON_PAGES_PER_PAGE];
>>
>> I assume this is copied from virtio_balloon.pfns, which also uses 
>> __virtio32, but
>> isn't that horribly broken?  PFNs are 'unsigned long', i.e. 64 bits on 
>> 64-bit kernels.
>> x86-64 at least most definitely generates 64-bit PFNs.  Unless there's magic 
>> I'm
>> missing, page_to_balloon_pfn() will truncate PFNs and feed the host bad info.
>>
> 
> Yes, I also noticed this point, I suppose the balloon device can not 
> work on a virtual machine which has physical address larger than 16T.

Yes, that's a historical artifact and we never ran into it in practice
-- because 16TB VMs are still rare, especially when paired with
virtio-balloon inflation/deflation. Most probably the guest should just
stop inflating when hitting such a big PFN. In the future, we might want
a proper sg interface instead.

-- 
Thanks,

David / dhildenb

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH 3/3] virtio_balloon: Introduce memory recover

2022-05-30 Thread David Hildenbrand


> +
>  struct virtio_balloon {
>   struct virtio_device *vdev;
>   struct virtqueue *inflate_vq, *deflate_vq, *stats_vq, *free_page_vq;
> @@ -126,6 +133,16 @@ struct virtio_balloon {
>   /* Free page reporting device */
>   struct virtqueue *reporting_vq;
>   struct page_reporting_dev_info pr_dev_info;
> +
> + /* Memory recover VQ - VIRTIO_BALLOON_F_RECOVER */
> + struct virtqueue *recover_vq;
> + spinlock_t recover_vq_lock;
> + struct notifier_block memory_failure_nb;
> + struct list_head corrupted_page_list;
> + struct list_head recovered_page_list;
> + spinlock_t recover_page_list_lock;
> + struct __virtio_balloon_recover in_vbr;
> + struct work_struct unpoison_memory_work;

I assume we want all that only with CONFIG_MEMORY_FAILURE.

>  };
>  
>  static const struct virtio_device_id id_table[] = {
> @@ -494,6 +511,198 @@ static void update_balloon_size_func(struct work_struct 
> *work)
>   queue_work(system_freezable_wq, work);
>  }
>  
> +/*
> + * virtballoon_memory_failure - notified by memory failure, try to fix the
> + *  corrupted page.
> + * The memory failure notifier is designed to call back when the kernel 
> handled
> + * successfully only, WARN_ON_ONCE on the unlikely condition to find out any
> + * error(memory error handling is a best effort, not 100% coverd).
> + */
> +static int virtballoon_memory_failure(struct notifier_block *notifier,
> +   unsigned long pfn, void *parm)
> +{
> + struct virtio_balloon *vb = container_of(notifier, struct 
> virtio_balloon,
> +  memory_failure_nb);
> + struct page *page;
> + struct __virtio_balloon_recover *out_vbr;
> + struct scatterlist sg;
> + unsigned long flags;
> + int err;
> +
> + page = pfn_to_online_page(pfn);
> + if (WARN_ON_ONCE(!page))
> + return NOTIFY_DONE;
> +
> + if (PageHuge(page))
> + return NOTIFY_DONE;
> +
> + if (WARN_ON_ONCE(!PageHWPoison(page)))
> + return NOTIFY_DONE;
> +
> + if (WARN_ON_ONCE(page_count(page) != 1))
> + return NOTIFY_DONE;

Relying on the page_count to be 1 for correctness is usually a bit
shaky, for example, when racing against isolate_movable_page() that
might temporarily bump upo the refcount.

> +
> + get_page(page); /* balloon reference */
> +
> + out_vbr = kzalloc(sizeof(*out_vbr), GFP_KERNEL);

Are we always guaranteed to be able to use GFP_KERNEL out of MCE
context? (IOW, are we never atomic?)

> + if (WARN_ON_ONCE(!out_vbr))
> + return NOTIFY_BAD;
> +
> + spin_lock(&vb->recover_page_list_lock);
> + balloon_page_push(&vb->corrupted_page_list, page);
> + spin_unlock(&vb->recover_page_list_lock);
> +
> + out_vbr->vbr.cmd = VIRTIO_BALLOON_R_CMD_RECOVER;

This makes me wonder if we should have a more generic guest->host
request queue, similar to what e.g., virtio-mem uses, instead of adding
a separate VIRTIO_BALLOON_VQ_RECOVER vq.

> + set_page_pfns(vb, out_vbr->pfns, page);
> + sg_init_one(&sg, out_vbr, sizeof(*out_vbr));
> +
> + spin_lock_irqsave(&vb->recover_vq_lock, flags);
> + err = virtqueue_add_outbuf(vb->recover_vq, &sg, 1, out_vbr, GFP_KERNEL);
> + if (unlikely(err)) {
> + spin_unlock_irqrestore(&vb->recover_vq_lock, flags);
> + return NOTIFY_DONE;
> + }
> + virtqueue_kick(vb->recover_vq);
> + spin_unlock_irqrestore(&vb->recover_vq_lock, flags);
> +
> + return NOTIFY_OK;
> +}
> +
> +static int recover_vq_get_response(struct virtio_balloon *vb)
> +{
> + struct __virtio_balloon_recover *in_vbr;
> + struct scatterlist sg;
> + unsigned long flags;
> + int err;
> +
> + spin_lock_irqsave(&vb->recover_vq_lock, flags);
> + in_vbr = &vb->in_vbr;
> + memset(in_vbr, 0x00, sizeof(*in_vbr));
> + sg_init_one(&sg, in_vbr, sizeof(*in_vbr));
> + err = virtqueue_add_inbuf(vb->recover_vq, &sg, 1, in_vbr, GFP_KERNEL);
> + if (unlikely(err)) {
> + spin_unlock_irqrestore(&vb->recover_vq_lock, flags);
> + return err;
> + }
> +
> + virtqueue_kick(vb->recover_vq);
> + spin_unlock_irqrestore(&vb->recover_vq_lock, flags);
> +
> + return 0;
> +}
> +
> +static void recover_vq_handle_response(struct virtio_balloon *vb, unsigned 
> int len)
> +{
> + struct __virtio_balloon_recover *in_vbr;
> + struct virtio_balloon_recover *vbr;
> + struct page *page;
> + unsigned int pfns;
> + u32 pfn0, pfn1;
> + __u8 status;
> +
> + /* the response is not expected */
> + if (unlikely(len != sizeof(struct __virtio_balloon_recover)))
> + return;
> +
> + in_vbr = &vb->in_vbr;
> + vbr = &in_vbr->vbr;
> + if (unlikely(vbr->cmd != VIRTIO_BALLOON_R_CMD_RESPONSE))
> + return;
> +
> + /* to make sure the contiguous balloon PFNs */
> + for (pfn

Re: [PATCH 0/3] recover hardware corrupted page by virtio balloon

2022-05-30 Thread David Hildenbrand
On 27.05.22 08:32, zhenwei pi wrote:
> On 5/27/22 02:37, Peter Xu wrote:
>> On Wed, May 25, 2022 at 01:16:34PM -0700, Jue Wang wrote:
>>> The hypervisor _must_ emulate poisons identified in guest physical
>>> address space (could be transported from the source VM), this is to
>>> prevent silent data corruption in the guest. With a paravirtual
>>> approach like this patch series, the hypervisor can clear some of the
>>> poisoned HVAs knowing for certain that the guest OS has isolated the
>>> poisoned page. I wonder how much value it provides to the guest if the
>>> guest and workload are _not_ in a pressing need for the extra KB/MB
>>> worth of memory.
>>
>> I'm curious the same on how unpoisoning could help here.  The reasoning
>> behind would be great material to be mentioned in the next cover letter.
>>
>> Shouldn't we consider migrating serious workloads off the host already
>> where there's a sign of more severe hardware issues, instead?
>>
>> Thanks,
>>
> 
> I'm maintaining 1000,000+ virtual machines, from my experience:
> UE is quite unusual and occurs randomly, and I did not hit UE storm case 
> in the past years. The memory also has no obvious performance drop after 
> hitting UE.
> 
> I hit several CE storm case, the performance memory drops a lot. But I 
> can't find obvious relationship between UE and CE.
> 
> So from the point of my view, to fix the corrupted page for VM seems 
> good enough. And yes, unpoisoning several pages does not help 
> significantly, but it is still a chance to make the virtualization better.
> 

I'm curious why we should care about resurrecting a handful of poisoned
pages in a VM. The cover letter doesn't touch on that.

IOW, I'm missing the motivation why we should add additional
code+complexity to unpoison pages at all.

If we're talking about individual 4k pages, it's certainly sub-optimal,
but does it matter in practice? I could understand if we're losing
megabytes of memory. But then, I assume the workload might be seriously
harmed either way already?


I assume when talking about "the performance memory drops a lot", you
imply that this patch set can mitigate that performance drop?

But why do you see a performance drop? Because we might lose some
possible THP candidates (in the host or the guest) and you want to plug
does holes? I assume you'll see a performance drop simply because
poisoning memory is expensive, including migrating pages around on CE.

If you have some numbers to share, especially before/after this change,
that would be great.

-- 
Thanks,

David / dhildenb

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH] virtio_balloon: check virtqueue_add_outbuf() return value

2022-05-30 Thread David Hildenbrand
On 30.05.22 04:16, Bo Liu (刘波)-浪潮信息 wrote:
> Adding this patch can avoid unnecessary VM exits and reduce the number of VM 
> exits
> 

... in corner cases where virtqueue_add_outbuf() fails? Why do we care
about that corner case?

Looks like unnecessary code churn to me, unless I am missing something
important.

-- 
Thanks,

David / dhildenb

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization